This Web3 data warehouse wants to bring Big Data Analytics to Blockchain
Nate Holiday is the co-founder and CEO of Space and Time. He sat down with Jessica Abo to talk about his company and the intersection of data and blockchain.
Jessica Abo: Nate, tell us a little bit about space and time.
Space and Time is a decentralized data warehouse. If you think of a data warehouse, it’s where companies process data to better interact with their consumers on a regular basis. Companies all over the world run their data analysis in data warehouses. That ability to run applications and run businesses is something that has been around for a long time. We are now decentralizing that process, and that data, to interact and run with blockchain data in decentralized applications.
And why is it important?
There are many different types of blockchains around the world, and more and more people are starting to interact with different blockchains. And so it’s very important for businesses that are built on the blockchain, or businesses that are what we call “off-chain” or Web2, to start interacting with this data because that’s where their customers are.
If you want to understand your customers better, to be able to interact with this data and understand what they do with the different types of applications and services offered through blockchain, it is very important to have tools and analytics and capabilities that can combine both on- chain data from a blockchain with off-chain data from enterprise systems.
Your company aims to move enterprise data analytics to the blockchain. Why would anyone do that?
It’s a fair question. Look, I don’t think businesses are moving to the blockchain wholeheartedly. They have years of data infrastructure and analytics architectures built. They spent billions of dollars building this data architecture and infrastructure. And then to take all that capability that they’ve built and just move it all and reuse it for the blockchain is something that I don’t think is going to happen. But what is going to happen is that they need blockchain data to be able to interact and integrate with the large investments they have made over several years and billions of dollars.
So we provide this bridge. We let them take all the blockchain data, get it flushed and indexed into relational data stores, and bring familiar tools that they can integrate into the overall system. So now they can combine all the years of architecture and data and analytics that they’ve built up over this period with new data coming from analytics and from blockchains, and combine that into a single cluster and provide analytics at scale.
If you step back from this for a second, what do you think is wrong with how big data analytics is done today?
Well, the first thing I want to point out is that it’s all centralized. So if you think over the last couple of years, everybody has migrated all their data analytics and infrastructure to the cloud, and all the cloud service providers are centralized. There are many points where data can be manipulated in that architecture.
First, cloud service providers can decide what applications and what software runs in their cloud, and you see it all over the world where they decide on certain types of software to run; while certain types of companies can participate in their cloud, others cannot. Since companies build this software in the cloud, the service providers now have access to all this data. They are able to manipulate and control the data if they wish.
Now, if you build a company that has customer data in the software that’s on that cloud provider, and companies have access to a huge amount of data that’s also centralized. So if you think along this data line of cloud service providers, software creators, companies that capture and use data from their customers, obviously these companies have employees that have access to all of that data. And when you think about do no harm, always do good in a centralized data architecture, you have all these points of failure where it’s centralized and it can be manipulated and tampered with.
And what would be the benefits of decentralized data analytics?
Space and time offer an alternative to the centralized computing architectures that the world has built. And it is primarily the servers on which we run the analyses. We do not own all the servers. They are decentralized. No one can own all the servers in a decentralized network, which would defeat the purpose. So therefore anyone can participate in the node operations and the servers that run the software itself. The software we build is aimed at open source software. And so as the software is running on all these nodes, we don’t actually have access or ability to manipulate or tamper with the software itself.
In addition to that, we have built in cryptographic guarantees. As data is processed in space and time, if someone changes or manipulates or inserts, or deletes data during a query process, the cryptographic proofs will fail. This means that you wouldn’t actually be able to deliver data back to, say, a smart contract on the blockchain, which requires tamper-proof delivery of data; this is why we are building this cryptographic proof, which we call Proof of SQL. People cannot interact with the data as bad actors to change, manipulate or transform data that is beneficial to them and not against the customer.
Nate, can you tell us about some of your partners?
We have really strong partnerships within the Web3 / blockchain space. Chainlink helps us bring data to and from the blockchain. So when you want to interact with smart contracts, you need decentralized oracles. Data coming to and from smart contracts requires these oracles – it’s very important. And Chainlink is a leader in the area of data and decentralization of oracles.
We are also working with Polygon, Mystenlabs and Avalanche to bring Web2 customers to the blockchain. You see more and more news articles about Web2 companies starting to use blockchain data.
One of the most exciting partnerships we just announced was with Microsoft and M12 Ventures investing in space and time. I think why it’s so important is that you’re starting to see the leaders of the data industry—like Microsoft from a Web2 perspective, and Chainlink from the Web3 industry—starting to look at how we bridge all this data. How do we bring best-in-class capabilities from off-chain data analytics and on-chain data to a single platform, with just data in general, that interacts with blockchains and interacts with customers – wherever those customers are.
Finally, what would you say to people out there who are skeptical about all of this?
One thing I always say is don’t confuse blockchain technology with crypto. There is a lot in the news today about crypto exchanges, about trading exchanges, where you have bad actors participating in centralized formats. For example, if you think about these exchanges that fail, they’re all centralized. Blockchain technology, Web3 technology, was built to decentralize all of this. And as we think about the data infrastructure that we’re building to further decentralize the ability to own your data, own your funds. And so all of these play a factor in what blockchain was built for and what Web3 technology was built for.