Data Exchange Architecture
Version 1.1
Abstract
We are building the Permissionless Data-AI Coordination Layer—with a go-to-market focus on AI agents—to rethink from first principles how data owners collaborate with AI Devs and Agents without humans in the loop. If you're in a hurry, here’s a quick snapshot from one of our earliest backers on the “why” before you dive into the full Litepaper.
Access to specialized, niche datasets on-demand is one of the biggest bottlenecks for AI today. We tackle this by supercharging incentive alignment for data owners leveraging IP co-ownership and using intent-centric technologies to eliminate friction during discovery, negotiation, commitment, and settlement—both at the protocol level and application level. We don’t care where and how someone built their AI agent or Agent Swarms, web2 or web3, Framework X or Framework Y, they are welcome to our Gas Station. Reppo should be thought of as a fuelling station for AI Agents and Devs, where they proactively make pit stops everyday, sometimes every hour to find, bid, and transact on datasets and data pipelines, be it synthetic, human generated, enterprise, proprietary, or sensitive data. We also provide in-platform access to private and secure training compute to minimize disruption to agentic workflows. All AI assets imported or bridged to Reppo are non-custodial. We use a novel security architecture powered by Gateway Protocol and Pocket Network for on-chain smart accounts we refer to as Pods. At launch, Reppo will support 5+ EVM chains, including Solana, Base and Celestia. We are on track to be the first intent-centric decentralized Data Exchange, acting as the primary point of interaction for data owners/providers and the agentic world.
Data Exchange Overview
The most widely adopted Data exchange known to consumers today is the AWS Data Exchange. Permissioned. Centralized. No composability. No explicit franctional financialization of data as an asset class. As Crypto x AI industry take off, builders will obsess less and less on which is the best agent framework to use for building agents and more on how to get access to the most unique and specialized fuel for their agentic workflows. Peer-to-peer ways to store and share data have so far been kind of ironical, as we saw during our time at Filecoin, where the technology is incredible but the implementation relies so heavily on centralized backends and platforms. Just like humans, we believe agents will opt for convenience over security and data layers and platforms that make the user experience a priority will win. Just like crypto currency exchanges, bootstrapping and aggregating liquidity will be critical and we will leverage a class of human and autonomous actors, designed after Anoma’s solvers, who are incentivized to facilitate market making on the Data Exchange.
Reppo’s ecosystem is demand-driven, unlocking a market for data not yet scrapable and available on the internet. Accepting what’s available is a compromise AI builders are forced to settle on and we have already reached a tipping point where access to data must happen through permissionless infrastructure, competing with Web2 aggregators, providers, and brokers who extract value from users and businesses. It is time for a Reppolution.
Core Philosophy
Reppo is intent-centric, agentic first and for the most part, purpose-built. While we understand these can be seen as a buzzwords in the industry, we want to emphaize that building a platform where AI agents can autonomously discover, negotiate, commit, and settle on data requires careful design choices and intentional protocol integrations to secure and facilitate each stage of a data trade. The only generalized components we are building is the core Protocol as well as a permissionless data transfer and consumption protocol for AI agents using Pocket Network’s Gateways. Unlike most decentralized teams that start with infrastructure and hope others will come build on it, burning their runway on hackathons and marketing, we are starting with the user who we have clearly defined as AI agents. Our belief is that once we build a system that works for agents, opening it up to AI Model and Application Devs will be a net + to the platform. We are starting with the world’s first intent-centric data exchange deployed on an existing L2 and we might consider launching our own chain to let the ecosystem control additional parts of the value accrual flow down the path. With these constraints, we have already made a lot of progress than trying to build the everything app or platform.
The Reppo governance process is underway and we briefly touch on it later in this paper, but beginning end of feb/early march, it will be open, so anyone can propose integrating a new chains, protocols, and partnerships to deliver new functionality on Reppo.
Bringing Novel Primitives to Data Marketplaces
Most Data Exchanges today were created as a way for someone to list their data and some buyer to purchase or license such data i.e. a venue for human users to swap or trade one type of asset - data. In our view, that is extremely boring. Imagine if you could only use an exchange for swapping fiat for Bitcoin and vice versa. Just like the explosion of many types of digital assets, including stablecoins, we expect data will not just be one monolithic asset class but there will be many types of data assets. Private/public is the obvious one. So is consumer/enterprise. On-chain/off-chain is another one. Synthetic/human-generated is another. Raw/labelled is an emerging class. Just as the complexity of supporting swaps for different assets grew as new blockchains emerged, we believe there will emerge interesting primitives like derivatives and tokenization of datasets and data pipelines. Swapping data for crypto/fiat is a rudimentary application. Things get interesting when you can swap data for data.
Borrowing and Lending Data Perps
With the rise of on-chain IP, borrowing and lending datasets on-demand will be a key feature of agentic workflows. Instead of paying a one time spot “licensing fee”, agents and models will share part of their revenue with data owners when pricing data is unreasonable and such rights will be tradable. The Reppo Protocol is being designed with this in mind. As we have seen with currency exchanges, perpetual futures have grown in adoption but margin for spot trading have not. We expect this trend to reflect in how data exchanges will value data as an asset class for AI and Agentic workflows.
Data Lending
This cycle, we saw remergence of DataDAOs. While explaining what they are and how they work is out of our scope, we view services like DataDAOs and projects like OpenLedger as well as L2s like Akave to be Data nodes i.e. supply side actors on the Reppo Network. Over the last few years, many DePINs that have emerged that reward tokens for crowdsourcing data but there is serious misalignment of incentives due to the centralization in demand discovery and settlement process. DevCo is the broker and the end user does not benefit. It is not very different than how web2 data vendors, aggregators, and exchanges are incentivized to maximize return, regardless of how the data was collected, leading to a race to the bottom where data assets are lent to the highest bidding counterparties with little regard for fair use and revenue flowing back to data creators. DePINs will suffer, soon, because of this issue, and our experience at Filecoin confirmed the 2nd and 3rd degree impacts of this incentive misalignment. Our view is that users must never give up custody of their assets, be it currency, or be it data.
We expect more and more data lending protocols to emerge and Reppo’s core protocol, designed around IP Co-ownership of Data and AI assets, is well positioned to play be the foundation of which builders can choose to build.
Data Options
Options are interesting. TradFi loves them and we are starting to see some adoption in crypto. Many reasons. Not the focus of this litepaper but imagine if AI agents were able to negotiate contracts on data that is yet to be created on made available. This is already happening. The volume differential between spot and options on a decentralized data exchange like Reppo can be significant. This is not something we will roll out with at the beginning but we are committed to defining and driving adoption for this asset class.
Data liquidity is a newer concept and due to lack of familiarity, it is unlikely that agents will be the first to adopt data options but just like other financialized instruments, we must underestimate the power of markets and degens.
Staking, zkTLS, and On-Chain Reputation
Data verifiability was the bane of our team’s existence at Filecoin. Starting to define what is good data and bad data, either centrally or through some decentralized governance, leads to a very dark room. We trust the market and believe that it will converge to the optimal quality through feedback loops. Still, we plan to de-risk the system by leveraging staking primitives combined with zkTLS and an on-chain Reputation System for Agents to provide optimistic garuntees on Reppo’s data exchange. In simpler terms, there will be a notion of verified data on which consumers or other actors can participate in underwriting but it is not an objective garuntee provided by Reppo. Today, to our knowledge, no centralized data exchange leverages cryptographic primitives for ensuring data and transaction quality. As a reminder, the Data Exchange itself is non-custodial but we will have some form of custodial staking service available for several different data assets and blockchains to begin with.
Some have asked about Liquid staking for Data. Yes, it will happen but we are not taking it up today, although if someone eventually creates wstrEPPO, we might be open to supporting such teams.
NFTs
NFTs are a great way to not only engage the community but also to provide rewards and benefits, which include ways to contribute to the Data Exchange either by providing liquidity, staking and securing or other mechanisms. We are committed to leveraging NFTs to build the Reppo Ecosystem. Some initial rewards of Utility NFTs that have been distributed so far can be found here.
NFTs will have a resurgence as they are used by AI agents as membership passes. Everyone is sleeping on this. We will showcase an implementation as part of GA rollout and launch for the Reppo’s Data Exchange.
User Acquisition
We buy that there will be a billion agents on-chain before there are a billion human users. Instead of the current extreme cyclical adoption we see in crypto, vast majority of agents will choose to live and play on-chain. Every agent, regardless of which factory it was designed, manufactured, or shipped from, will need access to high quality fuel. While cycles might impact crazy growth curves, we expect more and more agents will adopt the Data Exchange to discover, negotiate, commit, and settle on data. Increased focus on the application layer will only increase this trend by orders of magnitude. This presents a significant opportunity for a new platform like Reppo to challenge the web2 incumbents as well as data streamers from the previous cycle. Many protocols end up producing noise instead of value because they bring in their own biases of what is valuable and what is not. Just like we observe trend in datasets, we must can a real-time stream of demand signals to form a fairly clear picture of which data has dominant market at a given time.
There have been 4-5 cycles so far in crypto and regardless of who you are and where you stand on decentralization, user adoption has increased. While we are not here to predict which will be the next hot trend, it is clear that AI x Crypto is here to stay. Beyond attention tokenization and memecoins, Reppo is best positioned to take advantage of the many emerging applications and services being built onchain for AI. As mentioned previously, we don’t care where and how agents, apps, and models are built, Reppo will be the go to pit-stop, the fuelling station for all on-chain and eventually off-chain AI. In addition to a seamless and permissionless experience, Reppo will stand out is a demand-driven intent pooling and real-time data acquisition strategy. Safety and transparency will also be core winning traits and tenants of the system.
Distribution
In our view, the Crypto X AI market is severely distorted today, where no one is sure what is valuable long term and what qualifies for a liquid strategy. Regardless, we are confident that AI agents will trade with each other and trade data beyond on-chain sources, giving rise to a powerful feedback loop where the protocols and applications that are most efficiently capturing increasing market share and awareness will restore fundamental market dynamics that ensure strong competition and efficiency across the ecosystem.
Governance
We expect Reppo to evolve into more than just an exchange or a platform, especially as we add more and more onchain services and open up the platform to new use cases and users. Credibly neutrality will be core to Reppo Governance. We acknowledge that user growth and adoption are the goals of every project in the space but with the rise of AI agents, we have a unique window of opportunity to build governance for Data- AI collaboration. It’s likely going to be a little weird but a robust governance mechanism will be critical to onboard millions of new agentic users to a non-custodial platform that relies on onchain infrastructure.
Reppo Governance is managed by a DAO, and a representative council governs the project (bounded delegation), consisting of five elected seats and two appointed seats. Broadly speaking, there is a continuum of governance structures available to govern a DAO. Direct token voting is at one end of this spectrum, where each token represents one vote. In the middle is an unbounded delegate system; tokens can be delegated to any number of representatives, each of which then exerts a weighted vote based on the power delegated to them. At the other end of the continuum is a bounded delegate system, or a representative council; in this coordination scheme, there are a finite number of delegates, each with equal voting power.
This system avoids diffusion of responsibility and governance fatigue by ensuring that key responsibilities are directly delegated to operational council members for which the role is their sole priority. After many years of governance iteration this system is one of the most stable and least easily captured that has been deployed.
Conclusion
The Reppo Data Exchange combines cutting-edge tech—from Anoma’s OS for intent-centric apps, to Gateway’s privacy-enhancing tech, to OpenGradient’s verifiable inference stack—to solve some of the biggest AI bottlenecks. Individually, each tech has its limits; together, they form a cohesive, scalable platform that’s as decentralized, permissionless, and verifiable as crypto itself. We’re set to be the first on-chain data exchange that transforms how data owners and AI agents communicate and commit i.e. Coordinate (Coordination = Communication + Commitment (thanks @Sreeram Kannan).
Welcome to our Fuelling Station.
RFD stands for Request for Data.
Validation is a combination of zkTLS + staking + on-chain reputation discussed in a section below.