OpenLedger: Building a Data-Driven Model for Composable Intelligent Agent Economies

2025-08-02 03:03:54

OpenLedger Depth Research Report: Building a data-driven, model-composable agent economy based on OP Stack + EigenDA

1. Introduction | The Model Layer Leap of Crypto AI

Data, models, and computing power are the three core elements of AI infrastructure, analogous to fuel (data), engine (model), and energy (computing power), all of which are indispensable. Similar to the evolutionary path of infrastructure in the traditional AI industry, the Crypto AI field has also gone through similar stages. At the beginning of 2024, the market was once dominated by decentralized GPU projects, which generally emphasized a rough growth logic of "competing in computing power." However, as we enter 2025, the industry's focus is gradually shifting to the model and data layers, marking a transition for Crypto AI from competition for underlying resources to a more sustainable and application-value-driven mid-layer construction.

General Large Model (LLM) vs Specialized Model (SLM)

Traditional large language model (LLM) training heavily relies on large-scale datasets and complex distributed architectures, with parameter scales often ranging from 70B to 500B, and the cost of training once can reach several million dollars. SLM (Specialized Language Model), as a lightweight fine-tuning paradigm of reusable foundational models, is usually based on open-source models, combined with a small amount of high-quality specialized data and technologies like LoRA, to quickly build expert models with specific domain knowledge, significantly reducing training costs and technical barriers.

It is worth noting that SLM will not be integrated into the LLM weights, but will collaborate with LLM through methods such as Agent architecture calls, plugin system dynamic routing, LoRA module hot swapping, and RAG (Retrieval-Augmented Generation). This architecture retains the broad coverage capabilities of LLM while enhancing professional performance through fine-tuning modules, forming a highly flexible composite intelligent system.

The value and boundaries of Crypto AI at the model layer

Crypto AI projects are essentially difficult to directly enhance the core capabilities of large language models (LLM), and the core reason lies in

High technical threshold: The scale of data, computing resources, and engineering capabilities required to train Foundation Models is extremely large, and currently only technology giants such as the United States and China possess the corresponding capabilities.
Limitations of Open Source Ecology: Although mainstream foundational models such as LLaMA and Mixtral have been open-sourced, the key to driving breakthroughs in models still lies primarily within research institutions and closed-source engineering systems, leaving limited space for on-chain projects to participate at the core model level.

However, on top of the open-source foundational models, Crypto AI projects can still achieve value extension by fine-tuning specialized language models (SLM) and combining the verifiability and incentive mechanisms of Web3. As the "peripheral interface layer" of the AI industry chain, this is reflected in two core directions:

Trustworthy Verification Layer: Enhances the traceability and tamper-resistance of AI outputs by recording the model generation path, data contributions, and usage on-chain.
Incentive Mechanism: Utilizing the native Token to incentivize actions such as data uploading, model invocation, and agent execution, thereby constructing a positive cycle of model training and service.

AI Model Type Classification and Blockchain Applicability Analysis

It can be seen that the feasible landing points of model-based Crypto AI projects mainly focus on the lightweight fine-tuning of small SLMs, on-chain data access and verification of RAG architecture, as well as the local deployment and incentives of Edge models. Combining the verifiability of blockchain and token mechanisms, Crypto can provide unique value for these scenarios of medium to low resource models, forming differentiated value for the AI "interface layer".

The blockchain AI chain based on data and models can provide clear and tamper-proof on-chain records of the contribution sources of each data and model, significantly enhancing the credibility of data and the traceability of model training. At the same time, through the smart contract mechanism, rewards distribution is automatically triggered when data or models are called, transforming AI behavior into measurable and tradable tokenized value, thus building a sustainable incentive system. In addition, community users can also evaluate model performance through token voting, participate in rule formulation and iteration, and improve the decentralized governance structure.

2. Project Overview | The AI Chain Vision of OpenLedger

OpenLedger is one of the few blockchain AI projects in the current market that focuses on data and model incentive mechanisms. It first proposed the concept of "Payable AI" with the aim of building a fair, transparent, and composable AI operating environment that incentivizes data contributors, model developers, and AI application builders to collaborate on the same platform and earn on-chain rewards based on their actual contributions.

OpenLedger provides a full-chain closed loop from "data provision" to "model deployment" to "profit sharing call", with core modules including:

Model Factory: No programming required, you can use LoRA to fine-tune, train, and deploy custom models based on open-source LLM.
OpenLoRA: Supports coexistence of thousands of models, dynamically loads on demand, significantly reducing deployment costs;
PoA (Proof of Attribution): Achieving contribution measurement and reward distribution through on-chain call records;
Datanets: Structured data networks aimed at vertical scenarios, built and validated through community collaboration;
Model Proposal Platform: A composable, callable, and payable on-chain model marketplace.

Through the above modules, OpenLedger has built a data-driven, model-composable "agent economic infrastructure" to promote the on-chain of the AI value chain.

In terms of blockchain technology adoption, OpenLedger uses OP Stack + EigenDA as the foundation to create a high-performance, low-cost, and verifiable environment for data and contract execution for AI models.

Built on OP Stack: Based on Optimism technology stack, supporting high throughput and low-cost execution;
Settle on the Ethereum mainnet: Ensure transaction security and asset integrity;
EVM Compatible: Convenient for developers to quickly deploy and expand based on Solidity;
EigenDA provides data availability support: significantly reducing storage costs and ensuring data verifiability.

Compared to general-purpose AI chains like NEAR, which focus more on the underlying layer and emphasize data sovereignty and the "AI Agents on BOS" architecture, OpenLedger is more focused on building AI-specific chains aimed at data and model incentives, committed to making the development and invocation of models on-chain achieve traceable, composable, and sustainable value loops. It is the model incentive infrastructure in the Web3 world, combining model hosting, usage billing, and on-chain composable interfaces to promote the realization of "models as assets".

3. Core Components and Technical Architecture of OpenLedger

3.1 Model Factory, no-code model factory

ModelFactory is a large language model (LLM) fine-tuning platform under the OpenLedger ecosystem. Unlike traditional fine-tuning frameworks, ModelFactory provides a purely graphical interface for operation, without the need for command line tools or API integration. Users can fine-tune models based on datasets that have been authorized and reviewed on OpenLedger. It achieves an integrated workflow for data authorization, model training, and deployment, with core processes including:

Data Access Control: Users submit data requests, providers review and approve, and data is automatically integrated into the model training interface.
Model selection and configuration: Supports mainstream LLMs (such as LLaMA, Mistral), with hyperparameters configured via GUI.
Lightweight fine-tuning: Built-in LoRA / QLoRA engine, real-time display of training progress.
Model Evaluation and Deployment: Built-in evaluation tools, support for exporting deployment or ecological shared calls.
Interactive verification interface: Provides a chat-like interface for directly testing the model's Q&A capabilities.
RAG Generation Traceability: Answers with source citations enhance trust and auditability.

The Model Factory system architecture includes six major modules, encompassing identity authentication, data permissions, model fine-tuning, evaluation deployment, and RAG traceability, creating a secure, controllable, real-time interactive, and sustainable monetization integrated model service platform.

The following is a brief summary of the capabilities of the large language models currently supported by ModelFactory:

LLaMA Series: The most extensive ecosystem, active community, and strong general performance, making it one of the most mainstream open-source foundational models currently.
Mistral: Highly efficient architecture with excellent inference performance, suitable for flexible deployment in resource-limited scenarios.
Qwen: Produced by Alibaba, it performs excellently in Chinese tasks, has strong overall capabilities, and is the top choice for domestic developers.
ChatGLM: The Chinese dialogue effect is outstanding, suitable for vertical customer service and localization scenarios.
Deepseek: Excels in code generation and mathematical reasoning, suitable for intelligent development assistance tools.
Gemma: A lightweight model launched by Google, with a clear structure, easy to quickly get started and experiment.
Falcon: Once a performance benchmark, suitable for basic research or comparative testing, but community activity has decreased.
BLOOM: Strong support for multiple languages, but weaker inference performance, suitable for language coverage research.
GPT-2: A classic early model, suitable only for teaching and validation purposes, not recommended for actual deployment.

Although OpenLedger's model combination does not include the latest high-performance MoE models or multimodal models, its strategy is not outdated. Instead, it is a "practical-first" configuration based on the real constraints of on-chain deployment (inference costs, RAG adaptation, LoRA compatibility, EVM environment).

Model Factory, as a no-code toolchain, has built-in proof of contribution mechanisms for all models, ensuring the rights of data contributors and model developers. It boasts advantages such as low entry barriers, monetizability, and composability, compared to traditional model development tools:

For developers: Provide a complete path for model incubation, distribution, and revenue;
For the platform: to form a model asset circulation and combination ecosystem;
For users: Models or Agents can be combined and used like calling an API.

3.2 OpenLoRA, on-chain assetization of fine-tuned models

LoRA (Low-Rank Adaptation) is an efficient parameter tuning method that learns new tasks by inserting "low-rank matrices" into pre-trained large models without modifying the original model parameters, significantly reducing training costs and storage requirements. Traditional large language models (such as LLaMA, GPT-3) typically have billions or even hundreds of billions of parameters. To use them for specific tasks (such as legal question answering, medical consultations), fine-tuning is required. The core strategy of LoRA is: "freeze the parameters of the original large model and only train the newly inserted parameter matrices." Its parameter efficiency, fast training, and flexible deployment make it the mainstream fine-tuning method most suitable for Web3 model deployment and compositional calls.

OpenLoRA is a lightweight inference framework built by OpenLedger, specifically designed for multi-model deployment and resource sharing. Its core objective is to address common issues in current AI model deployment, such as high costs, low reusability, and waste of GPU resources, and to promote the implementation of "Payable AI."

OpenLoRA system architecture core components, based on modular design, covering key aspects such as model storage, inference execution, and request routing, achieving efficient and low-cost multi-model deployment and invocation capabilities:

LoRA Adapter Storage Module (LoRA Adapters Storage): The fine-tuned LoRA adapter is hosted on OpenLedger, allowing on-demand loading to avoid pre-loading all models into GPU memory, saving resources.
Model Hosting & Adapter Merging Layer (: All fine-tuned models share the base model, during inference.

OP-3.96%

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

14 Likes