A decentralized registry where companies list tools, agents discover and pay for them, and reputation emerges from verified usage — backed by Dolt DB, inspired by AT Protocol's data model.
AI agents are increasingly capable, but when they need specialized tools — fraud detection, geospatial analysis, compliance checking — they hit a wall.
Tools are hardcoded or manually configured. There's no Yellow Pages for agent capabilities.
There's no machine-native way to pay for a tool call. It's all API keys, billing dashboards, and enterprise contracts.
An agent can't know which tool provider is reliable, fast, or accurate without a human pre-vetting everything.
Switch your MCP server or API provider and you're rewiring everything from scratch.
When an agent makes a decision based on a tool's output, there's no versioned, reproducible record of what happened.
SaaS pricing models assume a human buyer. Per-seat licenses, annual contracts, "schedule a demo" funnels — the entire commercial infrastructure was designed for humans. An agent can't sit through a sales call.
The protocol debate (MCP vs. skills vs. REST vs. gRPC) is a distraction. The real gap is: how do agents find, trust, pay for, and audit tool usage across organizational boundaries?
Dan Abramov's article reframes AT Protocol as a distributed filesystem for social computing. The ToolShed borrows its design patterns — but doesn't depend on Bluesky's infrastructure.
"Our memories, our thoughts, our designs should outlive the software we used to create them." Replace "software" with "agent frameworks" and the same principle applies to tools.
A multi-agent orchestration system for coordinating 20-30+ Claude Code agents working simultaneously.
A SQL database you can fork, clone, branch, merge, push, and pull — just like Git. MySQL-compatible with full version history.
| Feature | Application in ToolShed |
|---|---|
dolt_history_* |
Full row-level history of every tool registration and invocation |
AS OF queries |
"What tools were available at time T? What schema was version N?" |
dolt_diff() |
"What changed between schema versions?" |
| Branch & merge | A/B test new pricing, preview schema changes before publishing |
dolt clone / push |
Distribute the registry, federate across organizations |
| Signed commits | Tamper-evident audit trail |
A language where every definition is identified by a hash of its syntax tree, not by its name. Names are just metadata — pointers to hashes.
“What we now think of as a dependency conflict is instead just a situation where there are multiple terms or types that serve a similar purpose.” The ToolShed applies this to tool schemas.
The ToolShed doesn't change the pattern developers already use — it is that pattern. Add one config entry and your agent has access to every tool in the registry. No new paradigm, no behavioral shift.
The ToolShed exposes a small set of meta-tools:
Find tools by capability, price, latency, reputation
Call a tool — handles payment, schema validation, logging
Check reliability, quality scores, SLA compliance
Submit a proof-of-use upvote after using a tool
Agent has a task:
"analyze this transaction for fraud"
Agent doesn't have a fraud tool — calls
toolshed_search({ capabilities: ["fraud"], max_price: 0.01
})
ToolShed returns ranked results from the Dolt registry
Agent picks one, calls
toolshed_invoke({ tool: "fraud-detection-v3@acme.com", input:
{...} })
ToolShed gateway handles Stripe payment, calls the endpoint, validates response
Agent gets the result, uses it — then calls
toolshed_review with quality signal
An agent that already has a hardcoded fraud tool will never search for fraud — it'll just use what it has. But the first time it needs something it doesn't have, the ToolShed is right there. Discovery happens organically at the edges.
Dolt-backed. Tool records with schema, pricing, endpoint, payment methods, and SLA. Company identity verified via domain ownership or DIDs. Capability search and discovery.
Thin routing + auth + metering. Protocol translation (MCP, REST, gRPC — doesn't care). Payment negotiation, usage metering via Stripe, and response validation against schema.
Dolt-backed audit trail. Every invocation is a commit. Time-travel, diff, reproduce any agent decision. Settlement and reconciliation records.
The MCP-vs-skills debate is a false choice. The invocation method is just a field in the tool record:
{ "protocol": "mcp", // or "rest", "grpc", "graphql", "skill" "endpoint": "https://tools.acme.com/mcp", "tool_name": "fraud_check" }
It's like how DNS doesn't care what protocol you speak once you've resolved the address. The schema is the contract; the protocol is a transport detail.
Every entity in the system is a record — a JSON document with a schema. No special servers for payment, reputation, or discovery. It's all records in the Dolt registry, with materialized views computed by whoever needs them.
A company registers a tool in two parts: an immutable
definition (the contract) and a mutable
listing (the metadata). The registry hashes the
definition to produce a content_hash — the tool's
true identity.
{ "provider": { "domain": "acme.com", "did": "did:plc:acme-corp" }, "schema": { "input": { "transaction_id": { "type": "string" }, "amount": { "type": "number" }, "merchant_category": { "type": "string" } }, "output": { "risk_score": { "type": "number", "min": 0, "max": 1 }, "flags": { "type": "array", "items": { "type": "string" } } } }, "invocation": { "protocol": "mcp", "endpoint": "https://tools.acme.com/mcp", "tool_name": "fraud_check" }, "capabilities": ["fraud", "ml", "financial", "real-time"], "createdAt": "2026-03-01T00:00:00Z" }
{ "definition_hash": "sha256:a1b2c3d4e5f6...", "name": "Fraud Detection", "version_label": "3.1.0", "description": "Real-time transaction fraud scoring with ML", "pricing": { "model": "per_call", "price": 0.005, "currency": "usd" }, "sla": { "p99_latency_ms": 500, "uptime": "99.9%" }, "updatedAt": "2026-03-01T00:00:00Z" }
When an agent uses a tool and gets good results, it creates an upvote — a quality signal with proof that the agent actually paid for and used the tool:
{ "subject": "com.toolshed.tool/fraud-detection-v3@acme.com", "proof": { "payment_method": "stripe", "stripe_invoice_id": "in_1abc123def456", "invocation_hash": "sha256:deadbeef...", "ledger_commit": "dolt:76qerj11u38il8rb..." }, "evaluation": { "quality": 5, "latency_met_sla": true, "schema_valid": true, "useful": true } }
Immutable contract: schema, invocation, capabilities
Mutable metadata: name, pricing, SLA — points to a definition
Machine-readable input/output contract
Record of each call: input hash, output hash, timing
Quality signal with proof-of-use
Inspired by the Unison programming language, tool definitions are identified by a hash of their content, not by a name or version number. Names and version labels are mutable metadata that point to immutable hashes.
New schema → new hash → new definition. Old hash still exists. Agents pinned to the old hash keep working.
Two definitions with different schemas are different hashes. They coexist. No coordination needed.
After a successful call, an agent stores
sha256:abc123 — immutable and precise. Names
can change; the hash is stable.
Two providers with the same schema and contract share a content hash. Discovery surfaces both providers for one definition.
Old hashes just exist. Stale reputation naturally pushes agents toward newer definitions.
tool_definitions is append-only.
dolt_history_tool_listings tracks every pointer
change. AS OF queries reproduce any point in time.
No special payment subsystem. The provider declares "send cash this way" as part of their tool registration. The agent reads the payment methods, picks one it supports, pays, and calls the tool.
// MVP: Stripe metered billing "payment": { "stripe": { "payment_link": "https://buy.stripe.com/...", "meter_id": "mtr_abc123" } } // Open source / community tools "payment": { "free": {} }
New payment methods don't require protocol changes. Anyone publishes a new lexicon:
com.toolshed.defs#paymentStripe ← MVP com.toolshed.defs#paymentFree ← open source com.toolshed.defs#paymentLightning ← future: micropayments com.toolshed.defs#paymentCashu ← future: bearer tokens io.fedi.defs#paymentFedimint ← community-defined xyz.newrail.defs#paymentWhatever ← anyone can extend
Validate on read. If a tool lists a payment method the agent doesn't understand, the agent skips it and picks one it does. If it can't pay at all, it moves on to the next tool.
Reputation is not stored on the tool. It's derived — a materialized view computed from all upvote records in the Dolt registry. Nobody owns the score. Nobody can inflate it without paying for real usage. Anybody can compute it.
-- REPUTATION for acme's fraud-detection-v3: SELECT AVG(quality_score), COUNT(*), COUNT(DISTINCT caller_domain) FROM upvotes WHERE tool_id = 'com.toolshed.tool/fraud-detection-v3@acme.com' AND proof_is_valid = true -- payment receipt checks out AND invocation_exists = true -- hash found in ledger -- Nobody owns this score. -- Nobody can inflate it without paying for real usage. -- Anybody can compute it (clone the registry, run the query).
| Attack | Why It Fails |
|---|---|
| Fake upvotes (sybil) | Proof-of-use required. No valid payment receipt = unverifiable upvote. |
| Self-upvoting |
Provider pays themselves real money.
caller_did == provider_did — trivial to
filter.
|
| Wash trading | Detectable via diversity-of-upvoters weighting. PageRank-style graph analysis. |
| Buying upvotes | Requires real usage and real payment — the tool still has to deliver quality. |
| Deleting bad reviews | Impossible. Upvotes live in the reviewer's repo, not the provider's. |
Because the registry is a Dolt database anyone can clone, anyone can build discovery algorithms over the data:
Clone the Dolt registry, write your own ranking SQL, expose it as an API. Competition between discovery algorithms improves quality for everyone.
Every table gets Git-style version control for free. Time-travel queries, schema diffs, branch-and-merge for tool configurations, and a tamper-evident audit trail.
-- Tool definitions (immutable, content-addressed) -- Append-only: rows are never updated or deleted CREATE TABLE tool_definitions ( content_hash VARCHAR(64) PRIMARY KEY, -- sha256 of (schema + invocation + provider) provider_domain VARCHAR(255) NOT NULL, provider_did VARCHAR(255), -- Tier 2, nullable schema_json JSON NOT NULL, invocation_json JSON NOT NULL, capabilities JSON, created_at DATETIME ); -- Tool listings (mutable, human-readable metadata) -- Points to a tool_definition via content_hash CREATE TABLE tool_listings ( id VARCHAR(255) PRIMARY KEY, definition_hash VARCHAR(64) NOT NULL, -- points to tool_definitions.content_hash provider_domain VARCHAR(255) NOT NULL, provider_did VARCHAR(255), -- Tier 2, nullable name VARCHAR(255) NOT NULL, version_label VARCHAR(32), -- cosmetic, like a Git tag description TEXT, pricing_json JSON NOT NULL, payment_json JSON NOT NULL, sla_json JSON, capabilities JSON, created_at DATETIME, updated_at DATETIME, FOREIGN KEY (definition_hash) REFERENCES tool_definitions(content_hash) ); -- Upvotes (proof-of-use quality signals) CREATE TABLE upvotes ( id VARCHAR(255) PRIMARY KEY, tool_id VARCHAR(255) NOT NULL, caller_domain VARCHAR(255) NOT NULL, quality_score INT, proof_json JSON NOT NULL, context_json JSON, created_at DATETIME, FOREIGN KEY (tool_id) REFERENCES tool_listings(id) ); -- Invocation ledger CREATE TABLE invocations ( id VARCHAR(255) PRIMARY KEY, tool_id VARCHAR(255) NOT NULL, definition_hash VARCHAR(64) NOT NULL, -- exact definition called (immutable pin) input_hash VARCHAR(64) NOT NULL, output_hash VARCHAR(64), payment_proof VARCHAR(500), latency_ms INT, created_at DATETIME ); -- Reputation (materialized view, recomputed periodically) CREATE TABLE reputation ( tool_id VARCHAR(255) PRIMARY KEY, verified_upvotes INT DEFAULT 0, avg_quality DECIMAL(3,2), unique_callers INT DEFAULT 0, computed_at DATETIME );
Company already has their tool running (API, MCP server, whatever)
They submit a tool record (JSON) to the Dolt registry — schema, endpoint, pricing, payment
They verify domain ownership via DNS TXT record or
.well-known
That's it. No SDK. No middleware. No infrastructure changes.
Agent needs fraud detection for a financial analysis task
Queries the Dolt-backed registry:
capabilities LIKE '%fraud%' ORDER BY reputation DESC
Gets ranked tools, validates schema matches its needs
Reads the payment field, calls the tool — gateway reports usage to Stripe
Gets result, validates against schema, creates invocation + upvote records
Quality → Visibility → Usage → Revenue → Quality ∞
Companies start with the simplest possible on-ramp and upgrade when the value is proven.
Both tiers write to the same Dolt tables. A Tier 1 tool and a Tier 2
tool sit side by side. The only difference is whether
at_uri and provider_did are populated.
Agents don't care — they see the same schema, same pricing, same
endpoint.
| You Know This | ToolShed Equivalent |
|---|---|
| DNS | Tool discovery — resolve a capability to an endpoint |
| TLS Certificates | Domain verification / DIDs — prove your identity |
| npm Registry | Tool registry — search, install, version |
| App Store Ratings | Reputation — but only from verified purchasers |
| Stripe Connect | Payment — provider declares how to be paid |
| Google PageRank | Discovery algorithms — anyone can rank differently |
| Git + GitHub | Dolt + DoltHub — version control for registry & ledger |
npm + Stripe + Dolt, for AI agent tool calls, with AT Protocol's data philosophy.
Who defines com.toolshed.* lexicons? A foundation? A
GitHub org? Follow AT Protocol norms — publish early, evolve
carefully.
V1: Stripe customer ID + spending cap. V2: per-agent budgets. V3: prepaid balances. V4: autonomous agents with own funds (Lightning, Cashu).
When tool input/output changes, how do we handle backward compatibility? Follow lexicon rules — additive only, breaking changes = new name.
Invocation logs contain sensitive data. Dolt ledger could be local-only, with only upvotes going to the shared registry.
What if a tool takes payment but returns garbage? Proof-of-use creates a public record. Low quality + valid payment = strong signal. Formal resolution is TBD.
Who runs relays that index tool records? Same model as AT Protocol relays — some public, some private, some subsidized by providers.