Skip to content

Budget and Usage Reporting

The SDK enforces three budget layers, checked in order before any network call is made. A budget-exceeded condition never reaches the wire.

LayerScopeStorageLifecycle
Per-requestSingle Fetch callComparison only (no state)Instant
Per-sessionClient instance lifetimeIn-memory counterResets on NewClient
Per-periodCalendar period (e.g. 30 days)Local file or RedisSurvives restarts
type Budget struct {
// Maximum cost per individual request. Transactions above this
// are rejected before querying Exchanges.
MaxPerRequest float64
// Maximum cumulative spend per session (in-memory, resets on restart).
MaxPerSession float64
// Maximum cumulative spend per period (persisted).
MaxPerPeriod float64
// Period duration for MaxPerPeriod (e.g. 720h = 30 days).
Period time.Duration
// Budget scope identifier for per-period tracking.
// E.g. "user:u-12345" for per-user, "team:eng" for per-team.
// Required when MaxPerPeriod is set.
Scope string
// ISO 4217 currency code. Default: "USD".
Currency string
}

The simplest layer. Before querying any Exchange, the SDK checks whether the MaxPerRequest budget can accommodate the request. After offer selection, it checks the actual offer price:

// Pre-flight: is this request even possible?
if !budget.CanSpend(config.Budget.MaxPerRequest) {
return BudgetExceededError{Layer: "per_request", ...}
}
// Post-selection: does the specific offer fit?
if offer.Pricing.Rate > config.Budget.MaxPerRequest {
return BudgetExceededError{Layer: "per_request", ...}
}

Tracks cumulative spend across the lifetime of a Client instance. Resets when the client is created (no persistence).

Thread safety: the session counter uses atomic.AddInt64.

// Before transaction
if sessionSpent + offer.Rate > config.Budget.MaxPerSession {
return BudgetExceededError{Layer: "per_session", ...}
}
// After successful transaction
atomic.AddInt64(&sessionSpent, int64(txn.Cost.Amount * 10000)) // fixed-point

Tracks cumulative spend across a calendar period (e.g., 30 days). Persisted to survive restarts.

Default persistence: local JSON file at ~/.ramp/budget/<scope>.json:

u-12345.json
{
"scope": "user:u-12345",
"period_start": "2026-03-01T00:00:00Z",
"period_duration": "720h",
"currency": "USD",
"spent": 4.27,
"limit": 50.00
}

Multi-process agents: configure a shared Redis instance via Config.Budget.RedisURL for shared budget state across processes.

Thread safety: the period tracker uses a mutex for read-check-write.

When any budget layer is exceeded, the SDK returns a typed error before making any network call:

type BudgetExceededError struct {
Layer string // "per_request", "per_session", "per_period"
Limit float64
Current float64
Requested float64
Currency string
}
  • Budget limits are enforced client-side. They protect the agent operator from runaway spend.
  • The Exchange independently enforces its own credit/balance limits server-side.
  • Both layers must agree for a transaction to proceed.
  • The SDK checks budget BEFORE making any network call to the Exchange.
func TestBudgetTracker_PerSession(t *testing.T) {
bt := budget.NewTracker(budget.Config{
MaxPerRequest: 0.10,
MaxPerSession: 0.50,
Currency: "USD",
})
// First four requests: ok
for i := 0; i < 4; i++ {
require.NoError(t, bt.Check(0.10))
bt.Record(budget.Cost{Amount: 0.10, Currency: "USD"})
}
// Fifth request: would exceed session limit
err := bt.Check(0.10)
require.ErrorAs(t, err, &ramp.BudgetExceededError{})
assert.Equal(t, "per_session", err.(*ramp.BudgetExceededError).Layer)
}

The SDK automatically submits usage reports after each successful Fetch. This is the default behavior (Config.Reporting.AutoReport = true).

  1. After a successful content fetch, the SDK enqueues a UsageReport in an in-memory bounded channel.
  2. A background goroutine drains the channel and submits reports to the appropriate Exchange via ReportUsage RPC.
  3. Reports approaching their deadline are prioritized.
  4. Failed submissions are re-enqueued with exponential backoff.
  5. Reporting never blocks the Fetch call — it is entirely non-blocking.
type ReportingConfig struct {
// Whether to auto-submit usage reports after each Fetch. Default: true.
AutoReport bool
// Maximum number of pending reports before blocking. Default: 1000.
MaxPendingReports int
// Retry interval for failed report submissions. Default: 30s.
RetryInterval time.Duration
}
type pendingReport struct {
Report *rampv1.UsageReport
Deadline time.Time // obligation window end
Retries int
NextTry time.Time
}
  • Capacity: bounded channel, default 1000 pending reports
  • Overflow: if the channel is full, the oldest report is dropped and a warning is logged. This is a soft failure — the Exchange may eventually block the agent for overdue reports (DENIAL_REASON_REPORTING_OVERDUE), but the current fetch is not affected.
  • Priority: reports approaching their deadline are submitted first
  • Retry: exponential backoff — 30s, 60s, 120s, max 10 minutes

UsageReportResponse now returns a report_id — a Exchange-assigned identifier for the accepted report. The SDK stores this on FetchResult.ReportID.

The report_id is required for filing disputes. The dispute chain enforces reporting-before-disputing:

UsageReport -> UsageReportResponse{report_id} -> DisputeRequest{report_id}

If the agent needs to dispute a transaction, it must have a valid report_id. The SDK tracks this automatically when AutoReport is enabled.

The Usage message now includes structured attribution reporting via CitationFormat and AttributionDetail:

type Attribution struct {
Format CitationFormat // e.g., CITATION_FORMAT_INLINE, CITATION_FORMAT_FOOTNOTE
Details []AttributionDetail // per-asset attribution specifics
}

The SDK populates attribution fields in the UsageReport when the caller provides them via FetchResult or ReportUsage.

When AutoReport is disabled, the caller is responsible for submitting usage reports:

resp, err := client.ReportUsage(ctx, ramp.UsageReport{
TransactionID: result.TransactionID,
BillingID: result.BillingID,
Function: "FUNCTION_AI_INPUT",
TokenCount: 1500,
})
// resp.ReportID is needed for any subsequent dispute

Previously ReportUsage returned only an error. In v1.0, it returns a UsageReportResult containing ReportID:

err := client.ReportUsage(ctx, ramp.UsageReport{
TransactionID: result.TransactionID,
BillingID: result.BillingID,
Function: "FUNCTION_AI_INPUT",
TokenCount: 1500,
})

Each transaction may carry a ReportingObligation that specifies:

  • Whether reporting is required or optional
  • The reporting window (deadline by which the report must be submitted)
  • What fields are required in the report

The SDK tracks these obligations automatically. If a Exchange denies a future transaction with DENIAL_REASON_REPORTING_OVERDUE, the SDK flushes pending reports and retries.

client.Close(ctx) blocks until all pending usage reports are submitted or the context is cancelled:

// Graceful shutdown -- flushes all pending reports
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
err := client.Close(ctx)

If the context is cancelled before all reports are submitted, remaining reports are lost. The Exchange will track them as overdue.

The SDK emits structured log events for every significant operation:

client, _ := ramp.NewClient(ramp.Config{
// ...
Logger: slog.New(slog.NewJSONHandler(os.Stderr, nil)),
})
EventLevelFields
ramp.fetch.startInfourl, domain
ramp.discovery.cache_hitDebugdomain, exchange_count
ramp.discovery.cache_missInfodomain
ramp.discovery.ramp_jsonInfodomain, exchange_count, latency_ms
ramp.supply.queryInfoexchange, uri_count, latency_ms
ramp.selection.winnerInfoexchange, offer_id, unit_cost, subscription_id
ramp.budget.checkDebuglayer, limit, current, requested
ramp.budget.exceededWarnlayer, limit, current, requested
ramp.transaction.executeInfoexchange, offer_id, latency_ms
ramp.transaction.deniedWarnexchange, reason
ramp.content.fetchInfourl, status_code, latency_ms, content_length
ramp.report.enqueueDebugtransaction_id, deadline
ramp.report.submitInfotransaction_id, accepted
ramp.report.failedWarntransaction_id, error, retry_count

The SDK exposes Prometheus-compatible metrics via an optional metrics.Handler:

MetricTypeDescription
ramp_fetch_totalCounterFetches by status (success, budget_exceeded, no_exchange, denied, error)
ramp_fetch_duration_secondsHistogramEnd-to-end Fetch latency
ramp_supply_query_duration_secondsHistogramPer-Exchange query latency
ramp_transaction_duration_secondsHistogramPer-Exchange transaction latency
ramp_budget_spent_totalCounterCumulative spend by currency
ramp_budget_remainingGaugeRemaining budget by scope
ramp_reports_pendingGaugePending usage reports
ramp_reports_overdueGaugeReports past their deadline
  • The SigningKey (Ed25519 private key) is stored in the Config struct and never serialized, logged, or included in error messages.
  • Ed25519 signatures are computed per-request. The private key produces agent_signature on ResourceQuery and TransactionRequest messages.
  • The Exchange verifies signatures using the agent’s registered public key (looked up by LicenseID). The private key never leaves the agent.
  • All Exchange RPCs use HTTPS (TLS 1.2+)
  • Content fetch (signed URL) uses HTTPS
  • ramp.json discovery uses HTTPS
  • No plaintext HTTP, even for localhost development (use http://localhost only with explicit opt-in)
CategoryWhat It TestsTooling
Unit: selectionunit_cost ranking, subscription preference, dedupTable-driven tests, no I/O
Unit: budgetPer-request/session/period enforcement, edge casesIn-memory tracker
Unit: reportingQueue overflow, deadline priority, retry backoffFake clock
Integration: mockFull Fetch flow against MockExchangetestutil.NewMockExchange
Integration: referenceFull Fetch flow against reference ExchangeReal Connect server
E2ESDK to reference Exchange to reference CDN to reportDocker compose