Skip to main content
Production integrations fail in predictable ways. This page lists the failure modes we’ve observed across partners and the recommended handling for each.

Chain RPC unavailable

What happens. Our upstream RPC (Alchemy / public node) drops requests or returns stale data. Affected endpoints: /position, /health, lifecycle event delivery (events queue up but emit late). What you see.
  • /position and /health return 503 with code: "chain_rpc_unavailable"
  • Lifecycle events arrive minutes-to-hours late (no event loss; we replay from the last confirmed block once RPC recovers)
Handling.
  • For /position: cache the last successful read with a timestamp; show “live data temporarily unavailable, last updated 3 min ago” rather than spinning forever
  • For /health: fall back to “treating as healthy” if your last successful response was within ~5 min, otherwise show maintenance banner
  • For lifecycle events: don’t show a “deposit failed” UI just because the event is late. Use the heuristic in Tx lost vs late below
  • Retry strategy: exponential backoff starting at 30s, max 5 minutes between attempts. Don’t retry tighter — you’ll just amplify the upstream incident

Tx revert

What happens. The user’s deposit / redeem transaction reverts on-chain. Common causes: insufficient allowance, vault paused between simulation and submission, slippage on async vaults. What you see.
  • The wallet returns the tx hash, but the receipt has status: 0
  • No lifecycle webhook fires — we only emit on successful logs
Handling.
  • Read the receipt from your own RPC after submission; don’t wait for a webhook to learn the tx failed
  • Show the revert reason if available: revert reason from receipt → user-facing message map. Common ones: "ERC4626: insufficient allowance" → “Please approve the token first”, "Pausable: paused" → “This vault is temporarily unavailable”
  • Don’t mark a tx as failed based solely on webhook silence — see next section

Tx lost vs late

The problem. You submitted a tx, got a hash, but minutes later neither a webhook nor a receipt has arrived. Did it fail, get dropped, or is the indexer just slow? Heuristic.
elapsed_since_submit > confirmation_depth_blocks * 2 * block_time
Past that threshold, treat as lost. Concretely:
ChainThreshold
Ethereum mainnet6 min
Base2 min
Arbitrum30 sec
Optimism2 min
Polygon4 min
Before the threshold, show “Confirming…” and keep polling the receipt. After the threshold without a receipt, show “Transaction may have been dropped — please check your wallet” and offer a re-submit. Do not auto-retry — the user might have re-broadcast at higher gas already.

Tx replacement (speed-up / cancel)

What happens. The user’s wallet submits a replacement tx with the same nonce but higher gas (speed-up) or zero value (cancel). Only one tx wins. What you see.
  • If the replacement is a successful deposit / redeem at higher gas: lifecycle webhook fires for the replacement tx hash, not the original
  • If the replacement is a cancel (zero-value): no event fires; your “pending tx” entry has no resolution
Handling.
  • Don’t pin your pending state to one specific tx hash. Pin it to (user, nonce, intent) and update when any tx with that nonce confirms
  • For cancel detection: if you see a tx with the same nonce confirm to a different to address (typically the user’s own address), treat the original as cancelled
  • Most partners can ignore replacement entirely — the eventual webhook (if any) and the /position endpoint will reconcile

Chain reorg

What happens. The chain rolls back blocks below the latest tip. A Deposit log that briefly existed now doesn’t. What you see. Nothing. We hold lifecycle events until they’re at confirmation depth (see Lifecycle → Confirmation depth) — past depth, reorgs are vanishingly rare. We do not deliver and then retract. Handling. Trust the depth. If your business needs deeper certainty (e.g. > $1M deposits), use /position with a custom min_confirmations query param after the webhook arrives.

Vault pause mid-flight

What happens. Ops or upstream emergency triggers a pause while user deposits are in flight. What you see.
  • A vault_pause event fires (operational webhook)
  • In-flight deposits either revert (most common) or complete normally — pause is forward-looking
  • The /api/partner/products/{slug}/health endpoint flips is_paused: true
Handling.
  • Hide the “Deposit” CTA the moment you receive vault_pause
  • For pending deposits at pause time: poll the receipt; treat reverts as user-facing failures
  • Withdrawals are typically allowed during pause; check product.statuspaused allows redeem, deprecated is redeem-only forever

Webhook delivery failure (your endpoint down)

What happens. Your endpoint returns 5xx, times out, or drops the connection. What we do. Retry up to 3 attempts at 30s / 120s / 600s after the initial try. Past that, the delivery is marked failed but the row is durably stored — you can manually replay from the Portal. Handling.
  • Always return 2xx quickly (under 10 sec) and process async. Long-running handlers cause us to time out and retry, which doubles your workload
  • Persist the dedupe key ((event_type, chain_id, tx_hash, log_index) for lifecycle, (event_type, slug, date) for operational) before acking — if you ack and crash, the next retry will look like a duplicate and you’ll lose the event
  • Set up alerting on failed deliveries in Portal — the row is preserved but no one will tell you it’s there

Idempotency reminder

Every event family has a deterministic dedupe key. We may send the same delivery more than once due to network failures, our retries, or partner-initiated replay. Your handler must be idempotent.
  • Lifecycle events: (event_type, chain_id, tx_hash, log_index) — uniquely identifies a chain log
  • apy_change / tvl_alert: (event_type, slug, date)
  • vault_pause: (event_type, slug) — re-emitted while paused
If your DB is your dedupe store, use a unique index on these tuples and treat INSERT ... ON CONFLICT DO NOTHING as a successful no-op. Never branch on “have I seen this before” via a SELECT-then-INSERT pattern — race conditions will let duplicates through.

What’s next