Backpressure & Flow Control

Audience: Internal — this page is for the Rulecatch team, not customer-facing documentation.

The AI-Pooler implements smart throttling to handle API unavailability, rate limits, and server load gracefully.

Overview

Before sending events, the flush script:

Checks local state — Is a backoff timer active?
Asks the server — How much can I send?
Sends in batches — Respecting server-recommended sizes and delays
Records results — Updates backoff state on success or failure

Backpressure State

The state is persisted in ~/.claude/rulecatch/.backpressure-state:

Field	Type	Description
`backoffLevel`	number	Current backoff level (0-10)
`nextAttemptAfter`	number	Timestamp when next attempt is allowed
`lastCapacity`	object	Last server capacity response
`consecutiveFailures`	number	Consecutive failure count
`lastSuccessTime`	number	Last successful flush timestamp
`pendingEventCount`	number	Events waiting in buffer

Exponential Backoff

When a flush fails, the backoff delay increases exponentially:

Level	Delay	After
0	0s	No failures
1	2s	1st failure
2	4s	2nd failure
3	8s	3rd failure
4	16s	4th failure
5	32s	5th failure
6	64s	6th failure
7	128s	7th failure
8	256s	8th failure
9	300s (5 min)	9th failure
10	300s (5 min)	10th+ failure

Configuration

Setting	Value
Base delay	1,000ms
Max delay	300,000ms (5 minutes)
Multiplier	2x per level
Max backoff level	10

Circuit Breaker

After 10 consecutive failures, the circuit breaker opens:

All flush attempts are blocked until the nextAttemptAfter timer expires
This prevents hammering a downed server
Events continue to accumulate in the buffer safely
The circuit closes when the timer expires and the next attempt succeeds

Server Capacity

Before flushing, the pooler asks the server how much it can handle:

POST /api/v1/ai/pooler/capacity

The server responds with:

Field	Description
`ready`	Whether server can accept events
`maxBatchSize`	Maximum events per batch
`delayBetweenBatches`	Milliseconds to wait between batches
`retryAfter`	Seconds to wait if not ready
`loadPercent`	Server load (0-100%)
`message`	Optional status message

Server Load Levels

Load %	Server Status	Recommended Batch	Delay
0-39%	Normal	100	100ms
40-59%	Moderate	50	1,000ms
60-79%	High	20	2,000ms
80-94%	Very High	10	5,000ms
95-100%	Overloaded	0 (not ready)	retryAfter

Flush Flow

1. Load backpressure state from disk
2. Can we attempt? (check backoff timer + circuit breaker)
   → No: log reason, save state, exit
   → Yes: continue

3. Ask server for capacity
   → Not ready: set nextAttemptAfter, save state, exit
   → Ready: get maxBatchSize + delay

4. While events remain:
   a. Take batch of min(maxBatchSize, remaining)
   b. Send batch to API
   c. Success?
      → Yes: record success, reduce backoff
      → No: record failure, increase backoff, stop

   d. More events?
      → Wait delayBetweenBatches
      → Every 100 events: re-check capacity

5. Save final state to disk

Recovery Behavior

On Success

backoffLevel decreases by 1 (gradual recovery)
consecutiveFailures resets to 0
nextAttemptAfter cleared (can flush immediately)

On Failure

backoffLevel increases by 1 (capped at 10)
consecutiveFailures incremented
nextAttemptAfter set based on calculated delay

Rate Limited (429)

Double the normal backoff delay
Parse Retry-After header if present

Server Overloaded (503)

Double the normal backoff delay
Parse Retry-After header if present

Monitoring

Check backpressure status:

npx @rulecatch/ai-pooler backpressure

Backpressure Status

Status:           Backing Off

Failures:         3 consecutive
Backoff level:    3/10
Next attempt in:  6s
Last success:     45s ago

Pending events:   12

Last Server Response:
  Ready:          Yes
  Max batch:      50
  Delay between:  500ms
  Server load:    35%

Reset

To clear backpressure state (after fixing the underlying issue):

npx @rulecatch/ai-pooler backpressure --reset=true

Buffer Safety

During backoff, events continue to accumulate in the buffer directory. They are never lost:

Buffer files are only deleted after successful API acknowledgment
The buffer can grow indefinitely (limited only by disk space)
When connectivity is restored, all buffered events are gradually drained