# Idempotency Patterns — Create Orders Exactly Once (With Retries)

Git Repo: https://github.com/mnafshin/idempotency

## Why This Matters

Retries are a fact of life in distributed systems. A mobile client times out, a load balancer gives up, a service restarts — and the same request arrives twice. Without idempotency, that can mean duplicate orders and duplicate charges.

This repository walks you through **three runnable, progressively richer implementations** so you can teach the pattern, validate it in tests, and eventually ship it in production:

| Module | Teaches | Audience |
|---|---|---|
| `idempotency-easy` (port 8081) | The core concept — dedup by key | New to idempotency |
| `idempotency-mature` (port 8082) | Multi-tenant + payload consistency check | Building multi-tenant APIs |
| `idempotency-production-redis` (port 8083) | Filter-based cross-cutting concern + pluggable store (in-memory / Redis) | Production services |

---

## The Idempotency Contract

A client picks a **random, unique `Idempotency-Key`** per logical operation and sends it as a header.

```
POST /api/orders
Idempotency-Key: 550e8400-e29b-41d4-a716-446655440000
X-Tenant-ID: 2b8de313-9c3c-4a15-a9b8-0cd1e34be3da
```

Rules:
- **First call** → business logic runs, response is `201 Created`.
- **Retry with the same key + same payload** → original response is replayed, `200 OK`, `Idempotent-Replayed: true` header set. No business logic runs again.
- **Retry with the same key + different payload** → `409 Conflict`. Clients can't silently change the intent of an in-flight operation.
- **Different key** → treated as a new, independent operation.

---

## Level 1 — Easy Mode (Teaching)

**Module:** `idempotency-easy`  
**Goal:** explain idempotency in the fewest moving parts.

### What it does

```
POST /api/orders
Idempotency-Key: order-abc
```

- First request → new order, `201 Created`.
- Same key → same order returned, `200 OK`.
- No tenant scope, no payload check — just key dedup.

### How it works

`EasyOrderService` holds a `ConcurrentHashMap<String, Order>`. On each request it calls `putIfAbsent`:

```java
Order existing = orders.putIfAbsent(idempotencyKey, newOrder);
boolean replayed = existing != null;
return new OrderResult(replayed ? existing : newOrder, replayed);
```

The controller computes the HTTP status from the result:

```java
return ResponseEntity
    .status(result.replayed() ? HttpStatus.OK : HttpStatus.CREATED)
    .body(result.order());
```

### Limitations (and why they matter in the blog)

- Stored in `HashMap` → lost on restart, not shared across instances.
- No tenant isolation → two tenants using the same key would collide.
- No payload check → a client that accidentally changes the request body gets a silent replay of the wrong data.

These are exactly the gaps Level 2 closes.

---

## Level 2 — Mature Mode (Multi-Tenant + Payload Safety)

**Module:** `idempotency-mature`  
**Goal:** production-oriented semantics without external infrastructure.

### Required headers

```
X-Tenant-ID: 2b8de313-9c3c-4a15-a9b8-0cd1e34be3da   # UUID
Idempotency-Key: order-abc                             # max 255 chars
```

### What changes

#### Tenant-scoped dedup key

```java
String dedupKey = tenantId + ":" + idempotencyKey;
```

Two tenants using `order-abc` get completely separate records.

#### Payload fingerprint

```java
String fingerprint = customerId + ":" + amount;
```

This is stored alongside the order. On retry, fingerprints are compared before replaying:

```java
if (!stored.requestFingerprint().equals(incomingFingerprint)) {
    throw new IdempotencyConflictException(...);
}
```

`409 Conflict` is returned when payloads differ — protecting clients from subtle bugs where a retry accidentally sends different data.

#### Race-safe `putIfAbsent`

`ConcurrentHashMap.compute()` ensures only one thread wins the first write, regardless of how many concurrent retries arrive simultaneously.

### Behavior table

| Scenario | Response |
|---|---|
| First call | `201 Created` |
| Retry, same tenant + key + payload | `200 OK` (replay) |
| Retry, same tenant + key, **different payload** | `409 Conflict` |
| Same key, different tenant | Independent — both succeed |

---

## Level 3 — Production Mode (Filter-Based + Pluggable Store)

**Module:** `idempotency-production-redis`  
**Goal:** cross-cutting idempotency at the HTTP layer, switchable between in-memory and Redis.

### The key design insight

In mature mode, the service owns idempotency. In production mode, that responsibility moves to a **servlet filter** (`IdempotencyFilter extends OncePerRequestFilter`).

Why? Because idempotency is not business logic — it's a protocol guarantee. The service shouldn't care whether the response it created 24 hours ago is being replayed right now. With the filter in place:

```
IdempotencyFilter → [cache hit?] → replay stored response (service never called)
                  → [cache miss] → acquire lock → ProductionOrderController → ProductionOrderService
                                                 ↓
                                            store response in HttpReplayStore
                                            release lock
```

`ProductionOrderService` becomes a **plain order factory**:

```java
// No idempotency logic here — the filter ensures this is called exactly once per key
public Order createOrder(String tenantId, String idempotencyKey, CreateOrderRequest request) {
    return new Order(UUID.randomUUID(), UUID.fromString(tenantId),
                     idempotencyKey, request.customerId(), request.amount(),
                     "pending", Instant.now());
}
```

---

### IdempotencyFilter — step by step

```
┌─────────────────────────────────────────────────────┐
│  IdempotencyFilter.doFilterInternal()               │
│                                                     │
│  1. Skip non-mutating methods (GET, HEAD, OPTIONS)  │
│     or requests with no Idempotency-Key header      │
│  2. Validate X-Tenant-ID (UUID) + Idempotency-Key   │
│     → 400 Bad Request on violation                  │
│  3. SHA-256 fingerprint of raw request body         │
│  4. Look up dedupKey in HttpReplayStore             │
│     ├─ HIT, same fingerprint  → replay 200 OK       │
│     │                            Idempotent-Replayed: true
│     └─ HIT, diff fingerprint  → 409 Conflict        │
│  5. Cache miss → acquireLock (SETNX in Redis)       │
│     └─ lock held by concurrent thread → 409         │
│  6. Run filter chain (controller + service)         │
│  7. Store response in HttpReplayStore + TTL         │
│     ├─ success → release lock                       │
│     └─ store fails → EXTEND lock (don't release)   │
│        (client gets 5xx; lock prevents unsafe retry)│
└─────────────────────────────────────────────────────┘
```

Key implementation details:

**`CachedBodyRequestWrapper`** — the HTTP request body is a stream that can only be read once. This wrapper caches the raw bytes so the filter can compute the SHA-256 fingerprint _and_ Spring can still deserialize the body downstream.

**`ContentCachingResponseWrapper`** — the response body is captured after the chain runs so the filter can store it in the replay store before releasing the lock.

**SETNX lock** (step 5) — Redis `SET … NX EX` (or in-memory equivalent). If two threads with the same new key arrive simultaneously, exactly one acquires the lock; the other gets a clear `409` rather than creating a duplicate order.

**Lock retention on store failure** (step 7) — if Redis write fails _after_ the order was created, releasing the lock would allow a retry that creates a second order. Instead, the lock TTL is extended to match the idempotency TTL. The client gets a 5xx, and retrying is unsafe until the lock expires.

**`Idempotent-Replayed: true` header** — set on every replayed response so clients, load balancers, and observability tools can distinguish replays from fresh creation responses.

**Response size guard** — responses exceeding `idempotency.store.max-response-bytes` (default 10 MB) are replaced with a `409` marker record. The business action completed; it just can't be replayed verbatim to avoid OOM.

---

### HttpReplayStore — the pluggable store

```java
public interface HttpReplayStore {
    HttpReplayRecord get(String dedupKey);
    boolean putIfAbsent(String dedupKey, HttpReplayRecord record, Duration ttl);
    boolean acquireLock(String lockKey, Duration ttl);
    void releaseLock(String lockKey);
    void extendLockTtl(String lockKey, Duration ttl);

    record HttpReplayRecord(
        int    statusCode,
        String contentType,
        String body,               // UTF-8 string (JSON in practice)
        String requestFingerprint  // SHA-256 of raw request bytes
    ) {}
}
```

Two implementations are wired via `@ConditionalOnProperty`:

| Implementation | `idempotency.store.type` | Notes |
|---|---|---|
| `InMemoryHttpReplayStore` | `in-memory` *(default)* | `ConcurrentHashMap` with TTL. No Redis needed. |
| `RedisHttpReplayStore` | `redis` | Jackson JSON in Redis. Human-readable via `redis-cli`. |

The filter depends only on the interface — swapping stores requires no code changes.

#### What a record looks like in `redis-cli`

```
# Cache entry
GET order:idempotency:2b8de313-...:order-abc
{"status":201,"contentType":"application/json","body":"{\"id\":\"a1b2...\", ...}","requestFingerprint":"e3b0..."}

# In-flight lock (present only while first request is being processed)
EXISTS order:idempotency:lock:2b8de313-...:order-abc

# Scan all cache entries for a tenant
KEYS order:idempotency:2b8de313-...:*

# Scan all in-flight locks across all tenants
KEYS order:idempotency:lock:*
```

The `:lock:` segment sits **before** the tenant ID — so cache entries and locks are separately scannable namespaces.

---

### Switching to Redis

**Step 1 — start Redis:**

```bash
# from idempotency-production-redis/
docker compose up -d
```

This starts `redis:7-alpine` on the default port `6379` with a persistent volume.

**Step 2 — run with the `redis` Spring profile:**

```bash
./gradlew :idempotency-production-redis:bootRun \
  --args='--spring.profiles.active=redis'
```

`application-redis.properties` is activated automatically, overriding:

```properties
idempotency.store.type=redis
spring.data.redis.host=localhost
spring.data.redis.port=6379
```

**Inspect live keys:**

```bash
docker compose exec redis redis-cli
> KEYS order:idempotency:*               # all cache entries
> KEYS order:idempotency:lock:*          # all in-flight locks
> GET  order:idempotency:2b8de313-...:order-abc
> TTL  order:idempotency:2b8de313-...:order-abc
```

---

## Tests

### Production module test coverage

| Test class | Store | What it covers |
|---|---|---|
| `ProductionOrderServiceTest` | — | Service creates order with correct fields |
| `OrderIdempotencyIntegrationTest` | In-memory | 12 HTTP-level scenarios |
| `OrderIdempotencyRedisIntegrationTest` | Real Redis (Testcontainers) | Same scenarios against live Redis |

### Integration test scenarios

1. First call → `201 Created`
2. Retry same key + payload → `200 OK`, `Idempotent-Replayed: true`, identical body
3. Retry different payload → `409 Conflict`
4. Different key → independent `201 Created`
5. Missing `Idempotency-Key` → filter skips (passes through)
6. Missing `X-Tenant-ID` → `400 Bad Request`
7. Malformed `X-Tenant-ID` → `400 Bad Request`
8. Empty `Idempotency-Key` → `400 Bad Request`
9. Oversized `Idempotency-Key` (> 255 chars) → `400 Bad Request`
10. Same key, different tenants → isolated; both get `201`
11. GET request → filter skips; no idempotency logic applied
12. **20 concurrent threads, same key** → exactly one order created; all others get `409` (SETNX lock)

### Testcontainers Redis wiring

```java
@Container
static GenericContainer<?> redis =
    new GenericContainer<>(DockerImageName.parse("redis:7-alpine"))
        .withExposedPorts(6379);

@DynamicPropertySource
static void redisProperties(DynamicPropertyRegistry registry) {
    registry.add("spring.data.redis.host", redis::getHost);
    registry.add("spring.data.redis.port", () -> redis.getMappedPort(6379));
}
```

`@DynamicPropertySource` injects the container's ephemeral port into the Spring context before it starts — no hardcoded ports, no external infrastructure in CI.

Run only the Redis-backed tests:

```bash
./gradlew :idempotency-production-redis:test \
  --tests "*.OrderIdempotencyRedisIntegrationTest"
```

---

## Running All Three Modules

```bash
./gradlew :idempotency-easy:bootRun
./gradlew :idempotency-mature:bootRun
./gradlew :idempotency-production-redis:bootRun
```

### Easy — port 8081

```bash
# First call
curl -i -X POST http://localhost:8081/api/orders \
  -H 'Content-Type: application/json' \
  -H 'Idempotency-Key: order-123' \
  -d '{"customerId":"25dfc44e-3ed7-4eb4-b412-6a6df8c6d355","amount":99.99}'
# → 201 Created

# Retry — same key, same payload
curl -i -X POST http://localhost:8081/api/orders \
  -H 'Content-Type: application/json' \
  -H 'Idempotency-Key: order-123' \
  -d '{"customerId":"25dfc44e-3ed7-4eb4-b412-6a6df8c6d355","amount":99.99}'
# → 200 OK  (same id, same createdAt)
```

### Mature — port 8082

```bash
# First call
curl -i -X POST http://localhost:8082/api/orders \
  -H 'Content-Type: application/json' \
  -H 'X-Tenant-ID: 2b8de313-9c3c-4a15-a9b8-0cd1e34be3da' \
  -H 'Idempotency-Key: order-123' \
  -d '{"customerId":"25dfc44e-3ed7-4eb4-b412-6a6df8c6d355","amount":99.99}'
# → 201 Created

# Payload mismatch
curl -i -X POST http://localhost:8082/api/orders \
  -H 'Content-Type: application/json' \
  -H 'X-Tenant-ID: 2b8de313-9c3c-4a15-a9b8-0cd1e34be3da' \
  -H 'Idempotency-Key: order-123' \
  -d '{"customerId":"25dfc44e-3ed7-4eb4-b412-6a6df8c6d355","amount":999.99}'
# → 409 Conflict
```

### Production — port 8083

```bash
# First call
curl -i -X POST http://localhost:8083/api/orders \
  -H 'Content-Type: application/json' \
  -H 'X-Tenant-ID: 2b8de313-9c3c-4a15-a9b8-0cd1e34be3da' \
  -H 'Idempotency-Key: order-123' \
  -d '{"customerId":"25dfc44e-3ed7-4eb4-b412-6a6df8c6d355","amount":99.99}'
# → 201 Created

# Retry — filter replays, service never called:
curl -i -X POST http://localhost:8083/api/orders \
  -H 'Content-Type: application/json' \
  -H 'X-Tenant-ID: 2b8de313-9c3c-4a15-a9b8-0cd1e34be3da' \
  -H 'Idempotency-Key: order-123' \
  -d '{"customerId":"25dfc44e-3ed7-4eb4-b412-6a6df8c6d355","amount":99.99}'
# → 200 OK
# Idempotent-Replayed: true
```

---

## Configuration Reference

All values in `idempotency-production-redis/src/main/resources/application.properties`:

| Property | Default | Description |
|---|---|---|
| `idempotency.store.type` | `in-memory` | `in-memory` or `redis` |
| `idempotency.store.key-prefix` | `order:idempotency:` | Redis key namespace |
| `idempotency.store.ttl` | `PT24H` | How long completed responses are kept |
| `idempotency.store.lock-ttl` | `PT30S` | Max time an in-flight lock is held |
| `idempotency.store.max-response-bytes` | `10485760` (10 MB) | Oversized responses get a marker instead of being cached |

---

## Pattern Progression (Beyond This Repo)

| Level | Approach | Trade-offs |
|---|---|---|
| 1 | `ConcurrentHashMap` | Simple; in-process only |
| 2 | Tenant-scoped map + fingerprint | Multi-tenant; still single-instance |
| 3 *(this repo)* | Filter + pluggable store (in-memory / Redis) | Horizontally scalable; no DB dependency |
| 4 | Redis + DB unique constraint on `(tenant_id, idempotency_key)` | Durable across Redis restarts |
| 5 | Outbox / event store | Full audit trail; at-least-once delivery |

The filter approach in Level 3 is the right production baseline for most REST APIs: transparent to the service, testable in isolation, and the store can be swapped without touching business logic.

---

## Known Limits (Intentional for Teaching)

- **No SQL persistence** — Redis data loss means idempotency history is lost. A unique constraint on the orders table is the next step.
- **No distributed lock** — the Redis lock is best-effort; a Redis failover mid-lock could theoretically allow a duplicate. Redlock or a DB advisory lock eliminates this.
- **No metrics** — adding Micrometer counters (`idempotency.hit`, `idempotency.miss`, `idempotency.conflict`) is the obvious next production addition.
- **In-memory store is per-instance** — run two instances without Redis and deduplication breaks across them.

---

## Summary

Idempotency is a **protocol guarantee**, not a business rule. Keeping it in a filter (rather than the service) means:

- The **service stays a plain factory** — easy to test, easy to reason about.
- You can **swap the store** (in-memory → Redis → DB) without touching business logic.
- **Replay, conflict detection, locking, and the `Idempotent-Replayed` header** are all in one place.

The three-module structure lets you walk a team from "what is idempotency?" all the way to "here's how we'd ship it."

