Skip to content

Component

Sentinel Service

Responsible for rate-limit enforcement and asynchronous fraud screening.


Purpose

Sentinel serves two roles. On the synchronous path, it applies Redis-backed token bucket decisions before protected services absorb unnecessary traffic. Off the hot path, it consumes bid events from Redis Streams, batches them for model scoring, and records enforcement outcomes for the rest of the system.

Responsibilities

  • Enforce per-client request budgets through atomic token bucket decisions
  • Share rate-limit state across multiple service instances
  • Consume bid events from Redis Streams
  • Forward micro-batches to the Python sentinel-ml service
  • Write flagged actors to the banned_users Redis set

Rate-Limit Decision Flow

sequenceDiagram
    participant C as Client
    participant S as Sentinel Service
    participant R as Redis

    C->>S: POST /check
    S->>R: Execute token bucket Lua script
    R-->>S: Allowed / blocked + remaining tokens
    S-->>C: 200 OK or 429 Too Many Requests

The synchronous path is intentionally short. Sentinel receives a request, delegates the bucket update to Redis, and returns the decision without additional coordination.

Fraud Analysis Flow

sequenceDiagram
    participant X as Redis Stream
    participant S as Sentinel Service
    participant M as sentinel-ml
    participant R as Redis

    S->>X: Read bid events
    S->>S: Form micro-batch
    S->>M: Submit batch for scoring
    M-->>S: Fraud predictions
    S->>R: Add flagged actors to banned_users

This pipeline stays off the request path. Bids are scored asynchronously, and enforcement data is written back to Redis without adding latency to bid execution.

Token Bucket Model

Each caller is identified by X-User-ID. Redis stores the current token count and last refill time for that caller. On each request, Sentinel:

  1. Computes elapsed time since the last refill
  2. Adds newly earned tokens, capped at bucket capacity
  3. Deducts the request cost when enough tokens are available
  4. Returns a deny response when capacity is exhausted

All four steps run inside one Lua script so refill and consumption happen as a single atomic operation.

Runtime Stack

Layer Technology
Language Java 25
Framework Spring Boot 4 / WebFlux
State Store Redis
Coordination Lua scripts
Stream Processing Redis Streams
ML Integration Python FastAPI (sentinel-ml)

Quick Start

services:
  sentinel-service:
    build: ./sentinel-service
    ports:
      - "8081:8081"
    environment:
      - SPRING_DATA_REDIS_HOST=redis
    depends_on:
      - redis

  redis:
    image: redis:7.2-alpine
    ports:
      - "6379:6379"
docker compose up -d

Local Development

Start Redis

docker run -d -p 6379:6379 --name sentinel-redis redis:7.2-alpine

Run the service

./mvnw spring-boot:run

API

Endpoint

POST /check

Parameters

  • capacity - maximum number of tokens in the bucket
  • rate - refill rate in tokens per second
  • cost - number of tokens consumed by the request
  • X-User-ID header - unique identifier for the caller

Example

curl -X POST "http://localhost:8081/check?capacity=10&rate=1&cost=1" \
  -H "X-User-ID: test_user"

Response

{
  "allowed": true
}

API Documentation

When the service is running locally, the OpenAPI documentation is available at:

http://localhost:8081/swagger-ui.html

Created by Justin Walker