Queue-Based Architecture

Understanding Bklit's high-performance event processing pipeline

Queue-Based Architecture

Bklit uses a queue-based architecture for high performance, scalability, and zero data loss.

Overview

Instead of inserting events directly into ClickHouse on every request, events are:

  1. Queued in Redis (fast, durable)
  2. Processed in batches by a background worker
  3. Inserted into ClickHouse in bulk (100x faster)

Architecture Diagram

Tracker SDK ←→ WebSocket Server → Redis Queue → Background Worker → ClickHouse
            (bklit.ws:8080)         (durable)      (batch 100)      (8-50ms)

            Broadcast to Dashboard

            Dashboard UI (WebSocket)

Components

1. WebSocket Server

Location: packages/websocket
Port: 8080
URL: wss://bklit.ws (production), ws://localhost:8080 (development)

Purpose:

  • Maintain persistent connections with SDK (visitor browsers) and dashboards
  • Receive events via WebSocket (pageviews, custom events)
  • Validate API tokens (with Redis caching)
  • Anonymize IP addresses and enrich with geolocation
  • Push to Redis queue
  • Broadcast events to connected dashboards
  • Detect instant session end on WebSocket disconnect

Benefits:

  • Ultra-fast response times
  • No ClickHouse connection overhead
  • Can scale independently

2. Redis Queue

Key: analytics:queue
Format: LPUSH/RPOP (list)
Persistence: AOF (append-only file)

Purpose:

  • Buffer events for batch processing
  • Ensure zero data loss
  • Decouple ingestion from processing

Benefits:

  • Durable (survives crashes)
  • Fast (in-memory)
  • Scalable (multiple workers can consume)

3. Background Worker

Location: packages/worker
Polling: Every 1 second
Batch Size: Up to 100 events

Processing Steps:

  1. Pop batch from Redis queue
  2. Look up EventDefinition UUIDs from Postgres (cached)
  3. Create/update sessions in ClickHouse
  4. Batch insert events to ClickHouse
  5. Publish to Redis pub/sub for real-time
  6. Verify inserts succeeded

Benefits:

  • Batch inserts are 100x faster than single inserts
  • Can retry failed inserts
  • Monitors queue depth

4. Real-time Updates

Channel: live-events (Redis pub/sub)
Consumer: WebSocket server

Flow:

  • Worker publishes processed events
  • WebSocket server broadcasts to connected clients
  • Dashboard UI updates instantly

Performance Comparison

MetricOld (Direct Insert)New (Queue-Based)Improvement
API Response300-500ms1-3ms100x faster
ClickHouse Write1 insert/eventBatch 100 events100x faster
Data Loss RiskMedium (if crash)Zero (queue persists)Eliminated
ScalabilityLimitedHorizontalUnlimited
ObservableConsole logs only/terminal UIFull visibility

Local Development

Starting Services

Terminal 1:

pnpm dev:services

Wait for:

  • [prisma] Prisma dev running...
  • [websocket] 🌐 WebSocket server ready
  • [worker] 🔄 Background worker started

Terminal 2:

pnpm dev

Debug UI

Visit the /terminal page to see real-time event flow:

http://localhost:3000/{organizationId}/{projectId}/terminal

Filter by stage, search logs, trace individual events through the entire pipeline.

Monitoring

Queue Depth

Check if events are backing up:

docker exec bklit-redis-local redis-cli LLEN analytics:queue

Should be 0 or very low (worker processes fast).

Worker Performance

The /terminal UI shows:

  • Events processed per second
  • Average latency per stage
  • Batch sizes
  • Error rates

ClickHouse Data

Verify events are being saved:

docker exec bklit-clickhouse-local clickhouse-client --database=analytics \
  --query "SELECT count() FROM page_view_event"

Zero Data Loss

The architecture guarantees zero data loss through:

  1. Redis AOF Persistence - Every queue operation is logged to disk
  2. Worker Retry Logic - Failed inserts are retried with exponential backoff
  3. Dead Letter Queue - Permanently failed events moved to analytics:queue:failed
  4. /terminal Monitoring - Immediate visibility into any failures

Production Deployment

Status: Production deployment guide coming soon.

The architecture is production-ready and tested locally. Deployment involves:

  • WebSocket server → Hetzner VPS (bklit.ws, port 8080, gray cloud DNS)
  • Background worker → Hetzner (systemd service, co-located with ClickHouse)
  • Dashboard → Vercel (Next.js app with tRPC API routes)
  • Monitor queue depth and WebSocket connections

Backwards Compatibility

The new system is fully backwards compatible:

  • Uses same ClickHouse tables
  • Uses same EventDefinition UUIDs from Postgres
  • Queries work identically
  • Gradual migration possible (dual-write during transition)

On this page