Kafka KRaft for Solo Projects

Building a real-time geospatial data platform (QuackNet) for crowdsourced network intelligence, I faced a common solo-dev dilemma: how to handle high-volume event ingestion without ops complexity?

My first instinct was to skip Kafka entirely—Redis Streams or NATS would be simpler. But after stress-testing the architecture, I realized I needed 100+ concurrent writers, fault-tolerant persistence, and partitioned parallelism. That's Kafka's job. The problem: traditional Kafka with ZooKeeper is overkill for one developer.

Enter KRaft mode: Kafka's new consensus protocol that eliminates ZooKeeper. Single-process broker+controller, one configuration file, and it just works. Here's what I learned shipping it.

Why Kafka at All?

The QuackNet pipeline is simple in theory:

Mobile app submits WiFi/cellular/BLE/GPS scan data
Backend API writes scan events to Kafka
Consumer group reads events and batches them into ClickHouse (OLAP database)
Frontend queries ClickHouse for analytics dashboards

Without Kafka, I'd either poll PostgreSQL constantly (wasteful) or have the API block until ClickHouse writes complete (slow, defeats the purpose of async).

Kafka lets the API fire-and-forget: write to Kafka in milliseconds, let the consumer do batch inserts into ClickHouse in parallel. If the consumer crashes, Kafka's topic partitions have the data—no events lost.

Also: multiple consumers. The same scan event feeds analytics, fraud detection, and push alerts. Kafka broadcasts to all subscribers. No code duplication.

The KRaft Trade-Off

Traditional Kafka setup:

ZooKeeper cluster (3+ nodes for HA)
Kafka broker nodes
Monitoring, inter-node replication, metadata quorum
If you mess it up, ZooKeeper becomes a black box

KRaft mode:

Single broker+controller process (or multiple for HA)
No external dependencies
Raft consensus built into Kafka itself
Simpler: one Docker container, one port

Trade-off: KRaft is newer (GA in Kafka 3.2+), fewer operators know it well, and edge cases may exist. For a solo project? Perfect. For Airbnb's infrastructure? Maybe stick with ZooKeeper.

Docker Compose Setup (What Actually Works)

I tried bitnami/kafka first. Broke immediately. The bitnami image uses confusing env var prefixes and old Kafka versions. After 2 hours of debugging, I switched to apache/kafka:3.8.1 (official). That worked.

version: '3.8'
services:
  kafka:
    image: apache/kafka:3.8.1
    ports:
      - "9092:9092"   # broker (internal)
      - "9094:9094"   # advertised (external)
    environment:
      KAFKA_NODE_ID: 1
      KAFKA_PROCESS_ROLES: "broker,controller"
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: "CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT"
      KAFKA_INTER_BROKER_LISTENER_NAME: "PLAINTEXT"
      KAFKA_ADVERTISED_LISTENERS: "PLAINTEXT://localhost:9094"
      KAFKA_CONTROLLER_QUORUM_VOTERS: "1@kafka:9092"
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
      KAFKA_LOG_RETENTION_HOURS: 168
    volumes:
      - kafka-data:/var/kafka-logs

volumes:
  kafka-data:

Key settings:

KAFKA_NODE_ID: 1 — Broker ID (must be positive integer)
KAFKA_PROCESS_ROLES: "broker,controller" — This is the magic. Single process plays both roles.
KAFKA_CONTROLLER_QUORUM_VOTERS: "1@kafka:9092" — Quorum of one (it votes for itself)
KAFKA_LOG_RETENTION_HOURS: 168 — Keep 7 days of data. Adjust for your scale.

Spin it up:

docker compose up -d kafka
docker exec kafka kafka-topics.sh --create \
  --topic netintel.scan.events \
  --partitions 6 \
  --replication-factor 1 \
  --bootstrap-server localhost:9092

Done. One command, one container. No ZooKeeper cluster to manage.

The context.Background() Bug

Here's where I got burned. In my Go API handler, I was writing scans to Kafka asynchronously:

func (h *Handler) SubmitScan(c *gin.Context) {
  go func() {
    err := h.kafkaProducer.Produce(
      c.Request.Context(),  // ← WRONG!
      "netintel.scan.events",
      scanJSON,
    )
  }()
  c.JSON(200, gin.H{"status": "accepted"})
}

The problem: c.Request.Context() cancels when the HTTP response is sent. By the time the goroutine tries to write to Kafka, the context is already dead. Silent failure, no error log—the event just got dropped.

The fix:

func (h *Handler) SubmitScan(c *gin.Context) {
  go func() {
    err := h.kafkaProducer.Produce(
      context.Background(),  // ← Detached context
      "netintel.scan.events",
      scanJSON,
    )
  }()
  c.JSON(200, gin.H{"status": "accepted"})
}

context.Background() is never cancelled. It's the root context. Goroutines spawned with it will complete their writes to Kafka even after the HTTP response goes out.

Golden rule: For async background work (Kafka, email, logging), use context.Background(). For handler-scoped work (database queries), use c.Request.Context().

When to Use Kafka vs Alternatives

Kafka is powerful but adds complexity. Here's my mental model:

Use Case	Kafka	Redis/NATS
Durability (persist to disk?)	Yes, auto	Optional, costly
Replay old events?	Yes	No
Partitioning / scaling?	Native	Manual sharding
Ops complexity?	Higher (KRaft helps)	Much simpler
Best for	Multi-consumer analytics pipelines	Task queues, caching

For QuackNet: we need durability (scan data is valuable), replay capability (re-process events if our ClickHouse pipeline breaks), and multiple consumers (analytics, fraud detection, push alerts). Kafka wins.

If I were just building a job queue? Redis Streams or Bull would be faster to ship.

The Scaling Trap

One gotcha: don't run Kafka in development on your laptop if you're doing serious testing. Kafka's memory footprint is small, but if you generate millions of events, replication and compaction kick in. I once filled my 256GB SSD with Kafka logs because I didn't set KAFKA_LOG_RETENTION_HOURS correctly.

Solution:

Use named Docker volumes with size limits (or just monitor them)
Set retention aggressively in dev: KAFKA_LOG_RETENTION_HOURS: 1
Use KAFKA_LOG_SEGMENT_BYTES: 10485760 (10MB) to force compaction frequently

Is KRaft Production-Ready?

Yes, but with caveats.

Confluent (Kafka stewards) marked KRaft as GA in Kafka 3.2 (early 2022). It's been battle-tested by mid-tier companies for ~2 years. But the operator community is smaller than ZooKeeper-based Kafka.

For a solo founder with a real-time pipeline? Go for it. For a 100-person SaaS company running 50 Kafka clusters with SLA guarantees? Maybe hire someone who knows ZooKeeper inside-out.

I'm shipping QuackNet with KRaft. When we hit scaling issues (and we will), I'll have time to migrate. For now, it's one less thing to think about.

Conclusion

Kafka KRaft lets solo developers use a battle-tested, production-grade event streaming platform without ops overhead. No ZooKeeper cluster. One Docker image. Six environment variables.

The gotchas are real (context cancellation, image selection, config typos), but they're learnable. If you're building an analytics pipeline, fraud detection layer, or real-time dashboard, Kafka is worth the effort.

KRaft makes it accessible. That's a win for indie builders.