Scalytics · KafScale

One endpoint.
Infinite scale.

Stateless Kafka on S3. Self-hosted. Apache 2.0.

The streaming spine for critical infrastructure and agentic AI. KafScale is the only Apache 2.0, S3-native, Kubernetes-first Apache Kafka compatible stack that runs where your platform runs and stays where your data is.

▸ GitHub ▸ kafscale.io

cluster · ingest 42K/s LIVE · seg 84,217

PRODUCERS

kafka clients

any library
no SDK swap

PKT PKT PKT

PROXY

one DNS

topology hidden
from clients

stateless · HPA · no broker-local disk

S3 · SOURCE OF TRUTH

.kfs segments

11 nines durability · segment 84,217 committed

BROKER STATE

0 GB

REBALANCE

never

CONTROL PLANE

none

LICENSE

Apache 2.0

Infinite Scale. S3 Storage. 100% Open Source.

You need Kafka at
scale. Not a seven-figure renewal.

KafScale turns S3 into a high-performance, cost-crushing streaming spine for AI agents, defense, and critical infrastructure.

Start with an AI Audit →How It WOrks

Open infrastructure · No control plane · No vendor toll booth

Your streaming spine should not share fate with the vendor that owns it.

What KafScale commits to that comparable platforms don't

Six commitments. No exceptions.

KafScale is purpose built on real-world experience and architectural decisions that make Kafka-compatible streaming usable for engineering teams who own the consequences of their infrastructure choices. These commitments hold across every release, every API endpoint, and every deployment. They are documented in the source, enforced in the build, and visible at the protocol layer.

Apache 2.0. The whole way down.

Use it, modify it, redistribute it, sell services on top of it. No BSL clauses that convert in four years. No usage fees per GiB. No vendor control plane that holds your cluster hostage. The license is the same one Apache Kafka ships under, and KafScale is the only stateless, S3-native streaming platform that ships under it without an asterisk.

S3 is the single source of truth.

Brokers hold zero durable state. Immutable .kfs segments live in object storage with eleven-nines durability. Add a broker, the cluster scales. Remove a broker, the cluster shrugs. There is no partition rebalancing because there is no broker-local data to rebalance. Failure becomes a scheduling event, not an operational incident.

One endpoint. Infinite scale.

A proxy rewrites Kafka metadata responses so every client sees one DNS name. Brokers scale to N behind it. Clients never see a topology change, never trigger a reconnection storm, never need a custom SDK for most Kafka operations. KafScale scales with you, transparent for your producers and consumers, and S3 as storage layer.

Open segment format. No broker bottleneck.

The .kfs format is documented as part of the public specification. Processors for Iceberg sinks, SQL engines, AI agent retrievers read segments directly from S3, bypassing brokers entirely. Streaming traffic and analytical traffic read from S3 and never compete for the same compute. No other Kafka-compatible platform exposes its storage format this way.

Kubernetes-first, not Kubernetes-tolerant.

A real operator with custom resource definitions for clusters and topics. HPA-driven scaling that does not require partition reassignment. Helm charts that deploy a working cluster in minutes. The deployment unit is a pod set on your cluster; not an agent in a vendor VPC, not a hosted service, not a binary you babysit.

Standards on every boundary.

Kafka wire protocol inbound and outbound. S3 as proven durability storage layer, etcd for metadata. No proprietary RPC, no vendor SDK, no closed segment format sitting between your producers and your data. If we lose the deal, your data leaves on the same standards it arrived on.

Each commitment is a red line in the build. They are documented in the source, enforced in the integration pipeline, and visible at the API. A buyer who wants to verify them does not need a sales call. The repository is at github.com/KafScale/platform.

The system

Producers in. Brokers stateless. S3 owns the data.

KafScale accepts Kafka-protocol traffic from any standard client through a single proxy endpoint, flushes immutable .kfs segments to S3, and serves both real-time consumers through brokers and analytical workloads through direct-from-S3 processors. Streaming and analytics share the data; they never compete for the same compute.

PRODUCERS

Kafka clients

any library
any version
no SDK rewrite

PROXY

one DNS endpoint

metadata rewrite
topology hidden
topology-stable clients

BROKERS · 0..N

stateless pods

no local disk
HPA autoscale
flush to S3

.kfs segments

11 nines durability
immutable
source of truth

↓ produce

→

→ flush

Processors · Iceberg sinks · SQL engines · AI agent retrievers → read .kfs segments directly from S3, bypassing brokers entirely. Real-time and batch never contend for the same resources.

PRODUCTION TRANSPORT

Replace broker disks. Keep your clients.

For teams running Kafka with multi-terabyte partition disks, daily on-call pages from rebalance storms, and ETL pipelines that don't need sub-10ms latency. KafScale takes the load off broker disks and puts retention on S3 economics. Producers and consumers connect with their existing libraries.

▸ quickstart guide

BACKUP · DR · ISOLATION

A second spine that does not share fate with production.

For teams who run Confluent or Apache Kafka in production and need a parallel cluster for disaster recovery, agentic AI workloads, replay testing, or regulated workload separation. kaf-mirror live-syncs your production cluster to KafScale. Production stays untouched.

▸ see the pattern below

AGENTIC SPINE

The memory layer for autonomous agents.

For teams building agentic AI where decisions, tool calls, and reasoning chains need to be replayable, auditable, and storable for years without a per-GiB tax. The immutable S3 log becomes the substrate that agents query, reconcile, and reason over — bypassing the broker path entirely.

▸ KafClaw runtime

Backup, DR, and agentic isolation

Mirror production. Run agents on a separate spine.

Most engineering teams running Confluent or Apache Kafka in production cannot put agentic AI workloads on the same cluster. Replay testing, prompt-history retention, batch reprocessing, and disaster-recovery drills all compete with the same brokers that serve real-time. KafScale plus kaf-mirror separates the two without changing a single line of producer code.

PRODUCTION

Confluent / Kafka / MSK

untouched
real-time SLA preserved

→
kaf-mirror

LIVE SYNC

kaf-mirror replication

topic + offset preserving
per-topic regex mapping

→
S3-native

KAFSCALE

parallel stack

DR cluster · agent workloads
replay · long retention

Production never sees agent load.

AI agents, batch reprocessors, and replay tests read from KafScale. The brokers serving your real-time traffic do not contend with reasoning chains that pull months of history.

Backup-DR without a vendor lock-in.

If your primary cluster goes down or your renewal terms shift, KafScale is already a warm standby on S3. Cluster Linking and MirrorMaker 2 are options — KafScale is one with no per-GiB licensing and no proprietary control plane.

Long retention without the storage tax.

Standard broker-resident retention for prompt histories, agent decisions, and tool call logs gets expensive at multi-month scale. S3 economics let you keep years of agent state for the price of object storage, which compounds in your favour as agent workloads grow.

Compliance boundaries stay clean.

Mirror only the topics that need to leave production. Apply different retention, different ACLs, different encryption keys to the agent spine. The kaf-mirror layer is the policy boundary, not a downstream afterthought.

Scalytics is a Confluent partner. KafScale is positioned as a complementary spine for workloads where production isolation is required — not as a wholesale Confluent replacement. The kaf-mirror project is open source: github.com/scalytics/kaf-mirror.

Honest comparison · April 2026

Where KafScale fits, and where it doesn't.

KafScale is not a drop-in for every Kafka workload. The table below summarises the architectural and licensing tradeoffs that actually matter when you're choosing what runs your streaming spine for the next five years. The full comparison, including cost snapshots and feature parity, lives at kafscale.io/comparison.

Platform	License	Storage model	Self-hostable	Best for
KafScale	Apache 2.0	S3 only, etcd metadata	Yes, no control plane	BDR, agentic AI, ETL, regulated workloads
Apache Kafka	Apache 2.0	Local disk, replicated	Yes	Sub-10ms latency, transactional workloads
Confluent Platform / Cloud	Proprietary	Local disk + tiered S3	Self-managed Platform	Full Kafka ecosystem, ksqlDB, Connect
WarpStream	Proprietary	S3 + Confluent control plane	BYOC only	BYOC logging, observability
Redpanda	BSL 1.1	Local disk + S3 tiering	Yes, with BSL terms	Low-latency Kafka replacement
AutoMQ	Apache 2.0*	S3 + EBS write-ahead log	Yes	Kafka migration with low latency
Bufstream	Proprietary	S3 + PostgreSQL metadata	Yes, with usage fees	Lakehouse, Protobuf-first pipelines

WarpStream was acquired by Confluent in September 2024 and is now closed source. IBM announced acquisition of Confluent in December 2025, expected to close mid-2026. AutoMQ converted from BSL to Apache 2.0 in May 2025. Among Kafka-compatible streaming platforms shipping in April 2026, KafScale is the only one that is S3-native, stateless, fully self-hostable, and Apache 2.0 licensed without an asterisk for the eighty percent of workloads that do not require sub-10ms latency or exactly-once transactions.

Architecture commitments

Red lines, in plain text.

A buyer running streaming for a regulated process, a BDR strategy, or an agentic platform reads architectural commitments more carefully than they read benchmarks. Each line below is enforced in the build pipeline, documented in the source, and visible at the protocol layer.

Brokers hold no durable state. S3 is the source of truth. There is no broker disk to lose, drain, or rebalance.
The proxy presents one DNS endpoint. Standard Kafka clients connect without modification. Topology changes are invisible to producers and consumers.
The .kfs segment format is part of the public specification. Processors read directly from S3 without going through brokers.
No vendor control plane. The cluster operates on your Kubernetes, your S3, your etcd. There is no external dependency that can revoke access or change pricing.
The license is Apache 2.0. No BSL conversion clauses, no usage-based fees, no commercial-use restrictions. Forever.
Streaming and analytical workloads never compete for compute. Brokers serve the Kafka protocol. Processors read S3 directly. Two paths, one data set.
If you stop using KafScale, your data leaves on the same standards it arrived on. Kafka protocol out, S3 segments in your bucket. No vendor extraction step.

Deployment posture

Runs where your platform runs.

KafScale is designed for environments where a vendor cloud dependency is a dealbreaker. No external service calls in the data path. No telemetry back to the maintainers. The deployment unit is a pod set on your Kubernetes cluster: sovereignty and network boundaries stay your decision.

Footprint

A Kubernetes cluster you already run, an S3 bucket, and an etcd ensemble. That is the entire dependency list. Zero phone-home, zero external auth, zero hosted control plane.

Operator and CRDs

The KafScale operator manages clusters and topics through standard CRDs. kubectl apply -f topic.yaml creates a topic. HPA handles broker scaling without partition reassignment.

Storage

Any S3-compatible object store: AWS S3, MinIO, GCS via S3 API, Azure Blob via S3 API, on-prem Ceph. Lifecycle policies enforce retention. There is no second tier to manage.

Backup and restore

S3 versioning, replication, and snapshots are the backup story. There is no broker-local state to dump, replay, or reconcile after a node failure. Recovery is a pod restart.

Air-gapped deployment

The data path has no external dependencies. KafScale runs in environments without internet egress, behind VPN, or in classified networks. Container images mirror to your registry; the cluster never reaches out.

Source access

The repository is at github.com/KafScale/platform. Inspect, fork, audit before purchase. There is no enterprise-only branch or hidden module.

Procurement questions

What buyers ask before a first briefing.

The seven questions platform leads, OT security architects, and CTOs raise in every early call. Answered here so the briefing time can go to your actual problem.

Is KafScale a drop-in replacement for our existing Confluent or Apache Kafka cluster?

No, and we recommend against framing it that way. KafScale is Kafka protocol-compatible for produce, consume, and consumer-group flows, but it does not implement transactions, compacted topics, or sub-10ms latency. It is the right choice for ETL, logs, async events, agentic workloads, and BDR mirrors of an existing cluster. For low-latency transactional workloads, run Apache Kafka, Confluent, or AutoMQ. Many KafScale deployments live alongside a production Kafka cluster, not in place of it.

How does KafScale work with our existing Confluent cluster?

kaf-mirror live-syncs topics from your production Kafka or Confluent cluster to KafScale using the franz-go client library. Production traffic is unaffected. AI agents, replay testing, batch reprocessors, and DR drills run on the KafScale side. You select which topics mirror, with what retention, under what ACLs. Scalytics is a Confluent partner; this pattern is designed to coexist, not displace.

What does KafScale require to run?

A Kubernetes cluster (1.24+), an S3-compatible object store, and an etcd ensemble. The KafScale operator deploys via Helm. Standard Kafka clients connect to a single proxy endpoint. There is no external control plane, no vendor cloud account, and no phone-home telemetry.

Is the source available before purchase?

Yes. The repository is open at github.com/KafScale/platform under Apache 2.0. Specifications for the .kfs segment format, the operator CRDs, and the supported Kafka API surface are all public. There is no enterprise-only branch.

What are the realistic latency expectations?

Steady-state p99 produce latency is approximately 200–500ms because writes are not acknowledged until the segment commits to S3. This is acceptable for ETL, logs, async events, and agentic workloads. It is not acceptable for synchronous trading or fraud-detection paths that require sub-10ms acknowledgement. The latency tradeoff is the price of broker statelessness.

Can KafScale operate fully air-gapped?

Yes. The data path makes no external service calls. Container images mirror to your registry. Operator, brokers, and processors run inside your network boundary. The only requirements are a reachable S3 endpoint and a reachable etcd ensemble, both of which can be on-premise or inside a SCIF.

What does Scalytics provide on top of the open-source project?

Architecture review for streaming and BDR strategies, hardening for regulated environments, integration with Apache Wayang for federated execution across the same data, integration with Scalytics Copilot for private LLM hosting alongside agent workloads, and 24/7 enterprise support contracts. The KafScale repository remains Apache 2.0 and the project's roadmap is independent of any specific support engagement.

Where this is in the build

Foundation shipped. Processors in flight. Design partners welcome.

The technical buyer reads this section more carefully than the case studies. A platform that admits what is shipped and what is in flight is one the buyer can plan against. A platform that claims everything is ready is one the buyer assumes is hiding the integration cost.

SHIPPED

Foundation

Stateless brokers, S3-native segment storage, etcd metadata, single-endpoint proxy, Kubernetes operator with cluster and topic CRDs, Kafka APIs covering produce, fetch, consumer groups, and topic administration. Helm chart and quickstart for kind clusters.

IN FLIGHT

Data Processors

The Iceberg processor is shipping incrementally with Unity Catalog, Polaris, and AWS Glue support. The KAFSQL processor (Postgres-compatible SQL over .kfs segments) is also available and in use.

OPEN

Design partners

We are working with platform teams running Confluent or Apache Kafka in production who want to add a KafScale spine for BDR, agentic workloads, or long-retention analytics. If your gap is the gap described on this page, request a briefing.

Who builds this

Built by the team that originally created Apache Wayang.

KafScale is developed and maintained by Scalytics, the company founded by the original inventors of Apache Wayang (now an Apache Top-Level Project for federated data processing). The same architectural philosophy - compute where the data lives, on standards you can audit - runs through every part of KafScale.

The team includes Apache committers, former Confluent and Cloudera engineers, and operators from regulated enterprise environments. Scalytics is an active Confluent partner; KafScale is positioned as a complementary spine for workloads where production isolation, Apache 2.0 licensing, or agentic AI infrastructure are the deciding constraints.

The team's experience include Allianz · E.ON · Cloudera · Scout24 · Confluent · McKinsey

Request a technical briefing.

45 minutes. Architecture, BDR pattern, kaf-mirror integration, agentic workload separation, design partner terms. Bring your hardest streaming question. Or read the source first.

▸ Book a briefing ▸ Read the source ▸ Explore docs at kafscale.io

One endpoint.Infinite scale.

You need Kafka at scale. Not a seven-figure renewal.