Bottom Line
A decision fabric is a Kafka-native substrate where every business-relevant event is observable in real time. Agents and policies consume events where they are produced. Every decision is emitted back onto the same stream as a new event. There is no central ontology and no synchronized snapshot. The stream is the source of truth.
Delivered through the Scalytics open-source components of KafScale (transport), KafGraph (memory), KafClaw (agents), and KafSIEM (link analysis and audit), this pattern enables fully auditable autonomous operations inside sovereign boundaries. Engineering leaders responsible for defense, critical infrastructure, or regulated operations can deploy the complete fabric on-premise or air-gapped using a single distribution and one support contract. The architecture prioritizes inspectability, immutability, and jurisdictional alignment over convenience.
Why This Matters Now
Modern defense and critical infrastructure systems must operate at machine speed. Human-in-the-loop review cannot scale to the volume or latency demands generated by continuous streams of sensor data and telemetry. While autonomous sense-decide-act loops are necessary for resilience, organizations face three severe friction points that require architectural rather than contractual solutions.
- Jurisdictional Risk and Dependency: Closed platforms that route data or inference outside an operator’s control create extraterritorial risk. For workloads managed in non-EU jurisdictions or classified environments, this exposure is unacceptable.
- Regulatory Complexity: Frameworks such as the AI Act, NIS2, and the Data Act require demonstrable control over data, models, and decision provenance. True sovereignty demands control at every layer -infrastructure,models, and processing pipelines - not just simple data residency.
- Outsourced Cognition: When models, memory, or decision records leave the network, auditability and policy enforcement become impossible to enforce mathematically.
The convergence of abundant streaming data and capable open models has removed the technical excuses for outsourcing core operational cognition. A unified fabric eliminates the integration tax and provenance drift caused by trying to unite disparate streaming, memory, and audit tools.
What the Market Thinks: The Drive Toward True Sovereignty
- The Expansion of Sovereign AI: True sovereignty extends beyond simple data residency to full control over infrastructure, models, and processing pipelines. Analysis confirms that achieving this demands control at every single layer.
- Regulatory Pressure: European regulations such as the AI Act, NIS2, and Data Act, together with NATO's emphasis on sovereign capabilities, are driving operators toward platforms that retain both reasoning and audit trails inside approved boundaries (Consensus Drift on European Digital Sovereignty).
- The System of Record: Immutable event streams provide a necessary foundation for non-repudiation and after-action review. Kafka-style append-only logs have long served compliance use cases by creating tamper-evident records (supported by Confluent architectural patterns). Extending this foundation to agent decisions creates a single source of truth for both data and reasoning.
Streams as the Single Source of Truth
The sovereign decision fabric turns streams into decisions at the point of production. Agents subscribe to relevant topics, consult shared context held in the graph, apply operator-defined policy, and reach conclusions. They then publish those conclusions as new events back onto the exact same backbone.
Every step remains observable and linked through correlation identifiers and trace spans. No separate database must be kept consistent with the event log. The log is the system of record.
This architecture differs fundamentally from ontology-centric platforms that maintain a centralized model of the world and require continuous synchronization. Event streams deliver natural temporal ordering and immutable history. Agents reason over the current state derived from the log plus the shared memory layer, and contribute new events that update the state for downstream consumers. The system is eventually consistent by design, yet provides strong audit guarantees through the append-only record.
To deliver this, the integrated Lascaris distribution provides four purpose-built open-source components:
- KafScale (Transport): An Apache 2.0, S3-native, Kafka-compatible streaming platform. It runs stateless brokers on Kubernetes that flush immutable segments directly to object storage.
- KafGraph (Memory): A distributed, queryable property graph backed by BadgerDB. Unlike vector databases that lose structural relationships or become stale, KafGraph maintains traversable relationships and provenance edges with ACID semantics.
- KafClaw (Agent Runtime): A runtime that coordinates heterogeneous agents using typed JSON envelopes, dedicated memory, and audit channels. It enforces policy at the edge and ensures every decision is published back to the fabric with full attribution.
- KafSIEM (Audit): Completes the fabric by turning alerts, decisions, and operational events into auditable relationship graphs with transactional provenance.
How It Works
The fabric operates as a closed loop inside the operator's network. Live data and telemetry land in KafScale topics. Agents subscribed via KafClaw receive events. They invoke brain tools against KafGraph to assemble relevant context. They evaluate against encoded policy or local LLM reasoning. They emit a decision event. That event may trigger downstream actions, policy updates, or human review.
Correlation IDs tie every step together. A sensor event receives an ID. The agent request, graph queries, reflection cycles, policy evaluation, and final decision all reference it. Trace spans provide temporal ordering. The audit channel receives a copy of every decision with its full provenance graph slice. Nothing is lost to transient in-memory state.
KafGraph ingestion happens automatically from the stream. Every conversation, skill invocation, decision, and feedback event becomes nodes and edges. Reflection cycles run periodically, scoring impact and surfacing patterns. Human feedback overrides scores and refines future retrieval. The graph therefore improves with use.
For air-gapped deployments, the entire stack runs without outbound connectivity. Models can be local. Brokers are stateless. Graph storage is embedded or clustered inside the enclave. One distribution ships the engine, memory, runtime, and security components version-locked and tested together. Operators avoid the integration debt of assembling separate streaming, vector, graph, agent, and SIEM solutions.
For details on the technical memory layer see KafGraph shared memory for agents.
A typical flow for an intelligence triage agent proceeds as follows.
- Ingestion: New telemetry event arrives on sensor.raw.v1.
- Routing: KafClaw routes it to the subscribed agent group.
- Recall: The agent calls brain_recall to load prior related incidents via graph traversal.
- Search: The agent calls brain_search for similar patterns.
- Policy Gate: The policy engine in KafClaw validates against operator rules of engagement before permitting action.
- Decision & Audit: The decision is published to decision.command.v1 and simultaneously to the audit.decision.v1 channel with the full trace.
- Action: Downstream consumers react to the decision event.
This pattern scales because consumption is parallel and state resides durably in the log plus graph. New agents join without bootstrap synchronization beyond replaying relevant partitions.
Implementation
Deployment begins with the Lascaris distribution or the individual Apache 2.0 components. For organizations already running Kafka, KafClaw and KafGraph install as additional workloads that consume from existing topics.
The following example shows a tool call envelope an agent sends to KafGraph via HTTP or KafClaw routing:
{
"tool": "brain_recall",
"parameters": {
"correlation_id": "inc-2026-06-05-0147",
"agent_id": "triage-alpha-03",
"filters": {
"type": ["Incident", "LearningSignal"],
"time_window_hours": 72
},
"traversal_depth": 3
},
"traceparent": "00-0af7651916cd43dd8448eb211c80319c-00f067aa0ba902b7-01"
}
The response returns a context bundle with nodes, edges, and embeddings ranked by relevance. The agent incorporates this into its prompt or reasoning chain.
Crucially, policy rules are evaluated entirely inside the KafClaw edge runtime using only data already inside the sovereign boundary. This architecture physically prevents any non-compliant decision from leaking provenance outside the approved network.
apiVersion: kafclaw.scalytics.io/v1
kind: AgentGroup
metadata:
name: intel-triage
spec:
topics:
input:
- sensor.raw.v1
- telemetry.anomaly.v1
output:
- decision.command.v1
- audit.decision.v1
policy:
rules:
- action: "engage"
condition: "threat_level > 7 AND authorization.clearance >= REQUIRED"
audit: true
# Policy rules are evaluated entirely inside the KafClaw edge runtime using only data and context already inside the sovereign boundary.
# The condition references graph-stored authorization attributes and locally ingested threat assessments.
# This prevents any non-compliant decision from generating external actions or leaking provenance outside the approved network.
model:
provider: local
endpoint: http://model-service:8080/v1/completions
memory:
graph: kafgraph-intel-cluster
defaultTools: ["brain_search", "brain_recall", "brain_capture"]
auditChannel: audit.all.decisions
This manifest is applied once. KafClaw reconciles the group, registers the tools, and begins routing. Policy changes create versioned groups with backward compatibility for in-flight requests.
For high-assurance environments the security and compliance posture includes encryption at rest and in transit, least-privilege access, immutable backups, and alignment with ISO 27001 and SOC 2 principles (Scalytics Security and Compliance).
In a typical on-premise installation, operators deploy via Helm on Kubernetes. Node taints such as scalytics.io/sovereign-boundary=true:NoSchedule combined with NetworkPolicy resources that permit ingress exclusively from Kafka broker pod CIDRs on ports 9092 (Kafka), 7687 (Bolt), and 8080 (tool calls) have been used in client air-gapped enclaves. Persistent volumes for BadgerDB and SQLite are backed by encrypted block storage.
These patterns reflect production deployments Scalytics has delivered in environments where auditability and boundary control were non-negotiable. The code and configuration above are drawn directly from those engagements.
Trade-offs
The sovereign decision fabric trades convenience for control. Running the full stack on-premise or air-gapped requires operational maturity in Kubernetes, Kafka, and graph databases. Organizations without existing streaming expertise will invest in enablement.
Graph query performance depends on partitioning strategy and working set size. While KafGraph handles the memory patterns of agent teams efficiently, very large traversals can introduce latency compared to pure vector similarity. Reflection cycles and feedback loops add background load that must be sized appropriately. For a detailed explanation of the shared memory architecture see KafGraph distributed agent memory.
Local models currently trail frontier capabilities in some domains. Operators must decide whether to accept that gap, pursue aggressive fine-tuning, or implement hybrid retrieval under strict policy. The fabric makes the choice explicit and auditable rather than hidden behind vendor APIs.
Air-gapped operation eliminates automatic updates. Verified release lines and long-term support contracts mitigate this, but patching cycles are deliberate rather than continuous. This is the correct trade-off for high-assurance environments.
Compared to bolting agents onto existing databases and vector stores, the integrated fabric reduces operational surfaces but increases the blast radius of any single component failure. Resilient deployment with multiple availability zones or logical enclaves is therefore mandatory.
The pattern delivers greatest value when decisions must be explained and audited months later. For use cases where best-effort copilots suffice and provenance is secondary, simpler retrieval-augmented generation approaches may reach value faster. The fabric is engineered for situations where the decision record itself must withstand formal review.
These limits are the visible cost of sovereignty and auditability. In our consulting engagements, clients accept them once the alternative of opaque reasoning in foreign jurisdictions is made concrete.
In one recent client air-gapped deployment for European defense logistics, the combination of the scalytics.io/sovereign-boundary=true:NoSchedule taint and a NetworkPolicy allowing traffic only from explicitly labeled Kafka and graph pods on approved ports successfully isolated the enclave while maintaining full internal connectivity.
Outcomes
Organizations running the sovereign decision fabric realize three primary improvements over best-effort RAG or outsourced AI:
- Machine-Speed Compliance: Decision latency drops from minutes to seconds while preserving required oversight gates. In client environments, the immutable record has reduced compliance evidence preparation time by approximately 75%.
- Compounding Institutional Memory: The shared graph surfaces precedents from peer agents and historical sessions. Relevance scores for human-feedback-reinforced patterns increase by a factor of four within the first 90 days of operation.
- Verification of Responsibility: Jurisdictional risk is eliminated by design. When every decision carries its complete provenance graph and audit trail, leadership can authorize higher degrees of autonomy with technically enforceable accountability.
Next Steps
Map the sovereign decision fabric pattern against your specific constraints. Identify the highest-value sense-decide-act loop currently limited by latency, provenance gaps, or jurisdictional concerns. Our team will review your existing streaming estate, policy requirements, and model governance posture, then produce a concrete architecture brief and phased implementation roadmap.
Schedule an architecture review or explore our open-source tools at scalytics.io/open-source.
About Scalytics
Our founding team created Apache Wayang, the federated execution framework that lets computation run where the data lives and dramatically reduces unnecessary data movement.
We also built and maintain kafSCALE, a high-performance, Kafka-compatible streaming platform designed for Kubernetes and object storage. It delivers elastic scale without broker complexity or lock-in.
Our mission: Keep data in place. Bring compute to the data. Enable secure, sovereign, and production-ready AI operations.