Scalytics | The Scalytics Story: Our Mission & Founding Team

April 21, 2023

Scalytics is modernizing how organizations process, govern, and operationalize data in regulated and distributed environments. Our platform, Scalytics Federated, enables analytics, machine learning, and AI to run directly where data resides. Instead of moving, copying, or centralizing information, computation is executed across existing systems and infrastructures. This helps organizations reduce complexity, strengthen compliance, and unlock value from data that traditionally remained siloed.

We are the team behind Scalytics: Zoi, Alexander, Kaustubh, and Mirko.

We are the team behind Scalytics:
Zoi, Alexander, Mirko, and Kaustubh.

A few words about us, our story, and why we are here:

Zoi Kaoudi
Zoi leads our research vision in distributed systems, machine learning, and federated data processing. She is an Associate Professor at the IT University of Copenhagen and the principal architect behind Apache Wayang, the first cross-platform data processing system. Her work on cross-platform optimization introduced concepts that shaped modern data fabrics and continue to influence federated AI research.

Alexander Alten
Alexander contributes more than 15 years of experience in big data, distributed systems, AI, and IoT. His background spans regulated sectors including energy, finance, and healthcare, as well as roles with Cloudera and other data platform providers. Having worked across organizations where regulatory pressure, data fragmentation, and operational constraints collide, he brings the practical perspective that guides our product direction.

Kaustubh Beedkar
Kaustubh is a founding engineer and CTO. He earned his Ph.D. at the Max Planck Institute and the University of Mannheim, publishing research in top-tier venues. He built the federated SQL layer of Apache Wayang and leads the architectural development of Scalytics Federated. Kaustubh also teaches data management at the Indian Institute of Technology, Delhi.

Mirko Kämpf
Mirko contributes deep experience in distributed data, large-scale systems, and enterprise engineering through senior roles at Cloudera, Confluent, and ecolytiq. His expertise bridges open-source practice, data platform architecture, and real-world operational scaling.

In memory of Jorge
Jorge began exploring federated data processing and distributed AI in 2015. He co-developed early prototypes of what later became Apache Wayang and presented this work internationally. Jorge passed away unexpectedly in 2023. His contributions remain fundamental to how Scalytics Federated operates today.

‍

Why We Started Scalytics

Scalytics emerged from a shared frustration with the limitations of traditional data architectures. Across industries we saw the same pattern: excessive data movement, costly ETL pipelines, duplicated systems, and growing regulatory pressure. Teams spent more time integrating platforms than analyzing data. Modern organizations were accumulating more technology but gaining less agility.

We built Apache Wayang to address these challenges programmatically. It introduced a clean abstraction between analytical logic and execution engines, enabling applications to run on Spark, Flink, Postgres, Java, or Python without rewriting code. This work became the technical foundation for Scalytics Federated.

Our goal was straightforward: make distributed data usable without forcing organizations into centralization strategies or vendor-locked platforms.

This motivation is reflected in the well-known complexity visualized by the Matt Turck Data & AI Landscape, which has expanded each year and highlights the increasing fragmentation across tools and architectures.

‍

The Data and AI Landscape 2020, by Matt Turck

‍

What We Built

Scalytics Federated brings together distributed data, heterogeneous processing systems, and modern AI into one execution and governance layer. It empowers teams to:

run analytics and AI directly on operational data across different platforms
minimize redundant ETL and avoid building new data silos
apply consistent governance and compliance across distributed systems
modernize existing architectures without replacing them
accelerate development by abstracting applications from underlying processing engines

The system operates across data lakes, warehouses, operational stores, and edge platforms. By unifying them at the execution layer, organizations can modernize their data strategy without disrupting their infrastructure.

‍

Our Vision

We're taking on the data market players who have purposefully created segregated products to lock clients into their single solutions, hindering data cooperation and complicating compliance with data rules. We've experienced the frustration, exhaustion, and anger when initiatives fail due to incompatibility, rising expenses, and reliance on limited technology. We know what it's like to feel pressure to address real-world data issues while no one is willing to step up. That's what drives us - the determination to revolutionize how we all work with data.

‍

‍

Data architectures have grown complex because the market rewarded siloed products and isolated ecosystems. This has made interoperability difficult and compliance harder. Our vision is to give organizations control over their data processing strategy by providing a neutral, federated, and extensible foundation.

We believe distributed, regulation-aligned processing is the future of enterprise AI. Scalytics Federated is designed to help organizations work with their data where it already is, unlock value without unnecessary movement, and support the next generation of decentralized AI systems.

‍

Alexander, Zoi, Kaustubh, Mirko

About Scalytics

Scalytics architects mission-critical streaming, federated execution, and sovereign AI systems. We help defense, infrastructure, and regulated organizations turn real-time data streams into trusted decisions reliably and under production load.
Our founding team created Apache Wayang, the federated execution framework that lets computation run where the data lives and dramatically reduces unnecessary data movement.
We also built and maintain kafSCALE, a high-performance, Kafka-compatible streaming platform designed for Kubernetes and object storage. It delivers elastic scale without broker complexity or lock-in.

‍Our mission: Keep data in place. Bring compute to the data. Enable secure, sovereign, and production-ready AI operations.