Scalytics Connect Release Notes

Updates für neue Funktionen und Leistungsverbesserungen

Scalytics Connect v. 1.2.0 / November 2024

Scalytics Core

Apache Wayang ist das Herzstück unserer Produkte. Apache Wayang (Incubating) ist die einzige plattformübergreifende Open-Source-Datenverarbeitungs-Engine. Anwendungsentwickler spezifizieren Anwendungen mithilfe der API von Apache Wayang.

Scalytics Unique features:
  • Der KI-basierte Optimierer wählt automatisch eine optimale Konfiguration von Klassendatenverarbeitungs-Frameworks wie Java Streams oder Apache Spark aus, auf denen Anwendungen ausgeführt werden.
  • Blossom Core führt die Programmausführung durch. Es abstrahiert die verschiedenen plattformspezifischen APIs und koordiniert die plattformübergreifende Kommunikation.
  • Anwendungen können auf mehreren Datenverarbeitungsplattformen ausgeführt werden, ohne den systemeigenen Code der zugrunde liegenden Plattformen zu ändern.
  • Federated data processing: In-situ processing in different sites without moving raw data outside their origin.
  • Build and execute cross platform machine learning pipelines in a unified way.
  • NEW: Federated Machine Learning
    • Federated analytics by integrating multiple platforms across silos
    • Developers: Train ML models using federated learning in a platform agnostic way
  • NEW: Supporting unsupervised learning (e.g., using K-means) and Stochastic Gradient Decent optimization technique for Federated Learning across supported data platforms
  • NEW: Auditing compliance (who accessed what when) and training audits (basic)

Data sources:
  • PostgresSQL
  • Columnar Data Files (e.g., CSV, Iceberg, Parquet, ORC)
  • SQlite (e.g. Mobiles, Embedded)
  • Local file systems
  • Distributed file systems (e.g., HDFS, S3)
  • Apache Kafka
  • NEW: Remote files over http(s)
  • NEW: JDBC based data sources

Data Processing Platforms:
  • Java 8 Streams
  • Apache Spark / DataBricks
  • Postgres
  • SQLite
  • Apache Flink / Confluent, Decodable
  • NEW: Apache Kafka
  • NEW: Tensorflow
  • NEW: JDBC based platforms

Programming APIs
  • Java
  • Scala
  • Basic SQL
  • New: Python (limited support)

Runtime
  • NEW: Actor-based runtime for building federated applications

Scalytics Studio: Simplifying Machine Learning Workflow Design

Scalytics Studio is a cloud-native, low-code extension to Scalytics Core, designed to streamline Machine Learning workflow design and enhance data management.

With an intuitive graphical user interface, Scalytics Studio enables you to:

  • Connect and Query Seamlessly: Effortlessly connect to various data sources and perform local queries.
  • Unify and Join Data: Combine data from multiple sources with ease for comprehensive analysis.
  • Transform Data Intuitively: Perform complex data transformations and execute them on the platform of your choice.

Supported Data Sources:
  • PostgresSQL
  • Files on local or distributed filesystems (e.g., HDFS)
  • NEW: Apache Kafka
  • NEW: Files over http(s)

Supported Platforms
  • Java 8 Streams
  • Apache Spark / DataBricks
  • NEW: Tensorflow
  • NEW: Apache Kafka

Supported Data Transformations
  • Map
  • Filter 
  • Reduce
  • GroupBy
  • Join
  • Union
  • Cross
  • Train (for ML pipelines)
  • Predict (for ML pipelines)
Die Zukunft gehört denen, die Daten und KI besitzen. Besitze deine!
start your free trial