What is Scalytics Studio?

Scalytics Studio, an integral feature available in every license without additional costs, is powered by Scalytics Core and provides a dynamic cloud-native interface that streamlines the creation, execution, and monitoring of data transformation tasks, ML and AI pipelines and data modification tasks. With the strength of Apache Wayang (incubating) at its core, Scalytics Studio efficiently manages these processes across diverse platforms. The result? Comprehensive insights into schema and data outcomes at every workflow step.

A notable advantage of Scalytics Core is its intrinsic support for extended table formats like Iceberg, parquet, json, csv, and similar, ensuring seamless data integration and transformation for a broader spectrum of datasets. Scalytics Studio isn't just designed for structured data; it also excels with semi-structured data, which many conventional interfaces find challenging.

Scalytics Studio is cloud-native add-on to Scalytics Core for developing data processing (ETL) pipelines in a low-code way.

Features of Scalytics Studio

When you initiate a task in Scalytics Studio, there's an array of data sources to choose from, including PostgresSQL, local file systems, and distributed systems like HDFS. This ensures rapid data preparation for further analysis in diverse data landscapes. Beyond this, Scalytics Studio provides tools to oversee ETL workflows, ensuring they function impeccably. The provision to preview data sets at each phase significantly aids in ETL task troubleshooting.

With Scalytics Studio's intuitive interface, users can:

  • Extract data from sources like PostgresSQL or distributed filesystems like HDFS.
  • Set up diverse data transformations, including mapping, filtering, grouping, and joining.
  • Choose the execution platform, be it Java 8 Streams or Apache Spark.
  • Inspect dataset schematics or samples at every task juncture.
  • Effortlessly initiate, oversee, and manage tasks integrated into Scalytics Studio.
  • Share pipelines and processors with other users

Rooted in Scalytics Core, Scalytics Studio excels in curating and managing tasks that gather, refine, and unify data from multiple data sources without moving them to a central place. And for those with intricate requirements, Scalytics Studio serves as a potent tool to diagnose and tailor job scripts. Scalytics Studio includes a graphical user interface that allows you to connect and query different data sources or join data across multiple data sources in a very intuitive way. It also supports complex data transformations that can be processed on platforms of the users choice.

The platform’s visual job editor presents users with a plethora of features:

  • The ability to incorporate multiple data sources and targets.
  • Preview data at each workflow node.
  • Implement various data transformations, from simple mappings to complex joins.
  • Switch data processing frameworks instantly, enables rapid testing and ultra-fast deployment
  • Data platform independence - switch seamlessly from any supported platform to another (ex. Spark -> Flink)

Further, the script editor in Scalytics Studio is adept for crafting or amending the ETL code for your tasks. After laying down the initial design, you can fine-tune the generated script to align with the specificities of your task. Scalytics Studio’s performance dashboard offers an exhaustive view into your ETL tasks. This dashboard furnishes pivotal insights about job runs over selected timeframes, ensuring you're always informed.

Support for Dataset Partitioning

With Scalytics Studio, you're empowered to handle partitioned datasets with finesse. Efficiently process, filter, and transform partitioned data, ensuring optimal utilization without unnecessary data listings or loadings.

Why Choose Scalytics Studio?

Scalytics Studio, integrated with Scalytics Core, offers a streamlined avenue for crafting ETL workflows. With its capabilities and the muscle of Apache Wayang, it becomes an essential tool for ETL developers aiming for reliable processes to manage expansive, semi-structured datasets and deposit them into structured data environments. The culmination of user-centric design, coupled with the versatility of Scalytics Core's advanced processing engine, makes Scalytics Studio an indispensable tool in modern data management.

With Scalytics Studio, not only do you get a simplified job management experience, but also a comprehensive view of your tasks and their interrelations. The platform's consolidated interface presents a continually refreshed perspective on ETL operations and resource allocations. This makes it an invaluable asset for anyone looking to optimize their data processing workflows.

About Scalytics

Legacy data infrastructure can't keep pace with the speed and complexity of modern AI initiatives. Data silos stifle innovation, slow down insights, and create scalability bottlenecks. Scalytics Connect, the next-generation data platform, solves these challenges. Experience seamless integration across diverse data sources, enabling true AI scalability and removing the roadblocks that hinder your AI ambitions. Break free from the limitations of the past and accelerate innovation with Scalytics Connect.

We enable you to make data-driven decisions in minutes, not days
Scalytics is powered by Apache Wayang, and we're proud to support the project. You can check out their public GitHub repo right here. If you're enjoying our software, show your love and support - a star ⭐ would mean a lot!

If you need professional support from our team of industry leading experts, you can always reach out to us via Slack or Email.
back to all articlesFollow us on Google News
Unlock Faster ML & AI
Free White Papers. Learn how Scalytics streamlines data pipelines, empowering businesses to achieve rapid AI success.

Get started with Scalytics Connect today

Thank you! Our team will get in touch soon.
Oops! Something went wrong while submitting the form.