Scalytics Connect is a fully-managed, privacy-first AI infrastructure solution that provides organizations with dedicated GPU resources for running large language models privately within their own environment. Our solution handles all aspects of deployment, management, and optimization, allowing you to focus on using AI rather than maintaining it.
Unlike typical AI services that send your data to external APIs, Scalytics Connect provides dedicated infrastructure where all data processing happens within your private environment. We combine the convenience of managed services with the security and compliance benefits of private infrastructure. Additionally, we support multiple model families (DeepSeek, Mistral, Llama, Gemma, Phi) while most competitors focus on a single model or require you to handle complex DevOps tasks yourself.
No. While Scalytics Connect is powerful enough for AI specialists, it's designed to be accessible for organizations without extensive AI expertise. Our platform provides intuitive interfaces for using models, and our team handles all the complex technical optimization and maintenance.
Your data never leaves your dedicated environment. All processing happens on your private infrastructure, meaning sensitive information stays within your control at all times. This is fundamentally different from API-based services where your data must be sent to external servers.
You can run a variety of open models like DeepSeek, Mistral, Llama, Gemma, and Phi without restrictions. The exact model sizes and quantizations depend on the VRAM of the GPUs:
NVIDIA L4 GPU (24GB VRAM):
- 7B models in Q6/Q5/Q4 quantization (1-2 model instances per GPU)
- 12B models in Q5/Q4 quantization (1 model instance per GPU)
- 14B models in Q4 quantization (1 model instance per GPU)
NVIDIA H100 GPU (80GB VRAM):
- 7B models in full precision or Q6/Q5/Q4 quantization (4-6 model instances per GPU)
- 14B models in full precision or Q6/Q5/Q4 quantization (2-3 model instances per GPU)
- 34B models in Q6/Q5/Q4 quantization (1-2 model instances per GPU)
- 70B models in Q4 quantization (1 model instance per GPU)
Our platform is optimized for Q6 quantized models, with DeepSeek-R1-14B as our reference model for performance specifications. You can also integrate with cloud models like OpenAI and Anthropic through API keys to enhance your AI experience and to build powerful privacy foccussed agents.
Concurrent users are individuals actively querying your system within a short time window (approximately 10 seconds). Our measurements are based on users sending 1-2 requests per second with prompts of 100-500 tokens and responses of 100-200 tokens. The number indicates the capacity of your infrastructure to handle simultaneous requests.
Yes. You can easily upgrade from the Small Business to SME tier as your usage increases. Upgrading from SME to Enterprise requires a technology migration due to the different GPU architecture (L4 to A100), which our team will manage for you.
Scalytics Connect can be deployed on AWS, Azure, GCP, or any Linux-based cloud that offers NVIDIA or similar GPUs. We also support on-premises deployments through our partnerships with HPC system builders and specialized GPU infrastructure providers across Europe.
For cloud deployments, we typically have your environment operational within 2-3 business days from contract signing. On-premises deployments vary based on your existing infrastructure and requirements.
Our team manages all aspects of maintenance, including security updates, performance optimizations, and model updates. You'll always have access to the latest features and improvements without needing to manage them yourself.
All plans include dedicated support. The Small Business tier provides business hours support, the SME tier includes 24/7 priority support, and the Enterprise tier offers white-glove support with a dedicated account manager. Our support team has deep expertise in AI infrastructure and can assist with both technical and strategic questions.
Data privacy is built into the core architecture of Scalytics Connect. Your data never leaves your dedicated environment, and all processing happens locally on your infrastructure. We implement comprehensive security measures including end-to-end encryption, role-based access control, and secure deployment practices.
Scalytics Connect's architecture supports compliance with major regulations since your data remains in your controlled environment. We can help implement specific controls required for GDPR, HIPAA, and other regulatory frameworks. Our team can work with your compliance officers to ensure proper documentation and controls.
Our RBAC system allows administrators to define precise permissions for users and groups. You can control which models users can access, limit token usage, restrict certain features, and enforce security policies. This ensures proper governance of your AI infrastructure according to your organization's requirements.
Yes! We offer a 5-day trial period during which you can fully evaluate Scalytics Connect in your environment. During this trial, you can cancel anytime and you'll only be billed for the actual hardware costs incurred. This allows you to verify the performance, security, and usability of our platform with your specific use cases before making a longer commitment.
AI infrastructure requires significant setup and optimization for your specific needs. Annual contracts allow us to make this investment while ensuring stable, reliable service. They also provide you with cost predictability and dedicated resources that aren't shared with other organizations.
You're fully responsible for managing your usage within the infrastructure capacity you've purchased. While we provide monitoring tools that show your usage patterns, it's up to you to ensure you stay within appropriate usage levels. You can add more users beyond the recommended concurrent user count, but be aware that this will lead to performance degradation. Our concurrent user guidelines are designed to maintain optimal performance - exceeding them means your users may experience slower response times or processing delays.
No. Our pricing is transparent and includes all aspects of the service: infrastructure, management, support, and software. The only additional costs would be if you choose to integrate with third-party models like OpenAI or Anthropic (you would pay for their API usage directly).