A large consulting firm helps companies across a wide variety of industries tackle complex business challenges. The company’s data science, analytics, and artificial intelligence department makes extensive use of machine learning to help clients identify competitive advantages and leverage data for growth.
Businesses are increasingly relying on AI to improve performance and guide decision making, but without the right AI platform, many companies struggle to see value or impact from their models.
Prior to this engagement, our client's data scientists were spending significant amounts of time setting up development and model training environments, figuring out how to connect and consume data sources, setting up data versioning, and setting up and configuring code repositories. When it came time to implement a trained model in a customer’s environment, these exercises often had to be repeated, for example in migrating from utilizing a development database to connecting to a customer’s data warehouse.
Our client needed a way to standardize and accelerate the development and deployment of machine learning models for their customers. Their goal was to have data scientists spending less time on the peripheral operations of data science and more time on the actual development of ML models.
The solution needed to be rapidly scalable, both in terms of onboarding new customers and increasing the compute power available for model training. To align with company security requirements, the system had to support multiple tenants with varying industry compliance requirements. As an extra layer of protections, each customer’s data and resources needed to be effectively air-gapped from those of other customers.
The Keyrus Cloud team worked with the customer to build a product with multiple AWS services at its core, including S3, KMS, EC2, Lambda, and others. In an effort to increase portability across platforms, the system also made heavy use of Terraform, Vault, Consul, Jenkins, Docker, and Github.
Each client had their own environment with separate S3 buckets, KMS keys, Kubernetes clusters, application stack, and zero-trust entry point, ensuring zero risk of cross contamination of data.
The entire stack and application was designed to support compliance with HIPAA, GDPR, PCI, and NIST workload requirements. The Keyrus Cloud team also led the security team for the project, including supervision of penetration testing by an independent agency.
The resulting solution is scalable, secure, resilient, performant, and cost-effective. Internal data scientists use the platform to significantly decrease the lead time for model development, and have saved thousands of hours in consulting time over approximately two years, significantly speeding up development and delivery for machine learning projects.
After dozens of successful project implementations, the solution was ultimately acquired by a unicorn startup and incorporated into the startup’s existing platform.