A software company that offers accounting products and services to small businesses sought to provide personalized recommendations in its cloud-based accounting products to better serve their customers. The software company worked with Keyrus towards this goal and to differentiate their offerings from competitors.
This company’s application and API allow customers to issue invoices, charge for time and expertise, and access single company ledgers from any location.
Their products are used by millions of people at thousands of businesses, saving their customers over 192 hours of accounting work per year. With their products, the company’s customers can stay focused on strategic business operations.
Company leadership had a long-term plan to acquire businesses in other markets as part of global expansion, but they realized they lacked a consistent data strategy and toolset that the expansion would demand. Keyrus was asked by the software company for analysis and design assistance.
By partnering with Keyrus, the company positioned themselves for rapid growth as well as more order and consistency in their strategic efforts.
Due to their rapid growth, the software company’s data architecture became too fractured to fully support their ambition to build advanced analytics into their frontline applications. The company had no unified governance model across the organization.
Each of their teams developed separate methods of reporting using their own technology stack, leading to some challenges:
No formal data stewardship upon data creation or new source usage
No standard processes for data quality
Duplication of reports and efforts to prepare and analyze data
The Keyrus team first spoke to the client’s data science team about their objectives in a Discovery phase.
We found marketing and sales analysts who wanted more support from the data science team for deeper analysis to drive more sales and marketing initiatives. They felt the data science team was disconnected from these initiatives.
In our discovery, we reviewed all the current data architecture and processes in the company’s entire stack:
The analytics architecture was on AWS, but their live application was on GCP. We untangled and documented this very large and complex end-to-end multi-cloud environment.
Their back end consisted of an Airflow process that took over 12 hours to refresh daily, making analytics difficult as the process completed in the afternoon of the working day. Each business unit had its own set of DAGs (Directed Acyclic Graphs), that they were running independently with no central coordination (DAGs describe how to run a workflow by defining the pipeline in Python).
Data loading processes were inefficient, leading to increased costs for AWS Redshift, a cloud database with a pricing model that increases with additional compute requirements. This was amplified by hundreds of Looker reports which needed to query Redshift each day.
Each business unit had its own cloud strategy. Each built out their own workloads, ad hoc, using whichever platforms and web services they could make function quickly. This led to duplicated workloads and a lack of standardization across teams.
After analyzing the software company’s current data practices, Keyrus developed a maturity model of their current and desired states that clearly identified actionable gaps in governance and organizational structures.
The maturity model, as shown below, illustrated that data governance strategy was one of the largest areas of growth from the software’s current state to their desired state.
To get to their desired state, the software company needed a data governance strategy that would control risk and prevent redundancy and waste, while maintaining their decentralized and entrepreneurial culture.
Based on the maturity model, the second-largest gap between the current and desired state was Data I/O.
The software company’s data science team already had over 9 models in production using Sagemaker and was thinking of how to incorporate real-time streaming pipelines for personalized recommendations.
They needed better control over pipelines. It was difficult for the company’s team to integrate new data sources since there was a lack of real-time scoring or interface pipelines.
Our recommendation to address data ingestion included a simplified architecture of Rivery → Snowflake ingestion pipelines. This recommendation was part of a recommendation guide Keyrus developed to address the technological and strategic organization of the software company.
As part of developing a maturity model of the software company’s current and desired states, the Keyrus team organized workshops to:
Understand the data management organization, how it is organized, the different roles and responsibilities within the software company, and the processes provided
Obtain business, functional, and non-functional requirements for the new data governance organization
Identify missing elements necessary for a data governance organization
In addition to the workshops, Keyrus developed a future data governance strategy. Our team defined a Data Governance Structure, including roles and responsibilities. We also provided a set of Data Governance policies across Data Quality, Data Ownership, and Metadata Management.
A significant part of a data governance roadmap is specifying very clearly who owns which data. Defining who owns the data would ensure:
Trusted data — By assigning a single data owner, one person will take the responsibility for the quality and standards around their data assets. This improves data quality, thus improving its reliability of using data for decision making.
No more redundancies — This issue stems from multiple teams addressing the same problem, such as when someone isn’t aware that the problem has been resolved by another team. Ownership eliminates these redundancies.
Our client now provides far better and more competitive cloud services to its customers. By providing consolidated recommendations for both batch and real-time pipelines, Keyrus supported the software company’s personalized product recommendations in the frontline of their applications.
Keyrus delivered a current and desired state architecture assessment with clear comparisons between AWS, GCP, and Snowflake driven stacks, as well as the cost for each, which amounted to an approximate $70-100K reduction in licenses, cost per year.
With a data governance strategy in place, the software company gets more value from its data. All incoming data is assessed and managed in accordance with the Data Quality Policy and Process document to ensure that it is of a quality that is fit for its intended use.
Keyrus’ defined metadata management policy and process provide the following benefits:
A better understanding of the nature and types of data collected
Help prevent data from being used inappropriately
Reduce data-oriented research time
Reduce training costs and lower the impact of staff turnover through documentation of data context, history, and origin
Overall, we empowered our client to better serve their customers while defining and executing a vision for strategic growth.