Our client works in the automotive industry. They receive a lot of data sources files from the Benelux region which come in different formats, times and origins. To get useful data insights, these source files need to be combined into a data warehouse.
Our client's infrastructure was outdated and inefficient compared to current standards. It was difficult, tedious, and prone to mistakes to add or handle hundreds of ingestions. By making the data ingestion flow metadata-driven and automated, it becomes simpler and more informative to monitor and manage the flow.
First, listing the detailed specifications for each data source: Column names and their data types, dimension types of the data source, ingestion and storage requirements, etc. Then we designed a flexible data ingestion process that gathers the different data sources from a source location and loads them into the DWH. To conclude we created a foundational structure to manage the flow of the data sources.
The automated data ingestion flow sets up a solid data warehouse ingestion infrastructure. This enables more accurate and more insightful queries of the existing data, effectively reducing risk and cost while increasing business agility . The flow also makes it easier to add new data sources, which expands the possibilities for data analysis.