Our client is a leading independent provider of information services and business intelligence in the energy industry. Their research reports and publications cover energy areas such as petroleum, natural gas, energy transition, and custom intelligence.
Our client sought a new analytics platform to replace their old “black-box” system for mapping and calculating gas prices.
The company’s lack of a modern analytics platform was causing significant challenges. Much of their data analysis was performed ad hoc and manually via Excel.
Their previous system was a custom web development solution that required an expert for ongoing support, rather than allowing for analyst self-sufficiency. Their analysts, or editors, could not do work remotely due to the manual process with large Excel files that were slow and clunky.
Once the company published its insights and blended findings, it was a very difficult process to change them. Whenever an industry source reported a data update, the company had to invest a significant amount of time to completely redo analyses and data blending. Similarly, data mapping was a repeating task that was very time-consuming for their editors.
The company needed a new solution that would efficiently handle these edge cases promptly.
Keyrus designed an on-prem solution consisting of two platforms: Reference Data Admin (RDA) and Alteryx.
RDA is the interface that company editors use to create, read, update, delete, and interact with the data. This module is inside quilliup, a data quality testing and governance platform.
Alteryx is the ETL platform used to collect, transform, and move the data downstream, where our client can ultimately create the structured reports that they currently produce. The raw data is received from various sources in Excel format.
The process in Alteryx has a wrapper (or master workflow) and a source-specific macro for each source to handle all the loaded files. First, the Excel files from a directory are loaded. Then, a Python script is used to extract sheet names from the Excel files. To make the Alteryx process as dynamic as possible, these names are used to create an XML file for each source. These XML files are then used to run a separate, macro process for each source that handles different structures and creates a unified SQL table with all the data.
In the following phase, an intermediate layer was constructed based on the published data of the previous phase. Published data is created when all of the data from the sources were extracted to the SQL tables and was signed off by the client.
The intermediate layer supports the report creation feature on the company’s portal, rearranging the data to make it easier to create reports based on a single and simple query.
This new solution allowed editors to change and validate calculated values based on trends and industry knowledge. With a few minor adjustments, editors are now able to add more sources with different structures. The SQL tables are the base for the RDA platform. This allows the editors to view and edit the data without any SQL knowledge needed.
Editors save tremendous amounts of time with the ability to perform updates to already published data. They can work faster, and if a mistake occurs, editors do not need to spend the entire night fixing it.
A huge benefit of the system is quality improvements in organizing data, databasing full reported data, and analyzing data more comprehensively. Our solution provided the business with the following results:
Data updates after signoff or publication can be amended easily. Updates are being done to downstream system and reports as well
Data mapping is done in a more straightforward manner, which can also account for updates
The ability for analysts to work remotely. Editors only need internet access to complete work.
Business logic is maintained in one place
Data alerts are easy to track and update
Edge cases that used to take our client days to solve now take only minutes. More accurate price reporting can be done faster. Our main stakeholder stated that with the solution we built, there are no longer any edge cases that the company cannot be solved.