Logo - Keyrus
  • Playbook
  • Services
    Advisory & License Sales
    Data Management
    Business Intelligence
    Cloud
    Training
  • Insights
  • Partners
  • Careers
  • About us
    What sets us apart
    Company purpose
    Innovation & Technologies
    Committed Keyrus
    Regulatory compliance
    Investors
    Management team
    Brands
    Locations
  • Contact UsJoin us
Expert opinion

5 min read

Responsible AI begins with responsible data engineering

Dandre Diedericks, Data Engineer at Keyrus

The conversation around responsible AI often gravitates towards algorithms, regulations, and explainability. Yet the real foundation of trustworthy AI lies further upstream—in the often-overlooked world of data engineering. Unless the pipelines feeding AI systems are transparent, well-governed, and auditable, the smartest models risk producing biased, unreliable, or even harmful outcomes.

Regulation is raising the stakes

Governments and regulators are tightening expectations around AI accountability. The European Union’s AI Act requires risk-based compliance, documentation, and traceability, while South Africa’s POPIA enforces strong data-handling responsibilities. These frameworks make one thing clear: organisations must demonstrate responsible data practices long before AI systems make decisions. For data engineers, this elevates their role from backend enablers to frontline custodians of ethical AI.

Data Engineering: The first mile of responsible AI

AI is only as good as its inputs. Data engineers handle the collection, cleaning, transformation, and validation of the information that models rely on. If these steps are done haphazardly without lineage, documentation, or quality checks biases creep in unnoticed and governance collapses. Ethical AI, then, is less about the magic of machine learning and more about how the data got there in the first place.

From my own experience working on a data pipeline for a retail analytics platform, we learned this lesson the hard way. Customer data from different regions had been consolidated without proper documentation of transformations. When the AI-driven demand forecasting tool went live, it consistently under-predicted demand in certain geographies. The issue wasn’t with the model - it was with untracked filters applied months earlier. Rebuilding the pipeline with strict lineage and automated tests corrected the bias and restored confidence in the system. This experience reinforced the idea that responsible AI begins with responsible data engineering.

Five principles of ethical data engineering

  1. Transparency and Lineage Every dataset should be traceable back to its source. Tools like dbt enforce transformations as code, making lineage auditable, while Snowflake integrates this visibility across analytics layers.

  2. Data Quality and Testing Garbage in, garbage out. Automated testing frameworks ensure datasets meet reliability standards before flowing into AI pipelines.

  3. Governance and Access Control Fine-grained permissions and governed semantic layers prevent misuse and help AI systems operate only on approved data.

  4. Fairness and Representativeness Skewed or incomplete datasets can perpetuate discrimination. Auditing models for representativeness helps organisations detect and address these risks early.

  5. Automation with Oversight Efficiency must be balanced with accountability. Automated workflows should always leave an auditable trail of what transformations occurred and why.

These principles are not abstract ideals - they are practical safeguards that build resilience against legal, reputational, and technical risks.

Why neglect carries heavy risks

Ignoring ethical data engineering isn’t just an efficiency issue. It can lead to opaque decision-making, regulatory penalties, or even lawsuits. Real-world consequences, like biased hiring algorithms or discriminatory lending models, highlight how damaging poor data practices can be. The cost of retrofitting governance after deployment far outweighs the investment in building responsible pipelines from the start.

Collaboration Beyond Engineering

Responsible AI is not the sole responsibility of data scientists or engineers. It requires a cross-functional effort involving legal, compliance, product, and domain experts. This collective approach ensures that blind spots—ethical, social, or regulatory—are identified before AI systems scale into production.

How Keyrus Can Help

At Keyrus, we recognise that building responsible AI is as much about cultural alignment and governance as it is about technical tooling. Our teams bring expertise in data engineering, AI implementation, and regulatory compliance to help organisations:

  • Audit pipelines for lineage, quality, and governance gaps.

  • Design frameworks that embed ethical principles into everyday workflows.

  • Implement leading tools such as dbt and Snowflake to automate testing, documentation, and access control.

  • Align diverse stakeholders - from engineers to compliance officers - around a shared vision of responsible AI.

Whether modernising legacy systems or building future-ready AI platforms, Keyrus helps transform responsible data engineering into a strategic advantage. By partnering with us, you organisation will not only safeguard compliance but also strengthen trust in your AI-driven decisions. Contact as at sales@keyrus.co.za.


References

dbt Labs (2025) Build reliable AI agents with the dbt MCP server dbt Labs (2025) Introducing the dbt MCP Server Snowflake Inc. (2023) Snowpark for Python – Empowering Secure and Scalable ML Snowflake Inc. (2024) Data Cloud Security and Governance Overview Gebru, T. et al. (2018) Datasheets for Datasets European Commission (2021) Proposal for a Regulation on Artificial Intelligence (AI Act)

Start your AI journey with a reputable partner
Related Articles
  • Expert opinion

    The power and shortfalls of Large Language Models (LLMs)

  • Event

    Keyrus as Premier Sponsor of Anaplan Connect Cape Town 2025 

  • Success story

    Powerful cloud data architecture that fuels business intelligence across enterprise

  • Expert opinion

    Complex business decisions enabled by WhatsApp

  • Expert opinion

    Snowflake brings AI revolution to South Africa

Logo - Keyrus
Durban

10 Flanders Drive Mt Edgecombe 4302 Kwazulu Natal

Phone:+27 87 350 8860