EU AI Act Art. 10: Data Quality That Withstands Audits
What data engineering must deliver before the high-risk rules take effect (Part 1/3). In many organizations, data quality was long treated as a hygiene topic: important, but rarely decisive. With the introduction of the High-Risk rules, data quality becomes verifiable. It must be measurable, controllable, and evidentially demonstrable in operations.


We addressed this topic earlier in “The Impact of the EU AI Act on Business Intelligence”.
With this three-part series, we translate the regulatory requirements into operational practice. Part 1 starts at the foundation: data quality under Article 10.
Note: This is not legal advice, but a technical and practice-oriented perspective.
Why This Matters Now
Context: Alongside the phased application of the AI Act, simplifications and potential adjustments to the start date of certain High-Risk requirements are currently being discussed. These include possible linkages to support tools and standards. These proposals are still under discussion and not finalized.
Status Q1/2026: Details and potential easing measures are not yet final. At present, the most relevant orientation comes from published guidelines and FAQs of the European Commission.
The EU AI Act entered into force on 1 August 2024. It will generally apply from 2 August 2026. (🔗) Application is phased. Transitional periods apply until 2 August 2027 for certain High-Risk AI systems that act as safety components in regulated products. (🔗)
For data teams, this means: auditability is not created by documents, but by controlled data processes.
The Core Misconception: Compliance Is Not a Model Problem, It’s a Data Problem
Many discussions around regulation focus on the model layer: documentation,nexplainability, or frameworks. In mature BI and data landscapes, however, failure usually occurs much earlier. Compliance rarely fails at the model level. It fails because data was never built to be auditable.
This makes data engineering the compliance anchor, not as a blocker, but as the function that embeds measurability and operational control into the data pipeline.
Article 10 in Engineering Terms: From “Adjectives” to Controls
Article 10 (“Data and data governance”) defines requirements for data and data governance in High-Risk AI systems, particularly where data is used for training, validation, or testing.
Completeness note:
If no model training takes place, the obligations under Article 10 (2)–(5) apply to testing datasets and, where provided for, also to validation datasets.
Whether a BI or analytics application falls under the AI Act depends on whether it meets the definition of an “AI system.” The Commission has published guidelines on this definition. If a system falls under the AI Act and is additionally classified as High Risk, the data quality and data governance obligations of Article 10 apply.
Why BI and Data Organizations Can Still Be Affected
Data platforms are often shared infrastructure. As soon as an AI use case accessing this platform falls into a High-Risk category, the upstream data pipeline must provide the required evidence. Examples include HR selection, credit scoring, or critical infrastructure.
Operationally, this means: data quality and data governance must be measurable and auditable for High-Risk contexts. This includes documented data preparation and controls such as representativeness, error minimization, and completeness “to the best extent possible.”
Quality as Code can be an effective building block here. What matters is that controls and evidence are permanently embedded in operations.
Where It Breaks in Practice
Not edge cases - everyday reality:
- Shadow logic (KPI definitions spread across SQL, Excel, or BI layers)
- Silent upstream schema changes that continue downstream
- Manual fixes without audit trails
- Tests without consequences (failures without stop or escalation)
In High-Risk contexts, these issues become compliance- and audit-relevant. What matters is what is controlled, documented, and traceable inoperations.
From Here On It Gets Practical:
Minimum Viable Controls (MVC) for Data Quality
Many organizations fail not due to lack of intent, but due to overload. MVC means: small enough to start, robust enough to scale.
1. Data Contracts and Schema Guards
Expected columns, data types, nullability, and semantics are defined as a contract.
This includes rules for breaking changes, versioning, and communication.
Result: fewer silent breaks and fewer reporting surprises.
2. Quality Gates in the Pipeline
Baseline checks such as required fields, uniqueness on business keys, referential integrity, and value ranges can be established quickly. What matters is the consequence: stop, quarantine, or alert plus documented approval.
From inics practice:
In dbt setups, we frequently use dbt Docs and metadata as a pragmatic entrypoint into a data catalog. Models, definitions, and dependencies become discoverable and traceable for teams, a solid foundation for later catalog and governance expansion.
3. SLOs for Data
Quality becomes measurable as a service. This includes binding expectations for freshness and completeness, e.g. “complete by 06:30 on business days” or “≥ 99.x % of expected records,” with tolerance windows.
Where it truly matters, end-to-end latency targets for critical data products are added.
4. Representativeness and Bias as Pipeline Topics
Representativeness is not something to be handled “later in the model.” A practical starting point includes
- checking coverage (segments, regions, channels present),
- measuring drift and anomalies on key features
- making data gaps visible.
This aligns directly with the governance logic of Article 10.
5. Minimal Evidence Output from Data Quality Controls
The goal: evidence is generated automatically, without compliance PowerPoint decks.
For Part 1, the scope is deliberately lean. Per run, test results, row counts, freshness status, and pipeline version are stored as evidence.
Important: The broader architectural questions - lineage, metadata, change gates, and audit trails as a system - are covered in Part 2, “Traceability byDesign.”
From inics practice (RAG & GenAI): When RAG systems are in use, we additionally log:
- which sources were actually retrieved (document and section IDs),
- which index and embedding versions were used,
- which prompt and pipeline version generated the answer.
This ensures traceability does not have to be reconstructed later.
6. Clear Ownership: Who Decides on a Failure?
A minimal setup that works in practice:
- Data Owner (Business): defines goals and tolerances
- Engineering Owner: responsible for implementation and root-cause resolution
- Platform & Ops: provide monitoring, runbooks, and escalation paths
Without ownership, alerts remain noise. With ownership, they become actionable.
Conclusion
The AI Act does not reward more documentation. It rewards platforms that can establish and prove quality in operation.
Organizations that treat data quality as an engineering and ownership topic reduce risk, while simultaneously increasing stability, delivery speed, and trust.
In Part 2 (2/3), we will focus on Traceability by Design and showhow lineage, metadata, and change gates can be implemented so that auditability emerges directly from operations.

Ready for the Start of the High-Risk Rules?
We show you how to make data quality operationally measurable and resilient for High-Risk contexts.
Request a free initial consultationThomas Howert
Founder and Business Intelligence expert for over 10 years.
Discover more articles

AI is a Bubble
So was the Internet.

The Impact of the EU AI Act on Business Intelligence
The EU’s Artificial Intelligence Act is the world’s first comprehensive AI law, and while it doesn’t regulate every dashboard, it has major implications for Business Intelligence once AI features are involved.

Data Governance and the Single Source of Truth
Companies often come to us because their reporting doesn’t add up. Dashboards contradict each other, KPIs are inconsistent, and the root cause is almost always assumed to be technical.
