All Lessons Course Details All Courses Enroll
Courses/ AIGP Certification Prep/ Day 19
Day 19 of 30

Data Governance During AI Development

⏱ 18 min 📊 Medium AIGP Certification Prep

Data is the foundation of every AI system. Bad data governance doesn't just produce bad models — it produces discriminatory, non-compliant, and potentially dangerous models. Today you'll learn the governance controls that must be applied to AI development data.

Data Quality Dimensions for AI

Traditional data quality (accuracy, completeness, timeliness) is necessary but not sufficient for AI. Add these AI-specific dimensions:

Representativeness — Does the data adequately represent all groups the AI will affect? If a facial recognition system is trained primarily on lighter-skinned faces, it will perform poorly on darker-skinned faces. This isn't just a technical problem — it's a governance failure.

Label accuracy — For supervised learning, labels define truth. Inaccurate or inconsistent labels directly degrade model quality. Governance must ensure labeling guidelines, quality assurance, and inter-rater reliability checks.

Temporal relevance — Is the data current enough for the intended use? A credit scoring model trained on pre-pandemic data may not reflect current economic conditions.

Distributional alignment — Does the training data distribution match the deployment environment? A model trained on US data deployed in EU markets may produce unreliable results.

Sufficiency — Is there enough data to train a reliable model? Insufficient data, especially for minority classes, leads to unreliable predictions for those groups.

Knowledge Check
A medical AI diagnostic system trained primarily on data from one ethnic group shows significantly lower accuracy for other ethnic groups. Which data quality dimension was most likely neglected?
Representativeness is the key issue — the training data did not adequately represent all groups the AI would serve. This is a common and well-documented problem in medical AI, leading to disparate performance across demographic groups.

Bias Detection in Datasets

Bias can enter data at multiple points. Governance requires systematic detection:

Historical bias — Data reflecting past discrimination (e.g., historical hiring data in industries that excluded certain groups).

Selection bias — Non-random sampling that overrepresents or underrepresents certain populations.

Measurement bias — Inconsistent data collection methods across groups (e.g., different diagnostic criteria applied to different demographics).

Label bias — Annotators' subjective judgments reflecting personal or cultural biases.

Aggregation bias — Combining data from different contexts without accounting for population differences.

Governance response: Require demographic parity analysis of training datasets before model development begins. Document any identified biases and the mitigation strategies employed.

Data Labeling Governance

Data labeling (annotation) is where human judgment enters the AI pipeline. Governance controls include:

Annotator guidelines — Clear, detailed instructions for labeling decisions. Reduce ambiguity to improve consistency.

Quality assurance — Double-labeling (two annotators label the same data independently), spot-checking, and regular accuracy reviews.

Inter-rater reliability — Statistical measures (Cohen's kappa, Fleiss' kappa) of agreement between annotators. Low reliability indicates unclear guidelines or subjective labeling.

Annotator demographics — The composition of the annotator team can introduce bias. A monolingual team labeling sentiment in multilingual data will produce biased labels.

Working conditions — Ethical treatment of annotators, especially for content moderation and sensitive data. This is both an ethical and quality concern — fatigued or distressed annotators produce lower-quality labels.

Knowledge Check
Two annotators independently label the same dataset of customer complaints as "urgent" or "non-urgent." They agree on only 55% of labels. What governance action is MOST appropriate?
Low inter-rater agreement (55%) indicates ambiguous labeling guidelines. The fix is to improve the guidelines, not to average disagreements or defer to seniority. Increasing dataset size doesn't fix inconsistent labeling — it just creates more inconsistently labeled data.

Data Documentation Standards

Two widely recognized standards for AI data documentation:

Datasheets for Datasets (Gebru et al., 2021) — A structured documentation template covering:

- Motivation: Why was the dataset created?

- Composition: What's in the dataset? Demographics?

- Collection: How was the data collected? By whom?

- Preprocessing: What cleaning or transformation was applied?

- Uses: What is the dataset intended for? What should it NOT be used for?

- Distribution: How is the dataset shared?

- Maintenance: Who maintains the dataset? How are errors corrected?

Data Cards — A similar concept used by organizations like Google, providing a summary of dataset characteristics, intended uses, and limitations.

These documentation artifacts serve governance purposes: they create accountability, enable auditing, and inform downstream users about data limitations.

Final Check
An organization discovers that its AI training dataset contains historical bias — the data reflects hiring decisions from a period when the company actively discriminated against a protected group. The BEST governance response is:
The best response is to identify, document, and mitigate the bias. Removing protected group data creates an even less representative dataset. Continuing with a disclaimer doesn't address the harm. Collecting entirely new data may be impractical and doesn't guarantee bias-free data. Mitigation techniques like resampling and reweighting, followed by validation, directly address the issue.
🎯
Day 19 Complete
"AI data governance goes beyond traditional data quality — add representativeness, label accuracy, and distributional alignment. Systematic bias detection must happen before training begins. Document everything with datasheets."
Next Lesson
AI Risk Assessment Methodologies