Courses/ AIGP Certification Prep/ Day 22

Day 22 of 30

Documentation — Model Cards, Data Sheets, and AI Impact Assessments

⏱ 18 min 📊 Medium AIGP Certification Prep

Documentation is the backbone of AI governance accountability. Without it, governance is unverifiable. Today you'll learn the key documentation artifacts and when each is required.

Documentation artifacts mapped to AI lifecycle stages

Each governance document has a specific point in the lifecycle where it should be created and maintained.

Model Cards

A model card (Mitchell et al., 2019) is a structured document that provides essential information about an AI model to its users and stakeholders.

Standard model card contents:

- Model details — Developer, model type, version, license

- Intended use — Primary and secondary use cases, out-of-scope uses

- Factors — Groups, instrumentation, and environmental factors relevant to the model

- Metrics — Performance measures used and why they were chosen

- Evaluation data — Datasets used for evaluation and their characteristics

- Training data — Overview of training data (more detail may be in datasheets)

- Ethical considerations — Known ethical issues, potential harms, and mitigations

- Limitations and recommendations — Known limitations and guidance for use

Audience: Model cards serve multiple audiences — deployers (how to use the model safely), governance teams (compliance and risk), regulators (oversight), and the public (transparency).

Knowledge Check

A model card states that a facial recognition system was "not evaluated on individuals under 18 or over 65." This information is MOST important for:

Known limitations about evaluation scope directly inform deployment decisions. If the model wasn't evaluated on certain age groups, its performance for those populations is unknown. Deployers must consider whether their use case involves these populations and whether additional evaluation is needed.

Datasheets for Datasets

Datasheets (Gebru et al., 2021) document the characteristics of datasets used to train and evaluate AI models.

Key sections:

- Motivation — Why was the dataset created? For what task?

- Composition — What data does it contain? Demographics? Class distributions?

- Collection process — How was data collected? By whom? What consent was obtained?

- Preprocessing — What cleaning, filtering, or transformation was applied?

- Uses — Recommended uses and explicit non-recommended uses

- Distribution — How is the dataset shared? Under what license?

- Maintenance — Who maintains it? How are errors reported and corrected?

Datasheets enable governance by making data decisions traceable and auditable.

EU AI Act Annex IV — Technical Documentation

For high-risk AI systems, the EU AI Act's Annex IV specifies mandatory technical documentation contents:

1. General description of the AI system (intended purpose, developer, version)

2. Detailed description of elements and development process

3. Monitoring, functioning, and control information

4. Detailed information on the risk management system

5. Description of changes made during the lifecycle

6. Performance metrics and accuracy levels

7. Detailed description of data governance practices

8. Information on human oversight measures

This is more prescriptive than model cards or datasheets — it's a legal requirement, not a best practice.

Version Control and Documentation Lifecycle

AI documentation is living — it must be updated throughout the system lifecycle:

Version control requirements:

- Every document must have a version number, date, and author

- Changes must be tracked and auditable

- Previous versions must be retained (for regulatory and audit purposes)

- Clear relationship between document versions and model versions

Update triggers:

- Model retraining or significant update

- New test results or fairness metrics

- Change in intended use or deployment context

- Regulatory changes affecting requirements

- Incident or post-incident findings

Common documentation failures:

- Creating documentation only at deployment (missing development-phase decisions)

- Not updating documentation after model changes

- Storing documentation separately from the model (losing traceability)

- Writing documentation for compliance rather than genuine transparency

Knowledge Check

Before deploying a high-risk AI system under the EU AI Act, which documentation must be completed first?

The EU AI Act requires technical documentation per Annex IV to be completed BEFORE placing a high-risk AI system on the market. Model cards are a best practice but not an EU AI Act requirement. Press releases and FAQs serve communication purposes but don't satisfy legal documentation requirements.

Real-World Scenario

In 2020, Timnit Gebru and Margaret Mitchell — co-leads of Google's Ethical AI team — published foundational research on model cards and datasheets that became industry standards for AI documentation. Ironically, Google itself faced scrutiny when Gebru was terminated after co-authoring a paper criticizing the lack of adequate documentation for large language models. The paper argued that models like Google's own LaMDA were being developed and deployed without sufficient documentation of their training data composition, environmental costs, or known biases. The incident highlighted the tension between documentation best practices and organizational incentives to ship products quickly.

This case became a landmark moment for AI governance because it demonstrated that even organizations that pioneer documentation frameworks can fail to apply them internally. The EU AI Act's Annex IV requirements were developed partly in response to industry failures like this — mandating that high-risk AI providers produce comprehensive technical documentation before market placement, not as a voluntary best practice. The incident also underscored the importance of organizational independence for governance teams: documentation and risk assessments lose their value if the teams producing them face retaliation for candid findings.

For AIGP exam purposes, this scenario illustrates why documentation must be contemporaneous (created during development, not after), why governance teams need organizational independence, and why regulatory mandates like Annex IV exist — voluntary frameworks alone proved insufficient to ensure consistent documentation practices.

Final Check

An AI governance audit finds that a model's documentation was created 6 months after deployment and doesn't reflect the decisions made during development. What is the PRIMARY governance concern?

Post-hoc documentation is unreliable because it relies on memory rather than contemporaneous recording. Development decisions, data choices, and design rationale are best captured in real-time. The primary concern is that the documentation cannot be trusted for governance, audit, or regulatory purposes.

🎯

Day 22 Complete

"Model cards, datasheets, and Annex IV documentation serve different audiences but share a common purpose: traceable, auditable accountability. Create documentation during each lifecycle stage — post-hoc documentation is unreliable."

Go Deeper

Want to see these concepts applied to full case studies? Check out AIGP Scenarios — 10 real-world governance simulations mapped to the AIGP exam domains.

Next Lesson

The Go/No-Go Decision — Deployment Readiness Review

→