All Lessons Course Details All Courses Enroll
Courses/ ISACA AAISM Certification Prep/ Day 3
Day 3 of 18

AI Asset and Data Lifecycle Management

⏱ 18 min 📊 Advanced ISACA AAISM Certification Prep

You can't secure what you can't see. Today we cover AI asset inventory, classification, and the full data lifecycle — from collection through retirement. As a security manager, your job is to ensure these processes exist, are followed, and are auditable.

AI asset inventory and classification

Traditional IT asset management tracks hardware, software, and data. AI asset management adds new asset types:

Models — Trained AI models, including version history, training data provenance, and deployment status. A single model may exist in multiple versions across development, staging, and production.

Datasets — Training data, validation data, test data, and production inference data. Each has different security and compliance requirements.

Pipelines — Data processing and model training pipelines. These are infrastructure assets that require their own access controls and monitoring.

APIs and endpoints — Model serving interfaces. Track who can access which models, rate limits, and usage patterns.

Prompts and configurations — For generative AI, system prompts and configuration parameters are security-relevant assets that affect model behavior.

Classification should extend your existing data classification scheme. Add AI-specific categories:

Model sensitivity — Based on the data the model was trained on and the decisions it influences. A model trained on PII that makes credit decisions is higher sensitivity than an internal document summarizer.

Data provenance — Where did training data come from? Is it properly licensed? Were consent requirements met?

AI asset lifecycle flowchart showing six phases from acquisition through retirement
The AI asset lifecycle spans six phases. Security controls are required at every stage, not just deployment.
Knowledge Check
During an AI asset inventory, the security team discovers 47 AI models in production but only 12 were registered in the governance system. What does this gap PRIMARILY indicate?
The gap indicates a **control failure,** not just a training or tooling issue. Effective governance requires preventive controls (deployment gates that require registration) not just detective controls (after-the-fact inventory). ISACA focuses on the control gap, not the symptom.

Data lifecycle for AI

AI data has a lifecycle that extends beyond traditional data management:

Collection — Where does training data come from? Internal systems, purchased datasets, scraped data, synthetic generation? Each source has different legal, quality, and security implications.

Labeling and annotation — Data labeling introduces human judgment into the pipeline. Who labels? What quality controls exist? Is labeling outsourced? If so, what data protection agreements are in place?

Training — Data used to train models. Requires integrity controls (was the data tampered with?), access controls (who can modify training data?), and quality gates (does the data meet minimum quality standards?).

Validation — Separate dataset used to evaluate model performance. Must be independent of training data. Requires controls against data leakage between training and validation sets.

Production inference — Data that flows through deployed models. Requires input validation, output filtering, and monitoring for anomalous inputs.

Archival and deletion — When do you delete training data? Model weights encode information from training data — does deleting the data but keeping the model satisfy retention requirements?

IP and data rights for AI

AI creates new intellectual property questions that traditional data governance doesn't address:

Training data licensing — Can you legally use this data for training? Many datasets have licenses that restrict commercial use or AI training specifically.

Model ownership — If you fine-tune an open-source model on your proprietary data, who owns the result? What license terms apply?

Output ownership — Who owns content generated by AI? This varies by jurisdiction and is evolving rapidly.

Data subject rights — If a person requests deletion of their data under GDPR, does that require retraining models that learned from their data? The answer is complex and evolving.

As security manager, you ensure these questions have documented answers before deployment, not after a legal dispute.

Knowledge Check
A team has fine-tuned a large language model using customer support transcripts. A customer requests data deletion under GDPR. What is the MOST appropriate governance response?
This is a governance question, not a technical one. The organization should have **documented procedures** for handling data subject rights requests involving AI models. The answer depends on whether the model can regenerate personal data and the organization's legal interpretation — which should be pre-determined in policy.

Model versioning and lifecycle

Models aren't static. They change, degrade, and eventually need retirement.

Version control — Every model version should be tracked with its training data, hyperparameters, validation results, and approval status. This is your audit trail.

Retraining triggers — Define when models must be retrained: performance degradation beyond threshold, data drift detection, regulatory changes, or scheduled intervals.

Retirement criteria — When should a model be decommissioned? Performance below acceptable thresholds, regulatory non-compliance, superior replacement available, or business need no longer exists.

Rollback capability — Can you revert to a previous model version if a new version performs poorly? This requires maintaining previous versions and their associated infrastructure.

Think of model lifecycle management as change management for AI. The same governance principles apply: approval workflows, testing requirements, rollback plans, and documentation.

Final Check
A production AI model's accuracy has dropped from 94% to 87% over three months. What is the BEST governance response?
ISACA wants **process-driven responses.** The correct answer is to evaluate against predefined thresholds and follow documented procedures. The threshold determines whether this is acceptable drift or requires action — and the procedure determines what action to take.
📦
Day 3 Complete
"You can't govern AI you can't see. Asset inventory, data lifecycle management, and model versioning are the foundation of AI security management."
Next Lesson
Building an AI Security Program