You can't secure what you can't see. Today we cover AI asset inventory, classification, and the full data lifecycle — from collection through retirement. As a security manager, your job is to ensure these processes exist, are followed, and are auditable.
Traditional IT asset management tracks hardware, software, and data. AI asset management adds new asset types:
Models — Trained AI models, including version history, training data provenance, and deployment status. A single model may exist in multiple versions across development, staging, and production.
Datasets — Training data, validation data, test data, and production inference data. Each has different security and compliance requirements.
Pipelines — Data processing and model training pipelines. These are infrastructure assets that require their own access controls and monitoring.
APIs and endpoints — Model serving interfaces. Track who can access which models, rate limits, and usage patterns.
Prompts and configurations — For generative AI, system prompts and configuration parameters are security-relevant assets that affect model behavior.
Classification should extend your existing data classification scheme. Add AI-specific categories:
Model sensitivity — Based on the data the model was trained on and the decisions it influences. A model trained on PII that makes credit decisions is higher sensitivity than an internal document summarizer.
Data provenance — Where did training data come from? Is it properly licensed? Were consent requirements met?
AI data has a lifecycle that extends beyond traditional data management:
Collection — Where does training data come from? Internal systems, purchased datasets, scraped data, synthetic generation? Each source has different legal, quality, and security implications.
Labeling and annotation — Data labeling introduces human judgment into the pipeline. Who labels? What quality controls exist? Is labeling outsourced? If so, what data protection agreements are in place?
Training — Data used to train models. Requires integrity controls (was the data tampered with?), access controls (who can modify training data?), and quality gates (does the data meet minimum quality standards?).
Validation — Separate dataset used to evaluate model performance. Must be independent of training data. Requires controls against data leakage between training and validation sets.
Production inference — Data that flows through deployed models. Requires input validation, output filtering, and monitoring for anomalous inputs.
Archival and deletion — When do you delete training data? Model weights encode information from training data — does deleting the data but keeping the model satisfy retention requirements?
AI creates new intellectual property questions that traditional data governance doesn't address:
Training data licensing — Can you legally use this data for training? Many datasets have licenses that restrict commercial use or AI training specifically.
Model ownership — If you fine-tune an open-source model on your proprietary data, who owns the result? What license terms apply?
Output ownership — Who owns content generated by AI? This varies by jurisdiction and is evolving rapidly.
Data subject rights — If a person requests deletion of their data under GDPR, does that require retraining models that learned from their data? The answer is complex and evolving.
As security manager, you ensure these questions have documented answers before deployment, not after a legal dispute.
Models aren't static. They change, degrade, and eventually need retirement.
Version control — Every model version should be tracked with its training data, hyperparameters, validation results, and approval status. This is your audit trail.
Retraining triggers — Define when models must be retrained: performance degradation beyond threshold, data drift detection, regulatory changes, or scheduled intervals.
Retirement criteria — When should a model be decommissioned? Performance below acceptable thresholds, regulatory non-compliance, superior replacement available, or business need no longer exists.
Rollback capability — Can you revert to a previous model version if a new version performs poorly? This requires maintaining previous versions and their associated infrastructure.
Think of model lifecycle management as change management for AI. The same governance principles apply: approval workflows, testing requirements, rollback plans, and documentation.