California implements training data transparency rules for generative AI systems

A new California law requiring transparency around the training data used in generative AI systems will take effect on 1 January 2026. The measure introduces disclosure obligations for developers and marks a shift toward greater scrutiny of how generative models are built and updated.

California implements training data transparency rules for generative AI systems

The United States state of California has implemented Assembly Bill No. 2013, establishing new transparency requirements for developers of generative artificial intelligence systems. The law enters into force on 1 January 2026 and applies to generative AI systems or services released or substantially modified after 1 January 2022.

The legislation defines generative AI as artificial intelligence capable of producing synthetic content, including text, images, video, and audio, based on training data. Under the new rules, developers must publish documentation on their websites describing the data used to train their systems. This obligation applies both at the point of initial public release and whenever a system is significantly updated.

The required documentation must provide a high-level summary of training datasets, including their sources and intended purposes, the types of data points involved, and whether the datasets include copyrighted material or personal data. Developers must also disclose whether synthetic data was used during training. The law does not require disclosure of proprietary details or full datasets, but aims to improve public understanding of how generative AI systems are developed.

Certain categories of AI systems are exempt from the requirements, including those designed exclusively for security, system integrity, aircraft operation, or national security purposes.

By mandating training data disclosures, the new law places California among the first jurisdictions to impose binding transparency obligations specifically targeting generative AI, adding to broader debates on accountability, intellectual property, and data protection in AI development.

Go to Top