Atlas: A Framework for ML Lifecycle Provenance & Transparency

Marcin Spoczynski, Marcela S. Melara, Sebastian Szyller·February 26, 2025

Summary

Atlas is a framework that enhances ML lifecycle security and transparency. It uses open specifications for data and software supply chain provenance to collect records of model artifact authenticity and metadata. The framework combines trusted hardware and transparency logs to preserve data confidentiality and limit unauthorized access, addressing risks like data poisoning and supply chain attacks. Origin provides cryptographic tools for content authenticity, origin verification, and tamper-evident tracking. Approaches like AMP, the News Provenance Project, and Collomosse et al. leverage distributed ledger technologies for transparent, immutable records. LakeFS combines Git-like semantics with object store concepts for version control. Supply chain integrity in ML is an emerging area, with efforts like BOMs, SBOMs, and frameworks such as AIBOM and SLSA focusing on tracking dependencies and maintaining metadata. In-toto collects authenticated claims across supply chain steps, enabling end-to-end policy verification. Sigstore and SCITT provide transparency logs for issuing signing credentials and validating digital signatures, ensuring global visibility and auditing.

Advanced features