Audio and Video Content Provenance
Six audio formats and four video formats. C2PA manifests embedded in container-native structures that travel with files through distribution and streaming.
Container-Native Embedding
Audio and video files use container formats - structured wrappers around compressed media data. The container includes metadata fields, timing information, track definitions, and now, with C2PA, provenance manifests.
Each audio and video format has a container type with specific mechanisms for storing extension data. C2PA uses these native extension mechanisms rather than adding new metadata fields that could be stripped by processing tools:
- RIFF containers (WAV, AVI): JUMBF data in a dedicated RIFF chunk
- MP3 (ID3 container): JUMBF data in an ID3v2 GEOB (General Encapsulated Object) frame
- ISO BMFF containers (MP4, MOV, M4V, M4A, AAC): JUMBF data in a uuid box
These are the same extension mechanisms that other metadata uses. A uuid box in an MP4 file is a standard part of the ISO BMFF specification. Processing tools that do not understand the C2PA content leave it intact rather than stripping it.
Audio Formats
| Format | Container | Primary Use Cases |
|---|---|---|
| WAVaudio/wav | RIFF chunk | Music production, broadcast audio, podcast masters |
| MP3audio/mpeg | ID3 GEOB frame | Podcast distribution, music streaming, audio journalism |
| M4Aaudio/mp4 | ISO BMFF uuid box | Apple Music, iTunes distribution, high-quality audio |
| AACaudio/aac | ISO BMFF uuid box | Streaming platforms, mobile audio, broadcasting |
| FLACaudio/flac | Custom JUMBF/COSE | Lossless music, archival audio, hi-fi streaming |
| MPAaudio/mpa | MPEG audio frame | Broadcast audio, multimedia applications |
The ID3 GEOB Frame (MP3)
MP3 files use the ID3 tagging format for metadata. ID3 supports multiple frame types including GEOB (General Encapsulated Object), which stores arbitrary binary data with a MIME type identifier. Encypher stores the C2PA JUMBF manifest data in a GEOB frame identified by the C2PA MIME type.
This is the C2PA-standard embedding method for MP3. The GEOB frame is preserved by ID3-aware tools that process MP3 files. Most podcast distribution platforms and music distribution services pass ID3 tags through without stripping them, making the C2PA manifest durable through typical podcast and music distribution workflows.
MP3 is particularly important for podcast provenance. A podcast episode signed at production carries provenance through distribution to Apple Podcasts, Spotify, and other platforms. The episode's authorship, production organization, and publication date are documented in the file itself.
Video Formats
| Format | Container | Primary Use Cases |
|---|---|---|
| MP4video/mp4 | ISO BMFF uuid box | Web video, streaming platforms, news video |
| MOVvideo/quicktime | ISO BMFF uuid box (QuickTime) | Professional video production, film, Apple ecosystem |
| M4Vvideo/x-m4v | ISO BMFF uuid box | Apple TV, iTunes, DRM-protected video |
| AVIvideo/x-msvideo | RIFF chunk | Legacy video archives, Windows ecosystem, surveillance |
ISO BMFF: The Dominant Video Container
MP4, MOV, M4V, and M4A all use ISO Base Media File Format (ISO BMFF) as their container architecture. ISO BMFF organizes file content into "boxes" (also called "atoms" in Apple's QuickTime terminology). Each box has a type identifier and length, and boxes can contain other boxes.
C2PA uses a "uuid" box type - a box identified by a 16-byte UUID - to store the JUMBF manifest in ISO BMFF files. The UUID is the C2PA-designated identifier. File processing tools that do not understand the uuid box type skip it during processing, preserving it intact.
This makes MP4 provenance durable through common video processing workflows. Transcoding that does not modify the container structure preserves the manifest. Streaming platforms that serve MP4 files directly pass the manifest to recipients. The manifest survives download and local storage.
Deepfake Detection and Media Provenance
The same C2PA provenance infrastructure that authenticates human-created audio and video also provides the mechanism for identifying AI-generated synthetic media. A C2PA manifest on an AI-generated video records that it was generated by a specific AI system, supporting EU AI Act Article 52 compliance requirements.
For broadcasters and news organizations, the practical value is inverse: signing authentic footage with C2PA provenance creates a documented record of what is real. When a question arises about whether footage has been manipulated, provenance verification either confirms the footage is unmodified (hash matches the signed original) or detects modification (hash does not match).
This is a materially different capability than deepfake detection tools, which analyze synthetic patterns in video frames. Detection tools guess whether content is synthetic. Provenance verification confirms whether signed content is unmodified. Both capabilities are useful; they address different questions.
Live Streams
C2PA 2.3 Section 19 defines provenance for live video streams. Live stream provenance uses per-segment manifests linked in a backward chain, so each segment of the stream carries its own signed manifest and points to the previous segment. This creates a continuous tamper-evident record across a live broadcast.
Live stream provenance is covered in detail at Content Provenance for Live Streams.
Format-Specific Pages
Related Resources
Sign Your Audio and Video Content
API and SDK support for all 10 audio and video formats. Batch signing for archives. Free verification for any recipient.