Skip to main content

Audio and Video Content Provenance

Six audio formats and four video formats. C2PA manifests embedded in container-native structures that travel with files through distribution and streaming.

Container-Native Embedding

Audio and video files use container formats - structured wrappers around compressed media data. The container includes metadata fields, timing information, track definitions, and now, with C2PA, provenance manifests.

Each audio and video format has a container type with specific mechanisms for storing extension data. C2PA uses these native extension mechanisms rather than adding new metadata fields that could be stripped by processing tools:

  • RIFF containers (WAV, AVI): JUMBF data in a dedicated RIFF chunk
  • MP3 (ID3 container): JUMBF data in an ID3v2 GEOB (General Encapsulated Object) frame
  • ISO BMFF containers (MP4, MOV, M4V, M4A, AAC): JUMBF data in a uuid box

These are the same extension mechanisms that other metadata uses. A uuid box in an MP4 file is a standard part of the ISO BMFF specification. Processing tools that do not understand the C2PA content leave it intact rather than stripping it.

Audio Formats

FormatContainerPrimary Use Cases
WAVaudio/wavRIFF chunkMusic production, broadcast audio, podcast masters
MP3audio/mpegID3 GEOB framePodcast distribution, music streaming, audio journalism
M4Aaudio/mp4ISO BMFF uuid boxApple Music, iTunes distribution, high-quality audio
AACaudio/aacISO BMFF uuid boxStreaming platforms, mobile audio, broadcasting
FLACaudio/flacCustom JUMBF/COSELossless music, archival audio, hi-fi streaming
MPAaudio/mpaMPEG audio frameBroadcast audio, multimedia applications

The ID3 GEOB Frame (MP3)

MP3 files use the ID3 tagging format for metadata. ID3 supports multiple frame types including GEOB (General Encapsulated Object), which stores arbitrary binary data with a MIME type identifier. Encypher stores the C2PA JUMBF manifest data in a GEOB frame identified by the C2PA MIME type.

This is the C2PA-standard embedding method for MP3. The GEOB frame is preserved by ID3-aware tools that process MP3 files. Most podcast distribution platforms and music distribution services pass ID3 tags through without stripping them, making the C2PA manifest durable through typical podcast and music distribution workflows.

MP3 is particularly important for podcast provenance. A podcast episode signed at production carries provenance through distribution to Apple Podcasts, Spotify, and other platforms. The episode's authorship, production organization, and publication date are documented in the file itself.

Video Formats

FormatContainerPrimary Use Cases
MP4video/mp4ISO BMFF uuid boxWeb video, streaming platforms, news video
MOVvideo/quicktimeISO BMFF uuid box (QuickTime)Professional video production, film, Apple ecosystem
M4Vvideo/x-m4vISO BMFF uuid boxApple TV, iTunes, DRM-protected video
AVIvideo/x-msvideoRIFF chunkLegacy video archives, Windows ecosystem, surveillance

ISO BMFF: The Dominant Video Container

MP4, MOV, M4V, and M4A all use ISO Base Media File Format (ISO BMFF) as their container architecture. ISO BMFF organizes file content into "boxes" (also called "atoms" in Apple's QuickTime terminology). Each box has a type identifier and length, and boxes can contain other boxes.

C2PA uses a "uuid" box type - a box identified by a 16-byte UUID - to store the JUMBF manifest in ISO BMFF files. The UUID is the C2PA-designated identifier. File processing tools that do not understand the uuid box type skip it during processing, preserving it intact.

This makes MP4 provenance durable through common video processing workflows. Transcoding that does not modify the container structure preserves the manifest. Streaming platforms that serve MP4 files directly pass the manifest to recipients. The manifest survives download and local storage.

Deepfake Detection and Media Provenance

The same C2PA provenance infrastructure that authenticates human-created audio and video also provides the mechanism for identifying AI-generated synthetic media. A C2PA manifest on an AI-generated video records that it was generated by a specific AI system, supporting EU AI Act Article 52 compliance requirements.

For broadcasters and news organizations, the practical value is inverse: signing authentic footage with C2PA provenance creates a documented record of what is real. When a question arises about whether footage has been manipulated, provenance verification either confirms the footage is unmodified (hash matches the signed original) or detects modification (hash does not match).

This is a materially different capability than deepfake detection tools, which analyze synthetic patterns in video frames. Detection tools guess whether content is synthetic. Provenance verification confirms whether signed content is unmodified. Both capabilities are useful; they address different questions.

Live Streams

C2PA 2.3 Section 19 defines provenance for live video streams. Live stream provenance uses per-segment manifests linked in a backward chain, so each segment of the stream carries its own signed manifest and points to the previous segment. This creates a continuous tamper-evident record across a live broadcast.

Live stream provenance is covered in detail at Content Provenance for Live Streams.

Format-Specific Pages

Related Resources

Sign Your Audio and Video Content

API and SDK support for all 10 audio and video formats. Batch signing for archives. Free verification for any recipient.

Related