Skip to main content

Image Content Provenance

C2PA manifests embedded in 13 image formats. Stored in the file container, not in EXIF fields. The ownership record survives download, re-upload, and redistribution.

The EXIF Problem

Every major social platform strips EXIF metadata from uploaded images. Instagram, Twitter/X, Facebook, and LinkedIn all remove EXIF data - including photographer name, copyright notice, and creation date - when images are uploaded. CDN optimization pipelines often do the same. The industry standard for image ownership metadata is systematically stripped by the infrastructure through which images are distributed.

This is a known problem and the C2PA specification addresses it directly. C2PA manifests are not stored in EXIF fields. They are stored in format-specific container structures: JUMBF boxes appended to JPEG files, PNG chunks, RIFF chunks, ISO BMFF uuid boxes, and other format-native embedding locations. These structures are part of the file format itself and survive the processes that strip EXIF.

Social platforms do sometimes strip C2PA manifests during upload processing. This is an evolving area - platforms like LinkedIn and some major news distribution services are implementing C2PA support. For B2B distribution, syndication feeds, and API-based distribution, C2PA manifests typically survive intact. For consumer social platforms, the landscape is improving as platform adoption of C2PA grows.

JUMBF: The Container Standard

JUMBF (JPEG Universal Metadata Box Format) is an ISO standard (ISO 19566-5) for embedding structured metadata into image and media files. C2PA uses JUMBF as the container format for manifests across all supported media types.

The JUMBF box contains the complete C2PA manifest: the claim (what is being asserted about the content), assertions (specific claims like authorship, creation timestamp, and rights terms), the content hash, and the COSE signature (the cryptographic proof that authenticates the manifest).

Each image format has a format-specific location for the JUMBF box. The C2PA specification defines these locations precisely. JPEG files have the box appended after the EOI (End of Image) marker. PNG files use a custom PNG chunk. WebP files use a RIFF chunk. ISO BMFF containers (AVIF, HEIC, HEIF, M4V) use a uuid box. The embedding location is format-native to ensure compatibility with existing image processing tools.

Supported Image Formats

FormatContainerPrimary Use Cases
JPEGJUMBF appended after EOI markerNews photography, editorial images, stock photography
PNGJUMBF in custom PNG chunkInfographics, screenshots, logos, data visualizations
WebPJUMBF in RIFF-based WebP container chunkWeb publishing, CDN-served images, CMS content
TIFFJUMBF box in TIFF IFD structureProfessional photography, print publishing, archival
AVIFJUMBF box in ISO BMFF containerNext-gen web images, high-quality web delivery
HEICJUMBF box in ISO BMFF/HEIF containeriPhone/iPad photos, mobile photography
HEIFJUMBF box in ISO BMFF/HEIF containerCross-platform high-efficiency images, camera manufacturers
HEIC SequenceJUMBF box in ISO BMFF/HEIF sequence containerBurst photography, live photos, image sequences
HEIF SequenceJUMBF box in ISO BMFF/HEIF sequence containerAnimated and sequential image content
SVGJUMBF data in SVG metadata elementLogos, icons, illustrations, data visualizations
DNGJUMBF box in TIFF/EP IFD structureRAW photography, professional photo workflows
GIFJUMBF data in GIF application extension blockAnimated content, memes, simple web graphics
JPEG XLCustom JUMBF/COSE embeddingNext-gen web delivery, photography archives

pHash Attribution Search

Perceptual hashing (pHash) generates a compact fingerprint of an image based on its visual content rather than its exact pixel values. Two images that look similar to a human eye produce similar pHash values, even if they differ in resolution, compression, or minor visual modifications.

Encypher's attribution search uses pHash to find derivative works: versions of a signed image that have been resized, compressed, color-adjusted, or lightly edited. When a photographer publishes a signed original, attribution search can identify compressed social media versions, thumbnail derivatives, and editorial crops of the same image as related to the original.

This is particularly useful for photojournalism. A news organization's signed original photo may be redistributed as compressed versions across dozens of outlets. pHash attribution search identifies those derivatives and links them back to the original provenance record, extending the ownership documentation to derivative works that do not themselves carry the original manifest.

What the Image Manifest Records

The C2PA manifest embedded in a signed image includes:

  • Publisher/photographer identity (verified certificate)
  • Creation and publication timestamps
  • Content hash (SHA-256 of the image pixel data)
  • Rights assertions (machine-readable licensing terms)
  • Ingredient list (for images that incorporate other signed assets)
  • Actions (creation, editing, publishing - each as a separate assertion)
  • Location data (optional, for photojournalism with geographic context)

The full manifest is accessible through any C2PA-compatible verification tool. Verification does not require Encypher access - the open-source c2pa-python and c2pa-js libraries verify manifests directly from the image file.

AI Image Generation and Provenance

AI image generation training sets contain billions of images scraped from the web. Many of those images were created by photographers and visual artists who did not license them for AI training. The provenance problem for images is both a publisher protection issue and an AI compliance issue.

Photographers and visual publishers who sign their archives with C2PA provenance create a documented ownership record for every image. When those images appear in AI training data, the manifest records that they were used without a license if no license was obtained. EU AI Act Article 52 compliance for image-generating AI systems requires C2PA manifest embedding for AI-generated outputs.

C2PA has explicit support for documenting AI-generated images: the manifest can record that an image was generated by a specific AI model, the generation prompt (optionally), and the generating organization's identity. This creates a provenance record for AI-generated images that is as strong as the provenance record for human-created images.

Format-Specific Pages

Related Resources

Sign Your Image Archive

API and SDK support for all 13 image formats. Batch signing available. Free verification for any recipient, no account required.

Related