How to Extract Metadata from Docker Container Images
Docker container images carry structured metadata far beyond the filesystem layers themselves. OCI manifests, image configs, labels, layer history, and registry-level tags all hold information that matters for security audits, compliance checks, and build reproducibility. This guide covers five practical extraction methods, from docker inspect for local images to registry API calls for remote inspection without pulling.
What Metadata Does a Docker Image Contain
A Docker image is more than a stack of filesystem layers. The OCI image specification defines several metadata structures that travel with every image, and each one serves a different purpose.
Image manifest. The manifest is a JSON document that lists every layer in the image by digest, along with the image configuration reference. For multi-architecture images, a manifest list (also called an OCI image index) wraps multiple platform-specific manifests into a single tag. The manifest is the first thing a registry returns when you ask about an image.
Image configuration. The config JSON stores runtime defaults: the entrypoint, CMD, environment variables, exposed ports, working directory, and the user the container runs as. It also records the OS, architecture, and the ordered history of every layer, including which Dockerfile instruction created it.
Labels. Key-value pairs baked into the image at build time. Common labels include org.opencontainers.image.source for the source repo, org.opencontainers.image.version for the release tag, and org.opencontainers.image.authors for maintainer contact. Labels are the primary mechanism for attaching custom metadata to images.
Layer history. Each entry in the history array corresponds to a Dockerfile step. It records the exact command that produced the layer, whether it was created by a build cache hit, and whether the layer is empty (as with ENV or LABEL instructions that change config but don't add files).
Registry-level metadata. Tags, digests, and timestamps live at the registry level rather than inside the image itself. A single image digest can have multiple tags, and registries track when each tag was last pushed.
A production image usually contains multiple layers, each with its own history entry. The image config is small compared with the layers themselves, but it encodes the build provenance chain that auditors and deployment systems rely on.
Method 1: docker inspect for Local Images
The simplest starting point is docker inspect, which works on any image already pulled to your local daemon.
docker inspect nginx:latest
This returns the full image configuration as JSON, including the image ID, creation timestamp, architecture, OS, config (Cmd, Entrypoint, Env, Labels, ExposedPorts), and the ordered list of layer digests in RootFS.Layers.
Extracting Specific Fields
Use Go template formatting or pipe through jq to pull exactly what you need:
docker inspect --format='{{json .Config.Labels}}' nginx:latest | jq .
That gives you just the labels. Replace .Config.Labels with .Config.Env for environment variables, .Config.Cmd for the default command, or .Architecture for the target platform.
To get layer history with the original Dockerfile commands:
docker history --no-trunc nginx:latest
The --no-trunc flag is important. Without it, Docker truncates commands at 45 characters, which cuts off most useful detail.
Limitations
docker inspect only works on images that exist in your local daemon. You must pull the image first, which means downloading every layer. For a 500MB image you just want to check labels on, that's a lot of wasted bandwidth. It also requires a running Docker daemon, which rules it out in restricted CI environments or air-gapped audit workflows.
Method 2: Skopeo for Remote Inspection Without Pulling
Skopeo solves the biggest limitation of docker inspect: it queries registries directly without pulling the image or needing a Docker daemon.
skopeo inspect docker://docker.io/library/nginx:latest
This returns image metadata including the tag, digest, layers, architecture, OS, labels, and environment variables. The entire operation transfers a few kilobytes of JSON rather than the full image.
Raw Manifest Access
Add --raw to get the exact manifest JSON from the registry:
skopeo inspect --raw docker://docker.io/library/nginx:latest | jq .
For multi-architecture images, this returns the manifest list (OCI index). To drill into a specific platform:
skopeo inspect --raw --override-arch arm64 docker://docker.io/library/nginx:latest
Full Image Configuration
The --config flag returns the OCI image config, which includes the full layer history with Dockerfile commands:
skopeo inspect --config docker://docker.io/library/nginx:latest | jq .
Why Skopeo Matters for Automation
Skopeo runs without root privileges and without a container runtime. That makes it the standard choice for CI pipelines where you need to validate image metadata before deploying. You can check that required labels exist, verify the base image digest hasn't changed, or confirm the target architecture matches your cluster, all before any image bytes hit your nodes.
Install it on most Linux distributions with your package manager (apt install skopeo on Debian/Ubuntu, dnf install skopeo on Fedora/RHEL).
Store and Search Your Container Metadata
Export image configs and manifests to a Fast.io workspace. Intelligence Mode indexes everything for semantic search, and Metadata Views turn document collections into queryable structured data. Start with 50 GB free, no credit card required.
Method 3: Docker Registry HTTP API
When you need metadata extraction without installing any tools at all, the Docker Registry HTTP API v2 works with plain curl. This is useful inside minimal containers, in scripts that can't install Skopeo, or when building custom registry integrations.
Authentication
Docker Hub requires a bearer token. Fetch one first:
TOKEN=$(curl -s "https://auth.docker.io/token?service=registry.docker.io&scope=repository:library/nginx:pull" | jq -r .token)
Fetching the Manifest
Request the manifest with the appropriate Accept header:
curl -s -H "Authorization: Bearer $TOKEN" \
-H "Accept: application/vnd.docker.distribution.manifest.v2+json" \
"https://registry-1.docker.io/v2/library/nginx/manifests/latest" | jq .
The response includes the schema version, media type, config digest, and the ordered list of layer digests with their sizes. The Docker-Content-Digest response header gives you the manifest's own digest, which is the immutable identifier for that exact image version.
Fetching the Image Config
The config digest from the manifest points to the full image configuration:
CONFIG_DIGEST=$(curl -s -H "Authorization: Bearer $TOKEN" \
-H "Accept: application/vnd.docker.distribution.manifest.v2+json" \
"https://registry-1.docker.io/v2/library/nginx/manifests/latest" | jq -r .config.digest)
curl -s -H "Authorization: Bearer $TOKEN" \
"https://registry-1.docker.io/v2/library/nginx/blobs/$CONFIG_DIGEST" | jq .
This returns the same configuration data you'd get from docker inspect, but without ever pulling the image. The config blob is typically 2 to 10 KB, compared to hundreds of megabytes for the full image.
Listing Tags
To enumerate all available tags for a repository:
curl -s -H "Authorization: Bearer $TOKEN" \
"https://registry-1.docker.io/v2/library/nginx/tags/list" | jq .
This approach works against any OCI-compliant registry (Docker Hub, GitHub Container Registry, Amazon ECR, Google Artifact Registry, Azure Container Registry) with the appropriate authentication changes.
Method 4: crane for Scriptable Registry Operations
Crane is a Go-based CLI from Google's go-containerregistry project. It fills a gap between Skopeo's inspection focus and the raw HTTP API by providing a composable set of subcommands designed for scripting.
Getting the Image Config
crane config docker.io/library/nginx:latest | jq .
This returns the full OCI image configuration JSON, the same data as docker inspect but fetched directly from the registry.
Reading the Manifest
crane manifest docker.io/library/nginx:latest | jq .
For multi-arch images, this returns the manifest list. Add --platform linux/arm64 to select a specific platform.
Extracting Labels and Digests
Combine crane with jq for targeted extraction:
crane config docker.io/library/nginx:latest | jq '.config.Labels'
crane digest docker.io/library/nginx:latest
The crane digest command returns the image's immutable digest without downloading the config, which is useful for pinning image versions in deployment manifests.
Why Choose crane
Crane ships as a single static binary with no runtime dependencies. It installs with go install github.com/google/go-containerregistry/cmd/crane@latest or via Homebrew (brew install crane). Because it's built on the same library used by tools like ko and Tekton, its behavior matches what the broader container ecosystem expects. It's also fast: crane pipelines image metadata through streaming JSON, so you can process hundreds of images in a shell loop without hitting memory limits.
Method 5: docker buildx imagetools for Multi-Platform Images
The docker buildx imagetools inspect command is purpose-built for inspecting multi-platform images and their attestations. If you're working with images that target multiple architectures, this is the most direct tool.
docker buildx imagetools inspect docker.io/library/nginx:latest
The output shows the manifest list with each platform variant, its digest, and the individual layer details. It also surfaces build attestations (SBOM and provenance) when present.
Raw Manifest Output
docker buildx imagetools inspect --raw docker.io/library/nginx:latest | jq .
This returns the complete OCI index JSON, which you can parse to enumerate every platform, check annotation fields, and verify that all expected architectures are present.
Checking Annotations
OCI annotations attach arbitrary metadata to manifests and descriptors. Buildx surfaces these automatically:
docker buildx imagetools inspect docker.io/library/nginx:latest \
--format "{{json .Manifest}}" | jq '.annotations'
Annotations are increasingly used for supply chain metadata: build timestamps, source commit references, and SBOM attachment pointers.
When to Use imagetools
Use docker buildx imagetools inspect when you specifically need to verify multi-architecture builds, check attestation metadata, or inspect OCI annotations. For single-platform images or simple label checks, Skopeo or crane are lighter options.
Choosing the Right Method and Storing Results
Each extraction method fits a different workflow:
- Local debugging:
docker inspectanddocker historywhen the image is already pulled. - CI/CD validation: Skopeo for pre-deployment checks without a Docker daemon.
- Minimal environments: Registry HTTP API with curl when you can't install tools.
- Scripting at scale: crane for processing metadata across hundreds of images.
- Multi-arch verification:
buildx imagetoolsfor platform matrix and attestation checks.
For one-off checks, any of these methods works fine. The differences matter when you're building metadata extraction into a repeatable process: scanning images on push, auditing labels for compliance, or tracking base image versions across a fleet of services.
Structuring Extracted Metadata
Raw JSON output from these tools is useful for debugging but hard to query across many images. Teams that run regular metadata audits typically normalize the output into a structured format: CSV for spreadsheet analysis, or a database for querying across repositories.
For document-heavy metadata workflows beyond container images, Fast.io's Metadata Views handle the structuring automatically. You describe the fields you want extracted in plain language, and the platform builds a typed schema (text, dates, URLs, booleans) across your files. It works with PDFs, images, spreadsheets, and scanned documents. For container metadata specifically, you could export config JSON from any of the methods above, store the files in a Fast.io workspace, and use Intelligence Mode to search across your collected manifests and configs without building a custom indexing pipeline.
Teams that store extracted metadata in shared workspaces also get audit trails and version history on each export, which matters when you need to prove what a particular image contained at deployment time.
Frequently Asked Questions
How do I see metadata of a Docker image?
Run `docker inspect <image>` for local images. This returns the full image configuration as JSON, including labels, environment variables, the entrypoint, exposed ports, and layer digests. For remote images you haven't pulled, use `skopeo inspect docker://<image>` to fetch the same data directly from the registry without downloading the image layers.
What metadata does a Docker image contain?
A Docker image contains several layers of metadata defined by the OCI image specification. The image manifest lists all layers by digest and references the config. The image configuration stores runtime defaults (entrypoint, CMD, environment variables, exposed ports, working directory, user), the OS and architecture, and the ordered history of every Dockerfile instruction. Labels provide custom key-value metadata. At the registry level, tags, digests, and push timestamps add additional context.
How do I get Docker image labels?
For local images: `docker inspect --format='{{json .Config.Labels}}' <image> | jq .` returns just the labels. For remote images: `skopeo inspect docker://<image> | jq .Labels` or `crane config <image> | jq '.config.Labels'` both work without pulling the image.
How do I inspect a Docker image without pulling it?
Use Skopeo (`skopeo inspect docker://<image>`), crane (`crane config <image>`), or the Docker Registry HTTP API with curl. All three query the registry directly and return metadata without downloading image layers. Skopeo and crane are purpose-built for this and handle authentication automatically.
What is the difference between docker inspect and skopeo inspect?
docker inspect works only on images already pulled to your local Docker daemon. It requires a running daemon and root access (or Docker group membership). Skopeo inspect queries registries remotely, needs no daemon, runs without root, and never downloads image layers. Skopeo also supports the --raw flag for raw manifest access and --config for the full OCI image configuration.
How do I extract the manifest from a Docker image?
Three options: `skopeo inspect --raw docker://<image>` returns the manifest JSON, `crane manifest <image>` does the same, or you can query the registry HTTP API directly with curl against the `/v2/<repo>/manifests/<tag>` endpoint using the Accept header `application/vnd.docker.distribution.manifest.v2+json`.
Related Resources
Store and Search Your Container Metadata
Export image configs and manifests to a Fast.io workspace. Intelligence Mode indexes everything for semantic search, and Metadata Views turn document collections into queryable structured data. Start with 50 GB free, no credit card required.