Skip to main content

working with git: archive

You're about to deploy to production and your PM asks for "just a clean copy of the code without all that Git history stuff." Or maybe you need to package a specific release for a client who shouldn't see your development branches. Enter git archive — the Git command that's criminally underused despite solving these exact problems elegantly.

What Git Archive Actually Does

Think of git archive as Git's built-in export tool. It creates a clean snapshot of your repository at any commit, tag, or branch — without the .git directory, without the history, without any of the Git metadata. Just the files, exactly as they existed at that point in time.

Unlike cloning or downloading a zip from GitHub, git archive gives you surgical precision over what gets exported and how it's packaged. You can target specific commits, exclude certain paths, and output in multiple formats.

# Basic syntax
git archive --format=zip --output=release.zip HEAD

That's it. You now have a clean zip file of your current branch without any Git baggage.

Why You'd Want This Over Alternatives

Clean Deployments

Your production server doesn't need your commit history, branch information, or that embarrassing commit message from 3am. git archive gives you just the code.

Client Deliverables

When delivering source code to clients, you often want to provide a clean package without exposing your development process, internal comments, or proprietary Git hooks.

Release Packaging

Creating official releases becomes trivial. No more manual file copying or worrying about accidentally including development files.

Compliance and Auditing

Some organizations require clean, traceable code packages for compliance. git archive provides reproducible exports tied to specific commits.

Practical Examples

Basic Archive Creation

# Create a zip archive of the current branch
git archive --format=zip --output=myproject-latest.zip HEAD

# Create a tar.gz archive of a specific tag
git archive --format=tar.gz --output=v2.1.0.tar.gz v2.1.0

# Archive a specific commit
git archive --format=zip --output=hotfix.zip a1b2c3d

Targeting Specific Paths

# Only archive the src directory
git archive --format=zip --output=source-only.zip HEAD:src/

# Exclude certain directories
git archive --format=tar --output=no-tests.tar HEAD \
--prefix=myproject/ \
--exclude="tests/*" \
--exclude="*.test.js"

Archive Workflow Visualization

Advanced Archive Strategies

# Archive with a directory prefix (useful for releases)
git archive --format=tar.gz \
--prefix=myproject-v1.2.0/ \
--output=myproject-v1.2.0.tar.gz \
v1.2.0

# Archive directly to stdout and pipe to remote server
git archive --format=tar HEAD | ssh user@server 'cd /var/www && tar -xf -'

# Create archives for multiple formats simultaneously
git archive --format=zip --output=release.zip v1.0.0
git archive --format=tar.gz --output=release.tar.gz v1.0.0

Real-World Scenarios

Scenario 1: Production Deployment

#!/bin/bash
# deploy.sh - Clean deployment script

VERSION=$(git describe --tags --abbrev=0)
git archive --format=tar.gz \
--prefix=app/ \
--output=deploy-${VERSION}.tar.gz \
${VERSION}

# Upload and extract on production server
scp deploy-${VERSION}.tar.gz prod-server:/tmp/
ssh prod-server "cd /var/www && tar -xzf /tmp/deploy-${VERSION}.tar.gz"

Scenario 2: Client Code Delivery

# Create a clean client package excluding internal files
git archive --format=zip \
--output=client-delivery-$(date +%Y%m%d).zip \
--prefix=project-source/ \
HEAD \
--exclude="*.internal.*" \
--exclude="docs/internal/*" \
--exclude=".env*"

Archive vs. Alternative Approaches

Alternative Approaches and When to Use Them

Git Clone with Cleanup

git clone --depth 1 --branch v1.0.0 repo.git
rm -rf repo/.git

Pros: Familiar workflow, works with any Git repository
Cons: Slower, requires cleanup, less precise control

GitHub Release Downloads

Pros: Easy for public repositories, automatic via GitHub
Cons: No custom exclusions, limited to tagged releases, platform dependent

Manual File Copying

cp -r src/ ../clean-copy/

Pros: Simple, obvious
Cons: Error-prone, misses hidden files, inconsistent results

Rsync with Exclusions

rsync -av --exclude='.git' --exclude='node_modules' ./ ../clean-copy/

Pros: Powerful exclusion patterns, works with any directory
Cons: Not Git-aware, doesn't handle specific commits

When Git Archive Shines

Perfect for:

  • Production deployments requiring clean code
  • Client deliverables without development history
  • Creating reproducible release packages
  • Compliance scenarios requiring clean exports
  • CI/CD pipelines needing specific commit snapshots

Skip it when:

  • You need the full Git history
  • Working with local development copies
  • The recipient needs to contribute back to the repository
  • You're just moving code between development machines

Advanced Tips

Combining with Git Attributes

Create a .gitattributes file to control archive behavior:

# Exclude development files from archives
*.development.* export-ignore
tests/ export-ignore
docs/internal/ export-ignore

Now git archive automatically excludes these paths without explicit --exclude flags.

Scripted Release Process

#!/bin/bash
# release.sh - Complete release packaging

TAG=$1
if [ -z "$TAG" ]; then
echo "Usage: ./release.sh <tag>"
exit 1
fi

# Verify tag exists
git rev-parse --verify "$TAG" >/dev/null 2>&1 || {
echo "Tag $TAG does not exist"
exit 1
}

# Create multiple archive formats
git archive --format=zip --prefix="project-$TAG/" --output="project-$TAG.zip" "$TAG"
git archive --format=tar.gz --prefix="project-$TAG/" --output="project-$TAG.tar.gz" "$TAG"

echo "Release packages created for $TAG"
ls -la project-$TAG.*

Key Takeaways

  • git archive creates clean snapshots without Git metadata — perfect for deployments and deliverables
  • It offers surgical precision over what gets exported, unlike crude alternatives like manual copying
  • Use .gitattributes with export-ignore to automatically exclude development files from archives
  • Combine with tags and scripting for powerful, reproducible release processes
  • It's faster and more reliable than cloning and cleaning up repositories

Next time someone asks for "just the code files," you'll know exactly which Git command was built for that job. git archive isn't flashy, but it's the kind of utility that makes you wonder how you managed deployments and code delivery before you knew it existed.