Ever wonder what Git does behind the scenes when you run git commit
or git push
? That mysterious .git
folder holds all the magic—it's Git's brain, memory, and filing system rolled into one. Today we're cracking it open to see how Git really works under the hood.
The .git Folder: Git's Mission Control
When you run git init
, Git creates a .git
folder that becomes the nerve center of your repository. This isn't just storage—it's a sophisticated database that tracks every change, branch, and piece of metadata about your project.
Here's what a typical .git
folder looks like:
.git/
├── HEAD # Points to current branch/commit
├── config # Repository configuration
├── description # Repository description (used by GitWeb)
├── index # Staging area (binary file)
├── packed-refs # Packed references for efficiency
├── hooks/ # Git hook scripts
│ ├── pre-commit.sample
│ ├── post-commit.sample
│ ├── pre-push.sample
│ └── ...
│
├── info/ # Repository metadata
│ ├── exclude # Local ignore patterns
│ └── refs # Reference namespace info
│
├── logs/ # Reference history logs
│ ├── HEAD # HEAD movement history (reflog)
│ └── refs/
│ ├── heads/ # Branch movement logs
│ └── remotes/ # Remote branch logs
│
├── objects/ # Git's object database
│ ├── 01/ # Objects with hash starting "01..."
│ ├── 02/ # Objects with hash starting "02..."
│ ├── ... # More hash directories (00-ff)
│ ├── info/ # Object database info
│ └── pack/ # Packed objects for efficiency
│ ├── pack-*.idx # Pack index files
│ └── pack-*.pack # Packed object files
│
├── refs/ # References (pointers to commits)
│ ├── heads/ # Local branches
│ │ ├── main # Main branch pointer
│ │ ├── develop # Develop branch pointer
│ │ └── feature-* # Feature branch pointers
│ ├── remotes/ # Remote tracking branches
│ │ └── origin/
│ │ ├── main # Origin's main branch
│ │ └── develop # Origin's develop branch
│ └── tags/ # Tag references
│ ├── v1.0.0 # Version tags
│ └── v1.1.0
│
└── branches/ # Legacy branch storage (rarely used)
Let's explore each component and understand what makes Git tick.
Meet Your New Best Friend: cat-file
Before we dive deep, you need to know about git cat-file
—your Swiss Army knife for exploring Git's internals. This command lets you peek inside Git objects and understand what's really happening under the hood.
Here are the essential options you'll use:
# Show the type of an object (blob, tree, commit, tag)
git cat-file -t <object-hash>
# Show the content of an object (pretty-printed)
git cat-file -p <object-hash>
# Show the size of an object in bytes
git cat-file -s <object-hash>
# Check if an object exists (returns nothing if valid)
git cat-file -e <object-hash>
You don't need the full SHA-1 hash—Git accepts shortened versions as long as they're unambiguous. Usually 4-6 characters work fine:
# These are equivalent (if the short hash is unique)
git cat-file -p a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2
git cat-file -p a1b2
You'll see git cat-file
throughout this article—it's how we'll explore every corner of Git's object database.
Objects: Git's Content Database
The objects/
folder is where Git stores all your content. Everything—files, directories, commits, tags—becomes an object with a unique SHA-1 hash.
The Four Types of Git Objects
Blob Objects: Your File Content
Blobs store the actual content of your files. Here's how to peek inside:
# Find a blob object
git ls-tree HEAD | head -1
# Example output: 100644 blob a1b2c3d4... README.md
# View the blob content
git cat-file -p a1b2c3d4
Tree Objects: Directory Structure
Trees represent directories and link to blobs and other trees:
# View a tree object
git cat-file -p HEAD^{tree}
# Example output:
# 100644 blob a1b2c3d4... README.md
# 040000 tree b2c3d4e5... src
# 100644 blob c3d4e5f6... package.json
Commit Objects: Snapshots in Time
Commits tie everything together with metadata:
# View a commit object
git cat-file -p HEAD
# Example output:
# tree a1b2c3d4e5f6...
# parent b2c3d4e5f6a1...
# author Ris Adams <email> 1640995200 -0500
# committer Ris Adams <email> 1640995200 -0500
#
# Add user authentication feature
Object Storage Deep Dive
Git uses a clever storage system. Objects are stored in subdirectories based on the first two characters of their SHA-1 hash:
objects/
├── a1/
│ └── b2c3d4e5f6... (full hash: a1b2c3d4e5f6...)
├── b2/
│ └── c3d4e5f6a1...
└── pack/ (compressed object packs)
├── pack-abc123.idx
└── pack-abc123.pack
You can explore this yourself:
# List object directories
Get-ChildItem .git\objects | Where-Object { $_.Name.Length -eq 2 }
# Find all objects
Get-ChildItem .git\objects -Recurse -File | Measure-Object
References: Git's Pointer System
The refs/
folder contains all the pointers that make Git navigation possible.
Branch References (refs/heads/)
Each file in refs/heads/
represents a branch and contains the SHA-1 of the latest commit:
# View the main branch reference
cat .git/refs/heads/main
# Output: a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2
# This is equivalent to
git rev-parse main
Remote References (refs/remotes/)
These track the state of branches on remote repositories:
# View remote branch state
cat .git/refs/remotes/origin/main
# See all remote references
git branch -r
Tags (refs/tags/)
Tags point to specific commits (or tag objects for annotated tags):
# Lightweight tag (points directly to commit)
cat .git/refs/tags/v1.0.0
# Annotated tag (points to tag object)
git cat-file -t $(cat .git/refs/tags/v1.1.0) # outputs "tag"
git cat-file -p $(cat .git/refs/tags/v1.1.0) # shows tag metadata
HEAD: Your Current Location
The HEAD
file tells Git where you are right now:
# Typical HEAD content (on a branch)
cat .git/HEAD
# Output: ref: refs/heads/main
# During detached HEAD state
# Output: a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2
The Index: Git's Staging Area
The index
file (also called the staging area) is where Git prepares your next commit:
# View what's staged
git ls-files --stage
# Example output:
# 100644 a1b2c3d4... 0 README.md
# 100644 b2c3d4e5... 0 src/main.js
# 100644 c3d4e5f6... 0 package.json
The index contains:
- File mode (permissions)
- SHA-1 hash of the blob
- Stage number (0 for normal, 1-3 for merge conflicts)
- File path
Logs: Git's Memory
The logs/
folder keeps a history of where your references have pointed:
# View HEAD's movement history
git reflog
# Equivalent to: cat .git/logs/HEAD
# View branch history
cat .git/logs/refs/heads/main
Each log entry shows:
- Previous SHA-1
- New SHA-1
- Author and timestamp
- Action description
a1b2c3d4 b2c3d4e5 Ris Adams <email> 1640995200 -0500 commit: Add feature
b2c3d4e5 c3d4e5f6 Ris Adams <email> 1640995260 -0500 checkout: moving from main to feature-branch
Configuration: Git's Settings
The config
file stores repository-specific settings:
# View local config
cat .git/config
# Example content:
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
[remote "origin"]
url = https://github.com/user/repo.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "main"]
remote = origin
merge = refs/heads/main
Hooks: Git's Automation System
The hooks/
folder contains scripts that run at specific Git events:
# List available hooks
ls .git/hooks/
# Common hooks:
# pre-commit - Runs before commits
# post-commit - Runs after commits
# pre-push - Runs before pushes
# post-receive - Runs on the server after receiving pushes
Here's a simple pre-commit hook to run tests:
# .git/hooks/pre-commit
echo "Running tests before commit..."
npm test
if [ $? -ne 0 ]; then
echo "Tests failed. Commit aborted."
exit 1
fi
Practical Git Internals Commands
Here's your toolkit for exploring Git internals:
# Object inspection
git cat-file <option> <sha1> # Inspect an object
# Repository exploration
git count-objects # Count loose objects
git verify-pack -v .git/objects/pack/*.idx # Examine pack files
git fsck # Verify repository integrity
# Reference management
git update-ref refs/heads/test-branch <sha1> # Create/update reference
git symbolic-ref HEAD refs/heads/main # Update symbolic reference
# Index manipulation
git ls-files --stage # Show staged files
git update-index --add <file> # Add file to index manually
When Git Internals Knowledge Pays Off
Understanding Git's internals helps in several scenarios:
Repository Corruption Recovery
# Find dangling objects
git fsck --unreachable
# Recover lost commits
git reflog expire --expire-unreachable=now --all
git gc --prune=now
Performance Optimization
# Repack objects for better performance
git gc --aggressive
# Check repository size
git count-objects -vH
Advanced Debugging
# Trace Git's decision-making
GIT_TRACE=1 git status
GIT_TRACE_PACK_ACCESS=1 git log --oneline -5
Key Takeaways
- The
.git
folder is a complete database that tracks everything about your project - Objects store all content using SHA-1 hashes for integrity
- References provide human-readable names for commits
- The index bridges your working directory and repository
- Understanding internals helps with troubleshooting and advanced Git operations
Going Deeper
Want to explore more? Try building a simple Git implementation or dive into the Git source code. The more you understand Git's internals, the more confident you'll become with complex operations like rebasing, cherry-picking, and repository maintenance.
Next time you run git status
, you'll know exactly what Git is checking behind the scenes—and that's pretty powerful knowledge to have in your toolkit.