So, you use Git every day, pushing, pulling, and occasionally rage-quitting because of a merge conflict.
But do you actually know what’s going on under the hood?
1. Git’s Data Model: Snapshots, Not Diffs
Most people assume Git tracks changes (like traditional version control systems such as SVN or Mercurial), but nope.
Git is a snapshot-based system.
Each commit is a complete snapshot of your repository at that moment in time. The magic? If a file hasn’t changed, Git just reuses the same reference instead of storing a duplicate.
Proof With cat-file
Run this in a Git repo:
|
|
You’ll see something like:
|
|
That tree
hash (a1b2c3d4e5
) represents the snapshot of the files at that commit.
2. Git’s Directed Acyclic Graph (DAG)
Git stores commits as a Directed Acyclic Graph (DAG)—a fancy way of saying commits point backward in time, forming a tree-like structure with no cycles.
Each commit has:
- A tree (which maps to actual file contents)
- A parent (or multiple parents for merges)
- Metadata (author, message, etc.)
Here’s a quick ASCII example of a Git history:
|
|
This structure makes history traversal super fast.
3. SHA-1 Hashing: Git’s Fingerprint System
Git identifies everything (commits, trees, blobs, etc.) using SHA-1 hashes. This ensures data integrity.
Let’s see how Git hashes things:
|
|
Output:
|
|
That’s a SHA-1 hash of the string “Hello Git”. Git does this for all files, commits, and trees.
4. Intervals and Git’s History Walks
How git log
Walks the Graph
When you run:
|
|
Git walks the commit graph using intervals—basically, it optimizes how it retrieves commits by skipping unnecessary paths.
Example:
If your commit history looks like:
|
|
Git doesn’t scan every commit sequentially. Instead, it walks both branches in an interval-like pattern, minimizing redundant work.
5. Exploring Git Internals With Code
Want to see Git’s raw storage? Let’s play!
1. List All Git Objects
|
|
You’ll see folders with weird names (first 2 chars of SHA-1 hashes).
2. Inspect a Commit Object
Find a commit hash and run:
|
|
It prints the commit details.
3. Check a Tree Object (File Snapshot)
Grab the tree
hash from the commit and run:
|
|
Now you see the folder structure!
4. Inspect a Blob (File Content)
Find a file’s blob hash and run:
|
|
Boom! The file’s content appears.
6. Git’s Garbage Collection and Packfiles
Git compresses objects using packfiles. These bundle multiple objects into a single file for efficiency.
Run:
|
|
Git will repack objects, saving space.
To inspect packfiles:
|
|
7. What Is a Git Branch? (It’s Just a Pointer!)
Most people assume a branch is a separate folder or copy of files. Nope.
A Git branch is just a file that contains a commit hash. That’s it.
Let’s check it out!
|
|
Example output:
|
|
That’s just a commit hash!
When you create a branch, Git creates a new file under .git/refs/heads/
with a different commit hash. This means branches are cheap—they just move a pointer.
8. How HEAD
Works: The Active Branch
HEAD
is a special reference that tells Git which branch you’re currently on.
Run:
|
|
Example output:
|
|
This means HEAD
is pointing to main
. When you switch branches, Git just updates this file.
Try:
|
|
Now it shows:
|
|
No files copied. No magic. Just a simple pointer change.
9. What Happens When You Switch Branches?
Let’s break it down:
- Git updates
HEAD
to point to the new branch. - Git updates the working directory to match the new branch’s commit.
- Git unstages any conflicting changes (if needed).
Try this:
|
|
Now check:
|
|
Output:
|
|
Git also updates your working files to match the latest commit in test-branch
.
10. How Merging Works Under the Hood
When you merge branches, Git looks for a common ancestor (usually the latest shared commit).
Example:
|
|
If you merge feature
into main
, Git finds commit B (the last shared commit), then combines changes from C and E.
Merge Types
Fast-forward merge
If no new commits exist on
main
, Git just moves the branch pointer forward.Example:
1 2
git checkout main git merge feature
This just updates
main
to point toE
.
Three-way merge
If
main
has new commits, Git needs to create a merge commit.1
git merge feature
Git creates a new commit combining changes.
11. How Git Deletes and Recovers Branches
Deleting a Branch
A branch is just a file, so deleting it is easy:
|
|
This deletes .git/refs/heads/feature
.
To force delete:
|
|
Recovering a Deleted Branch
If you deleted a branch but need it back:
Find the last commit hash:
1
git reflog
Restore the branch:
1
git checkout -b feature <commit-hash>
Boom! The branch is back.
12. How Git Rebase Works Internally
Rebasing is one of Git’s most misunderstood features. Instead of merging branches, it rewrites history by moving commits.
Example scenario:
|
|
If we run:
|
|
Git does this internally:
- Finds the common ancestor (B).
- Moves
feature
commits (D
andE
) ontomain
, replaying them one by one. - Updates
feature
to point to the new commit history.
The result:
|
|
Rebasing rewrites commit hashes, creating new commits D'
and E'
. This is why you shouldn’t rebase shared branches—it changes history!
13. The Git Reflog: Your Undo Button
Git never actually loses commits—even deleted ones. Every action is logged in the reflog (git reflog
).
Run:
|
|
Example output:
|
|
This shows recent actions, like branch checkouts and commits.
If you accidentally delete a branch or reset a commit, use:
|
|
Or restore a deleted branch:
|
|
Git is hard to break, thanks to the reflog.
14. How Git Stash Works Internally
When you run:
|
|
Git doesn’t create a branch. Instead, it:
- Saves your uncommitted changes as a stash object.
- Moves
HEAD
back to a clean working directory.
Stashes are stored in:
|
|
To view them:
|
|
To restore:
|
|
Each stash is a commit, so you can even inspect them:
|
|
15. Packfiles: How Git Optimizes Storage
If you check .git/objects
, you’ll see lots of small files.
Over time, Git packs these into a single compressed file called a packfile.
Check packfiles:
|
|
To manually optimize storage:
|
|
Git then:
- Compresses objects into fewer files.
- Eliminates duplicate objects.
- Reduces repository size.
This is why Git repos stay efficient, even with thousands of commits.
16. Git Garbage Collection: Cleaning Up Unused Objects
Git automatically removes orphaned objects (like old commits no longer referenced by any branch).
To see loose objects:
|
|
Run garbage collection manually:
|
|
This removes:
- Orphaned commits
- Old packfiles
- Unreferenced blobs
If you accidentally delete a commit before garbage collection runs, you can still find it with git reflog
.
17. Git Bisect: Debugging Like a Time Traveler
Ever had a bug that wasn’t there yesterday? Instead of manually checking old commits, Git can automatically find the exact commit where the bug was introduced.
How It Works
git bisect
performs a binary search on your commit history.
Start bisect mode:
1
git bisect start
Mark a good commit:
1
git bisect good <commit-hash>
Mark a bad commit:
1
git bisect bad HEAD
Git will now checkout the midpoint commit and ask you to test.
- If the commit is good, run:
1
git bisect good
- If the commit is bad, run:
1
git bisect bad
- If the commit is good, run:
Git repeats this process until it finds the first bad commit.
When finished, reset:
1
git bisect reset
This automates debugging by letting Git find exactly where the problem started.
18. Worktrees: Multiple Checkouts at Once
Ever wanted to work on two branches at the same time without stashing or committing? That’s what git worktree
does.
How Worktrees Work
A Git worktree is another checkout of the same repository but in a different folder.
Creating a Worktree:
|
|
This creates a new folder ../feature-branch
where you can work on the feature
branch without switching your main repo.
Listing Worktrees:
|
|
Removing a Worktree:
|
|
This is super useful for working on multiple branches without constantly switching.
19. Bare Repositories: What’s Inside Remote Git Repos?
When you run:
|
|
You’re cloning a bare repository.
What’s a Bare Repository?
A bare repo has no working directory—just the Git data.
To create one:
|
|
A bare repo only contains the .git
directory:
|
|
This is what GitHub, GitLab, and other remote services use to store repos centrally.
Why Use a Bare Repo?
- Collaboration: Remote repositories need to accept pushes, but a regular Git repo can’t push to itself.
- Centralized Storage: CI/CD systems often use bare repositories for automation.
20. Git Submodules: Repositories Inside Repositories
Sometimes, you need to include another Git repo inside your own (e.g., a shared library). That’s where submodules come in.
Adding a Submodule
|
|
This creates:
|
|
Cloning a Repo With Submodules
By default, submodules aren’t cloned. To fix this:
|
|
Or if you forgot:
|
|
How Submodules Work Internally
Submodules aren’t stored as normal files. Instead, Git stores a special commit reference:
|
|
This tells Git which commit of the submodule to check out.
21. Git Hooks: Automate Everything
Git has built-in automation via hooks—scripts that run before/after Git commands.
Where Hooks Live
Hooks are stored in .git/hooks/
:
|
|
Common Hooks:
Hook | Runs When? |
---|---|
pre-commit | Before a commit is created |
pre-push | Before a git push |
commit-msg | When a commit message is entered |
post-merge | After a successful merge |
Example: Preventing Bad Commit Messages
Create .git/hooks/commit-msg
:
|
|
Make it executable:
|
|
Now Git rejects bad commit messages!
22. What Is a Git Patch?
A Git patch is a text-based representation of a commit (or multiple commits). Instead of pushing/pulling, you can export a change as a .patch
file and apply it elsewhere.
Think of it like a portable commit—you can send it via email, copy it to another machine, or even manually review it.
Example workflow:
- Generate a patch file (
git format-patch
). - Send it to someone (email, Slack, etc.).
- Apply it on another repository (
git apply
).
This is how Linux kernel development and many open-source projects handle contributions!
23. Creating a Git Patch
To generate a patch for the last commit:
|
|
This creates a .patch
file like:
|
|
To generate a patch for multiple commits:
|
|
This creates one .patch
file per commit.
To create a single patch for all changes:
|
|
This is useful when you haven’t committed changes yet.
24. Structure of a Git Patch File
A .patch
file is just plain text! Let’s break down an example:
|
|
Breakdown:
Metadata (Commit Info)
From:
→ Commit hashFrom:
→ AuthorDate:
→ TimestampSubject:
→ Commit message
File Change Summary
- Shows the number of insertions/deletions.
Unified Diff Format (
diff --git
)index
→ Shows blob hashes before/after the change.---
and+++
→ Indicates file modifications.@@
→ Shows line numbers where changes occurred.
25. Applying a Git Patch
To apply a patch:
|
|
This applies the change but does not commit it.
To apply and commit:
|
|
The am
(apply mailbox) command preserves the original commit message and author.
26. How Git Stores and Processes Patches
Internally, a patch file is just a diff. When you run:
|
|
Git runs the diff algorithm to compute differences between the latest commit and your working directory.
When applying a patch, Git:
- Reads the
diff
data. - Finds the target file(s).
- Applies changes line by line.
- Checks for conflicts (if needed).
What Happens If a Patch Fails?
If a patch doesn’t match the current state, you get:
|
|
This means the file changed since the patch was created. You’ll need to manually fix conflicts before retrying.
27. Git Patches vs Cherry-Picking
Another way to move changes between branches is git cherry-pick
:
|
|
This applies a commit from one branch to another.
Feature | Git Patch | Cherry-Pick |
---|---|---|
Requires commit history? | ❌ No | ✅ Yes |
Can be shared via email? | ✅ Yes | ❌ No |
Preserves author info? | ✅ Yes | ✅ Yes |
Can apply multiple commits? | ✅ Yes | ✅ Yes |
Patches are more flexible because they work even if the repo history is different.
28. Interactive Patch Editing
Want to apply only part of a patch? Use --reject
:
|
|
This applies as much as possible and creates .rej
files for conflicts.
To manually inspect:
|
|
This dry-runs the patch without applying it.
Here’s a deep dive into Git Cherry-Picking, starting at section 29.
|
|
This creates a new commit on your branch with the same changes as <commit-hash>
, but without merging the whole branch.
30. How Cherry-Picking Works Internally
When you cherry-pick a commit, Git:
- Finds the commit you specified.
- Applies the changes from that commit onto your current branch.
- Creates a new commit with the same changes but a new hash.
Under the Hood:
A cherry-pick is equivalent to:
- Running
git diff <commit-hash>^ <commit-hash>
to get the changes. - Applying those changes to the working directory.
- Creating a new commit with those changes.
Example:
|
|
If you run:
|
|
The result:
|
|
Even though F
already exists in feature
, a new commit F'
is created on main
.
31. Cherry-Picking Multiple Commits
You can cherry-pick multiple commits at once:
|
|
Or a range of commits:
|
|
For example:
|
|
This picks C, D, and E.
32. What Happens If a Cherry-Pick Fails?
If Git can’t apply a commit cleanly, it results in a conflict:
|
|
Git will stop cherry-picking and let you resolve conflicts manually.
Fixing a Cherry-Pick Conflict
Open conflicted files and fix them.
Mark them as resolved:
1
git add <fixed-file>
Continue the cherry-pick:
1
git cherry-pick --continue
If you want to cancel:
|
|
33. Cherry-Picking vs Merging vs Rebasing
Cherry-picking isn’t the only way to move commits between branches. Here’s how it compares to merging and rebasing:
Feature | Cherry-Picking | Merging | Rebasing |
---|---|---|---|
Selects specific commits? | ✅ Yes | ❌ No | ❌ No |
Creates new commit hashes? | ✅ Yes | ❌ No | ✅ Yes |
Maintains original history? | ❌ No | ✅ Yes | ❌ No |
Can be undone easily? | ✅ Yes (revert ) | ✅ Yes (revert ) | ⚠️ No (rewrites history) |
When to Use Cherry-Picking:
✅ When you need only one commit from another branch.
✅ When you don’t want to merge an entire branch.
✅ When applying a hotfix from one branch to another.
34. Automating Cherry-Picking With -x
By default, cherry-picked commits don’t track where they came from. To include a reference:
|
|
This adds a reference like:
|
|
Now it’s clear that the commit came from elsewhere!
35. Undoing a Cherry-Pick
If you made a mistake, you can undo a cherry-pick before committing:
|
|
If you already committed the cherry-pick:
|
|
This creates a reverse commit that undoes the cherry-picked changes.