GIT Internals

So, you use Git every day, pushing, pulling, and occasionally rage-quitting because of a merge conflict.

But do you actually know what’s going on under the hood?

1. Git’s Data Model: Snapshots, Not Diffs

Most people assume Git tracks changes (like traditional version control systems such as SVN or Mercurial), but nope.

Git is a snapshot-based system.

Each commit is a complete snapshot of your repository at that moment in time. The magic? If a file hasn’t changed, Git just reuses the same reference instead of storing a duplicate.

Proof With `cat-file`

Run this in a Git repo:

1
git cat-file -p HEAD

You’ll see something like:

1
2
3
4
5
6
tree a1b2c3d4e5
parent 1234567890
author You <you@example.com> 1647890189 +0000
committer You <you@example.com> 1647890189 +0000

Added README

That tree hash (a1b2c3d4e5) represents the snapshot of the files at that commit.

2. Git’s Directed Acyclic Graph (DAG)

Git stores commits as a Directed Acyclic Graph (DAG)—a fancy way of saying commits point backward in time, forming a tree-like structure with no cycles.

Each commit has:

A tree (which maps to actual file contents)
A parent (or multiple parents for merges)
Metadata (author, message, etc.)

Here’s a quick ASCII example of a Git history:

1
2
3
A ← B ← C ← D  (main branch)
        ↖
         E ← F (feature branch)

This structure makes history traversal super fast.

3. SHA-1 Hashing: Git’s Fingerprint System

Git identifies everything (commits, trees, blobs, etc.) using SHA-1 hashes. This ensures data integrity.

Let’s see how Git hashes things:

1
echo "Hello Git" | git hash-object --stdin

Output:

1
8ab686eafeb1f44702738c8b0f24f2567c36da6d

That’s a SHA-1 hash of the string “Hello Git”. Git does this for all files, commits, and trees.

4. Intervals and Git’s History Walks

How `git log` Walks the Graph

When you run:

1
git log --graph --oneline --all

Git walks the commit graph using intervals—basically, it optimizes how it retrieves commits by skipping unnecessary paths.

Example:

If your commit history looks like:

1
2
3
A ← B ← C ← D (main)
        ↖
         E ← F (feature)

Git doesn’t scan every commit sequentially. Instead, it walks both branches in an interval-like pattern, minimizing redundant work.

5. Exploring Git Internals With Code

Want to see Git’s raw storage? Let’s play!

1. List All Git Objects

1
ls .git/objects

You’ll see folders with weird names (first 2 chars of SHA-1 hashes).

2. Inspect a Commit Object

Find a commit hash and run:

1
git cat-file -p <commit-hash>

It prints the commit details.

3. Check a Tree Object (File Snapshot)

Grab the tree hash from the commit and run:

1
git cat-file -p <tree-hash>

Now you see the folder structure!

4. Inspect a Blob (File Content)

Find a file’s blob hash and run:

1
git cat-file -p <blob-hash>

Boom! The file’s content appears.

6. Git’s Garbage Collection and Packfiles

Git compresses objects using packfiles. These bundle multiple objects into a single file for efficiency.

Run:

1
git gc

Git will repack objects, saving space.

To inspect packfiles:

1
ls .git/objects/pack

7. What Is a Git Branch? (It’s Just a Pointer!)

Most people assume a branch is a separate folder or copy of files. Nope.

A Git branch is just a file that contains a commit hash. That’s it.

Let’s check it out!

1
cat .git/refs/heads/main

Example output:

1
9fceb02b3beecf73c4f0d7b24b3b9d09981fb17e

That’s just a commit hash!

When you create a branch, Git creates a new file under .git/refs/heads/ with a different commit hash. This means branches are cheap—they just move a pointer.

8. How `HEAD` Works: The Active Branch

HEAD is a special reference that tells Git which branch you’re currently on.

Run:

1
cat .git/HEAD

Example output:

1
ref: refs/heads/main

This means HEAD is pointing to main. When you switch branches, Git just updates this file.

Try:

1
2
git checkout -b new-branch
cat .git/HEAD

Now it shows:

1
ref: refs/heads/new-branch

No files copied. No magic. Just a simple pointer change.

9. What Happens When You Switch Branches?

Let’s break it down:

Git updates HEAD to point to the new branch.
Git updates the working directory to match the new branch’s commit.
Git unstages any conflicting changes (if needed).

Try this:

1
git checkout -b test-branch

Now check:

1
cat .git/HEAD

Output:

1
ref: refs/heads/test-branch

Git also updates your working files to match the latest commit in test-branch.

10. How Merging Works Under the Hood

When you merge branches, Git looks for a common ancestor (usually the latest shared commit).

Example:

1
2
3
A ← B ← C (main)
     ↖
      D ← E (feature)

If you merge feature into main, Git finds commit B (the last shared commit), then combines changes from C and E.

Merge Types

Fast-forward merge
- If no new commits exist on main, Git just moves the branch pointer forward.
- Example:
  1 2
  git checkout main git merge feature
  This just updates main to point to E.
Three-way merge
- If main has new commits, Git needs to create a merge commit.
  1
  git merge feature
  Git creates a new commit combining changes.

11. How Git Deletes and Recovers Branches

Deleting a Branch

A branch is just a file, so deleting it is easy:

1
git branch -d feature

This deletes .git/refs/heads/feature.

To force delete:

1
git branch -D feature

Recovering a Deleted Branch

If you deleted a branch but need it back:

Find the last commit hash:
1
git reflog

Restore the branch:

1
git checkout -b feature <commit-hash>

Boom! The branch is back.

12. How Git Rebase Works Internally

Rebasing is one of Git’s most misunderstood features. Instead of merging branches, it rewrites history by moving commits.

Example scenario:

1
2
3
A ← B ← C (main)
      ↖
       D ← E (feature)

If we run:

1
2
git checkout feature
git rebase main

Git does this internally:

Finds the common ancestor (B).
Moves feature commits (D and E) onto main, replaying them one by one.
Updates feature to point to the new commit history.

The result:

1
A ← B ← C ← D' ← E' (feature)

Rebasing rewrites commit hashes, creating new commits D' and E'. This is why you shouldn’t rebase shared branches—it changes history!

13. The Git Reflog: Your Undo Button

Git never actually loses commits—even deleted ones. Every action is logged in the reflog (git reflog).

Run:

1
git reflog

Example output:

1
2
9fceb02 HEAD@{0}: commit: Added README
2d4f7b6 HEAD@{1}: checkout: moving from feature to main

This shows recent actions, like branch checkouts and commits.

If you accidentally delete a branch or reset a commit, use:

1
git reset --hard HEAD@{1}

Or restore a deleted branch:

1
git checkout -b feature 2d4f7b6

Git is hard to break, thanks to the reflog.

14. How Git Stash Works Internally

When you run:

1
git stash

Git doesn’t create a branch. Instead, it:

Saves your uncommitted changes as a stash object.
Moves HEAD back to a clean working directory.

Stashes are stored in:

1
ls .git/refs/stash

To view them:

1
git stash list

To restore:

1
git stash apply

Each stash is a commit, so you can even inspect them:

1
git stash show -p

15. Packfiles: How Git Optimizes Storage

If you check .git/objects, you’ll see lots of small files.

Over time, Git packs these into a single compressed file called a packfile.

Check packfiles:

1
ls .git/objects/pack

To manually optimize storage:

1
git gc

Git then:

Compresses objects into fewer files.
Eliminates duplicate objects.
Reduces repository size.

This is why Git repos stay efficient, even with thousands of commits.

16. Git Garbage Collection: Cleaning Up Unused Objects

Git automatically removes orphaned objects (like old commits no longer referenced by any branch).

To see loose objects:

1
git fsck --unreachable

Run garbage collection manually:

1
git gc --prune=now

This removes:

Orphaned commits
Old packfiles
Unreferenced blobs

If you accidentally delete a commit before garbage collection runs, you can still find it with git reflog.

17. Git Bisect: Debugging Like a Time Traveler

Ever had a bug that wasn’t there yesterday? Instead of manually checking old commits, Git can automatically find the exact commit where the bug was introduced.

How It Works

git bisect performs a binary search on your commit history.

Start bisect mode:
1
git bisect start
Mark a good commit:
1
git bisect good <commit-hash>
Mark a bad commit:
1
git bisect bad HEAD
Git will now checkout the midpoint commit and ask you to test.
- If the commit is good, run:
  1
  git bisect good
- If the commit is bad, run:
  1
  git bisect bad
Git repeats this process until it finds the first bad commit.
When finished, reset:
1
git bisect reset

This automates debugging by letting Git find exactly where the problem started.

18. Worktrees: Multiple Checkouts at Once

Ever wanted to work on two branches at the same time without stashing or committing? That’s what git worktree does.

How Worktrees Work

A Git worktree is another checkout of the same repository but in a different folder.

Creating a Worktree:

1
git worktree add ../feature-branch feature

This creates a new folder ../feature-branch where you can work on the feature branch without switching your main repo.

Listing Worktrees:

1
git worktree list

Removing a Worktree:

1
git worktree remove ../feature-branch

This is super useful for working on multiple branches without constantly switching.

19. Bare Repositories: What’s Inside Remote Git Repos?

When you run:

1
git clone git@github.com:user/repo.git

You’re cloning a bare repository.

What’s a Bare Repository?

A bare repo has no working directory—just the Git data.

To create one:

1
git init --bare myrepo.git

A bare repo only contains the .git directory:

1
2
3
4
5
myrepo.git/
 ├── HEAD
 ├── refs/
 ├── objects/
 ├── hooks/

This is what GitHub, GitLab, and other remote services use to store repos centrally.

Why Use a Bare Repo?

Collaboration: Remote repositories need to accept pushes, but a regular Git repo can’t push to itself.
Centralized Storage: CI/CD systems often use bare repositories for automation.

20. Git Submodules: Repositories Inside Repositories

Sometimes, you need to include another Git repo inside your own (e.g., a shared library). That’s where submodules come in.

Adding a Submodule

1
git submodule add https://github.com/some/library.git libs/library

This creates:

1
2
libs/library/   # A separate Git repo
.gitmodules     # Tracks submodule settings

Cloning a Repo With Submodules

By default, submodules aren’t cloned. To fix this:

1
git clone --recurse-submodules <repo-url>

Or if you forgot:

1
git submodule update --init --recursive

How Submodules Work Internally

Submodules aren’t stored as normal files. Instead, Git stores a special commit reference:

1
cat .gitmodules

This tells Git which commit of the submodule to check out.

21. Git Hooks: Automate Everything

Git has built-in automation via hooks—scripts that run before/after Git commands.

Where Hooks Live

Hooks are stored in .git/hooks/:

1
ls .git/hooks

Common Hooks:

Hook	Runs When?
`pre-commit`	Before a commit is created
`pre-push`	Before a `git push`
`commit-msg`	When a commit message is entered
`post-merge`	After a successful merge

Example: Preventing Bad Commit Messages

Create .git/hooks/commit-msg:

1
2
3
4
5
#!/bin/sh
if ! grep -qE "^(feat|fix|docs|chore):" "$1"; then
  echo "Commit message must start with feat:, fix:, docs:, or chore:"
  exit 1
fi

Make it executable:

1
chmod +x .git/hooks/commit-msg

Now Git rejects bad commit messages!

22. What Is a Git Patch?

A Git patch is a text-based representation of a commit (or multiple commits). Instead of pushing/pulling, you can export a change as a .patch file and apply it elsewhere.

Think of it like a portable commit—you can send it via email, copy it to another machine, or even manually review it.

Example workflow:

Generate a patch file (git format-patch).
Send it to someone (email, Slack, etc.).
Apply it on another repository (git apply).

This is how Linux kernel development and many open-source projects handle contributions!

23. Creating a Git Patch

To generate a patch for the last commit:

1
git format-patch -1

This creates a .patch file like:

1
0001-Added-feature-X.patch

To generate a patch for multiple commits:

1
git format-patch HEAD~3

This creates one .patch file per commit.

To create a single patch for all changes:

1
git diff > my_changes.patch

This is useful when you haven’t committed changes yet.

24. Structure of a Git Patch File

A .patch file is just plain text! Let’s break down an example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
From 4f5a4e7c1342b5a1b6c6ff3a0b2f1d4a52a3d5b8 Mon Sep 17 00:00:00 2001
From: John Doe <johndoe@example.com>
Date: Mon, 1 Mar 2025 14:30:00 -0700
Subject: [PATCH] Fix typo in README

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 3b18b12..6f7e8f1 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-Hello Git Usres!
+Hello Git Users!

Breakdown:

Metadata (Commit Info)
- From: → Commit hash
- From: → Author
- Date: → Timestamp
- Subject: → Commit message
File Change Summary
- Shows the number of insertions/deletions.
Unified Diff Format (diff --git)
- index → Shows blob hashes before/after the change.
- --- and +++ → Indicates file modifications.
- @@ → Shows line numbers where changes occurred.

25. Applying a Git Patch

To apply a patch:

1
git apply 0001-Added-feature-X.patch

This applies the change but does not commit it.

To apply and commit:

1
git am 0001-Added-feature-X.patch

The am (apply mailbox) command preserves the original commit message and author.

26. How Git Stores and Processes Patches

Internally, a patch file is just a diff. When you run:

1
git diff > changes.patch

Git runs the diff algorithm to compute differences between the latest commit and your working directory.

When applying a patch, Git:

Reads the diff data.
Finds the target file(s).
Applies changes line by line.
Checks for conflicts (if needed).

What Happens If a Patch Fails?

If a patch doesn’t match the current state, you get:

1
2
error: patch failed: README.md:1
error: README.md: patch does not apply

This means the file changed since the patch was created. You’ll need to manually fix conflicts before retrying.

27. Git Patches vs Cherry-Picking

Another way to move changes between branches is git cherry-pick:

1
git cherry-pick <commit-hash>

This applies a commit from one branch to another.

Feature	Git Patch	Cherry-Pick
Requires commit history?	❌ No	✅ Yes
Can be shared via email?	✅ Yes	❌ No
Preserves author info?	✅ Yes	✅ Yes
Can apply multiple commits?	✅ Yes	✅ Yes

Patches are more flexible because they work even if the repo history is different.

28. Interactive Patch Editing

Want to apply only part of a patch? Use --reject:

1
git apply --reject my_changes.patch

This applies as much as possible and creates .rej files for conflicts.

To manually inspect:

1
git apply --check my_changes.patch

This dry-runs the patch without applying it.

Here’s a deep dive into Git Cherry-Picking, starting at section 29.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
title: "Git Internals: How Cherry-Picking Works Under the Hood"
description: "A detailed look at how Git cherry-picking works, its internals, and when to use it effectively."
slug: "git-cherry-pick-internals"
date: 2017-10-22
image: "post/Articles/IMAGES/37.jpg"
categories: ["Git", "Version Control", "Internals"]
tags: ["Git", "Cherry-Pick", "Commits", "Merge", "Conflict Resolution"]
draft: false
weight: 428
---

# Git Internals: How Cherry-Picking Works Under the Hood

Sometimes, you need to **pick just one commit** from another branch without merging the entire branch. That’s where **Git cherry-picking** comes in.

Cherry-picking lets you **selectively apply commits** from anywhere in your repo’s history, without merging unwanted changes.

In this article, we’ll cover:
- What cherry-picking is and why it’s useful
- How cherry-picking works internally
- How to cherry-pick multiple commits
- Handling conflicts during cherry-picking
- Cherry-picking vs merging vs rebasing

Let’s get started!

---

## 29. What Is Git Cherry-Picking?

Git cherry-picking **applies an individual commit** from another branch to your current branch.

### Example Use Case:
You’re working on `main`, but a bugfix was added to `feature-branch`. You don’t want the **entire branch**, just the bugfix.

Instead of merging, you can **cherry-pick** the fix:

```sh
git cherry-pick <commit-hash>

This creates a new commit on your branch with the same changes as <commit-hash>, but without merging the whole branch.

30. How Cherry-Picking Works Internally

When you cherry-pick a commit, Git:

Finds the commit you specified.
Applies the changes from that commit onto your current branch.
Creates a new commit with the same changes but a new hash.

Under the Hood:

A cherry-pick is equivalent to:

Running git diff <commit-hash>^ <commit-hash> to get the changes.
Applying those changes to the working directory.
Creating a new commit with those changes.

Example:

1
2
3
A ← B ← C ← D (main)
      ↖
       E ← F ← G (feature)

If you run:

1
git cherry-pick F

The result:

1
2
3
A ← B ← C ← D ← F' (main)
      ↖
       E ← F ← G (feature)

Even though F already exists in feature, a new commit F' is created on main.

31. Cherry-Picking Multiple Commits

You can cherry-pick multiple commits at once:

1
git cherry-pick <commit1> <commit2>

Or a range of commits:

1
git cherry-pick <start-commit>^..<end-commit>

For example:

1
git cherry-pick C^..E

This picks C, D, and E.

32. What Happens If a Cherry-Pick Fails?

If Git can’t apply a commit cleanly, it results in a conflict:

1
error: could not apply <commit-hash>

Git will stop cherry-picking and let you resolve conflicts manually.

Fixing a Cherry-Pick Conflict

Open conflicted files and fix them.
Mark them as resolved:
1
git add <fixed-file>
Continue the cherry-pick:
1
git cherry-pick --continue

If you want to cancel:

1
git cherry-pick --abort

33. Cherry-Picking vs Merging vs Rebasing

Cherry-picking isn’t the only way to move commits between branches. Here’s how it compares to merging and rebasing:

Feature	Cherry-Picking	Merging	Rebasing
Selects specific commits?	✅ Yes	❌ No	❌ No
Creates new commit hashes?	✅ Yes	❌ No	✅ Yes
Maintains original history?	❌ No	✅ Yes	❌ No
Can be undone easily?	✅ Yes (`revert`)	✅ Yes (`revert`)	⚠️ No (rewrites history)

When to Use Cherry-Picking:

✅ When you need only one commit from another branch.
✅ When you don’t want to merge an entire branch.
✅ When applying a hotfix from one branch to another.

34. Automating Cherry-Picking With `-x`

By default, cherry-picked commits don’t track where they came from. To include a reference:

1
git cherry-pick -x <commit-hash>

This adds a reference like:

1
(cherry picked from commit 4a5b3c)

Now it’s clear that the commit came from elsewhere!

35. Undoing a Cherry-Pick

If you made a mistake, you can undo a cherry-pick before committing:

1
git cherry-pick --abort

If you already committed the cherry-pick:

1
git revert <commit-hash>

This creates a reverse commit that undoes the cherry-picked changes.

1. Git’s Data Model: Snapshots, Not Diffs

Proof With cat-file

2. Git’s Directed Acyclic Graph (DAG)

3. SHA-1 Hashing: Git’s Fingerprint System

4. Intervals and Git’s History Walks

How git log Walks the Graph

5. Exploring Git Internals With Code

1. List All Git Objects

2. Inspect a Commit Object

3. Check a Tree Object (File Snapshot)

4. Inspect a Blob (File Content)

6. Git’s Garbage Collection and Packfiles

7. What Is a Git Branch? (It’s Just a Pointer!)

8. How HEAD Works: The Active Branch

9. What Happens When You Switch Branches?

10. How Merging Works Under the Hood

Merge Types

11. How Git Deletes and Recovers Branches

Deleting a Branch

Recovering a Deleted Branch

12. How Git Rebase Works Internally

13. The Git Reflog: Your Undo Button

14. How Git Stash Works Internally

15. Packfiles: How Git Optimizes Storage

16. Git Garbage Collection: Cleaning Up Unused Objects

17. Git Bisect: Debugging Like a Time Traveler

How It Works

18. Worktrees: Multiple Checkouts at Once

How Worktrees Work

Creating a Worktree:

Listing Worktrees:

Removing a Worktree:

19. Bare Repositories: What’s Inside Remote Git Repos?

What’s a Bare Repository?

Why Use a Bare Repo?

20. Git Submodules: Repositories Inside Repositories

Adding a Submodule

Cloning a Repo With Submodules

How Submodules Work Internally

21. Git Hooks: Automate Everything

Where Hooks Live

Common Hooks:

Example: Preventing Bad Commit Messages

22. What Is a Git Patch?

23. Creating a Git Patch

24. Structure of a Git Patch File

Breakdown:

25. Applying a Git Patch

26. How Git Stores and Processes Patches

What Happens If a Patch Fails?

27. Git Patches vs Cherry-Picking

28. Interactive Patch Editing

30. How Cherry-Picking Works Internally

Under the Hood:

31. Cherry-Picking Multiple Commits

32. What Happens If a Cherry-Pick Fails?

Fixing a Cherry-Pick Conflict

33. Cherry-Picking vs Merging vs Rebasing

When to Use Cherry-Picking:

34. Automating Cherry-Picking With -x

35. Undoing a Cherry-Pick

Proof With `cat-file`

How `git log` Walks the Graph

8. How `HEAD` Works: The Active Branch

34. Automating Cherry-Picking With `-x`