Inside Git: A Simple Guide to How Git Works Internally

if you haven’t used git don’t read this article because before reading it you should have at least used git or used some commands

Look, we've all been there. You start using Git because everyone says you have to, and pretty soon you're just typing the same commands over and over

git add .

git commit -m "fix stuff"

git push

It works fine... until it doesn't. One day you get a merge conflict that looks like code you have never seen or you accidentally reset something and lose work, or Git just refuses to do what you want. Suddenly it's frustrating as hell.

the fix isn’t learning commands or just rewriting commands again. It's finally understanding what Git is actually doing behind those commands. Once you get the internals, everything clicks no more guessing, no more fear.

this article will explain how git works internally what is .git folder and what it actually has inside it and how git tracks changes using hash and objects

1. First Things First: Git Doesn't Track Changes like we think

almost everyone you ask or see talking about git they say git tracks changes in your files that’s true but it’s misleading like what you know till now ? or think that git tracks changes line by line that’s not true Git stores full snapshots of your entire project at each commit not differences

every time you commit git

takes a snapshot of tracked files
reuses unchanged data
links it to previous snapshots

let’s take an example and understand it more clearly

suppose i have a shopping.txt

Start: Your first list

File content:

milk

eggs

bread

You commit this (Version 1). Git takes a full photo (snapshot) of the file exactly like this.

Then: You add one thing

You change the file to:

milk

eggs

bread

apple

You commit again (Version 2). Git takes another full photo of the entire file now.

it looks like git saved two different files , but here is the best part about git Git notices that the first three lines ("milk", "eggs", "bread") are exactly the same as before. So it doesn't duplicate those it just reuses the old parts and only saves the new line ("apples").

So even though Git is saving full snapshots, it's smart and only stores the truly new stuff.

Later: You remove something

Now change it back to just:

milk

eggs

Commit (Version 3) Git takes yet another full snapshot but again it reuses the old "milk" and "eggs" parts that already exist from Version 1. It doesn't save full copies again.

Why call it snapshots not changes?

Because when you want to go back to Version 1, Git can instantly give you the complete exact file from that moment no guessing or patching things together. It's like pulling out an old photo that's already perfectly complete. That's why it's fast, safe, and doesn't balloon in size.

2. The .git Folder :-

Run git init in a folder, and Git quietly creates a hidden directory called .git.

Most people ignore it, Big mistake the .git folder is your Git repository, when we run the command git init after that everything that happens in that folder will be tracked and even if you delete the working copy or code files Git can restore them from .git but if you delete the .git your entire history ,branches ,commits are gone your files remain but they are now normal files no longer version controlled

git is like time machine if it knows something it can bring it back, edit it , can see the commits etc

you can go back into past see what you did but if you break the time machine nothing like this will happen

Structure of .git Folder

.git/

├── objects/ # All your data: files, folders, commits stored as "objects"

├── refs/ # Pointers to branches and tags

├── HEAD # Points to your current branch/commit

├── index # The staging area (more on this soon)

├── config # Project-specific Git settings

└── ... # Other stuff like hooks, logs

Version control with Git | PracticalSeries: Brackets-Git and GitHub

This folder is Git's private database. Treat it like the heart of your project.

3. Git's Core Building Blocks: The Three Objects

Git is an object database. Everything important is stored as one of three types of objects, each identified by a unique SHA-1 hash (that long hex string like e69de29bb2d1d6434b8b29ae775ad8c2e48c5391).

Blob:

it stores pure file content, No filename, no path just the raw bytes of the file.

If two files have same content, they share the same blob. Super efficient.
Tree: Represents a directory. It lists filenames and points to blobs (for files) or other trees (for subfolders). This rebuilds your folder structure.

project/

├── index.html

└── style.css

tree object:

index.html —→blob

style.css——>blob

Commit: The snapshot itself. Points to one root tree your whole project, plus metadata: parent commits, author, date, message.

it contains
reference to a tree
reference to a parent commit

Every commit → points to a tree → which points to blobs (and sub-trees).

Git - Git Objects

Git Book - The Git Object Model

Commit C1 |

└── Tree

├── app.js → Blob

└── index.html → Blob

every commit forms a chain through parent commit

4. How Tracking Changes Actually Works

Since Git uses snapshots:

When a file changes, Git creates a new blob for the new content if the files are unchanged reuse the old blob
The new commit gets a new tree that points to the mix of old + new blobs.

Git detects changes by comparing hashes. Same hash = same content = no duplication.

Lets discuss what is hash?

this term often seems confusing but once you understand it all makes sense i hope i can make you understand

i hope you know about ASCII value (google it) or go to this website and read about it https://www.w3schools.com/charsets/ref_html_ascii.asp let’s design a hash function from it

hash= ΣASCII(char)* position

ASCII(char)=ASCII value of the character

position=at what place the character is we position starts from one like in cat position of c=1

let’s say we take cat

c ASCII(99) 1 = 99

a ASCII(97) 2 = 194

t ASCII(166) 3 = 348

Σ it means summation so add all the three values 99+194+348=641

hash(“cat”)=641 now change the letter the value will change

now git doesn’t use such a simple hash function it it takes time author etc.. like it’s much more complex when you change a simple text in the file hash value changes aggressively

5. The Famous Three Areas

Forget memorizing commands. Think in areas

Working Directory: Your actual files on disk. What you edit in your editor.
Staging Area (Index): A "preview" of your next commit. Lives in .git/index.

Coding Career Advice: Using Git for Version Control Effectively ...

Repository : Permanent storage in .git/objects. Your commit history.

The flow:

You edit files → git add moves them to staging → git commit saves the staging snapshot as a commit in the repo.

Git Gud: The Working Tree, Staging Area, and Local Repo | by Lucas ...

6. What Really Happens When You Run git add and git commit

Let's trace it with an example. You modify app.js.

git add app.js:

Git reads the file content.
Computes its hash → creates a new blob object and stores it in .git/objects/.
Updates the index (staging area) to include this blob (with filename/path) for the next commit.

index is a temporary area that stores what will go into next commit

git add doesn’t create a commit it prepares a snapshot

git commit:

when you run git commit -m “message”

it reads the current index and builds tree objects from it.
Creates a new commit object pointing to that root tree, linking to the previous commit.
Updates your current branch (in .git/refs/heads/main) to point to the new commit.
Moves HEAD to stay on the branch.

nothing is deleted nothing is overwritten

Git, GitHub, & Workflow Fundamentals - DEV Community

Nothing gets overwritten. Old commits stay forever until garbage collected much later

7. HEAD and Branches: Just Pointers

HEAD: it is a pointer which tells git this is the current commit i’m on

usually HEAD—> main ——> latest commit

when you commit HEAD moves forward

when you checkout HEAD moves backward
Branches: Tiny files in .git/refs/heads/ containing a commit hash.

Creating a branch? Just make a new pointer to the current commit. Switching? Move HEAD.

That's why branching is basically free.

8. Now Git will make sense

Once you understand how Git works internally, Git stops feeling scary or random.
Things that earlier felt like errors now feel logical.

Let’s break this down one by one in plain language.

Why Merge Conflicts Suddenly Make Sense

Before understanding internals, a merge conflict feels like

earlier when merge conflicts used to occur we were like what went wrong etc…

Now think in Git’s language:

You have two commits
Both commits changed the same file
Each change created a different blob
Git sees two different versions of the same content

Git’s problem:

since you have changed different blobs of the same file git is confused which blog you want to keep

So Git asks you to decide, A merge conflict is not an error it’s Git asking for human input when logic isn’t enough

What `git reset` Really Does

Earlier, git reset sounded like confusing and it went over the head now after understanding that HEAD is just a pointer

git reset moves that pointer backward or forward

Depending on the option:

It may also update the staging area
Or even your working files

Lost Commits Are Usually Not Lost

Sometimes you think:

I lost my commit. It’s gone forever.

Internally:

Commits are stored as objects in .git/objects
Even if a branch no longer points to them
The data still exists

That’s where this command helps:

git reflog

reflog shows:

Where HEAD was earlier
Which commits you visited

Git keeps a record of your moves
You can usually recover work if you know where to look

summary

Git saves complete pictures of your project every commit It smartly reuses the old data via hashes no wasteful copies
The .git folder is everything: Your code is just the current view. All history commits branches live in this hidden folder
Three simple objects power it all:
- Blob: Raw file content
- Tree: Directory structure (names + pointers)
- Commit: Snapshot (points to tree + parent)
Three areas mental model:
- Working Directory (your files)
- Staging Area (what's prepped for commit)
- Repository (permanent history in .git)
git add → git commit flow: Add builds blobs and updates staging. Commit creates trees + new commit object, advances your branch.
Git isn't scary magic. It's a clever system of snapshots built from blobs/trees/commits, stored safely in .git with cryptographic hashes. Understand this once and merge conflicts, resets, branches all of it just make sense

thanks for reading :)

Inside Git: How It Really Works