Git

Table of Contents

day 9

<2025-12-27 Sat>

gdb Demonstration

Here is the relevant mailing list discussion that I based this on https://lore.kernel.org/git/xmqqbjjqslgq.fsf@gitster.g/T/#m7d9288bd28a9f9a51781bf42330a9c15fe9016ff

There is a bug that offers the chance to learn how to use gdb. If you are feeling ambitious, go ahead and attempt solving this on your own before reading my proposed solution.

The bug is that the command `git restore -source branch` gives an error message that seems to be a typo.

$ git checkout 66ce5f8e8872f0183bb137911c52b07f1f242d13

Go ahead and try it on that commit and then continue reading for a demonstration of how to use gdb to diagnose what is going wrong.

Tutorial

Build git with meson (or use the Makefile). See here.

$ meson setup build/
$ cd build
$ meson compile

Use gdb to find the bug

$ gdb ./git
(gdb) run restore -source branch

Aha! There is the error

fatal: could not resolve ource

which seems like a typo? …

It isn't. Set a break point for the `die` function. Yes that is really what the function responsible for bringing you that fatal error is. Run until you get there and observe.

(gdb) break die
(gdb) r
Starting program: ~/repos/github.com/git/git/build/git restore -source branch
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Breakpoint 1, die (err=0x5555559d8b6a "could not resolve %s") at ../usage.c:202
202     {
(gdb)

You will see the reason for the issue if you have the coding guidelines in your marrow (further inspection is needed to understand why the bug happens).

From the `Documentation/CodingGuidelines.adoc` we have:

Enclose the subject of an error inside a pair of single quotes

among other tips. And that is how we fix it, wrapping that %s with single quotes.

It feels like I am only scratching the surface of what is possible with gdb, but I am happy to demonstrate some usage based on a trivial patch. I have plans to look for more interesting one line changes and continue this journey in the future.

day 8

<2025-12-26 Fri>

I am not sure how to best use this site. It seems that I need more structure. With that said, I am returning to the baby-git repository and the Decoding Git book in order to understand the initial commit of Git.

My goal is to get through this book before January 22nd which is when the next semester starts. I've gone through Part 1 already, so I am reviewing that and then moving on to Part 2.

It'd be kind of neat to traverse the entirety of Git's code by starting at the initial commit and checking out the next commit until getting to the HEAD.

Here is a summary of the history of Git:

https://about.gitlab.com/blog/journey-through-gits-20-year-history/

I used grep in the git.github.io repo in order to find where people were talking about "Emacs", "Magit", "GDB" or "gdb". These tools seem important to me, and I'd like to see if other people are using them. I installed the lei package, but haven't set much up with it or unsubscribed from the mailing list.

day 7

<2025-12-22 Mon>

I am setting up an agenda to outline my studies leading up to March 31st when the GSoC applications are due. I also cloned the git.github.io repo so I can see the developer pages from Emacs. This is important because when they release those microprojects, I want to be ready to work on them, so I will be fetching the changes to this repo on Emacs startup. I found the fun Gitstery repository which is a murder mystery that you can solve by using Git commands.

day 6

<2025-12-21 Sun>

I'm moving all of my private repositories off of GitHub and using Git over SSH to work on them from my laptop.

I learned about `git rebase` and feel enlightened. The words my high school math teacher told me when he found out that I wanted to study computer science echo in my head

With great power comes great responsibility.

day 5

<2025-12-07 Sun>

Today I learned how to install Git on Windows and helped install Git on a colleagues machine. They use Codium, which seems to be pretty similar to VSCode. It was satisfying installing git and then seeing their editor now capable of committing their changes.

day 4

<2025-12-05 Fri>

Found more resources:

day 3

<2025-10-17 Fri>

Using the baby-git and Decoding Git book, I've found the following resources;

Git uses the SHA-1 hash function to map file contents to hash values.

There are the following four basic components in Git's initial commit:

  • objects
  • an object database
  • a current directory cache
  • a working directory

Objects

Object types:

  • blob
  • tree
  • commit

An object is an abstraction of data and metadata. It is indexed and referenced through its hash value. The name of an object is its hash value. This hash value is used to refer and look up to the specific content.

The general structure of an object is:

object tag
' '                 (single space)
size of object data (in bytes)
'\0'                (null character)
object binary data

The first part of an object consists of the object metadata. The second part consists of the object data (the binary data). The space and null byte are used to separate the two. The object tag is simply what type of object it is (one of: blob, tree, commit) and the size of the object data in bytes before deflation.

Blobs (binary large objects)

Whoa that just blew my mind, didn't know blobs are just binary large objects. Any file that the user adds could be a blob, it should be the binary representation of a video, plain text, or any file. Git generates a blob object that is named, indexed and referenced through the deflated blob objects SHA-1 hash value.

Tree object

A tree object contains a list of files added to a repository. Each file has a mode, path and spa-1 hash. The size of the tree is the sum of the sizes of the file information entries in the tree object data.

Commit object

A commit contains the hash value of a tree object being committed and the hash value of any parent tree objects specified by the user, metadata about the user who committed the tree, the time and date when the commit was made and a user-supplied comment known today as the commit message.

day 2

<2025-10-14 Tue>

Today I am exploring the Git source code and trying to figure out how things work.

Finding list of commands

Git has lots of commands. Here is how you can find where the commands are in the source. I used the command

grep -nr "list of commands"

to find that there is a list of commands in the git.c file;

Documentation/MyFirstContribution.adoc:220:The list of commands lives in `git.c`.

In that file is the list of commands. Here are the first and last five;

"add"
"am"
"annotate"
"apply"
"archive"
...
"verify-tag"
"version"
"whatchanged"
"worktree"
"write-tree"

How does `git add` work?

Let's focus on a command I've probably used hundreds of times already:

git add

We can find the following in builtins/add.c;

static struct option builtin_add_options[] = {
  OPT__DRY_RUN(&show_only, N_("dry run")),
  OPT__VERBOSE(&verbose, N_("be verbose")),
  OPT_GROUP(""),
  OPT_BOOL('i', "interactive", &add_interactive, N_("interactive picking")),
  OPT_BOOL('p', "patch", &patch_interactive, N_("select hunks interactively")),
  OPT_DIFF_UNIFIED(&add_p_opt.context),
  OPT_DIFF_INTERHUNK_CONTEXT(&add_p_opt.interhunkcontext),
  OPT_BOOL('e', "edit", &edit_interactive, N_("edit current diff and apply")),
  OPT__FORCE(&ignored_too, N_("allow adding otherwise ignored files"), 0),
  OPT_BOOL('u', "update", &take_worktree_changes, N_("update tracked files")),
  OPT_BOOL(0, "renormalize", &add_renormalize, N_("renormalize EOL of tracked files (implies -u)")),
  OPT_BOOL('N', "intent-to-add", &intent_to_add, N_("record only the fact that the path will be added later")),
  OPT_BOOL('A', "all", &addremove_explicit, N_("add changes from all tracked and untracked files")),
  OPT_CALLBACK_F(0, "ignore-removal", &addremove_explicit,
                 NULL /* takes no arguments */,
                 N_("ignore paths removed in the working tree (same as --no-all)"),
                 PARSE_OPT_NOARG, ignore_removal_cb),
  OPT_BOOL( 0 , "refresh", &refresh_only, N_("don't add, only refresh the index")),
  OPT_BOOL( 0 , "ignore-errors", &ignore_add_errors, N_("just skip files which cannot be added because of errors")),
  OPT_BOOL( 0 , "ignore-missing", &ignore_missing, N_("check if - even missing - files are ignored in dry run")),
  OPT_BOOL(0, "sparse", &include_sparse, N_("allow updating entries outside of the sparse-checkout cone")),
  OPT_STRING(0, "chmod", &chmod_arg, "(+|-)x",
             N_("override the executable bit of the listed files")),
  OPT_HIDDEN_BOOL(0, "warn-embedded-repo", &warn_on_embedded_repo,
                  N_("warn when adding an embedded repository")),
  OPT_PATHSPEC_FROM_FILE(&pathspec_from_file),
  OPT_PATHSPEC_FILE_NUL(&pathspec_file_nul),
  OPT_END(),
};

Okay … That is how it works.

day 1

<2025-10-13 Mon>

Here are some resources I'm looking into:

TIL about `git shortlog`

This feature is awesome, you can use it to easily see how many commits people are making to a repository. With the command

git shortlog -ns

you are able to see who has committed the most to a repository. Here is the output of that command on the Git repository;

27457  Junio C Hamano
 4611  Jeff King
 2390  Johannes Schindelin
 1945  Ævar Arnfjörð Bjarmason
 1824  Nguyễn Thái Ngọc Duy
 1810  Patrick Steinhardt
 1401  Shawn O. Pearce
 1314  René Scharfe
 1203  Elijah Newren
 1118  Linus Torvalds
  954  Michael Haggerty
  902  brian m. carlson

Fascinating. Junio C Hamano is legendary!

git baby steps

<2025-08-24 Sun>

My goal is to improve enough such that I am able to communicate and contribute to whatever project.

These notes are meant to serve as a road map for people that come after me, and are looking to learn enough to be useful and contribute to Git.