Werner Digital

Technology

Git Guide I - Basic

Old schooler's guide to Git source control.

Contents

In 2005, Source Code Management (SCM) got a huge boost made necessary by the growing complexity of managing large public projects like Linux. These tools have gradually matured enough to help almost any software project.

Overview

Why use Git?

Git (and source/version control management software in general) is mostly touted for coordinating work across teams, but it is also has incredible value for solo coders and even infrequent coders.

The safety nets that are available in git can speed development, find/fix bugs faster, and provide useable change documentation. They help organize and archive your source code, and allow for code pattern re-use in other projects.

Adding cloud services like github, bitbucket, etc isn't necessary but extends the usefulness of Git tremendously. These services provide a simple way to keep a centralized master copy and still retain all the advantages of local copies. In addition, they add a simple interface for teams, build/deployment and testing automation, code analysis, issue/bugfix management, documentation, etc.

(click on topic for more info)

  • lightweight and portable
    • free command line interface (although there are gui available)
    • portable across most vendors and op-systems
    • doesn't require constant internet access
    • can run directly from a thumbdrive with no install
    • converts a directory into a repository using normal file structures
    • more information on installing Git
  • code tools like grep, zip/tar, and distribution
    • search for text across the project with familiar unix grep command
    • zip (or tar) directory for releases
    • package repository w/ history into .git file
    • keep single file backups that can directly clone new repositories
    • easily keep git software on same media to preserve recovery
  • short and long term snapshots
    • basic, unobtrusive, low overhead version and source control
    • your choice of simple housekeeping options, flexible workflows
    • automatic revision history with changes and notes
    • synchronized undo/redo across multiple files/directories
  • easy branching
    • keeping a good working copy while making breaking changes
    • quickly stopping/starting work on a topic
    • working on multiple features simultaneously
    • experiment with much less risk
    • develop workflows that can be used with teams or solo
  • github, bitbucket, etc
    • off-site centralization/backup
    • small accounts are free on most of the bigger cloud platforms
    • synchronize working/testing source on multiple devices
    • easy access to automated cloud build/deploy
    • supports team workflows, public domain projects, and private projects

Git Basic Startup


Installing Git

Downloads

Git comes pre-installed on most Linux stations, Windows, macOS, and Linux/Unix installs are on the git downloads page.

Quick Config

At minimum, you should setup a few general things using the git config command. Your identity:

  • git config --global user.name "John Doe"
  • git config --global user.email johndoe@example.com Your editor if you don't want to use the system default (usually vi). Here is a editor list with setup instructions.

You can override any value in a specific repository by dropping the global parameter. In example:

  • git config user.email johndoe@some-project.com

More detailed information on installation and customization is in the second part of our guide, under the topic Customizing Installs.

initialize repository

There are two basic commands to initialize a repository:

To create a new project, or convert an existing project:

  • cd /path/to/project
  • git init - sets up git hidden files
  • if new project, copy any files you want to start the project with
  • git add . - tracks and adds the files to a staging index
  • git commit -m "Initial commit" - commits files in the staging index with the title "Initial commit"

If your project generates files during build or other automated tasks, consider adding a .gitignore file, which lists files that shouldn't be tracked or stored with the source.

More information on staging and commit below.

Status, log, show, diff, whatchanged

status

git status is probably my most used git command. It displays a summary of any files that have been changed, are untracked, or are staged up for the next commit. It also displays info about remote tracking branches that you may need to sync up with using push/pull.

log

git log --oneline --graph -20 is another frequently used command that lists the last 20 commits in history, one line each, with a graphical representation of the branch. It also indicates which commits have references (like branches or tags).

git log branch1..branch2 list the commits that are different between each branch.

show

git show displays the info in the last commit. Display other commits by adding a commit id. The info also contains the diff from the previous commit.

diff

git diff displays the difference line by line for each file changed since the last commit. You can also display the changes between any two commits by specifying up to two commit ids, or anything that can be translated into a commit id (such as branch names, tags, etc).

git diff --cached displays what changes are staged and ready to commit.

git diff "@{yesterday}"

whatchanged

git whatchanged --since="2 weeks ago" lists the files that were changed by each commit in the last 2 weeks.

Changes and Commits

Now that the repository is setup, it is ready to start recording changes. The frequency of snapshot recording is up to the individual developer. Common practice guidelines include:

  • Commit early and often - don't wait to perfect things, save your work.
  • Try to make single purpose commits - this makes it easier to review, to document, and reverse if necessary. It can also help your organize tasks and line them up with project management tools.
  • Use meaningful commit titles. Personally, I include a category prefix to the title such as (fix), (feature), (chore) etc.
  • In contrast to the the first guideline, I prefer to only make public commits that compile/build. This may not include linting and testing steps, but helps things like bisect debugging, which is discussed later as an advanced topic.

To commit changes into history, you prepare the staging index to describe the files and changes involved, then define the commit with a descriptive title.

git add /path/to/file1 git add /path/to/file2 git commit -m "fix - corrected issue x"

Staging Overview

  • git add Staging offers a chance to organize the files that have changed into a commit, primarily via the add command. The staging area is often called the index in git docs. From the docs:

The "index" holds a snapshot of the content of the working tree, and it is this snapshot that is taken as the contents of the next commit. Thus after making any changes to the working tree, and before running the commit command, you must use the add command to add any new or modified files to the index.

git status shows which files have been changed since the last snapshot, along with which files have been added (and are untracked) and which have been deleted/renamed.

To add a file into the staging area, use the command git add /path/to/filename

You can also add all modified/added files with the command git add . or to see what would have been added use git add . --dry-run

Using staging to cleanup

As mentioned above, creating a snapshot with commit does not necessarily update the branch with every file you have changed. You can group sets of non-interdependent changes together to make the change more descriptive.

For instance, every time I touch a react project, I run an npm update first and test it. I can then go ahead and start updating code, knowing that when I am committing the changes I can stage and commit the package.json and package-lock.json into a separate commit to keep each commit focused on related changes.

After commit touch ups

Modifying the history after commit is a like changing the past in Git, so you will have to do a little extra work if there are copies or remotes to keep them synced up. If you aren't a solo coder, rewriting history can be strongly discouraged, but to keep it simple we will ignore those complications for the moment.

If you want to make a quick touch up to a commit, go ahead and stage the changes, then use the command

  • git commit --amend --no-edit

This will use the same commit title and will replace the commit id with a new one.

There are a lot of reasons to do daily or regular small commits, and to break changes into many smaller commits during the main development. After the feature is largely done, it may be handy to consolidate all those changes into a fewer number of changes that will make reading the history more useful. This is a much larger touch up.

It is possible to do very complex history and commit revisions, I use the interactive rebase tool git rebase i when things get very complicated.

More information on rebasing, and the issues (and solutions) involved in rewriting history in the next article Git II - Intermediate

Using Branches

Git is designed to help manage and organize snapshots without using much space or effort. One of the cleaner ways to do this is with branches.

Underneath the surface, branches are a way to organize sets of commits. Each branch is really only a 40-byte pointer in a file (a commit id, hash of the contents and history) but represents a complete copy (along with the commit history that got it to this point) of the project directory.

Each repository has a default branch, we use the name main but older versions use master or prod. You are free to use any unique name that suits you, (aside from a few reserved words like HEAD) and you can change it at will. You can set the default branch name, see [custom config](/blog/git-intermediate#custom]

When you create branches it looks like you are making a complete copy of the current state of the branch you are currently in.

Parallel branches

The concept of parallel branches provides a simple model to keep a clean working production branch while you are introducing features on a development branch. When you are ready to update the production branch, you can apply the commits from the development branch to it with a git merge branchname command.

---> time

init    current
 O------O                     Main Branch
         \------O------O       Development Branch
               chg1   chg2
                      (locally current)

after the merge becomes

init                   current
 O------O-------O------O       Main Branch (production)
         \------O------O        Development Branch

This is referred to as a fast-forward merge. The main branch didn't change (no commits on main that aren't on development) between the time the development branch diverged and was merged back.

You can have more than one development branch running in parallel. These development branches (if successful) are expected to eventually get merged into the main production branch. If other development branches have already been merged into main (or really if main has picked up any commits since branching off into the development branch) the merge is much more complicated behind the scenes, but to the user it is usually just the same git merge command. There are cases where the standard merge cannot resolve with help. These conflicts can be avoided for the most part with workflows, but can require manual intervention. This is discussed later in the advanced merge section.

Additional basic commands

  • git branch -vv lists existing local branchs extra verbosely
  • git diff main shows the difference between the current branch and the branch named main.

More information on branches and related commands in the next article Git II - Intermediate

Basic Workflow Examples

Exploring with branches

Example: you get a crazy idea you want to try out in the code, but don't want to stomp on your working version. In the old days, I would copy /s the entire directory into a backup directory, and work on the code there. When I was done, I could copy the changed files to the regular project directory or just abandon the backup directory and go back to the regular project untouched.

Git simplifies this pattern: (starting from main branch)

  • create a new branch with git switch -c crazy-idea-branch
  • perform necessary code changes, add and remove files, etc.
  • test it, organize/stage/commit changes to crazy-idea-branch
  • go back to main for now with git switch main
  • like the new changes?
    • put changes into main with git merge crazy-idea-branch
    • ditch changes with git branch -d crazy-idea-branch
  • if you can't make up your mind yet, you can always hold on to the branch for a while, maybe start a new branch from main and see which solution you like better. Clean it up whenever.

Checkpoints

Example: You can also do the equivalent of making a backup and then continuing to work on the current copy. This can be handy if you have things pointed to the current copy for testing or deployment and don't want to have to duplicate them to test or build in the backup copy. The simple pattern is to: (starting from main branch)

  • start a new branch (but stay on main) with git branch milestone-1
  • perform code changes, add and remove files, etc.
  • test it
  • like the changes?
    • organize/stage/commit changes to main,
    • remove the checkpoint with git branch -d milestone-1
  • don't like the changes?
    • move main back to checkpoint with git reset --hard milestone-1

Note that the last step git reset --hard milestone-1 doesn't require a branch name, and could use a tag, or a commit id (long or short) directly. This means that you could just as easily have done this without using a branch beforehand, simply by looking in the log for the commit id that you want to reset to.

Documenting with branches

The command git log --oneline --graph shows the merging history of branches. The pattern of adding features using a branch named after the feature, and chores like npm updates with simple commits makes a nice self-documenting version history. There are times when I am making a similiar change to one made before that it is nice to be able to quickly find and view the previous changes.

More information on branches and workflows in the next article Git II - Intermediate