How Does Git Work
Git (/ ɡ ɪ t /) is a distributed version-control system for tracking changes in source code during software development. It is designed for coordinating work among programmers, but it can be used to track changes in any set of files. Its goals include speed, data integrity, and support for distributed, non-linear workflows. If you want to create a local branch based on a remote-tracking branch (i.e. In order to actually work on it) you can do that with git branch –track or git checkout –track -b, which is similar but it also switches your working tree to the newly created local branch.
I’m working with Git now but only for my personal projects and those I have on GitHub. At work we still use TFS and SVN (as of now). Recently came to our company to hold a course about Agile planning and since Git was quite new to most of my mates, he also quickly explained Git in the context of refactoring. I really liked his approach of explaining it and that’s why I’d like to replicate his explanation here.
Just before we start.How is Git different from other VCS (Version Control Systems)? Probably the most obvious difference is that Git is distributed (unlike SVN or TFS for instance). This means, you’ll have a local repository which lives inside a special folder named.git and you’ll normally (but not necessarily) have a remote, central repository where different collaborators may contribute their code. Note that each of those contributors has an exact clone of the repository on their local workstation.Git itself can be imagined as something that sits on top of your file system and manipulates files. Even better, you can imagine Git as a tree structure where each commit creates a new node in that tree. Nearly all Git commands actually serve to navigate on this tree and to manipulate it accordingly.As such in this tutorial I’d like to take a look at how Git works by viewing a Git repository from the point of view of the tree it constructs. To do so I walk through some common use cases like.
adding/modifying a new file. creating and merging a branch with and without merge conflicts. Viewing the history/changelog. Performing a rollback to a certain commit. Sharing/synching your code to a remote/central repositoryBefore starting here, I highly recommend to first go through the initial pages of the, especially the.
Illustration of the main three states your Git versioned file's lifecycle TerminologyHere’s the git terminology:. master - the repository’s main branch. Depending on the work flow it is the one people work on or the one where the integration happens. clone - copies an existing git repository, normally from some remote location to your local environment. commit - submitting files to the repository (the local one); in other VCS it is often referred to as “checkin”. fetch or pull - is like “update” or “get latest” in other VCS.
The difference between fetch and pull is that pull combines both, fetching the latest code from a remote repo as well as performs the merging. push - is used to submit the code to a remote repository. remote - these are “remote” locations of your repository, normally on some central server. SHA - every commit or node in the Git tree is identified by a unique SHA key. You can use them in various commands in order to manipulate a specific node. head - is a reference to the node to which our working space of the repository currently points. branch - is just like in other VCS with the difference that a branch in Git is actually nothing more special than a particular label on a given node.
It is not a physical copy of the files as in other popular VCS.Workstation SetupI do not want to go into the details of setting up your workstation as there are numerous tools which partly vary on the different platforms. For this post I perform all of the operations on the command line. Even if you’re not the shell-guy you should give it a try (it never hurts;) ).To setup command line Git access simply go to where you’ll find the required downloads for your OS. More detailed information can be found.After everything is set up and you have “git” in your PATH environment variable, then the first thing you have to do is to config git with your name and email: $ git config -global user.name 'Juri Strumpflohner'$ git config -global user.email 'Let’s get started: Create a new Git RepositoryBefore starting, lets create a new directory where the git repository will live and cd into it: $ mkdir mygitrepo$ cd mygitrepoNow we’re ready to initialize a brand new git repository. $ git initInitialized empty Git repository in c:/projects/mystuff/temprepos/mygitrepo/.git/We can check for the current status of the git repository by using $ git status# On branch master## Initial commit#nothing to commit (create/copy files and use 'git add' to track)Create and commit a new fileThe next step is to create a new file and add some content to it. $ touch hallo.txt$ echo Hello, world!
hallo.txtAgain, checking for the status now reveals the following $ git status# On branch master## Initial commit## Untracked files:# (use 'git add.' To include in what will be committed)## hallo.txtnothing added to commit but untracked files present (use 'git add' to track)To “register” the file for committing we need to add it to git using $ git add hallo.txtChecking for the status now indicates that the file is ready to be committed: $ git status# On branch master## Initial commit## Changes to be committed:# (use 'git rm -cached.'
To unstage)## new file: hallo.txt#We can now commit it to the repository $ git commit -m 'Add my first file'1 file changed, 1 insertion(+)create mode 100644 hallo.txtIt is common practice to use the “presence” in commit messages. So rather than writing “added my first file” we write “add my first file”.So if we now step back for a second and take a look at the tree we would have the following. State of the repo tree after 1st commitThere is one node where the “label” master points to. Add another fileLets add another file: $ echo 'Hi, I'm another file' anotherfile.txt$ git add.$ git commit -m 'add another file with some other content'1 file changed, 1 insertion(+)create mode 100644 anotherfile.txtBtw, note that this time I used git add. Which adds all files in the current directory (.).From the point of view of the tree we now have another node and master has moved on to that one.Create a (feature)branchBranching and merging is what makes Git so powerful and for what it has been optimized, being a distributed version control system (VCS).
Indeed, feature branches are quite popular to be used with Git. Feature branches are created for every new kind of functionality you’re going to add to your system and they are normally deleted afterwards once the feature is merged back into the main integration branch (normally the master branch). The advantage is that you can experiment with new functionality in a separated, isolated “playground” and quickly switch back and forth to the original “master” branch when needed. Moreover, it can be easily discarded again (in case it is not needed) by simply dropping the feature branch. There’s a nice article on which you should definitely read.But lets get started.
First of all I create the new feature branch: $ git branch my-feature-branchExecuting $ git branch. mastermy-feature-branchwe get a list of branches. The.
in front of master indicates that we’re currently on that branch. Lets switch to my-feature-branch instead: $ git checkout my-feature-branchSwitched to branch 'my-feature-branch'Again $ git branchmaster. my-feature-branchNote you can directly use the command git checkout -b my-feature-branch to create and checkout a new branch in one step.What’s different to other VCS is that there is only one working directory.
All of your branches live in the same one and there is not a separate folder for each branch you create. Instead, when you switch between branches, Git will replace the content of your working directory to reflect the one in the branch you’re switching to.Lets modify one of our existing files $ echo 'Hi' hallo.txt$ cat hallo.txtHello, world!Hiand then commit it to our new branch $ git commit -a -m 'modify file adding hi'2fa266a modify file adding hi1 file changed, 1 insertion(+)Note, this time I used the git commit -a -m to add and commit a modification in one step. This works only on files that have already been added to the git repo before. New files won’t be added this way and need an explicit git add as seen before.What about our tree?So far everything seems pretty normal and we still have a straight line in the tree, but note that now master remained where it was and we moved forward with my-feature-branch.Lets switch back to master and modify the same file there as well. Bratz rock angelz game download.
$ git checkout masterSwitched to branch 'master'As expected, hallo.txt is unmodified: $ cat hallo.txtHello, world!Lets change and commit it on master as well (this will generate a nice conflict later). $ echo 'Hi I was changed in master' hallo.txt$ git commit -a -m 'add line on hallo.txt'c8616db add line on hallo.txt1 file changed, 1 insertion(+)Our tree now visualizes the branch:Polishing your feature branch commitsWhen you create your own, personal feature branch you’re allowed to do as much commits as you want, even with kinda dirty commit messages. This is a really powerful approach as you can jump back to any point in your dev cycle.
However, once you’re ready to merge back to master you should polish your commit history. This is done with the rebase command like this: git rebase -i HEADThe following animated GIF shows how do do it: Demo on cleaning up your commit history Merge and resolve conflictsThe next step would be to merge our feature branch back into master.
This is done by using the merge command $ git merge my-feature-branchAuto-merging hallo.txtCONFLICT (content): Merge conflict in hallo.txtAutomatic merge failed; fix conflicts and then commit the result.As expected, we have a merge conflict in hallo.txt. Hello, world! my-feature-branchLets resolve it: Hello, world!Hi I was changed in masterHi.and then commit it $ git commit -a -m 'resolve merge conflicts'master 6834fb2 resolve merge conflictsThe tree reflects our merge. Fig 1: Tree state after the merge Jump to a certain commitLets assume we want to jump back to a given commit. We can use the git log command to get all the SHA identifiers that uniquely identify each node in the tree.
Git offers a huge array of really useful tools for that are great to explore. Today I’ll talk about one of these: Git Bisect, very useful when we want to search for something specific within our code.
What is Git Bisect and how does it work?Bisect comes form binary search algorithm, and it’s an efficient way to search through a large set of (sorted) data. By repetitive dividing the batch in half and performing some kind of validation check, we are able to scan through a huge amount of data in a limited number of iterations. Bisect could be performed manually, by simply “checking out” to a specific commit and looking through the code. Yet the provided implementation protects us from all sorts of possible errors and does a lot of manual labor for us.When it’s a good time to use bisect? When you are not exactly sure when a specific change had happened in the code. It may be a bug hard to track when it was introduced, unwanted change, like a code deletion that was mistakenly removed.
In all those cases bisect comes in handy.For the purpose of example let’s simply say we have a bug. It’s hard to track which change has initially caused it, but we know for sure, a week ago it was not there! Perfect, let’s start bisecting.Before you start bisecting, please store your unfinished work with commit or stash!At first, we need to initialise the search, with: git bisect startNow we need to mark two commits with good and bad labels, to specify the boundaries for the search. I will mark the current HEAD as being bad, since the bug can be reproduced right now: git bisect badNext we need to mark the place in time when we are sure everything was still working fine. Same thing, it may be specified with the commit’s SHA, tag or branch, or simply by checking out to a particular commit and marking it as good: git checkout 1234567git bisect goodor git bisect good 1234567From now on Git will start to move the HEAD between commits offering us a possibility to verify the state of the code at that moment in time. The instructions are quite self-explanatory; we may see something like this: Bisecting: 191 revisions left to test after this (roughly 8 steps)commit123 Add new transaction endpointOur task is to validate the code, whether it’s compiling and running the application or launching a test case for the given problem. Everything is up to a specific case.
Git will run us through the history, step by step optimising the amount of validations we need to perform. Our job is just to let Git know if at that point in time the code was still “good” or it was already “bad” – “git bisect good” or “git bisect bad”. Git will automatically jump over to the next candidate where we need to state our judgment: git bisect goodBisecting: 95 revisions left to test after this (roughly 7 steps)commit321 Replace the User model with new implementationAfter a specified number of selections we will be prompted with the suspicious commit and all its information.At the end, don’t forget to “reset”. Also don’t hesitate from resetting, at any given moment, in case something went wrong and you wish to start over.
How Does Git Work For Documents
Therefore: git bisect resetIs what you need to gracefully finish the algorithm.In case you forgot to do that, Git stores all the necessary data in the.git catalog in the root of your repository. Removing all the `BISECT` files from there will probably fix a great part of possible problems with bisect you may encounter. Rm.git/BISECT.Ok, let’s recap a little. Open a terminal, if you haven’t done that yet, `cd` yourself to some git repo you happen to have somewhere around and let’s practice a little.First, check if there are any pending changes and commit or stash them. Then do that: git rev-list —max-parents=0 HEADThis will give you your repo’s initial commit’s SHA.You may want to simply bisect start or start passing the bad and good points (in that order) right away. Try this: git bisect start firstcommitsshanumberhere HEADYou should see a fail message, well it makes sense right?
Why would you want to look for a place that you have made thing’s right?The bisect start may accept one or two revision params. Just the bad, or bad and good, so let’s fix it: git bisect start HEAD firstcommitsshanumberhereHEAD is bad.
First commit was good.Now: `git bisect good/bad`You may also try `git bisect next` or `git bisect skip`. Those are meant to allow you to pass a commit in case it cannot be clearly validated.After you end, try: git bisect logto see it all over again git bisect runto reply the fun, here worth mentioning “run” may accept a script to automatically verify the code.Try also: git bisect visualiseWhich will open your default visual tool. Try to experiment with it, in the middle of bisecting.Ok, final trick.
I should have probably started with this one, but it would spoil all the fun for me of explaining you all this. And for all those who haven’t yet jumped to console, this is your last chance – just: git bisectThere you go, a handy reminder: usage: git bisect help start bad good skip next reset visualize replay log run Summing up:Bisect is an easy to use searching algorithm that allows us to scan even a huge history in a reasonably short time. It’s also non-invasive and relatively simple. Just remember to always start with a clean working directory and reset it after you are done!
Feel free to use it anytime you are in need to search through your history.Of course, the full manual is available by typing: `git help bisect`.