Sunday, November 24, 2019

An introduction to git architecture and command line interface

Git is a distributed version control system. Using Git, many developers can make changes to the same code base at the same time without running into accidents like overriding someone else’s changes. Git will only update the differences made to a file.
Git was created by Linus Torvalds in 2005 for development of the Linux kernel. Git has become a must-have tool for software development companies and organizations. Git is incredibly simple to use but there are tons of concepts to go through to become a pro. So let’s get started.


How to install?

Git is basically a command line program. You will be working with Git using git commands almost all the time. You can install Git on your system using a package manager, an installer or from source.
A git installation must have a remote repository (remote code base) where the changes made by many developers accumulate. Two of the most popular repository hosting websites are GitHub and GitLab, however you can also choose to host your own remote repository on a server.
If you are not comfortable with Git commands, then you can use their official GUI which will be available to you after installing Git. If your repository is on GitHub, then you can download their official GUI application from https://desktop.github.com/.

How Git works?

Once you have Git installed on your system, sign-up on GitHub and create your first remote repository.

This will be your remote repository where you and other developers will push their changes. When you create repository, it will be empty, meaning that it won’t contain any code.
So initially, you need to push some code so that other developers can also work on it. Make sure to copy the link of the remote repository which will be available to you once you click create repository button. This will be used to link your local code base to the remote one.
When you are working on some code, you will be working on a local copy. That means you will be working on repository which is available on your local system like a PC or a Laptop. Once you have completed your updates you had to do, you have to push that code to remote repository. Hence Git must know where the repository is located. Hence we need to provide Git with the remote repository URL we just copied. Once Git knows where the remote repository is, it can keep the local repository in sync with the remote repository on demand.
So first, let’s create a local repository and add some code. I am going to use JavaScript for creating sample code files. You can use whatever you want.
You can have remote repository on your local system as well without any internet connection. Just, simply copy path to a folder on your system instead of the URL you copied above. But with this, other developers can not contribute unless somehow they have access to that folder within a network.


Configure Git

After Git is installed on your system, you need to configure Git so that it can understand who is pushing the code to the remote repository, as many people could be working on the project. The most important configurations are the username and the user’s email. This can be setup using following commands.
After this, Git on your system will use these credentials every time you push some code to the remote repository. You can view all Git configurations using the command below.


Initializing Git Repository

Let’s make a sample folder with name git-test which is same as our repository name we created on GitHub. This will be your local repository. After doing that, open the terminal window from within that folder.
First of all, we need make this folder a Git repository. It will be done by executing the command below.
After above command executes successfully, Git will create a hidden .git directory in the current folder. This folder will contain object files which will be used by Git to store important information about the repository and keep track of changed files. Unless a folder has a .git directory inside it, Git won’t treat it as a repository.
As of now, Git doesn’t know where the remote repository is located. Hence we need to instruct Git to link it with the remote repository using using the command below.
repo-url refers to the the remote repository URL we copied earlier. origin is a short name for this long URL which comes in handy whenever we have to push some code to remote repository. A local repository can track multiple remote repositories, you just have to use a different short name and origin url.
If your remote repository already contains some code files, then you need to pull them inside your local repository. For that, use the command below.
The above three steps can be combined into single step when you already have a remote repository. For that, use the command below
Here folder is an optional path to the local folder (which will be a local repository). We could have used git-test but we can avoid it. If folder is not given, then Git will create new folder with the name same as the remote repository name. This will also initialize the .git directory inside it and set origin to repo-url. This will also pull code from remote repository into the local repository.
You can verify if a local repository is tracking the remote repository using git remote -v command which should output below result.

(git remote -v)

You can replace the remote url by using the command below.


But how does Git really works?

Well, we talked about how Git works in layman’s terms but technically, it is much more sophisticated. So far, we have set up local repository. Let’s say that we made some changes to our code or created new files, and we want other developers to have it too, then we need to push these changes to the remote repository. Once these changes are updated on the remote repository, other developers can use the git pull command to bring those changes to their local repository.
There are few key steps one must go through to push changes a remote repository. Let’s first understand what a commit is.

What are Commits?

At a time, we can have one or multiple files changed. We don’t push entire files to remote repository, instead we push changed code which also makes file transfer over networks faster. A commit logs a change or series of changes you have made to a file. A commit has a unique SHA1 hash which is used to keep track of files changed in the past. A series of commits comprises the Git history. A commit object is more complex than it looks but basically it contains file change meta data, the name of the author of the commit, a timestamp of when the commit was made, and the previous commit’s hash. Based on this information, a hash of a new commit is generated. If any information inside a commit changes, hash will also change. If you are familiar with blockchain technology then you can think of the commit history as being like the blockchain where the commits are the blocks
Whenever you use git pull or git push, you only fetching or sending these commits to remote repository. Git on remote repository server then merges these commits to the its local repository (our remote repository).
Your local repository has three different virtual zones or areas viz. working area and staging area and commit area. Working area is where you create new files, delete old files or make changes to already existing files. Once you are done with these changes, you add these changes to staging area. The staging area is also sometimes called the ‘Index’. Once you have completed your changes, the staging area will contain one or more files which need to be committed. Whenever you create a commit, Git will take changed code from staging area and make a commit which it then moves to commit area. Unless you use git push command, these commits won’t be sent to remote repository.



No comments:

Post a Comment