2 posts tagged with "git"

Git - The Four Areas

January 17, 2020 · 5 min read

Maintainer of DevSecOpsGirl.in

If you have read my first blog Git - Understanding the Model, then you know what Git is, but knowing commands and the model is still not enough. You need to understand how the command work; what happens when we run the git commands.

Four areas of Git

So, let's understand the four areas of git that are Working area, Repository, Index area and stash. As many of you might not be familiar with these areas and the mechanism behind it,

The four areas of Git:

Working Area / Project repository -> It's a place where we keep our current files on which we are about to work.
Repository -> Contains entire history of Project
Index Area -> Intermediate area, a place where we put our files before commit
Stash -> Temporary storage area

Workflow in four areas

Understanding how git works when we pass various commands or how data moves in various directions:

Workflow Left to right

When we talk about moving from left to right we talk about three things : edit the file, stage the file and commit the file. So we move the data from working area to index and then to repository.

Starting with the Working Area, it's the area where our files are currently present. When we make any changes into our files, the changes remain intact into the working area. If we check the status using git status

$ git status

Git says, Changes not staged for commit. The message itself says that in order to commit, we first need to stage the changes.

In order to stage the changes, we run

$ git add

Now the changes are status, lets check the status

$ git status

Giit status tells that the file is modified and Changes to be committed. We can also verify the if the files are staged by checking difference between the working area and Index area using command git diff. In the above screenshot, we see no result of the command git diff because there is no difference to show.

We can also check the difference between the Index and Repository using git diff --cached

$ git diff --cached

Now in order to move the files from index(staging area) to git repository we run git commit

$ git commit -m “Add cricket to list”

-m : m stands for message, we can pass message to the commit we are doing

As soon as we commit the updated file is added to the repository. And all the three areas contain the same data.

Right to left

When we talk about moving from right to left here we talk about moving the data from repository to index and then to working area.

We use the command checkout, it does two things. It changes the repository first and then moves data.

In the repository, it moves the header reference, generally to another branch, so it changes the current commit.
And second thing it does, it takes data from the new commit, and it copies the data from the repository to the working area and the index.

In the above image when we checkout him then the head moves to him branch. The current commit is changed to him and copied the data to the index and working area.

$ git rm --cached him

It will unstage the file and remove it from the index.

$ rm him

It will delete the file from the working area.

So till now we discussed the functionality of Working area, Index area, Repository of git. Lets cover the fourth area that is stash.

The Stash

We use git stash to store changes which are not ready to be committed and that mean time we need to change our branch in order to carry out some other work.

$ git stash apply

In the above image we can see all the data in the working area and index which we wish to put aside in stash for some time.

So to save the changes in stash use below command. It will take all the data from the working area and index that is not in the current commit in the repository.

$ git stash save

$ git stash --include-untracked

It will store all the untracked files by default

In the image below, we can see that our file chess.txt is not available as it is now successfully stored in the stash area.

Now let's check our chess.txt is present in the stash area use the command

$ stash list

In the above image, we have only one element so it is stash@{0}

Now to get all our data in the working area and the index area back where it was before we stashed it. Use the below command

$ git stash apply

$ git stash apply --index

when there are multiple stashes

We can conclude Stash is like a clipboard; it is a place where it stores the stuff that needs to be set aside for some time.

Here we finished with the stash area. Hope you have a better understanding of Git working areas and their workflow.

Thank you for reading.

Git - Understanding the Model

January 17, 2020 · 11 min read

Priyam Singh

Maintainer of DevSecOpsGirl.in

Have you recently started using git? Or using it for a long time but still have the feeling you are missing something or not able to relate the concepts. So, here you will boost up yourself and become a power user in git.................!!!

Images1

What is Git ?

Git is a Distributed Revision Control System. In order to make it more easier to understand, we can break Git as :

Distributed
Revision
Control
System

Types of Command :

Git has two types of Command:

Porcelain Commands (High level commands):

Things with which the user interacts and gets work done. $ git help -a` *Shows the main porcelain commands

git add git commit git push git pull git branch git checkout git merge git rebase ...

Plumbing Commands (Low level Commands):

Things which git uses internally to get things done.

git cat-file git hash-objects git count-objects ...

If you are wondering why I have mentioned the commands and its type, then you need to wait as it will make more sense when you read the rest of the blog.

If you want to understand Git, don't focus on learning the commands. Instead learn the model. So to have effective understanding, imagine that git is a model layered like the Earth's structure as shown below.

Assuming that Git is structured like Earth's structure because even in Git, there are four layers similar to four layers of Earths that is

LAYER 1 : System
LAYER 2 : Control
LAYER 3 : Revision
LAYER 4 : Distributed

In this blog we will talk about Earth's structure inside out, that is we will begin digging from the inner layer and will reach the crust having explained all the four layers of Git. So let's understand each layer thoroughly.

Layer 1 : System

In this section, we will understand how git at core is just a map. This means that it's a table with keys and Values where Values are any sequence of Bytes

When a sequence of values are passed in Git, it generates a key(hash) for it. Every value has its own hash which are calculated using “SHA1 algorithm" and we know that SHA1 hashes are always unique.

So, in the image below we find the hash of string badminton. In this "Badminton" is a value and "f50df...fa1" is key(hash) for it. A sample command is shown below

echo "Badminton" | git hash-object --stdi

We echo the Badminton and pipe(|) the result into a git plumbing command called hash-object and also tell the hash-object to get its result from stdin(standard input or getting direct input instead from a file).

Every Hash is a key and it has a value which can be decoded and read using another plumbing command called git cat-file. When we decode a hash, we get objects which are explained below.

Git objects are present in .git/objects. Git has four types of objects: blobs, trees, commits and tags.

BLOB: Any arbitrary file having plain text or any type of content is called blob.
TREES: Equivalent of directories i.e. a tree is a directory stored in Git.
COMMITS: A commit holds metadata for each commit into the repository including the author, committer, commit date and log message.
ANNOTATED TAG: A name which is assigned to a commit. It's difficult to remember a commit hash, so we provide a name to hash just like a Domain name to IP.

Let's see where the hashes are stored.

Lets understand how and where git stores the hash values. Taking example of Badminton again as explained above

First initialize the git repository by using the command

$ git init

So in the below image it shows the hash value of Badminton is stored here in the .git/objects/f5 repository.

Note You will notice directories inside .git/objects and their names starting with 2 letters as shown below f5. Git uses the first two letters of the hash and makes a directory out of it and the rest of the hash is stored in it as it is.

In the below image the command $ git cat-file f50dfa851f76f8d7e89660e40cc0bdc34ee77fa1 -t

Note:
t : gives the file type in which the hash is stored
cat-file : It is a plumbing command which reads the data inside.

$ git cat-file f50dfa851f76f8d7e89660e40cc0bdc34ee77fa1 -p

p : It is called pretty print as it prints the data inside the objects

Bat

As it got stored in the form of a blob. So we can track in the above example where our information is saved hence a map is formed or we can say persistent map.

Here we finished with the first layer of git that is the system and also got the idea of keys and values. Moving further we come to the second layer of our model.

Layer 2: Control

Content tracker

We have seen till now that git is a persistent map but we see it as something probably more than that something that tracks your files and your directories as a content tracker. Lets see what that means:

We will now take a very simple example sportslist

Images

As in the above image, we can see there is one directory(games) and 3 files(Badminton_top.txt, README.txt, list.txt) contained in the sportslist directory.

Now we initialize Git as we earlier did in Layer 1

$ git init

When we check our object directory under .Git, it is empty we find nothing other then info and pack subdirectories means nothing is stored over here.

First Commit

Let's do the first c!ommit in our sportslist example, before doing the commit we first need to add all changed files to the staging area. Whatever is in the staging will get into the repository after the commit.

In order to check the status of files, we use

$ git status

images

In the above picture, files are untracked meaning they are not in the staging area.

Notes : I will talk more about staging and three other areas which are very important to understand in order to understand git at the end of blog

So for adding to staging area use the below command

$ git add

$ git add list.txt detail` // To also add/pass single file

To check whether they are staged or not again use the status command.

$ git status

images

In the above image it shows files and folders under "Changes to be committed" in Green means they are properly staged.

Now in order to commit :

$ git commit -m "first commit!"

Images

So to check logs in order to see, which hash is of the commit we did and the commit message, the author, date and time also. From there we can find the hash of all the files and folders which we committed.

In order to see logs, we use command

$ git log

Images

Where first commit got stored

So to find where the first commit got stored we will go to .git/objects. You will find the following object directories as shown in image below which were empty as I discussed earlier.

Images

The starting 2 letters of hash in the commit message which is c88887....c0 i.e, c8. There is a directory present in .git/objects with the same 2 letters.

On opening this directory, we will find a hash. Lets see what is inside this hash.

git cat-file -p c88887

-p stand for pretty print

Images

This hash contains metadata about the commit.

First a tree with hash value 91...8e.
Second author and committer details
Third message of commit

Coming further on Tree hash and see what it contains.

Images

In above image we see there is Tree hash showing the games directory and blob hash showing the list.txt file we made in the beginning in the sportslist directory. This is the way we can track the data which is stored inside the git repository.

This is how git does tracking and that's why we call it tracking or control system.

Now, we move further in our blog and know about the third layer of Git and features that result into a revision control system, features like branches and merges.

Layer 3 : Revision

BRANCH

So branches are actually just a reference to a commit and our default branch is the master branch shown in the image below.

One can create new branch using below command

$ git branch him

To see all the branches in your current repository type

$ git branch

The branch marked with ashtrick '*' is the current branch.

In above image we see we have two branches that is master branch and him branch.

Let's find the commit hash of the two branches in order to verify my above statement that branches are actually a reference to a commit message

$ cat .git/refs/heads/master

$ cat .git/refs/heads/him

Both the 'master' branch and new branch 'him' have the same commit because they are pointing to the same commit c888873...1ac0.

What is head

Head is a reference to current branch and another way to find current branch is to read the HEAD file under .git.

$ cat .git/HEAD

So in above image head is pointing to the master branch.

Now for changing the current branch.

Checkout actually moves the head and changes the working area.

$ git checkout him

So, in the above image checkout switched to branch him and also head is now pointing at him branch.

Merging of Branches

Merging of branches is one of the important features offered by git. Merging allows a user to merge two different branches into one. Every branch has its own data, once the merging is done the branch become similar. Follow below example for merging two branches and comparing the data.

To merge two branches, we can merge the data and compare it. Over here we will compare master and him* data we use the command:

$ git merge him

In this conflict occurs as him and the master branch have different data so now to merge we have to solve this conflict.

The conflict can we solved in 4 ways which are:

Accept Current Change - means to keep content of master branch and ignore him branch
Accept Incoming Change - means to keep content of him branch and ignore master branch
Accept Both Changes - To keep both the changes(master's content and him's content)
Compare Changes -Just to compare changes in both branches simultaneously.

Images

Hence, the two branches got merged successfully as it has now new tree and two parents now one is master branch and other is him branch.

So, we have covered the third section of our model that is Revision with the understanding of features of git that are branches and merging of it. Moving further towards the Layer 4 of our model.

Layer 4: Distributed

Here we come to our last and final layer that is a distributed layer which is useful for connecting us to other people working remotely. Distributed itself means between more than one or two people. We can work in using distributedly by using platforms like github, gitlab, bitbucket.

To do the same, go to github/gitlab/bitbucket and make a new repository.

Now as we have our repository already on our local machine, so choose the second option command as shown in above image that is:

$ git remote add origin https://github.com/Priyam5/sportslist.git

Now that we have added origin where we need to push our project file sportslist. The git push command is used to push the commit to remote repository. The command used for pushing to GitHub is given below.

$ git push -u origin master

Here -u, means that next time you don't have to specify name(origin) and branch(master) again.

When we are required to update the local repo based on someone else's changes done on the remote repo. We use command

$ git pull

It will pull the changes from remote repo and will update the local repository

So, we have completed our 4 Layers of git. While writing the blog I figured that it is important to understand the four areas in git which I will be writing in my upcoming post over git

Thank You for reading.

Four areas of Git​

Workflow in four areas​

Workflow Left to right​

Right to left​

The Stash​

Layer 1 : System​

Layer 2: Control​

Content tracker​

First Commit​

Layer 3 : Revision​

BRANCH​

Merging of Branches​

Layer 4: Distributed​

Four areas of Git

Workflow in four areas

Workflow Left to right

Right to left

The Stash

Layer 1 : System

Layer 2: Control

Content tracker

First Commit

Layer 3 : Revision

BRANCH

Merging of Branches

Layer 4: Distributed