Introduction to Version Control and Git.

Introduction to Version Control and Git.

Β·

5 min read

Hello all πŸ‘‹. Hop in until the end for a surprise!🎁

In this article, we will be focusing on version control systems, their need and types and finally, we will learn how Git, which is one of the many version control systems available and used by many works. This is going to be a long one, so bear with meπŸ˜€

What is version control?

Version control is a system that records changes ta a file or set of files over time so that you can recall specific versions later. You may think this is only for software source code files, but in reality, you can do this with nearly any type of file on a computer.

If you are a developer, you would want to keep track of the changes made to your source code files, so a Version Control System (VCS) would be a very wise thing to use. It allows you to revert selected files to a previous state, revert the entire project to a previous state, compare changes over time, see who last modified something that might be causing an issue and many more. Sounds magical right?πŸͺ„ well it's not! it's just technology! 😎

Types of Version Control Systems

  • Local version control systems:

    Many people's version-control method of choice is to copy files into another directory. A common approach but very error-prone. It is easy to forget which directory you're in and accidentally write to the wrong file or copy over files you didn't mean to. To deal with this programmers developed local VCS that had a simple database that kept all changes to the file under revision.

  • Centralized version control systems:

    The next major issue people encountered is that they needed to collaborate with developers on other systems. To deal with this CVCS was developed having a single server that contains all the versioned files and the history of what everyone on the project is doing. However, this setup has some serious downsides. What if the centralized server faces failure or what if the hard disk on the central database becomes corrupted? You lose absolutely everything.

  • Distributed version control systems:

    This is where DVCS steps in! In a DCVS, users don't just check out the latest snapshot of the files; rather, they fully mirror the repository (folder), including its full history. Thus, if the server dies, and these systems were collaborating via that server, any of the user's repositories can be copied back to the server to restore it.

What is Git?

Try to absorb this section effectively, as it will probably be much easier to grasp the further concepts.

Git is a distributed version control system developed by Linus Torvalds. It is an open-source project initiated in 2005.

We will now look at some feature which makes Git a one of its kind technology!

  1. Snapshots, Not differences

    The major difference between Git and other VCS(CVS, Subversion, Perforce etc) is the way Git thinks about data. These other systems think of the information they store as a set of files and the changes made to each file over time.

storing data as changes to a base version of each file.

Git doesn't think of data this way. Instead, it thinks of data more like a series of snapshots of a miniature filesystem. With Git, every time you commit or save the state of your project, it basically takes a picture of what all your files looked like at that moment and stores a reference to that snapshot. More like a stream of snapshot

s

  1. Nearly every operation is local

    Most operations in Git only need local files and resources to operate. Since the whole codebase and history of your project is available on your local machine, the Git operations are executed in no time.

    This also means your work is not hindered if you are offline. If you are travelling somewhere and you wish to get some work done, you can do so without worries as all the magic happens locally. How cool is that!

  2. Git has Integrity

    Everything in Git is checksummed before it is stored and is then referred to by that checksum. This means it's impossible to change the content of any file/directory without git knowing it. You can't lose information in transit or get file corruption without Git being able to detect it. It uses the SHA-1 hash checksumming method. It's a 40-character string composed of hexadecimal characters which looks something like this:

24b9da6552252987aa493b52f8696cd6d3b00373

In fact, Git stores everything in its database not by file name but by the value of its content.

  1. The three states

    Git has three main states that your file can reside in Modified, Staged and Committed.

    • Modified means that you have made some changes in the file, but Git has not yet tracked those changes

    • Staged means that you have tracked the changes in the modified file using Git and are ready for a commit snapshot

    • Committed means that the data is safely stored in your local database.

The basic Git workflow goes something like this :

1.  You modify files in your working tree.

2.  You selectively stage those changes you want to be part of your next commit, which adds only those changes to the staging area.

3.  You do a commit, which takes the file as they are in the staging area and stores that snapshot permanently in your Git directory.

To summarize, If a particular version of a file is in the Git directory, it's committed. If it has been modified and added to the staging area, it is staged. And if it was changed since it was checked out but has not been staged, it is modified.

If you have come this far, I really appreciate your effort. So, leave with a smile on your face:

References:

  1. Pro Git by Scott Chacon and Ben Straub, Second edition

Stay tuned for more articles on this topic.😎

Β