Git Is Your Friend not a Foe Vol. 1: Distributed

Recently, I’ve been preaching Git to everyone that use the inferior version control software (like SVN or, pardon me, CVS). But somewhy the main obstacle I see in these people is that they are so used to SVN workflow that they don’t see the magnificence and flexibility Git offers. They mostly are able to read http://whygitisbetterthanx.com/ and acknowledge the fact that more and more projects have been switching over to it.

But still, many of them don’t grasp the benefits Git gives, falling back to classic centralized edit–commit-to-server workflow of SVN and whining that “this stupid Git didn’t commit changes in that file; this stupid Git complains about ‘non fast-forward’; this stupid Git ate my kittens; etc.”. I would like to clear something out and introduce them to a better world.

First of all, Git is a distributed version control system. What does that mean? In classic VCS you have a single holy place called The Repository, where all the project’s history is kept. Developers get only the small fraction of information from it: the actual files from the latest revision (termed the “working copy”, which is obviously an exaggeration). Basically, the only thing SVN client is able to do is compare your files with the latest revision and send this diff to the server. In SVN communications are possible only between The Repository and the puny client with the working copy.

SVN

In contrast, Git does not differentiate His Holiness The Repository from mere mortal working copies. Everyone gets a repository of his own. Everyone can do anything they want with it. Each developer can communicate with any other developer. This gives a developer so much freedom, that he often does not get into it, and just simply asks this:

Uhm, an entire development history? With every working copy? Man, that will eat a lot of disk space! And I even can’t imagine how long it will take to checkout that repository!

Well, first, not checkout, but clone. The checkout in Git is a somewhat different operation, and that is a Git club entry fee: you need to lose your centralized VCS habits and get used to new terms and ways. This can be painful at first, but it pays off at the end. You’ll thank me later.

So, back to the repository size. Yes, Git requires you to have the whole repository on your person. Yes, it does increase your project directory size. But Git is extremely efficient in packing stuff, so that increase should not hurt you. In fact, the whole Git repository (with full project history) is known to take less space than an SVN checkout. And SVN’s checkout process is so inefficient, that for most projects Git clone takes less time than SVN checkout.

Okay, now the next question is: what is so cool about having the whole repository along with project files? Well, the most basic advantage is that a developer can do everything without access to the server, i.e.:

  • view the revision log starting from the very first commits;
  • browse old versions of the project;
  • and more importantly, commit his changes.

It is a nice feature being able to browse the history without Internet access for people with slow link, or for people that travel a lot. But being able to commit things without asking anyone’s permission is so important that it’s worth a separate paragraph. Here it goes.

Most software teams recognize the two simple principles that a developer should follow: keep commits atomic and don’t commit bad stuff. The problem is that centralized VCS make these principles incompatible. People just don’t work in a linear discrete fashion, instead they tend to steer between several things: a touch there, a refactor here, an occasional stupid bug fix. In the end you get a working tree with bunch of unrelated, uncommitted and untested changes. In Git you can commit as often as you want because commits are local to your repository, no one sees them except yourself! You can commit total rubbish and test everything later — you can edit every single commit without fear of embarrassment and humiliation. You can find out that the way you started to implement this killer feature everyone wanted is totally wrong and start from scratch — without spoiling the project version history.

The second advantage is that developers can exchange their revisions with each other without the central server. Imagine John having reworked the main loop of nuclear reactor coolant control computer. He doesn’t want to incorporate this change to a live system, so he asks Fred to download the respective changes from his repository and test them on his nuclear plant in less populated area. After not having heard any loud explosions, John knows that at least one plant survived the change.

You can also benefit from this even if you are the only developer. Imagine you have several different computers (for example Mac, Linux x86 and Linux amd64). You have developed something on your Mac box and tested it through and are ready to push this to the main repository. But you may also push it first to your Linux boxes and test it there. In SVN you would have to generate patch, transfer it to the boxes, and apply it. Everything manually. So you most probably wouldn’t bother at all and would discover that nasty bug that occurs only on 64-bit computers only in two month and lose your job.

Git

Finally, the concept of “central repository” may be eliminated altogether. Every developer gets a “public” repository where he keeps the stuff he is not ashamed of and a private repository where he works as he wants. Or a bunch of private repositories. The developers exchange their work by pulling commits from each other’s public repositories. Or they can have a single lead developer, who collects the good commits, and use his repository as a “blessed” repository. The lead developer either watches for changes in other public repositories, or waits for a “merge request”. Merge request is a message (e-mail traditionally) that says something along the lines of “Hey, Sam, I’ve implemented the automatic road crosser for blind one-legged homosexuals, ‘git pull git://acmesoftware.com/~dave/shiny.git crosser’, love, Dave”. Sam copies-and-pastes the command and gets a new branch, tests it, and then pushes to his blessed public repository.

For large projects (for example, Linux) lead developer has several people responsible for specific subsystems (the so called Lieutenants). They collect the small commits from their fellow developers, test them and forward to Linus, who aggregates all the good stuff in his own repository. This ensures that the code is seen by at least one other person, before it gets stored in the repository and completely forgotten.

The aforementioned site has a nice section about different Git workflows (see under Any workflow) with pictures.

Also, the nice side-effect of Git being a distributed system is that every repository is essentially a backup of the main repository. It doesn’t mean you should not do backups — you should! — it just means, that in case everything crashes and burns, any developer will provide you with full revision history, not only the recent project files.

There are some more things that confuse novice users, especially branches and staging area. I shall cover them in following posts, stay tuned!

Next posts:

All posts about Git

Pingbacks: 7

“Git Proxy”, not HTTP proxy - Git Solutions - Developers Q & A
http://hades.name/blog/2010/01/17/git-your-friend-not-foe/
www.ggkf.com, 02:11 (after 1464 days)
propecia
www.treintadetreinta.org, 20:01 (after 1237 days)
rimonabantexcellence site title
Hello http://hades.name/blog/2010/01/17/git-your-friend-not-foe/
www.rimonabantexcellence.com, 05:45 (after 1236 days)
Max’ Lesestoff zum Wochenende | PHP hates me - Der PHP Blog
Hades Blag: Git Is Your Friend not a Foe Vol. 1: Distributed
www.phphatesme.com, 05:59 (after 13 days)
Hades Blag: Git Is Your Friend not a Foe Vol. 3: Refs and Index
Volume 1, on the distributed VCS concepts
hades.name, 14:09 (after 11 days)
links for 2010-01-24 « Stand on the shoulders of giants
Hades Blag: Git Is Your Friend not a Foe Vol. 1: Distributed
mamatoshi.wordpress.com, 07:03 (after 8 days)
Hades Blag: Git Is Your Friend not a Foe Vol. 2: Branches
Volume 1, on the distributed VCS concepts
hades.name, 21:56 (after 6 days)

Comments: 27 (already: 14) Comment post

Hey, my name is Bono. Your article is so offensive, it made me drop my shades. And I still haven’t found what I’m looking for.

What now?

Bono , 14:12 (after 7 days)

It’s me again, Bono. Now The Edge is getting upset too, and threatens to throw his guitar. You really don’t want that. Please do something ASAP, and also get my shades back.

Cheers, Bono (from U2!)

Bono , 12:04 (after 9 days)

Hi,

distributed VCS certainly provide nice features. But we should not forget some things:

  1. Git has small disk-usage and is fast, that may be true for simple branches, I think it could be better without the full history, but it is okay because it is comfortable (e.g. you do not have to type your password to see the log). But you can not checkout the whole KDE-repository, that would be too much. Therefore there are smaller repositories represented by dozens of .git-files. But KDE is not a .git-file-collection for me. KDE has a structure and a VCS — distributed or not — should be able to represent structures. That is why I like the svn-commands “svn up -N” and “svn co -N”. I have got a proper structure of the KDE-project on my disk (svn co -N svn.kde.org/home/kde/trunk) without having to download everything. And CMake partly supports this system (optional_add_subdirectory). A heap of projects is simply ugly. It would be perfect if it would also be possible to display multiple branches in the same directory/to switch between them.
  2. It is nice to provide the same features for servers and peers. But that does not mean, it is bad to provide some syntactic sugar for the central server. Git thinks there is no central branch, but in reality there is and it is good for us, if we share a central branch and the VCS knows that and supports us with some syntactic sugar. Feature branches may be useful, but the can also become a risk.
  3. Unfortunately we had no choice. Some projects moved to get and now everybody has to follow.
  4. Now git is a fact for KDE, thank you for you articles, I really need to learn more about git.

Jonathan

Jonathan , 18:52 (after 7 days)

Thanks for your input. It’s true that SVN’s recursive nature allows to form meta-repositories, such as KDE and there is certain touch to it. But in my opinion the ends do not justify the means. Git’s storage efficiency and speed far outweigh the directory structure bonuses.

Also, there is ongoing development of git-submodules, but I am not sure they will become useful soon.

A heap of projects is simply ugly.

I fail to see how it is ugly.

It would be perfect if it would also be possible to display multiple branches in the same directory/to switch between them.

If you clone your repository on the same machine (i.e. git clone amarok amarok-copy), Git hardlinks its internal files, so you can get several directories with no storage penalty at all. So it is possible in Git, and it is better in Git.

It is nice to provide the same features for servers and peers. But that does not mean, it is bad to provide some syntactic sugar for the central server.

There is a syntactic sugar: “git push” and “git pull” with no arguments use “origin” as remote.

Edward , 20:47 (after 7 days)

I do not like such project heaps, because I want to see KDE in a structured way, it is structured. An example: Maybe I do not want to download the whole kdesdk, I just want to compile kapptemplate, then I download the rest. And I do not have to recompile kapptemplate, because CMake supports me. Or I want to try all playground-plugins for KDevelop etc. Some structures ae always good. The hardlink stuff sounds very nice (it is also supported by other DVCs iirc), but I meant managing of different branches in the same recursive structure. That is simply a wish-list item. For example I would like to be able to easily compare files with the stable version, but that I think in SVN there is not anything like this, maybe in bzr (have seen something like that) maybe in git (do not know). I am no VCS pro and I am more often happy with my structured KDE-checkout than complaining about missing merging-features. (It would be wrong to say the same for local commits, I do not miss them all the time, but I would use them very often) And I am very “angry”, because other DVCS do not use these stupid .git-files but a recursive structure.

Syntactic sugar: Yes, I said nonsense. We should simply take care that there are not too many unmaintained branches and the main repository is used as main repository. Maybe git is not that difficult I beleive and it is just too different.

Jonathan , 21:14 (after 7 days)

a working copy is a checkout, and that’s not different in git at all.

what you meant are revisions, and as it happens, they are not different in git, either (only that they may have multiple parents).

the lastest revision on a branch may be called the tip (e.g., in mercurial speak) or a head. as it happens, git even uses CVS’s capitalization of that name …

but, duh, you seem to already know that, judging by the second article …

Ossi , 19:07 (after 7 days)

Have you ever tried SVN’s “-N”-parameter? My block “2” is about habits, I know that you can do things like checkouts in git.

Jonathan , 19:46 (after 7 days)

nice article :)

for KDE, i don’t think having “no central repository” would be a good thing at all. having more peer-to-peer communication can be great, but given our workflows and how we work with others outside of KDE, having a diffuse cloud of repositories would be really difficult for us.

this is based on talking with our downstream packagers and many application developers about this exact issue.

Aaron Seigo , 02:32 (after 8 days)

for KDE, i don’t think having “no central repository” would be a good thing at all. having more peer-to-peer communication can be great, but given our workflows and how we work with others outside of KDE, having a diffuse cloud of repositories would be really difficult for us.

Yes, I agree here. That is better suited to projects that have a solitary dictator, such as Linux, or proprietary projects.

Edward , 07:18 (after 8 days)

The problem is, you and other git advocates keep telling us “you don’t like git because you aren’t using it right, you should change your workflow”, but what if we don’t want to change the workflow and have good reasons for it? A big project like KDE will always have a centralized server and I don’t see why it’s a mistake to push your changes to the central server as soon as you’re done. You don’t need local commits because you should just commit to the centralized server whenever you want to commit. You should never be ashamed of your work. “Release early, release often” also applies to SCM commits. I normally don’t have uncommitted changes in my SVN or CVS working copies: before I leave the computer, I commit. The big problem is that by switching the central repositories to git, you’re forcing your workflow (which isn’t necessarily one which should be encouraged; IMHO, local commits are actually a very bad thing, cooperative development means you should publish your work even when it’s not finished so anyone can use the parts that are already there and/or help finishing the work) onto everyone.

Kevin Kofler , 04:42 (after 8 days)

PS: And I don’t commit only when I leave the computer, I tend to commit after every small change, just like git users do, except it’s a commit to the central server (and of course commits don’t get “squashed”, SVN/CVS don’t support it and IMHO it’s a bad thing to rewrite History anyway).

Kevin Kofler , 04:45 (after 8 days)

You don’t need local commits because you should just commit to the centralized server whenever you want to commit.

What if I want to commit in an airplane?

I normally don’t have uncommitted changes in my SVN or CVS working copies: before I leave the computer, I commit.

If you commit perfect code every time and don’t even flinch, then I sincerely envy you.

workflow […] isn’t necessarily one which should be encouraged; IMHO, local commits are actually a very bad thing

[…]

it’s a bad thing to rewrite History anyway

Are those opinions based on experience in a project that uses Git? Somehow, most projects tend to keep only good and tested commits in the History. Many proprietary development models also include some form of code review by peers. Developers are (mostly) merely humans.

What about new developers? To gain repository write access one obviously should prove himself worthy by committing code, but it is impossible because he doesn’t have access. What should he do? Send patches, that obviously do not conform neither to “release early, release often”, nor to “keep commits atomic”?

I didn’t say “pick that specific workflow”, I said “see how many wonderful workflows are there”.

Edward , 07:54 (after 8 days)

Actually, I love the interim commit model. I’m in the middle of refactoring a core library using svn. Committing it to trunk after each interim step is unthinkable as it would be broken for some scenarios and probably either break all the clients or require them to be changed each time. Not acceptable. Instead I like to complete one round of refactoring to change one thing, then commit that change so I have a checkpoint to fall back to if the next round doesn’t work so well. The svn solution is a branch in playground then try to merge back into trunk. The git solution is local commits, then commit the branch to trunk when it doesn’t break everything else, and decide if you want to keep the full interim commit history at that stage.

The big danger working this way is that no-one else sees what you are working on, which is where Gitorious will come in to play, if you push your work branches there on a regular basis and keep your module informed of what you are doing, I don’t see how that is different to doing something in playground.

John Layt , 11:26 (after 8 days)

No, playground is different. The playground is not personal. There you can find plugins and some libraries and applications which are totally unstable. Some of them are unmaintained, some of them are under active development by some people and will move to review, extragear or KDE in a few months. But I like local commits. Sometimes I perform changes breaking compilation. Of course I do not want to commit them to trunk, but I want to save my work.

openid , 12:32 (after 8 days)

It was a very good article, then I read the “fat QA woman” example, you lost me there. It is highly offensive.

Ana , 00:27 (after 28 days)

He, clearly ppl are different — I found the fat QA woman and the road crosser for blind one-legged homosexuals very funny, just like the nuclear reactor one. While offending people is a bad thing, I’m not sure if such clearly humorous comments should be read as insults. Let’s not be too politically correct all the time.

Actually, I think if someone takes those as an insult it says far more about that person than it says about the writer — a lack of humor is more of an issue than the ability to make jokes freely :D

jospoortvliet , 14:44 (after 44 days)

Hi,

Thanks for the useful articles, with the move to git we will be needing all the education we can get :-) Could I just ask though that you tone down the references like “Fat woman from QA” and “blind one-legged homosexuals” which may cause offence to some people in the community. The posts are also going out on the planet, the public face of the KDE community, and will likely be picked up by other aggregators given how useful they are, and phrases like that are not the sort of image we want to project.

Thank you.

John Layt , 11:59 (after 7 days)

Hi,

Thanks for the useful articles, with the move to git we will be needing all the education we can get :-) Could I just ask though that you tone down the references like “Fat woman from QA” and “blind one-legged homosexuals” which may cause offence to some people in the community. The posts are also going out on the planet, the public face of the KDE community, and will likely be picked up by other aggregators given how useful they are, and phrases like that are not the sort of image we want to project.

Thank you.

John Layt , 11:59 (after 7 days)

Sure, no problem! It’s just I didn’t mean them to go to Planet KDE when I wrote them :)

Edward , 13:17 (after 7 days)

I agree with John. It’s very offensive. And it’s still there.

Elvis Stansvik , 13:46 (after 7 days)

Agreed here also, it greatly detracts from an otherwise good article. And referring to people who prefer subversion as stupid and whiny is not going to win them over, quite the opposite. Plus it makes you appear like a self righteous git.

Lindsay , 22:17 (after 7 days)

Another +1 to getting rid of the offensive references. If you have time to comment you have time to edit the above picture and text. Please do so ASAP.

Mike Arthur , 13:29 (after 8 days)

If it’s no problem, why is it still there? And furthermore, it’s offensive anyways, so whether you intended it to go to Planet KDE should be irrelevant.

Disgusted , 20:05 (after 8 days)

As a proponent of free speech, I would be very offended if the text were changed in order to satisfy the interests of those reading or syndicating it. I request that you leave it the way it is, or make changes based on your own personal preferences rather than those of others. It should be the decision of others whether or not to continue reading your blog, and they should not be allowed to censor your freedom of expression; nor should you be relied on to determine what is or is not offensive to others. What if you replace the reference to “homosexuals” with “elephants” and are subsequently prevailed upon by animal rights activists for the same problem? Is “one-legged stool” okay? Or could that be considered an offensive fecal reference? Where does it all end? Must all examples contain spam, so to speak? Is “blind one-legged spam” acceptable? What if the religious are offended by the idea of a layperson “blessing” something?

If other sources want to reference your post without damaging the sensitivities of certain parties, they can presumably redact it to a state which will meet their requirements, or simply attach a disclaimer emphasizing that they are not responsible for the content therein. I notice that your blog is deployed using git; if your repo is publicly-accessible, they could perhaps (depending on your licensing arrangements) clone it and create their own branch, edited explicitly for their intended audience. Of course they might have to read the offensive bits of this post in order to learn how to do that.

I also have to say that I’ve lost a bit of respect for the KDE project based on what I’ve read here: I didn’t expect that its representatives would attempt to pressure someone to change their work simply because they had chosen to carry it. Of course there’s no guarantee that the earlier commenter actually represents KDE.

I don’t want to imply that I’m opposed to people mentioning to Edward that they find his examples offensive; on the contrary, it’s great for him to be so easily made aware of this. However implying that he “should” change his text, or demanding that it be done, because of decisions made by others to read or syndicate his blog seems entirely contrary to the principles of individual freedom upon which free software is based.

That being said, it would be great for people who are offended by such expressions to have an organized way of registering their feelings with the poster in an inobtrusive way; perhaps blogging systems should have an “offensiveness” plugin that would allow various parties to isolate and vote on the offensiveness of particular phrases. That way at least Edward would be able to get an idea of the percentage of audience members he was potentially alienating, rather than wondering if perhaps comments to that effect were from a vocal minority. This information could even be tracked and used to produce more advanced RSS readers and web browsers that present warnings for certain pages based on the reader’s previously-registered instances of offence.

Ted , 21:18 (after 43 days)

Thanks, Ted!

I have never and will never offend anyone in particular, so I exercise my right to ignore any groundless demands to remove or alter my blog posts.

I’m glad that there are still people that understand that.

I haven’t removed comments with ridiculous demands partly for the same reason — freedom of speech. And partly for the amusement of people who understand that I meant my posts to help and entertain people, not to offend them. See, for example, the Bono comments below.

Regarding the “offensiveness” plugin: I think the very non-existence of such a plugin proves that no one gives a duck about so-called offensiveness, except the vocal minority.

Thanks again for your support!

Edward , 22:19 (after 43 days)

I have never and will never offend anyone in particular

Bollocks, you quite obviously did. And no one would have cared if it was just your own personal blog but it went out on the general KDE feed — as said earlier the public face of KDE.

As a proponent of free speech

Oh please, the generalised catch phrase for children who don’t know the difference between censorship and being polite. Both you and the parent poster need to grow up, learn the rules of civilised discourse and get over your precious selves.

As said earlier it was an excellent article spoiled by gratuitous and irrelevant abusive references. No one wants to censor you, just see a good and useful article published on the wider community not be spoiled by this.

Lindsay , 22:40 (after 43 days)

Free speech is a right, and Hades has the right to express his views up to the point where they may be considered incitement. As distasteful as I may find his words, he comes no-where near crossing that boundary so he has the right to let his post stand and have his words speak for themselves.

Posting on Planet KDE is not a right, it is a privilege, one granted on the condition that the posters behave in a way acceptable to the community as is clearly set out by the Planet KDE Guidelines and the KDE Code of Conduct. Hades requested to be syndicated on the planet and so agreed to play by those rules.

This post was a clear breech of the Guidelines and Code. Hades acknowledged that he didn’t intend for the post to go out on the Planet which indicates to me he knew that it would break the rules. At that point I think he had a clear choice to either edit the post to conform, or to remove the post from the Planet. That he choose to do neither I think shows a lack of respect for the community, and perhaps some enjoyment of his brief notoriety.

KDE is a massive and diverse community that survives only because of the mutual respect we have for each other, and by people consciously choosing to get along rather than unnecessarily stepping on each others toes (and that includes dealing politely with people when they break the rules). We’d like to keep it that way.

John Layt , 10:07 (after 44 days)

Comment form for «Git Is Your Friend not a Foe Vol. 1: Distributed»

Required. 30 chars of fewer.

Required.

Comment post