Posts tagged ‘git’

Contributing to git projects on github

I spent some time this week helping a colleague learn the ins and outs of Git rebase. He mentioned that he could not find a git introduction that summarized the basic workflow you might use when contributing to an open source project on Github. This article attempts to fill the gap.

Note that there are many possible ways to interact with git and github and none of those ways are canonically correct. The steps here will help a newcomer to git or to collaborative editing on git. I outline the steps in short form first, then summarize alternatives below.

  1. Fork the remote project on github.
  2. Clone your new fork.
    git clone ssh://git@github.com/[yourname]/[reponame]/
    cd [reponame]
  3. Add a shortcut to upstream.
    git remote add upstream https://git@github.com/[developername]/[reponame]/
  4. Optionally choose a branch to hack on.
    git checkout development_branch_name
  5. Create a local copy of that branch.
    git checkout -b working_branch_name
  6. Hack.
  7. Commit your changes
    git add [-p] filenames
    git commit

    Repeat the hack,add,commit steps until your commits look the way you want them to.

  8. Grab any commits that were added upstream while you were working.
    git checkout development_branch_name
    git pull upstream development_branch_name
  9. Rebase your work onto to the end of the development branch.
    git checkout working_branch_name
    git rebase development_branch_name
  10. Push your commits to your github fork for the world to see.
    git push origin working_branch_name
  11. Make a pull request on github. Pick development_branch_name as the one you want your code pulled into, and working_branch_name as the one you are pulling from in your repo.
  12. Oh nos! The upstream devs didn’t like your changes. That’s fine. Repeat the hack/commit step as needed. You may
    be able to just append commits to your previous commits. If you need to alter your history (e.g. because they don’t like how one of your commits looks), check the man pages for git commit --amend and git rebase -i.
  13. Once you have the working_branch in a state you like, repeat the remote pull and rebase steps to ensure your commits “look like” they happened at the end of the remote developer’s branch.
  14. Push your changes to your github fork, forcing an overwrite of any existing changes.
    git push -f origin
  15. Issue another pull request on github; your old pull request will be updated to the new working branch.
  16. Your code has been accepted upstream, congratulations on your contribution to an important open source project!

Notes

  1. This represents a branch the upstream developer has pushed to their remote repository. The branch chosen really depends on where you want to develop. Some projects do all their development on master. Others push a remote branch named ‘devel’ or branches named after the feature being developed on that branch. Either way, your ultimate goal is to have your commits added to this branch on the remote repository.
  2. It is possible to do your development on a single branch and rearrange commits as needed. However, branching is free in git, and storing your changes on a different logical branch will really help you identify where everything fits together.
  1. Each commit should contain one logical change.
    A commit message should not contain the word “and”.
    If you have been working on two logical changes at the same time, you can index only relevant hunks using git add -p.
  2. You’re checking out the original branch here. You don’t want to pull all the foreign commits into the branch you’ve been developing on; that branch should only contain your commits at this time.
  3. This is the same as “moving” all your commits
    to the end of development. The idea is to have your commits as a “tail” that all come after the upstream changes. Upstream will love this because they can do a fast-forward merge of your commits. Hopefully
    there aren’t any conflicts. If there are, resolve them so that your commits can apply cleanly. You’ll want to review the rebase documentation before continuing. Also, use three way merging, it makes life so much easier.
  1. There is a philisophical debate as to whether it is acceptable to rewrite public history, as Dieter has discussed in detail. In this case, since you have communicated (via the pull request) an intent to rewrite history, it’s not only acceptable, but desirable.Most upstream developers will consider it good form to edit older commits
    so that they reflect what the history “should have” looked like so that people reading your history can understand it.

Learning hg as a git user

As my friend, Jason Chu recently noted, I am primarily a git user who has discovered a need to understand and use Mercurial. I am trying to refrain from judgment on Mercurial, as I’m easily bored by bikeshed discussions and holy wars. I have a pragmatic “use what you like and let me use what I like” philosophy, but when you are interacting with other people’s code, you occasionally have to use what they like.

I have read several articles that I do not intend to link to discussing the differences, and cheat sheets of hg equivalents of common git commands. These are utterly useless. Mercurial and git have different design philosophies, as Jason noted, even though the end result of their usage is much the same. If you’re comfortable with git, and interested in learning Mercurial, you may find my own eureka moment helpful.

hg commit is not the same as git commit.

Most comparisons of git and hg do not notice this distinction, but I was really puzzled by how hg could be more powerful than Subversion and supposedly equally powerful to my beloved git until I came to this realization.

In git, when you make a commit, you are creating local history that can be easily changed, modified, or ratified. You can rebase over those changes as many times as you like. You can use git commit –amend to change the commit or add changes to it. The history is not remotely considered “permanent” until you push it to a public repo, and even then, there are times when it is acceptable to rewrite it.

Conversely, in hg, when you commit, you are doing what the word actually says: committing. You are saying “this commit looks the way I want it to, I am finished with it.” You may not be pushing the commit to a remote repo any time soon, you may not be publishing it, but you have written in sandstone that this commit is complete.

I say written in sandstone, rather than stone because there are a variety of hg commands and extensions that allow local history editing, rebasing, and rollbacks. I haven’t learned how fluid these extensions are compared to equivalent history modification in git, but the feeling I am getting is that such changes would be considered much more invasive in hg than in git. History editing is a third party extension; this says to me “not officially supported” (as compared to built-in extensions like Mercurial queues). Mercurial typically desires us to think of a commit as an object that is permanently in the history. Many of the other slightly-deeper-than-cosmetic differences between the two systems seem to stem from this same basic difference.

In git, I have gotten quite used to coding first, and then creating an appropriate history later. There are numerous other potential workflows with git, but that’s the one I like. At first, I thought this was impossible or very difficult with Mercurial. However, when I realized that “commit” falls somewhere between the commands “commit” and “push” in git, things started to fall into place.

Mercurial has a powerful tool called queues that allow you to manipulate history to your heart’s content before you call commit. I’ve been using these effectively to create a workflow that I am comfortable with. It’s not the same as what I’d do in git, not remotely, but the overall outcome is similar.

A related basic understanding that is a little better documented than the difference between hg commit and git commit is the following:

hg branch is not the same as git branch.

Once again, hg branch lies somewhere between git branch and git push origin. When you call hg branch, you are stating an intent that the branch will be public. In git, you can have as many unpublished branches as you want. In Mercurial, this behavior is better achieved by the use of bookmarks, although I’ve found that Mercurial queues are easier to work with.

There are many tutorials for new hg users coming from a svn background, and a few tutorials for those coming from a git background. If you are hoping to learn Mercurial effectively, I suggest avoiding most of those options. It is much better to study Mercurial from the perspective of a programmer who hasn’t seen version control before. Such coders don’t exist (I hope!), but this attitude allows you to learn how the new system should be used, not how to make it behave like a system you are previously used to.

For mercurial basics, I strongly recommend http://hginit.com/, an irreverent and entertaining tutorial on the simpler concepts.

I had a lot of trouble understanding hg queues until I read the hg book chapter on the topic I intend to read the entire red-bean book at some point, as it appears to be much more coherent than the official Mercurial documentation. Now that I’ve been playing with hg queues for a day or so, I have come to understand that they can cover several common git tasks that appear to be missing from hg, including stashing, rebase -i, and similar. The key takeaway is you don’t commit your queues until you are quite certain you want them to become permanent history.

I haven’t yet figured out just when to choose hg queues over hg bookmarks, but a good read for getting used to hg bookmarks can be found here.

I strongly recommend enabling the hgk extension Just add the following to your ~/.hgrc. This will enable an hg view command that is more similar to gitk than it should be, considering the basic differences between branches in the two systems.

[extensions]
hgk =
mq =
bookmarks =

(The second line is for enabling hg queues and third enables bookmarks.)

Overall, I suspect that I will always prefer git to hg. However, unlike subversion, I think Mercurial does supply me with tools I need to work effectively. Different tools from git, but effective nonetheless.

One beef I have with both git and Mercurial is that they violate the “one best way to do things” principle which makes learning, communicating about, and deciding how to use them more complicated than it needs to be.

How I reverted several git commits in a single commit

I hate to publicly admit this, but I recently made four commits that should have been merged into one commit, including two with embarrassing commit messages like, “third commit without testing, for shame!” I’m thoroughly shocked that fellow coder, Dan McGee hasn’t already attacked me for my misdemeanor.

Please forgive me, I was tired and in a hurry and was working on something that was easier tested on the production server and most certainly deserve to be attacked by a velociraptor.

To complicate matters, there was a fifth commit in the middle of these four commits that was pertaining to an irrelevant task, and several other users had committed changes after those commits.

Fastforward to today. Those four commits made in a hurry, now have to be reverted. As with any task, there are several ways to do this using git, but none of them are immediately obvious. git reset –keep was out of the question because of the newer commits. I think I could have git rebased the changes out of a new branch and merged them, but the method that made the most sense to me was to revert them independently, and then squash them.

Here’s how my history looked:

A–C1–C2–Ex–C3–C4–O1–O2–O3

The four C commits are the ones I want to revert. Ex was an extraneous commit I want to keep and the O commits were made by other authors later.

This was the desired end state:

A–C1–C2–Ex–C3–C4–O1–O2–O3–R

where R is a commit reverting the changes made in the four C commits. I didn’t want to simply erase the C commits, (which can be done easily with git rebase), as embarrassing as they are, because they are public history that had been pushed to other users.

My process was to run git revert several times:

git revert C4
git revert C3
git revert C2
git revert C1

Possibly there is a way to do all of this in one command, I’m not sure. This left me with:

A–C1–C2–Ex–C3–C4–O1–O2–O3–R1–R2–R3–R4

where the four R commits are reversions of the four C commits.

Then I ran:

git rebase -i HEAD~5

git rebase -i is my favourite method of rebasing. It lists the five most recent commits in vim asking me what to do with each one. You can choose several options for each commit. Here is what I chose:

pick O3
reword R1
squash R2
squash R3
squash R4

pick O3 says to include that commit and leave it unchanged. When rebasing, I usually go one commit earlier than I expect to make sure I’m modifying the correct history. The reword commit simply allows me to change the commit message of R1 to “Revert the XYZ changes because I no longer need them” The squash commits mean that those three R commits are merged into the previous commit — R1. And my end state is as desired:

A–C1–C2–Ex–C3–C4–O1–O2–O3–R

I’m pretty sure there are other other ways to do this. I chose this multi-step process because it allows me to understand what is going on at each step and to double check that I haven’t accidentally removed, merged, or reverted a commit I didn’t mean to.

Fixing Git Bash Completion

I didn’t know until yesterday about the __git_ps1 command. You can include it in your bash PS1 like this:

PS1='[\u@\h \W$(__git_ps1 " (%s)")]\$ '

and whenever you’re in a git directory, it will include the current branch in your prompt, along with a few other goodies.

I did this and it didn’t work. It just displayed __git_ps1 in my prompt all the time, which is ugly and not terribly useful.

I couldn’t find an answer on Google, so I ended up just disabling lines in my .bashrc until I could figure out what was wrong. I ended up having to disable this line:

shopt -u promptvars

I don’t know why it was on; perhaps I had a reason for it once and then copied the bashrc from computer to computer, but it’s gone now and my git bash prompt works.

So if you’ve recently heard about __git_ps1 and it’s not working for you, look for the promptvars shopt.