Chapter 5. Distributed Git

leonzhx

浏览: 767737 次
性别:
来自: 上海

最近访客更多访客>>

u012363178

justsimple

cdphantom

wang_xuewu

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

2014-05 ( 22)
2014-04 ( 47)
2014-03 ( 25)
更多存档...

博客分类：

Pro Git 读书笔记

Git DVCS

1. You can easily continue using centralized workflow with Git. Simply set up a single repository, and give everyone on your team push access; Git won't let users overwrite each other. If one developer clones, makes changes, and then tries to push their changes while another developer has pushed in the meantime, the server will reject that developer's changes. They will be told that they're trying to push non-fast-forward changes and that they won't be able to do so until they fetch and merge.

2. Integration-Manager Workflow (i.e. GitHub) often includes a canonical repository that represents the "official" project. To contribute to that project, you create your own public clone of the project and push your changes to it. Then, you can send a request to the maintainer of the main project to pull in your changes. They can add your repository as a remote, test your changes locally, merge them into their branch, and push back to their repository. The process works as follows :

1) The project maintainer pushes to their public repository.

2) A contributor clones that repository and makes changes.

3) The contributor pushes to their own public copy.

4) The contributor sends the maintainer an e-mail asking them to pull changes.

5) The maintainer adds the contributor's repo as a remote and merges locally.

6) The maintainer pushes merged changes to the main repository.

3. Dictator and Lieutenants Workflow (i.e. Linux Kernel) is a variant of a multiple-repository workflow. It's generally used by huge projects with hundreds of collaborators; Various integration managers are in charge of certain parts of the repository; they're called lieutenants . All the lieutenants have one integration manager known as the benevolent dictator . The benevolent dictator's repository serves as the reference repository from which all the collaborators need to pull. The process works like:

1) Regular developers work on their topic branch and rebase their work on top of master . The master branch is that of the dictator.

2) Lieutenants merge the developers' topic branches into their master branch.

3) The dictator merges the lieutenants' master branches into the dictator's master branch.

4) The dictator pushes their master to the reference repository so the other developers can rebase on it.

4. The Git project provides a document that lays out a number of good tips for creating commits from which to submit patches—you can read it in the Git source code in the Documentation/SubmittingPatches file.

5. You don't want to submit any whitespace errors. Git provides an easy way to check for this—before you commit, run git diff --check , which identifies possible whitespace errors and lists them for you. If you run that command before committing, you can tell if you're about to commit whitespace issues that may annoy other developers.

6. Try to make each commit a logically separate changeset. Don't code for a long time on five different issues and then submit them all as one massive commit. Even if you don't commit the changes before finishing coding for all the five issues, use the staging area to split your work into at least one commit per issue, with a useful message per commit. If some of the changes modify the same file, try to use git add --patch to partially stage files. This approach also makes it easier to pull out or revert one of the changesets if you need to later.

7. Getting in the habit of creating quality commit messages makes using and collaborating with Git a lot easier. Your messages should start with a single line that's no more than about 50 characters and that describes the changeset concisely, followed by a blank line, followed by a more detailed explanation. The Git project requires that the more detailed explanation include your motivation for the change and contrast its implementation with previous behavior. It's also a good idea to use the imperative present tense in these messages. In other words, instead of "I added tests for" or "Adding tests for," use "Add tests for."

8. Here is a template originally written by Tim Pope at tpope.net:

Short (50 chars or less) summary of changes

More detailed explanatory text, if necessary. Wrap it to about 72 characters or so. In some contexts, the first line is treated as the subject of an email and the rest of the text as the body. The blank line separating the summary from the body is critical (unless you omit the body entirely); tools like rebase can get confused if you run the two together.

Further paragraphs come after blank lines.

- Bullet points are okay, too

- Typically a hyphen or asterisk is used for the bullet, preceded by a

single space, with blank lines in between, but conventions vary here

9. The Git project has well-formatted commit messages. Run git log --no-merges to see what a nicely formatted project-commit history looks like.

10. For a private project with one or two other developers, you can follow a workflow similar to what you might do when using CVCS. You still get the advantages of things like offline committing and vastly simpler branching and merging, but the workflow can be very similar; the main difference is that merges happen client-side rather than on the server at commit time.

11. T o see what commit the work on branch issue54 has to be merged into on remote master branch:

git log --no-merges origin/master ^issue54

--no-merges means not printing commits with more than one parent. Git showed all matching commits on origin/master but on issue54 .

12. In a private small team, you can follow a workflow similar to what you might do when using centralized systems. Y ou work for a while, generally in a topic branch. When you want to share that work, you merge it into your own master branch, then fetch and merge origin/master if it has changed, and finally push to the master branch on the server. The general sequence is something like:

13. In a private managed team, let's say that John and Jessica are working together on one feature, while Jessica and Josie are working on a second. In this case, the company is using a type of integration-manager workflow where the work of the individual groups is integrated only by certain engineers, and the master branch of the main repository can be updated only by those engineers. In this scenario, all work is done in team-based branches and pulled together by the integrators later. The sequence for the workflow will look like:

And Jessica’s committing history will look like:

14. You may want to use rebase -i to squash your work down to a single commit, or rearrange the work in the commits to make the patch easier for the maintainer to review.

15. When your work has been pushed up to your fork, you need to notify the maintainer. This is often called a pull request , and you can either generate it via the website—GitHub has a "pull request" button that automatically messages the maintainer—or run the git request-pull command and e-mail the output to the project maintainer manually.

16. The request-pull command takes the base branch into which you want your topic branch pulled and the Git repository URL you want them to pull from, and outputs a summary of all the changes you're asking to be pulled in:

$ git remote add myfork git://githost/simplegit.git

$ git push myfork featureA

$ git request-pull origin/master myfork

The following changes since commit 1edee6b1d61823a2de3b09c160d7080b8d1b3a40:

John Smith (l):

added a new function

are available in the git repository at:

git://githost/simplegit.git featureA

Jessica Smith (2):

add limit to log function

change log output to 30 from 25

lib/simplegit.rb | 10 +++++++++−

1 files changed, 9 insertions(+), 1 deletions(−)

17. On a project for which you're not the maintainer, it's generally easier to have a branch like master always track origin/master and to do your work in topic branches that you can easily discard if they're rejected. Having work themes isolated into topic branches also makes it easier for you to rebase your work if the tip of the main repository has moved in the meantime and your commits no longer apply cleanly.

18. If the project maintainer has pulled in a bunch of other patches and tried your branch, but it no longer cleanly merges. In this case, you can try to rebase that branch on top of origin/master, resolve the conflicts for the maintainer, and then resubmit your changes:

$ git checkout featureA

$ git rebase origin/master

$ git push -f myfork featureA

Because you rebased the branch, you have to specify the -f to your push command in order to be able to replace the featureA branch on the server with a commit that isn't a descendant of it.

19. If the maintainer has looked at work in your branch and likes the concept, but would like you to change an implementation detail. You'll also take this opportunity to move the work to be based off the project's current master branch. You start a new branch based off the current origin/master branch, squash the featureB changes there, resolve any conflicts, make the implementation change, and then push that up as a new branch:

$ git checkout -b featureBv2 origin/master

$ git merge --no-commit --squash featureB

$ (change implementation)

$ git commit

$ git push myfork featureBv2

The --squash option takes all the work on the merged branch and squashes it into one non-merge commit on top of the branch you're on. The --no-commit option tells Git not to automatically record a commit. This allows you to introduce all the changes from another branch and then make more changes before recording the new commit.

20. For those larger public projects that accept patches via a developer mailing list, You use git format-patch to generate the mbox-formatted files that you can e-mail to the list—it turns each commit into an e-mail message with the first line of the commit message as the subject and the rest of the message plus the patch that the commit introduces as the body:

$ git format-patch -M origin/master

0001-add-limit-to-log-function.patch

0002-changed-log-output-to-30-from-25.patch

The format-patch command prints out the names of the patch files it creates. The -M switch tells Git to look for renames.

21. You can also edit these patch files to add more information for the e-mail list that you don't want to show up in the commit message. If you add text between the “---“ line and the beginning of the patch, then developers can read it; but applying the patch excludes it.

22. Git provides a tool to help you send properly formatted patches via IMAP. You can read detailed instructions for a number of mail programs at the end of the Documentation/SubmittingPatches file in the Git source code.

23. To send a patch, first, you need to set up the imap section in your ˜/.gitconfig file:

[imap]

folder = "[Gmail]/Drafts"

host = imaps://imap.gmail.com

user = user@gmail.com

pass = p4ssword

port = 993

sslverify = false

If your IMAP server doesn't use SSL, the last two lines probably aren't necessary, and the host value will be imap:// instead of imaps:// .

When that is set up, you can use git send-email to place the patch series in the Drafts folder of the specified IMAP server:

$ git send-email *.patch

At this point, you should be able to go to your Drafts folder, change the To field to the mailing list you're sending the patch to.

24. When you're thinking of integrating new work, it's generally a good idea to try it out in a topic branch—a temporary branch specifically made to try out that new work. This way, it's easy to tweak a patch individually and leave it if it's not working until you have time to come back to it. If you create a simple branch name based on the theme of the work you're going to try, such as ruby_client or something similarly descriptive, you can easily remember it if you have to abandon it for a while and come back later. The maintainer of the Git project tends to namespace these branches as well—such as sc/ruby_client , where sc is short for the person who contributed the work.

25. If you received the patch from someone who generated it with the git diff or a Unix diff command, you can apply it with the git apply command:

$ git apply /tmp/patch-ruby-client.patch

This modifies the files in your working directory. It's almost identical to running a patch -pl command to apply the patch while it's more paranoid and accepts fewer fuzzy matches than patch . It also handles file adds, deletes, and renames if they're described in the git diff format, which patch won't do. Finally, git apply is an "apply all or abort all" model where either everything is applied or nothing is, whereas patch can partially apply patchfiles, leaving your working directory in a weird state. It won't create a commit for you—after running it, you must stage and commit the changes introduced manually.

26. You can also use git apply to see if a patch applies cleanly before you try actually applying it—you can run git apply --check with the patch. If there is no output, then the patch should apply cleanly. This command also exits with a non-zero status if the check fails, so you can use it in scripts if you want.

27. To apply a patch generated by format-patch , you use git am . Technically, git am is built to read an mbox file, which is a simple, plain-text format for storing one or more e-mail messages in one text file.

28. If someone has e-mailed you the patch properly using git send-email , and you download that into an mbox format, then you can point git am to that mbox file, and it will start applying all the patches it sees. If you run a mail client that can save several e-mails out in mbox format, you can save entire patch series into a file and then use git am to apply them one at a time:

$ git am 0001-limit-log-function.patch

It applied cleanly and automatically created the new commit for you. The author information is taken from the e-mail's From and Date headers, and the message of the commit is taken from the Subject and body (before the patch) of the e-mail.

29. It's possible that the patch won't apply cleanly. In that case, the git am process will fail and puts conflict markers in any files it has issues with, much like a conflicted merge or rebase operation. You solve this issue much the same way—edit the file to resolve the conflict, stage the new file, and then run git am --resolved to continue to the next patch.

30. If you want Git to try a bit more intelligently to resolve the conflict, you can pass a −3 option to it, which makes Git attempt a three-way merge. This option isn't on by default because it doesn't work if the commit the patch says it was based on isn't in your repository.

31. If you're applying a number of patches from an mbox, you can also run the am command in interactive mode with –i option, which stops at each patch it finds and asks if you want to apply it.

32. It's often helpful to get a review of all the commits that are in the topic branch but that aren't in your master branch. You can exclude commits in the master branch by adding the --not option before the branch name. For example, if your contributor sends you two patches and you create a branch called contrib and applied those patches there, you can run this:

$ git log contrib --not master

33. To see what changes each commit introduces, you can pass the -p option to git log and it will append the diff introduced to each commit. To see a full diff of what would happen if you were to merge this topic branch with master branch, you may run this:

$ git diff master

Git directly compares the snapshots of the last commit of the topic branch you're on and the snapshot of the last commit on the master branch.

To have Git compare the last commit on your topic branch with the first common ancestor it has with the master branch, you can do that by explicitly figuring out the common ancestor and then running your diff on it:

$ git merge-base contrib master

36c7dba2c95e6bbb78dfa822519ecfec6e1ca649

$ git diff 36c7db

Or you can put three periods after another branch to do a diff between the last commit of the branch you're on and its common ancestor with another branch:

$ git diff master...topic

This command shows you only the work your topic branch has introduced since its common ancestor with master .

34. One simple workflow of integrating contributed work is to merge contributed work into your master branch. In this scenario, you have a master branch that contains basically stable code. When you have work in a topic branch that you've done or that someone has contributed and you've verified, you merge it into your master branch, delete the topic branch, and then continue the process.

35. If you have more developers or a larger project, you'll probably want to use at least a two-phase merge cycle. In this scenario, you have two long-running branches, master and develop , in which you determine that master is updated only when a very stable release is cut and all new code is integrated into the develop branch. You regularly push both of these branches to the public repository. Each time you have a new topic branch to merge in, you merge it into develop ; then, when you tag a release, you fast-forward master to wherever the now-stable develop branch is.

36. In the scenario of Large-Merging Workflow, the Git project has four long-running branches: master , next , and pu (proposed updates) for new work, and maint for maintenance backports. When new work is introduced by contributors, it's collected into topic branches in the maintainer's repository. At this point, the topics are evaluated to determine whether they're safe and ready for consumption or whether they need more work. If they're safe, they're merged into next , and that branch is pushed up so everyone can try the topics integrated together. If the topics still need work, they're merged into pu instead. When it's determined that they're totally stable, the topics are re-merged into pu and are then rebuilt from the topics that were in next but didn't yet graduate to master . This means master almost always moves forward, next is rebased occasionally, and pu is rebased even more often. maint branch is forked off from the last release to provide backported patches in case a maintenance release is required.

37. The other way to move introduced work from one branch to another is to cherry-pick it. A cherry-pick in Git is like a rebase for a single commit. It takes the patch that was introduced in a commit and tries to reapply it on the branch you're currently on. This is useful if you have a number of commits on a topic branch and you want to integrate only one of them, or if you only have one commit on a topic branch and you'd prefer to cherry-pick it rather than run rebase . Suppose you have a project that looks like:

If you want to pull commit e43a6 into your master branch, you can run:

$ git cherry-pick e43a6

This pulls the same change introduced in e43a6 , but you get a new commit SHA-1 value, because the date applied is different:

38. When you've decided to cut a release, you'll probably want to drop a tag so you can re-create that release at any point going forward:

$ git tag -s v1.5 -m 'my signed 1.5 tag'

If you do sign your tags, you may have the problem of distributing the public PGP key used to sign your tags. The maintainer of the Git project has solved this issue by including their public key as a blob in the repository and then adding a tag that points directly to that content. To do this, you can first figure out which key you want:

$ gpg --list-keys

/Users/schacon/.gnupg/pubring.gpg

---------------------------------

pub 1024D/F721C45A 2009-02-09 [expires: 2010-02-09]

uid Scott Chacon <schacon@gmail.com>

sub 2048g/45D02282 2009-02-09 [expires: 2010-02-09]

Then, you can directly import the key into the Git database by exporting it and piping that through git hash-object , which writes a new blob with those contents into Git and gives you back the SHA-1 of the blob:

$ gpg -a --export F721C45A | git hash-object -w --stdin

659ef797dl8l633c87ec71ac3f9ba29fe5775b92

Then you can create a tag that points directly to it by specifying the new SHA-1 value that the hash-object command gave you:

$ git tag -a maintainer-pgp-pub 659ef797d181633c87ec71ac3f9ba29fe5775b92

If you run git push --tags , the maintainer-pgp-pub tag will be shared with everyone. If anyone wants to verify a tag, they can directly import your PGP key by pulling the blob directly out of the database and importing it into GPG:

$ git show maintainer-pgp-pub | gpg --import

They can use that key to verify all your signed tags. Also, if you include instructions in the tag message, running git show <tag> will let you give the end user more specific instructions about tag verification.

39. If you want to have a human-readable name to go with a commit, you can run git describe on that commit. Git gives you the name of the nearest tag with the number of commits on top of that tag and a partial SHA-1 value of the commit you're describing:

$ git describe master

v1.6.2-rc1-20-g8c5b85c

git --version also gives you something that looks like this. If you're describing a commit that you have directly tagged, it gives you the tag name. The git describe command favors annotated tags. You can also use this string as the target of a checkout or show command, although it relies on the abbreviated SHA-1 value at the end, so it may not be valid forever. For instance, the Linux kernel recently jumped from 8 to 10 characters to ensure SHA-1 object uniqueness, so older git describe output names were invalidated.

40. You can create an archive of the latest snapshot of your code:

$ git archive master --prefix='project/' | gzip > `git describe master`.tar.gz

$ ls *.tar.gz

v1.6.2-rc1-20-g8c5b85c.tar.gz

You can also create a zip archive in much the same way, but by passing the --format=zip option to git archive :

$ git archive master --prefix='project/' --format=zip > `git describe master` .zip

41. A nice way of quickly getting a sort of change log of what has been added to your project since your last release or e-mail is to use the git shortlog command. It summarizes all the commits in the range you give it:

$ git shortlog --no-merges master --not v1.0.1

You get a clean summary of all the commits since v1.0.1, grouped by author.