02-git-1.pdf
Document Details
Uploaded by JubilantTellurium
Tags
Full Transcript
RW344: Software Engineering Lauren Hayward, Bernd Fischer [email protected] Version Control Version Control Version Control A project usually consists of many versions of many files. A version control system (VCS) keeps tracks of all the files and how they have changed in a reposi...
RW344: Software Engineering Lauren Hayward, Bernd Fischer [email protected] Version Control Version Control Version Control A project usually consists of many versions of many files. A version control system (VCS) keeps tracks of all the files and how they have changed in a repository. Allows users to keep track of versions, and thus to go back to previous versions when required. – typically features a text-based history log of the changes – sometimes also graphical representations Allows many users to work concurrently on the files of the project, and manages the integration of their respective changes. Allows users to manage release versions of the entire project. Provides access control. Version Control: System History SCCS (source code control system), 1972 – repositories are local – invented deltas RCS (revision control system), 1982 – repositories are local – reverse deltas CVS (concurrent versioning system), 1986 – client-server front-end for RCS – popularized merging instead of locking – snapshots for binaries SVN (subversion), 2000, – open-source bugfix successor of CVS Version Control: System History Git, 2005 – developed within Linux (Torvalds) – distributed approach, no central server – has become industry standard Mercurial, 2005 – similar to Git More detailed comparison can be found at https://en.wikipedia.org/wiki/Comparison_of_version-control_software Repository storage mechanisms Forward deltas: stores versions as deltas (conceptually from empty start) ∅ + - + + + + - - conceptually simple: all versions are deltas storage-efficient: deltas are typically (much) smaller than files most recent version must be reconstructed from version history – access gets slower Repository storage mechanisms Reverse deltas: stores latest version in full, earlier versions as deltas ∅ + - + + + + - - conceptually more complex: versions are mixture of deltas and sources storage-efficient: most versions are deltas most recent version immediately available Repository storage mechanisms Snapshots: stores all versions in full, typically compressed - + + + - - conceptually simple: all versions are sources – delta between any two versions can be computed easily high storage demand: all versions stored as sources most recent version immediately available Repository concurrency models Lock: single user gets exclusive access to repo version of file(s) check-out required, locks files until check-in Nobody in their right mind conflicts can emerge only on check-out used version control with – must be resolved before check-in central locking in the last not well suited for distributed development 35 years. Merge: users can commit files freely, but repo blocks conflicting commits conflicts can emerge on update and commit – typically resolved automatically (three-way merge) – typically combined with branching – logically separate copies of the version space – updates in one branch are not reflected in others unless explicitly merged Three-way merging is a merge conflict resolution heuristic that exploits the version history. base Consider the following scenario: A updates base file and edits at l1 and l3 B updates base file, edits at l3 and l5, and commits B tries to commit but has changes at l1, l3, and l5 B A – l1 should come from A and l5 from B ↯ – l3 is a merge conflict that needs manual resolution ΔA,B Solution: three-way merge (ΔA ∪ ΔB ) \ (ΔA ∩ ΔB ) identify most recent common ancestor (here: base) – exists if version space is meet semilattice, construct if not ↯ compute diffs ΔA and ΔB between base and A resp. B apply (ΔA ∪ ΔB ) \ (ΔA ∩ ΔB ) to base, manually resolve ΔA ∩ ΔB Centralized vs Distributed Version Control SVN GIT Distributed Version Control – Advantages Avoids relying on a single physical machine. – A server disk crash is a non-event with distributed revision control… Users can continue work even when not connected to network. Most operations much faster since no network is involved. Allows private work, so you can use your revision control system even for early drafts you don’t want to publish. Allows participation without permission from project authorities. – Still permits central control of the “release version”. Distributed Version Control – Disadvantages Concepts of DVCSs may be more difficult to grasp. Corporate environments tend to favour a centralized server with more control. DVCS may require more manual conflict-resolution when merging trees. Git Basics Creating a repository You can turn a directory into a (local) git repository: $ git init Central repositories should be created as bare: $ git init –bare => no working directory, cannot commit into repo (only push) Typically you clone from a central repo: $ git clone Initializing a Repository Git grants access using keys, so where do these names come from, and why are they different? The name that appears next to a commit is set using git config Initializing a Repository You can set these per repository $ git config user.name "Lauren Hayward" $ git config user.email "[email protected]" Or you can set them globally $ git config --global user.name "Lauren Hayward" $ git config --global user.email "[email protected]" Remember that if you work on someone else’s PC without changing the config for the repository, their name will appear in your commit logs! Initializing a Repository – Main Branch By default the first branch you create will be the main branch … but anything can be the main branch of a repository and it is possible to change it afterwards It was historically also called the “master” but that terminology is no longer used. Creating Branches There are a couple of ways to create a branch. The simplest way is: $ git branch This creates the branch, but doesn’t switch to it To create a branch and switch to it at the same time, you can use this: $ git switch -c Note that neither of these push the branch upstream. You can do that immediately after creation, or later using: $ git push Deleting Branches There are two ways to delete a local branch.The safe way is: $ git branch -d This only deletes the local branch if it is fully merged. To force delete a local branch, you can use this: $ git branch -D To delete the remote version of a branch, you must use: $ git push --delete To push, or not to push… Unless you’ve written a nice.gitignore that you’re very confident in $ git add -A is not your friend... You will add things like compiled files, IDE specific configuration files and temporary files generated by the tools you’re using. As a rule of thumb, never commit anything that can be regenerated or recreated using the source code. Typically you want to specify what to add when you commit and push. If you find that there are so many files that it is tedious to do so, you are probably working too long before committing anyway. A good compromise is adding specific folders at a time: $ git add src/ Tags Tags are references to a specific point in commit history. They are usually used to indicate a release. Tags are like branches that don’t change and can be checked out in the same way. You can create a tag using: $ git tag -m "Optional message" You have to push tags explicitly using: $ git push --tags To check out a tag, use : $ git checkout Amending mistakes Oh no, I’ve written something I shouldn’t have! To change the name of the current branch, use: $ git branch -m To change a commit message, use: $ git commit --amend If you already pushed, you need to follow this by $ git push –f but note this also pushes all committed changes. Amending mistakes Oh no, I’ve done something I shouldn’t have! If you’ve changed a file you didn’t mean to, you can undo that change to the last committed version using git: $ git restore If you’ve already committed the changes, you can “uncommit” (unstage) them using: $ git restore --staged Once they’re no longer committed, the command above will undo the changes. If you’ve already pushed the changes, you can undo them by reverting that commit: $ git revert You can obtain the commit SHA using: $ git log Note that you will lose the changes made in this commit. If you wish to undo the commit, but retain the changes that were part of that commit, you can use a reset. To undo the most recent commit, use: $ git revert HEAD^ Rebasing Rebasing is moving the base of a branch to a different commit from the one it was originally created off, making it appear as if you branched from a different commit. This is most often used to bring changes that have been pushed to the main branch into a feature branch. To start rebasing the branch you are currently on: $ git rebase The new base can be the name of a branch, a tag, or a commit SHA. You will likely have to resolve merge conflicts as you complete the rebase. Occasionally you will have to continue the rebase using: $ git rebase ––continue Finally, you will have to force push the changes to the remote. Merge Conflicts To merge a branch into our current branch, we use: $ git merge These branches may not be able to be merged successfully automatically, in which case git asks us to resolve the merge conflict. Most IDEs provide a visual tool with which to resolve conflicts, a nice one being VSCode. Once conflicts have been resolved, all changed files must be added and committed. Pushing upstream should not require a force push after a merge. Stashing and Unstashing You may want to switch branches without having to commit your code. To do so, we use stashing and unstashing. To store all the uncommitted changes, use: $ git stash The stash itself is a stack, and can hold more than one set of changes at a time. When you want to get your changes back, use: $ git stash pop This will reapply the most recently stashed set of changes. Development Workflows Git can be used to support different collaboration styles or workflows for the development of new features. The most widely used ones are: centralized workflow feature branch workflow gitflow workflow forking workflow More information can be found at https://www.atlassian.com/git/tutorials/comparing-workflows#!workflow- gitflow https://www.youtube.com/watch?v=w3jLJU7DT5E Centralized Workflow All changes are committed to main branch – uses git in SVN style Must be careful that code is correct – typically ensured by continuous integration Merge conflicts need to be resolved (immediately) Feature Branch Workflow All changes happen in a new branch made for each new feature Feature branch gets merged back into main when completed main ever only contains working code Gitflow Workflow meant for larger projects with release dates main and development branches main has version tags all changes happen on development branch now you add feature branches as in feature branch workflow Advanced Gitflow uses additional default branches: hotfixes, release branches Forking Workflow Each developer has two repos, one private, one public Only maintainer(s) push(es) to official repo Often used for open source projects Forking Workflow Fork the master repo Add a new feature branch and add the changes Commit to own copy Do a Pull Request to master repo Maintainer accepts pull request and merges into their local copy of master… … then updates master Summary of Workflows