CI1 Git Version Management PDF
Document Details
University of Strasbourg
2024
Tags
Summary
This document provides a comprehensive overview of version control systems, including the advantages and disadvantages of different types of systems. The document includes diagrams to explain these concepts and also explains how versions are stored. It also discusses practical applications, benefits, principles and the commands when managing a local repository.
Full Transcript
Software Engineering Version management Git Licence Informatique 2024 1 Why version files? Go back (to a stable version) and make corrections Keep a history of all operations (what changes, by whom, when) Teamwork (locks, file conflict management) Work in parallel...
Software Engineering Version management Git Licence Informatique 2024 1 Why version files? Go back (to a stable version) and make corrections Keep a history of all operations (what changes, by whom, when) Teamwork (locks, file conflict management) Work in parallel on several branches Guarantee file security, integrity, availability and confidentiality 2 Most versioning systems based on diff Diff Exists since the origin of Unix View the difference between two text files From the terminal: diff command From an IDE (Also works with folders) Demo 3 The different version control systems Different types: Local - Local Version Control System (LVCS) Centralized - Centralized Version Control System (CVCS) Distributed - Distributed Version Control System (DVCS) FR: Logiciel de Gestion de Version (LGV) ou Système de Gestion de Version (SGV) EN: Version Control System 4 LVCS - Local Version Control System Advantages : Simple Disadvantages : Easy to make mistakes Complex collaboration with other devs Example: SCCS (1972), RCS (1982) 5 CVCS - Centralized Version Control System Advantages : Everyone accesses the same code Easy to administer (user roles) Disadvantages : "single point of failure" (network problem, hard disk without backups = collaboration impossible) Latency in use; you have to communicate with the server for each operation A few examples: CVS (1990), Subversion (2000) 6 DVCS - Distributed Version Control System C Advantages : Each "clone" is a complete backup of the project Work offline and synchronize later Disadvantages : More difficult to master at the beginning Slower initial cloning Examples: Git (2005), Mercurial 7 DVCS - Distributed Version Control System In practice: one of the computer Advantages : serves as a central repository Each "clone" is a complete backup of the project Work offline and synchronize later Disadvantages : More difficult to master at the beginning Slower initial cloning Examples: Git (2005), Mercurial 8 How are the versions stored? The entire project is not saved every time! Local VCS / Centralized VCS (CVS, SVN, …) All files saved the first time Then only the differences are saved “Delta-based” version control 9 How are the versions stored? The entire project is not saved every time! Distributed VCS (git, …) Git also uses delta, but coupled with zlib compression (delta compression) All files are stored only once Git saves snapshots of a miniature filesystem Unchanged files: git stores only a link to the previous identical file 10 Why Git ? Source: Google Trends « As of January 2023, GitHub reported having over 100 million developers and more than 372 million repositories, including at least 28 million public repositories. It is the world's largest source code host as of June 2023. » Source: The Github Blog, 2023 11 In practice Central repository Distributed: each copy is a complete repository Each developer works with his/her own repository and complete history In practice, many projects use a "central" repository (on GitHub or GitLab, for example) to facilitate collaboration This "central" repository is not necessary for Git to function, but it serves as a canonical point of truth More freedom for developers, but requires coordination when merging changes 12 Benefits Organization Work offline on your local repository and synchronize later Code comparison Easily compare code versions (over time or between different variants) Easy undo Quickly revert to an earlier, more stable version Storage Generally, we just add to git - it's hard to lose information completely. 13 Remote repository Principles A remote repository Each developer has a "clone" on his/her machine = local repository Pull, clone, fetch Pull, clone, fetch Commands for interacting within your push push local repository (add, commit, status, log, etc.) Commands for local and remote Pull, clone, fetch synchronization (push, pull, clone, fetch) push Tools to label and organize your work (branches, tags) Local Localclone clone Remote Localrepository clone ->->add, add,commit, commit,status, status,log log -> add, commit, status, log 14 remote … Local file status in git Each file within a working directory is in one of the 4 following states: "Untracked”: the file is not followed by git "Modified": (not staged) the file is tracked by git, has been modified compared to the previous version, but has not yet been committed "Staged": the modified file is marked so that its changes can be added to the next commit snapshot modified staged committed local "Committed: the file is correctly stored in the local git repository in its current version 15 remote pushed Sharing committed files push merge request Committed files are still local Need to share them with the other repositories “push” Or “merge request” modified staged committed local 16 Local repository management - a few commands R General format: "git [options]" git init: initialize a git repository in an existing directory git add: track a file with git / tag file as "to be sent" at the next commit git commit -m "my comment": validate changes in tagged files pushed git commit -am "my comment": add + commit push git status: find out the current branch and file status: L untracked: not tracked in git unmodified: tracked but not modified modified: tracked and modified Local repository commited staged: tracked, modified and included in the list commands comm ent comm ent git diff: show modified lines git log: display revision history commit add modified staged 17 Demo 1 Initialize a local repository Add files Consult the "diff" Create a commit Modify files Create a second commit View history (git log) 18 What does a version contain? When committing, several objects are created: One blob per file (binary large object) stores only the content as a series of bytes without metadata (file names, path, permissions, …) two different files with exact same content → one blob if file unchanged, only a link One tree (and subtrees if needed) lists content of directory and specifies which file is attached to which blob stores the metadata if tree unchanged, only a link One commit object that points to the root of the tree contains some meta-data including an ID After a while, git packs objects to save space 19 pointer metadata pointers metadata 20 Successive commits Each commit also stores a pointer to the previous commit (parent) Allows to revert to previous versions 21 Branches Organize Several local versions when developing Allows to maintain a clean main version always accessible while developing one or several new experimental functionalities Diverge Each new functionality developed in a branch Main A branch can be seen as a variant Merge When stable and final, new functionalities are added to the main version by merging temporary variant to the main branch 22 Branches Branch = pointer to a commit Default branch: main (convention; also "master" or "develop") Format: "git [options]" git merge [branch name] ➝ merges [branch name] with the current branch git branch -d [branch name] ➝ deletes a branch git checkout [branch name] ➝ switches to an git switch [branch name] existing branch git branch [branch name] ➝ creates a new branch but stays in the current branch git checkout -b [branch name] ➝ creates a new git switch -c [branch name] branch and immediately switches to it 23 Demo 2 Create a branch Switch between two branches Making changes to a branch (commit) Merging one branch with another 24 Branches Pointer named HEAD points to current branch Default branch: main HEAD main 98ca9 34ac2 f30ab 25 Branches git branch test A branch is in fact a pointer to a commit creates a new pointer to the current branch HEAD main 98ca9 34ac2 f30ab test 26 Branches git checkout test A branch is in fact a pointer to a commit moves HEAD pointer to the new branch Can move it from one branch to another main 98ca9 34ac2 f30ab test HEAD 27 Branches Modifications + git commit –a –m ’fixed xxx bug’ moves test branch pointer to the new commit Also moves HEAD main 98ca9 34ac2 f30ab 87ab2 test HEAD 28 Branches git checkout main moves HEAD back to main branch pointer HEAD main 98ca9 34ac2 f30ab 87ab2 test 29 Branches Modifications + git commit –a –m ’fixed yyy bug’ HEAD moves main branch pointer to the new commit Also moves HEAD main c2b9e 98ca9 34ac2 f30ab 87ab2 test 30 Merging branches Simple case: “fast forward” only one branch diverges from main main branch didn’t change main 98ca9 34ac2 f30ab 87ab2 57cb5 test HEAD 31 Merging branches Simple case: “fast forward” only one branch diverges from main main branch didn’t change put HEAD on main HEAD git checkout main main 98ca9 34ac2 f30ab 87ab2 57cb5 test 32 Merging branches Simple case: “fast forward” only one branch diverges from main main branch didn’t change put HEAD on main HEAD merge git merge test main 98ca9 34ac2 f30ab 87ab2 57cb5 test 33 Merging branches Simple case: “fast forward” only one branch diverges from main main branch didn’t change put HEAD on main HEAD merge delete test branch main git branch –d test 98ca9 34ac2 f30ab 87ab2 57cb5 34 Merging branches Complex case – no conflicts test branch and main both evolved main c2b9e 98ca9 34ac2 f30ab 87ab2 57cb5 test HEAD 35 Merging branches HEAD Complex case – no conflicts test branch and main both evolved main put HEAD on main git checkout main c2b9e 98ca9 34ac2 f30ab 87ab2 57cb5 test Merging branches HEAD Complex case – no conflicts test branch and main both evolved main put HEAD on main merge c2b9e 4a65a git merge test 98ca9 34ac2 f30ab 87ab2 57cb5 test 37 Merging branches HEAD Complex case – no conflicts test branch and main both evolved main put HEAD on main merge c2b9e 4a65a delete branch 98ca9 34ac2 f30ab 87ab2 57cb5 38 Merging branches Complex case with conflicts #include git merges and adds markers in the conflicted file int main() { printf("Hello, world!\n"); // This line was added in the current branch (HEAD) printf("This is the HEAD branch.\n"); #include return 0; } int main() { printf("Hello, world!\n"); HEAD version > feature-branch } return 0; feature-branch version } 39 merged version #include int main() { printf("Hello, world!\n"); Merging branches > feature-branch Edit/modify file directly: return 0; Remove markers and do any of the following: } Keep HEAD version merged version Keep branch version Keep both Change some parts or everything #include #include #include int main() { int main() { int main() { printf("Hello, world!\n"); printf("Hello, world!\n"); printf("Hello, world!\n"); >>> feature-branch return 0; return 0; return 0; } HEAD version } branch version } 40 version Some other Merging branches Complex case with conflicts Edit/modify file directly: Remove markers and do any of the following: Keep HEAD version Keep branch version Keep both Change some parts or everything When happy, mark as resolved: git add [filename] Staging a file marks it as resolved 41 Remote repository A version of your repository hosted on a server or a cloud platform and accessible by multiple users over a network Often used as a centralized location to synchronize your work and collaborate with others Enables collaboration and version control across different machines and locations Users synchronize local and remote repositories by using communication commands By default, when a repository is cloned, the remote is named origin 42 Remote repository management remote - a few commands clone, fetch, pull Get new data (remote → local) local Format: "git [options]" git clone [url] ➝ create a local copy of an existing remote project (1st time) git fetch ➝ remote to local - retrieves new data (latest commits) from a remote repository (without merging in working copy): allows to first check the changes between working copy and local repository. Once happy, can merge local repo with working copy git pull ➝ remote to local - retrieves new data (latest commits) from a remote repository (with merge): fetch + merge directly with working copy. May cause conflicts. 43 Remote repository management remote - a few commands merge push request Send modified data (local → remote) Format: "git [options]" local git push: local to remote - send data to a remote server Good practice in collaborative projects Use merge requests instead Avoids unwanted pushes Let the owner know you have something new to merge (merge request), owner decides to validate or not), if validated the code is sent Forks and pull requests Fork is a replica of the project, that will appear as a new project owned by you Can work independently from initial project Send pull request to owner of the original project when ready 44 Remote repository management - a few commands When cloning, a bookmark named “origin” is automatically created and linked to the cloned Manage different sources / repositories repo origin Distant connexions are stored as bookmarks to URLs Format: "git [options]" git remote: lists current bookmarks towards other repositories git remote -v: same, but also displays URLs git remote add [name] [url]: add a connection to a new distant repo john git remote rm [name]: remove a connection to a repo > git remote add john http://dev.example.com/john.git git remote rename [old-name][new-name]: change name attached to a repo git remote show [name]: inspect an existing repository 45 Demo 3 - initialization (local first) Initial setup: how to create a local repository, a remote repository, and synchronize them. Initialize a local repository with git init Create a remote on Gitlab Connect the two repositories 46 Demo 4 - initialization (remote first) Initial setup: how to create a remote repository, and clone it into a local repository Create a repository on Gitlab Local clone 47 Recap Consider the main branch of the remote repo (origin/main) as the stable element of the project. Do not send untested or incomplete changes. Send only valid and tested code. Consider your main branch as the stable element of your local version. Best practice Use temporary branches to develop new features. When the work is guidelines ready, merge into the main branch and delete the temporary branch. - Before "pushing", always "pull" and "merge" remote changes in your local branch to check for conflicts, before sending your changes to others. Send only clean and compatible code. General use of git Pull Requests / merge Requests are recommended for collaboration instead of push Project initialization is done only once, at the start, for both local and remote (git init and git remote add / git clone). Best practice guidelines - commits Make "atomic" commits: one commit = one single change (a feature or a bugfix) Good example: fix a bug, change a few lines in one or more files Good example: adding a class / configuration file / simple functionality Bad example: adding one or more long, large files and commit them all at once Bad example: commit all the misc changes of the day at the end of the day Don’t hesitate to commit several times a day! Name commits with a short, comprehensive message describing the changes Good example: "Solved a display bug in the main screen" Good example: "Added automatic backup functionality" Good example: "Added command to generate docs in Makefile" Bad example: "Updated main.c file" Not specific enough Bad example: "Solved a few bugs" If you commit some code that still has some bugs, it’s ok but write it! Comments are essential when you want to go back to a specific working version Further information: conventions Conventional Commits: https://www.conventionalcommits.org/en/v1.0.0/ https://xkcd.com/1296/ 50 Typical process - add a commit to the main branch In simple projects with few collaborators, or when working on a small change, we sometimes commit directly to the main branch. git checkout main switch the local repository to the main branch git pull retrieve updated changes to this branch from the remote repository to work on the most up-to-date code git add. (or git add ) "stage" the changes made git commit -am "seg fault bug fix" create a commit on the branch with appropriate message git pull optional: if, in the meantime, the remote repository has changed, the remote changes should be retrieved again and any conflicts resolved before pushing git push send local changes to the remote repository 51 Demo 5 - add a commit to the main branch In simple projects with few collaborators, or when working on a small change, we sometimes commit directly to the main branch. git checkout main switch the local repository to the main branch git pull retrieve updated changes to this branch from the remote repository to work on the most up-to-date code git add. (or git add ) "stage" the changes made git commit -am "seg fault bug fix" create a commit on the branch with appropriate message git pull optional: if, in the meantime, the remote repository has changed, the remote changes should be retrieved again and any conflicts resolved before pushing git push send local changes to the remote repository 52 Typical process - adding via a branch In more complex projects, or with more collaborators, the main branch is often locked. To add new functionalities, you need to create a new branch and then merge it into the main branch (after reviewing the code). git checkout main && git pull switch the local repository to the main branch retrieve updated changes to this branch from the remote repository git checkout -b bugfix123 create a branch named bugfix123 from the most recent commit on main, and immediately switch to this new branch git add. (or git add ) "stage" the changes made git commit -m "seg fault bug fix" create a commit on the branch with appropriate message git checkout main && git pull switch to main branch, and retrieve remote changes in case anything has changed git checkout bugfix123 && git merge main switch back to the work branch, merge the main branch to update it, and resolve any conflicts. Option 1) pull request: git push send the working branch to the remote repository, then open a Pull Request - merge branches from the remote repository via the web interface Option 2) local merge: git checkout main && git merge return to the main branch and merge the bugfix123 branch into it. bugfix123 53 Demo 6 - adding via a branch In more complex projects, or with more collaborators, the main branch is often locked. To add new functionalities, you need to create a new branch and then merge it into the main branch (after reviewing the code). git checkout main && git pull switch the local repository to the main branch retrieve updated changes to this branch from the remote repository git checkout -b bugfix123 create a branch named bugfix123 from the most recent commit on main, and immediately switch to this new branch git add. (or git add ) "stage" the changes made git commit -am "seg fault bug fix" create a commit on the branch with appropriate message git checkout main && git pull switch to main branch, and retrieve remote changes in case anything has changed git checkout bugfix123 && git merge main switch back to the work branch, merge the main branch to update it, and resolve any conflicts. Option 1) pull request: git push send the working branch to the remote repository, then open a Pull Request - merge branches from the remote repository via the web interface Option 2) local merge: git checkout main && git merge return to the main branch and merge the bugfix123 branch into it. bugfix123 54 About Git Many different uses, depending on project needs: Team size, organizational structure Deployment frequency & nature of project (utilities, video game, web platform) Environment: open source, proprietary? Main platforms: Github, Gitlab, Bitbucket (remote repository and collaboration overlay) Also used for versioning documents, configuration files, graphic design assets, etc. 55 A few resources Git Book: https://git-scm.com/book/en/v2 Branches at a glance: https://git-scm.com/book/fr/v2/Les-branches- avec-Git-Les-branches-en-bref Interactive learning: https://learngitbranching.js.org/ 56 Questions ? 57