Git Guacamole
Credits to Bryon Smith (@bsmith89) for this. He started this lesson, James Mickley and I added to it
Pre-Workshop Setup
Open up Socrative again
QUESTION: How many of you have used git or other version control software before?
1. Introduction: Automated Version Control
- PhDComics slide
- Unlimited undo, helps us know what each change does: Who made it & why?
- Lets people collaborate on the same code at the same time (Have you used google docs?)
- All of these things are useful for repeatability, history, fixing problems etc.
- Introduce the concept of repositories and commits
2. Setting up Git
- Git needs to know who we are (so we know who changed what)
git config --global user.name "Vlad Dracula"
git config --global user.email "vlad@tran.sylvan.ia"
git config --global color.ui "auto"
git config --global core.editor "nano -w"
git config --list
- To get help
git config --help
or git -h
- Talk about how git is run with subcommands
git <verb>
- Kind of like saying bash . Git is another commandline interpreter
- If email privacy is needed:
- https://help.github.com/articles/keeping-your-email-address-private/
3. Creating a repository
- Create the repository
cd ~/Desktop
mkdir guacamole; cd guacamole
git init
- We can check that it's there. That .git folder stores all the data, don't delete!
git status
4. Tracking changes
Recording changes, checking status, and adding notes
---------- Socrative #1 ----------
- Instead of opening an editor, we can just put the commit message in the command
git commit -m "Edited the guacamole instructions
- Oops, we need to add the file first
git status
git add instructions.txt
git commit -m "Edited the guacamole instructions
git log
- Draw a diagram of the three "areas" of git
- [filesystem] -->add [staging area] -->commit [repository]
- Think of commit as taking a snapshot, and add as deciding what's in it
- Staging area
- Lets us only commit some of our file changes.
- Keep changes with a related theme together
---------- Socrative #2 ----------
---------- Socrative #3 ----------
- Add a new file
ingredients.txt
: 2 Avocados, 1 lime, 2 tsp salt
git status
git add ingredients.txt
git commit
- Add a new ingredient, onion, to both the instructions and the ingredients;
git status
git diff
git add .
git status
git commit -m "Added onion to the recipe"
- Discuss changes to multiple files
git log
- Our log is getting long, some solutions:
- d = down, u = up, q = quit
git log -2
git log --oneline
---------- Socrative #4 ----------
5. Exploring History
Identifying and recovering old versions of files, reviewing changes
---------- Socrative #5 ----------
---------- Socrative #6 ----------
6. Ignoring things (.gitignore
)
- What if we include some files that we don't want to track?
nano a.log
nano b.log
git status
could get obnoxious
nano .gitignore
git status
git add .gitignore
git commit -m "Ignore some files."
git status
- We could add anyway by using the -f force option
git add a.log
git add -f a.log
git status
git reset HEAD .
git status
git status --ignored
7. Remotes in Github
- Git is great for collaboration!
- Direct learners to make a github account and/or log in
- Make a github repository called 'guacamole'
- This is like mkdir, cd, and git init on the GitHub server
- But we want to connect the two
- Follow the instructions to add the remote repository
- Be sure to use HTTPS
- SSH is preferred and more secure, but it's more complicated to setup
git remote add origin [URL]
git remote -v
- The name origin is the local nickname for the remote repository.
- Push our local repository to the remote
git push origin master
- Pushing branch master to origin
- Explore the repository on github
- Viewing files, seeing history of file, editing
- Show students that local changes which are NOT committed don't end up on the remote.
- Ask students to discuss the difference between 'commit' and 'push'.
- To get the latest version FROM the remote:
8. Collaborating
9. Conflicts
How git handles conflicts
- When people are working in parallel (or on 2 computers), there's a chance they could both edit the same file
So let's make one, and see what happens and what to do about it
- Make an arbitrary change to your collaborators
instructions.txt
.
- Push that change upstream.
- Make (and commit) a different change to your
instructions.txt
on the same line as your collaborator changed.
git push origin master
- What happened? Git detected overlap and prevented us from messing it up
- We now have to get the changes from the server, and merge them with ours
git pull origin master
git status
cat instructions.txt
- HEAD is your version, the commit id is the version on the github repository (your collaborator's)
- You need to rectify things first before you can send it back to the repository
nano instruction.txt
git add instuctions.txt
git status
git commit <...>
This actually is the merge
git log
git push origin master
- In practice, it's a good idea to run
git pull origin master
before making changes
- Show them how to comment on a diff in Github
Merging is another reason to split up changes into small groups of files
15. Branching
- Branching is just collaboration with yourself.
- You're carrying on a parallel line of thought
- Experiment with reformatting your entire thesis without risking your progress.
- Or rewrite a paper for a particular publication
13. Hosting
- One nice thing about git is that it's "distributed"
- Any git repository can also be the server, there's no need for a central server
- So two users can just collaborate with each other without the need for github etc.
- It's also easy to move from github to some other service
- But websites like GitHub are very useful. They're a little nicer than the commandline version of git.
- Github, however, is built on openness. By default all repositories are public
- This may not be ideal for you, or may even be illegal (HIPAA)
- Some options:
- Use another service that allows private repositories. Gitlab and Bitbucket are a bit better.
- Apply for gitlab for Educational Use: https://education.github.com/discount_requests/new
- Setup a computer as a server at your institution. More maintainence but you have control
- You can even install your own private Gitlab or BitBucket
10. Open Science
- This is basically an electronic lab notebook
- It's also a good way to share information for reproducibility
- The data, and code are there together. Maybe also figures or the paper itself
14. Using Git from RStudio
11. Licensing
12. Citation