Git for Data Science: Working with Git on Oracle Data Science - Part 3

Author: Philip Godfrey

What is Git?

Git is a version control system that allows you to track changes made to a set of files, making it perfect for collaboration between teams, and allows you to revert to previous version of the files as needed.

Implementing version control your code is essential so you can keep track of any changes as you work through your various data science projects.

 In the previous blogs, we configured Git and initilialized the repository in Oracle Data Science, before working through cloning repository and working with Git, such as creating a branch and making changes. If you missed either of these blogs, you can read them here and here.

This blog we will focus on staging changes, committing those changes any necessary changes. And publishing the branch we’ve recently created.

Stage change


This moves the file from “Changed” to “Staged” where it is nearly ready to be pushed back into main.

We can now see the file has moved into the “Staged” area.


Before pushing a commit back to the main branch, it’s useful to add a summary and description, which summarises your changes.

E.g,

Summary: Additional text added to readme.md

Description (optional): Additional text has been added to the Readme.md file to show how changes to files are made


Committing a change

When we click commit, this will push the change back to the main branch.

Before this happens, we need to provide and name and email address of the user who is creating the change. This will be useful if any further information or comments are required. 


We will receive confirmation that the commit has been made at the bottom of the screen: 



Publish Branch

Before we can confirm any changes, we need to publish the branch. This makes the branch available from the DataScience repository. What we expect to see if that we have a main branch, but also a DataScience_Branch which sits underneath.

Again, you will be prompted for your credentials, but once provided the branch will be published and we’ll receive confirmation that it has been successfully pushed.


Confirm change in GitHub

The final check, is to go into GitHub to confirm these changes have been made, so what we are expecting is:

·        A new branch has been created (DataScience_Branch)

·        This has a Readme.md file which contains additional text



We can see the new branch exists, and the change to the readme file has been made. GitHub also advises that the branch is 1 commit ahead of the main.


We can review these changes within GitHub to see which files have been changed, and what has changed.



In the next blog we will focus on some further Git practices, including creating pull requests and merging them.


Comments