Git for Data Science: Working with Git on Oracle Data Science - Part 4

 Author: Philip Godfrey

What is Git?

Git is a version control system that allows you to track changes made to a set of files, making it perfect for collaboration between teams, and allows you to revert to previous version of the files as needed.

Implementing version control your code is essential so you can keep track of any changes as you work through your various data science projects.

In the previous blogs, we configured Git and initilialized the repository in Oracle Data Science, cloned a repository and made some changes to files and then staged them.

This blog we will focus on creating a pull request and merging that pull request into the main branch.

 

Pull request

Creating a pull request adds the changes we’ve made in a branch to the main branch.

It’s at this point we can add Reviewers, Assignees, Projects, Milestones etc which is essential to a fully functioning project team and successful code development and CI/CD.



For the purposes of the demo, we will create the pull request without any reviewers, but in a real project scenario, the use of reviewers is recommended.


At this point, GitHub will check there are no additional changes to main and there are no conflicts. As no errors have arisen, we are able to merge pull request.

It is also good practice to close pull requests with comments. Typically these will resolve a problem or question, which in GitHub are referred to as “issues”.  Each issue is numbered, and by referencing the number (#1) in the pull request, this will sync this merge request with the issue.

After merging, now the changes have been committed back to main, you can safely delete your branch (copy). 



Now if we look in the DataScience repository, we can see the changes have been applied to the main branch, ready for other users to clone from.


Note: when merging a pull request, you can also close the issue at the same time. As we mentioned earlier, each user will be working on multiple issues (these are numbered) and by refencing the issue number (#1) this will link the pull request and the issue together e.g.,

 Issue #1 is resolved by this pull request


If there is anything else you'd like to see in Git / Data Science, reach out and let me know in the comments.

Hope you enjoyed it! 


Comments