Combining Unrelated Git Repositories: When Projects Collide!
2022-05-07T00:24:05.237Z
How do you merge two Git repositories?
Categories:- git
- projects
Recently started a project to build an AI-powered bookmarking extension for Chrome using separate git repositories to get the guts created: one for the chrome extension and one for a React/TypeScript front-end.
I figured it would be simpler to start with isolated repositories and combine them once I had basic functionality working for each part. Eventually I needed to combine these repositories in order to use the front-end for the popup and Bookmark page override in the chrome extension.
I went google hunting and found this concise and well voted answer on how to merge one repository into another:
How do you merge two Git repositories?
_Basically, I rewrote the history of the my-plugin repository so that it appeared all development took place in the…_stackoverflow.com
It is definitely worth reading and is by definition the TLDR for this article.
After digging through the comments there I decided to go line by line through the recommended approach and check it against the git documentation in order to better understand what was happening under “the porcelain” as they say. Lucky for you, I took notes.
For reference here is the top-voted and accepted solution to merge project-a
into project-b,
with line numbers that link to the explanations below:
1 cd path/to/project-b2 git remote add project-a /path/to/project-a3 git fetch project-a --tags4 git merge --allow-unrelated-histories project-a/master # or whichever branch you want to merge5 git remote remove project-a
We start easily enough:
1 cd path/to/project-b
The cd
command changes your working directory to project-b
so that we can merge project-a
into it. Simple :)
2 git remote add project-a /path/to/project-a
Git - git-remote Documentation
_With no arguments, shows a list of existing remotes. Several subcommands are available to perform operations on the…_git-scm.com
The git remote
command ‘manages the set of repositories (“remotes”) whose branches are tracked’ and we’ll be using it again later as well.
git remote add
adds the named remote repository from the specified path or url. So here we are adding project-a
as a remote repository to the project-b
repository from the location /path/to/project-a
.
3 git fetch project-a –tags
Git - git-fetch Documentation
_all Fetch all remotes. -a –append Append ref names and object names of fetched refs to the existing contents of…_git-scm.com
git fetch
‘downloads objects and refs from another repository’.
fetch
downloads branches and tags (collectively, ‘refs’) from the repositories named along with the objects necessary to complete their histories. Any tag that points into the referenced histories is also fetched.
Using the--tags
option fetches all tags from the remote ‘refs/tags/’ directory into the local repository’s tags with the same name.
As these projects are both quite young and I have not begun to version them I do not have any tags of value to combine, so I am not going to include the — tags flag in my command.
For a bit more depth on tags refer to my diversion below.
4 git merge --allow-unrelated-histories project-a/master # or whichever branch you want to merge
Git - git-merge Documentation
_commit –no-commit Perform the merge and commit the result. This option can be used to override –no-commit. With…_git-scm.com
Here we get to the meat of the process with git merge
which ‘joins two or more development histories together.
The merge
command incorporates changes from the named commit(s) into the current branch. It will ‘replay’ the commit changes made on the named commit’s branch since it diverged from the current branch and record the result in a new commit along with the names of the two-parent commits and a log message from the user. As you might already know, this is a common scenario with multiple branches in a single repository and that is the way the merge command is generally designed and used. It does this by comparing the state of the files in the named commit’s branch to the same files’ state in the current branch and using a diffing algorithm to find where they diverge. Non-overlapping changes are made automatically while overlapping changes or ‘conflicts’ are presented to the user to select which version of the line to use.
Rather than dig too deep on the underlying mechanics I’ll just point to this simple answer as a good introduction to the three-way-merge algorithm that git defaults to, a good overview of the different merge strategies, and Atlassian’s worthy tutorial entry.
In our case we are not merging branches with a shared history, so the --allow-unrelated-histories
flag explicitly lets us merge two histories that do not have a common ancestor. If these were different versions of the same project this would be a potentially catastrophic operation that could leave our working tree missing commits with no way to revert, which is why the flag was introduced. With two separate projects combined, however, those issues are not possible.
For the truly adventurous, here is the actual implementation of git merge
where a quick Find in Page with the string ‘allow_unrelated’ will traverse in the code to where the--allow-unrelated-histories
flag is handled.
An important note picked up from the docs is to make sure to have your working tree up to date with all changes on both the named and current branches (or in our case repositories,) with no outstanding unstaged changes that you might care about. If this is not the case those outstanding changes have a chance of being lost.
5 git remote remove project-a
Finally, we return to git remote
and use git remote remove
to remove the named project-a
from project-b
’s remote tracking. All remote-tracking branches and configuration settings for project-a
are removed from project-b
, so that our now merged project no longer refers to the still existing project-a
. This completes our process by uncoupling the two repositories after we have incorporated project-a
’s files and history into project-b
successfully.
Not included in the original stackoverflow.com answer we could include a line like
rm -rf path/to/project-a
if we no longer wanted to keep the copy of the originalproject-a
.
I wound up cleaning up both projects’ working trees and then ran these exact commands to merge my two projects. It went as smoothly as advertised in the answer comments, with only one line of conflict in my .gitignore file which was easily fixed. If you’ve read this far, I hope you’ve got a couple of repos to smash together! Good luck!
A diversion on tags in Git
Here I felt the need to update myself on the tag functionality in git. Once again, the excellent git docs lay it out.
Git - Tagging
_Like most VCSs, Git has the ability to tag specific points in a repository’s history as being important. Typically…_git-scm.com
Essentially tags in git can be used to create a label for a particular commit, allowing you to reference it later. They allow a canonical reference point to a specific commit as opposed to a branch that starts at a specific commit and then tracks with changes. This allows for, in particular, versioning notation, so that a specific commit can be labeled as eg. v1.4
and later referenced via git commands.
There are two types of tags: ‘lightweight’ and ‘annotated’.
Lightweight stores only the tag name and reference to the commit checksum. It can be used to store temporary tags as a reference to transitional states or generally tags that are not expected to be maintained/shared.
Annotated tags include a message as well as a tag name, and store a full checksummed object in the git database including the tagger’s information and tagging date. This is the recommended type of tag for most tagging as it provides full information and can be signed and verified with GPG if needed.
A nice feature is the ability to tag commits after the fact so that the labeling process in the case of versioning does not have to take place in real-time but can be managed separately.
Tags are not shared by default when using i.e. git push
to a remote, so just like our git fetch --tags
call a git push --tags
option exists to send tags to a remote repository if desired.