merging git histories between repos

currently at work, i’m working on open sourcing a plugin we’ve maintained for a while now for Moodle. this plugin was at first only soft-forked into our Moodle repo, so the history from previous work doesn’t exist.

we wanted to share it as our way of giving back to the community, and this led to the conclusion that we should also preserve our commit history for the past 3-4 years of the small maintaining we’ve done on our part. the original plugin has “died” and is no longer maintained by the past maintainers, which was the reason for us soft-forking it (as of what i’ve been told).

this led to a new, interesting problem i’ve never encountered; migrating a specific plugin’s history from one Git repo into another repo, while preserving the commit history and make it follow the new repos old commit history.

while this might sound like an easy rebase, i (as a junior) had my difficulties with it hehe. the task is basically to “extract” a sub-directory from a large project (our Moodle instance) and merge it into a different existing project (the fork to be open sourced).

this is my first time working on open sourcing something professionally, and i don’t have a lot of experience with it either, but i’m superhyped about it. my senior literally said to the others, during the sprintmeeting (which I couldn’t attend due to working for another client), that “Sid is gonna be a bit hyped when he sees this task on his Jira board”.

well, for those might have the same problem and wonder “how am I gonna do this”, i got you!

setting up the projects

first of, lets call the large project “Project A” and the smaller fork _“Project B” for easier understanding. we’ll then start by preparing the projects by creating a copy of Project A as a safety measure.

cp -r project-a project-a-copy
cd project-a-copy

then we need to filter out the unnecessary info and files by using git filter-repo. git filter-repo deletes everything in the repository except for the files inside ${path-to-specific-folder}, as well as all git commits that didn’t touch these files. this will make it so we end up only keeping the sub-directory from the large project with its commits.

git filter-repo --path ${path-to-sub-directory}

now the copy of Project A has only the related files for the sub-directory. some might experience that for example the code is still under /src or something, but this can quickly be fixed by lifting the content to root. this is done by using

git filter-branch -f --subdirectory-filter ${path-to-sub-directory} -- --all

connecting the projects and merging

now that we’ve prepared Project A, we can move on to the folder for Project B. here we want to point our remote source to the cleaned-up Project A copy. this will make it so Project B can see the commits from the extracted sub directory (plugin in my case).

cd ../project-b
git remote add plugin-source ../project-a-copy
git fetch plugin-source

“plugin-source” is just the name for our remote pointer. it could be named whatever you’d want, like “target-commits”.

lets merge them! do the following command, go through your merge conflicts and merge that merge.

git merge plugin-source/main --allow-unrelated-histories

PS: little heads up, as our projects A and B doesn’t share a common ancestor (they are completely different projects), git will usually refuse to merge them. that is why the --allow-unrelated-histories-flag is added to force git to combine them, stiching the history of the sub-directory in Project A into Project B.

finito

the result of this operation is that Project B now contains the sub-directory/plugin/whatever you wanted to merge into, and if you look at the git log in Project B, you will see all the original development history (authors, dates, messages) from when that plugin was still part of Project A.

toodles!

My custom jolly roger