Moving a repository to a new source control management system can be challenging in an enterprise environment. For the novice it might seem simple, just copy the files from one system to another and “bam” it’s done. However most projects in an enterprise are constantly being updated and making changes to these process can feel like trying to change a car tire while the car is still moving. In my career I have done this with many different types of source control system from centralized to distributed. Here are some focus areas and an approach to converting them successfully.
In this post will be from the perspective of going to a git based system. Git has been around for over a decade and is the most popular one out today and many companies have written applications that will help you manage git repositories.
Tool Selection – Select the new source control system and insure it will work for your development process. Some considerations, market adoption as it will have the most support and to be able find talent that already knows the process and organizations strategic direction as this can streamline support and procurement. Generally trying to find that obscure tool with that one or two special features that you cant get support for down the road will cause grief down the road.
Select Branching Strategy – This is generally not a hard process and git flow is what most folks recommend, however some systems especially COTS systems need some modification to this strategy. You may also want to insure things like generated files that can’t not be merged with previous version are properly handled.
Define Access Management – Access management is very important to define upfront as some branches you will want to restrict to the development team and only updatable via pull request and review processes. Start this list early on and figure out a good process. Most times and Admin\Developer\User approach works well but you may have other considerations.
History Migration Approach – By far the easiest choice is to not move history as you will likely not have to rely on it and can always retain access to the old system for a period of time. Being in the business for over 27 years, every request to get access to code that was over a year old has never been re-implemented in production. If this is a problem area for you looking a feature flags might be a better option that re implementing old code. If your working with a team that insists on history or need it for regularity purposes, there are many conversion programs that will assist in this. Plan on adding some time for this during your conversion process.
Define verification process for conversion – During this conversion you will want to do a bit of clean up that is listed below and with these types of changes, being able to define the “test” to insure your not going to disrupt the existing processes is needed.
Removing binaries and executable – Garbage can collect in existing repositories. Most of the time it is binaries for either to setup environment, dependencies, and generated binaries. None of these should exist in a git repository they will cause your repo to explode in size and significantly slow down the cloning process. These need to be move to an artifact repository to be managed.
Technical Debt Removal – Other things like old project files that are no longer being used should also be removed to reduce the overall technical dept in the system. If you have high technical dept that will require significant code changes, you may want to break this off into it’s own effort. For an SCM conversion there is likely some low hanging things that can be done quickly and make a big impact.
Identify current connections to existing SCM systems – It is likely that you will have something connecting to the old repositories and will need to be moved over to the new system. These should be identified and added to the plan to be moved during conversion.
Scanning for Secrets in existing repository – One of the most important things not to overlook is scanning for secrets in the existing repo and insuring they do not get moved to the new system. A big consideration is that repos are designed to hold all history, so moving then cleaning is not a good idea. Better to clean up in place or before the very first commit into the new system. There are many bots that exist just looking for this type of thing and can be damaging to the organization if discovered. Don’t skip this step!
Training – This should be done in at least two phases. It is a good idea to get a few selected team members them test and walk though the process and really think about how these things will affect the development process for the team. You maybe surprised at some of the things you discover from getting feedback in this way. Additionally doing team wide training is critical. Even if the folks know or have familiarity with the new system because of the changes made to the repo and process, having a clear definition of the new way it will work is critical.
Preform test migration – Testing the migration is an important step, this will flush out and insure your defined process is working properly. Also by reviewing it with select members and architects on the team additional finding will make the conversion better and have a cleaner repo in the end.
Perform actual migration – Most teams can not disrupt the value to be business by stopping feature development. It is a good idea to plan for an outage that is the least disruptive. With all the prep done above this should be a smooth process. During this migration applying the permission and governance and moving the existing connection to the new SCM should be done.
Old repository retention and destruction process – Some factors to retire your old systems could be continued licensing, regulatory requirements, possibility of PII or secrets in old code, or even having it around contributing to process and technical debt to the system. Pick a plan and reminder to have it reviewed.
follow me: https://twitter.com/jmgress