Migrating to Git: Part one - CVS

In this three-part series, we will look at the experience with project migration to version control system Git . Gradually, we will describe the transfer of repositories CVS, SVN and Mercurial.

In our projects we use four version control systems (CVS, SVN, Mercurial, Git). This causes many problems with repository management and developer knowledge with all mentioned systems.

Migration is primarily targeting older repositories based on CVS and SVN, which nowadays seems, with regard to newer alternatives, totally inadequate.

In last two years, we have opted either Git or Mercurial for new projects. Now we have decided to select only one solution. After frequent discussions we opted rather to Git.

The final impetus was the introduction of GitLab, as more convenient means for managing Git repositories.

New projects will be developed mainly in Git. We do not push current projects in Mercurial to be migrated, but we also explored the transfer method.

For all three systems we were looking for suitable solutions, among which we chose the following tools:

  • CVS - cvs2git
  • SVN - svn2git
  • Mercurial - hg-fast-export

In today's episode we look at migration from CVS:

Prerequisites

  • Conversion tool cvs2git (modified version of cvs2svn)
  • Python
  • GIT
  • CVS
  • Direct access to the machine where CVS will be converted (mount as directory)

Transfer

Settings

  • Copy file cvs2git-example.options from cvs2git directory → cvs2git-PROJECT.options
  • Change ctx.tmpdir = r'cvs2git-tmp' ⇒ ctx.tmpdir = r'tmp-DIRECTORY' (existing work directory)
  • Change in ctx.revision_collector = GitRevisionCollector(… value 'cvs2git-tmp/git-blob.dat' ⇒ 'tmp-DIRECTORY/git-blob.dat'
  • Solve encoding - all CVSTextDecoder modify (or just set utf8 if it's already in CVS):
    CVSTextDecoder(
        [
            'cp1250',
            'utf8',
            'ascii',
            ],
        #fallback_encoding='ascii'
        )
  • Change author_transforms={… the list of authors in CVS…
    To get all authors in history just call in current CVS project directory:
    $ cvs log 2> /dev/null | egrep -o "author[^;]+;" | sort -u

    The result will be something like this:
    author: novak;
    author: svoboda;
    author: novotny;

    And make list to setting according to this:
    author_transforms={
        'novak' : ('Jan Novak', 'jan.novak@etnetera.cz'),
        'svoboda' : ('Jan Svoboda', 'jan.svoboda@etnetera.cz'),
        'novotny' : ('Jan Novotny', 'jan.novotny@etnetera.cz'),
    
        # This one will be used for commits for which CVS doesn't record
        # the original author, as explained above.
        'cvs2git' : 'cvs2git @example.com>',
        }
  • Change run_options.set_project(r'test-data/main-cvsrepos', ⇒ run_options.set_project(r'/mnt/REMOTE/cvs.etn/srv-cvs/repository-cvs/PROJECT', (path to CSV project in mounted directory)
  • For LF to CRLF conversion (Linux ⇒ Windows entry) it is required to set:
  • DefaultEOLStyleSetter(None) ⇒ DefaultEOLStyleSetter('CRLF')
  • Add: from cvs2svn_lib.svn_run_options import SVNEOLFixPropertySetter (into from imports)
  • Add: SVNEOLFixPropertySetter(), (at the end to ctx.file_property_setters.extend([ …)

Launch

Need to run python script with our setup:

$ python ./cvs2git --options=cvs2git-PROJEKT.options

  • When finished, there should be tmp-DIRECTORY/git-dump.dat and tmp-DIRECTORY/git-blob.dat
  • Then create a folder for project, initialize git and import there conversion result:
    $ mkdir PROJEKT.git
    $ cd PROJEKT.git
    $ git init --bare
    $ git fast-import --export-marks=../tmp-ADRESAR/git-marks.dat < ../tmp-ADRESAR/git-blob.dat 
    $ git fast-import --import-marks=../tmp-ADRESAR/git-marks.dat < ../tmp-ADRESAR/git-dump.dat

    This should prepare the git directory. You can continue with other points of cvs2git documentation - (at least take control of history and branches)

Placement in the GIT server

  • Bind our work to project and PUSH
    $ git remote add origin git@github.com:USER/PROJECT.git
    $ git push -u origin master
    
    #or push other branches
    
    #if you wanto to transfer the tags (and there are some after transfer)
    $ git push origin --tags

Cleanup

It is advisable to carry out cleanup operation after conversion on cloned local copy.
Rename .cvsignore files to .gitignore and at least add slash / to the line beginnings (or make unification and cleanup - as needed)

Sources

cvs2git documentation

Article has 2 comments

  • MarMax

    1
    tak to jsem zvědavej na SVN -> GIT, já si s tím užil poslední dva měsíce... :)
  • Lukáš Voborský

    2
    Článek o SVN -> GIT máme v plánu vydat v nejbližších dnech.