Computers
Blog | Front Page > Computers > svnautocommit

Using Subversion to do automatic backups

Subversion is a version control software: it has been designed for groups working on the same project and modifying a common set of files. The data (the set of files, including the directory tree) is stored in a main "repository" and the users checkout a "working copy" that they edit. When they want to push their changes to other users they commit their work. Subversion then uploads the changes to the repository, trying to resolve any conflict with changes made by others. Doing so, it automatically saves the state of the repository before the commit in a way that allows the changes to be undone.

By using a small script to commit my work every night, I am able to restore the state of any file I work on, at a given date, midnight.

There are plenty of version control software. The reason why I back subversion is that it has a great windows GUI that is integrated with explorer. If you know that you are not going to use windows, you might want to look at bazaar, that is easier to use when working with small teams.

Apart from this document I have found a very well written tutorial for beginners, in French, on Laurent Pointal's page that I suggest you read if you speak French and want deeper understanding.

1   Installing subversion

1.1   Under Windows

Under windows there is a very nice integration of subversion into the windows explorer: TortoiseSVN. I suggest you install it to have a friendly interface.

Integration of SVN in the explorer

1.2   Under Ubuntu Linux

Install the package subversion with synaptic or apt-get :

sudo apt-get install subversion

You can install a frontend: for instance package svn-workbench [1].

2   Using subversion

Subversion works like most version control software : when you want to put a bunch of files under version control you first have to create a repository, where the files and there history will be store. The repository is a black box where the files are stored, but you do not modify any files in the repository.

The fils you work on are called the working copy. You can have several of them (suppose that you want to make temporary modifications to files, but that you know that you will never want to keep them, you can make a new working copy, and delete it afterward). The working copy knows by itself where is its repository and when you are happy with your changes you save them in the repository.

2.1   Creating a repository

There is a good introduction to version control in the software carpentry lectures. I will only cover very quickly the basics here.

First you have to create a repository for your data: a location on your computer or on a remote computer where the successive versions of your files will be kept.

  • Under Windows the easiest way to do this (after having installed TortoiseSVN) is by right-clicking on an empty folder and choosing "TortoiseSVN -> Create repository here".

  • Under Linux just open a terminal in the folder where you want to create a repository and use the command:

    svnadmin create
    

You can now use the import command to import your data in the repository. Once you have done this you need to check out a working copy of the data in an empty folder. This will be the folder where you work. If everything went fine and your data was properly checked out you should have a complete copy of your files in the working folder in addition to the data in the repository. You can therefore erase the original data you imported in the repository.

2.2   Using subversion to track modifications to files

When you have modified a file you can save its changes to the repositroy using the commit command. If you want to undo your changes you can use the revert command.

The log command allows you to track which files where modified when. Subversion gives a revision number to the repository which is incremented each time a commit is made.

History of a file

2.3   Adding and deleting files to the repository

Committing only tracks changes to files already in the repository. New files have to be added manually and deleted file have to be deleted using the delete command. For our purpose of doing a backup automatically overnight we need to use a script that finds the added and deleted files by itself.

3   Sharing your work with svn

When you are working with colleagues on the same set of files, they can make a check out of your repository. Actually the best solution would be to have a common repository accessible by everybody on the project, I give a few hints on how to do this in the last section of this article. When your colleagues are done with there work on their working copy, they can give you their files (the whole set), and you can commit them to your repository. Their changes are thus integrated in your repository and you can use svn's powerful history functions to see what changes they made, to review them or undo them (see for instance the "diff" feature in the TortoiseSVN GUI). You no longer need to do this by hand.

4   The autocommit script

I wrote a script that scans an existing subversion working copy and commits the changes, including the added or deleted files.

I do not advise you to use this script to much, as it will backup any temporary file and does not allow to create clean backup of your work. The standard porcedure requires to manually add, delete or rename the files in the repository. Doing this requires a bit more work but is a good practice as it forces you to back up only the important files. Modern software tends to create a huge number of files that are not terribly important (backups, history, preferences), and if you use this script everything will be saved to the repository, which will inflate its size.

There are a couple of things one should be aware of when using this script. If you move a folder (without using the subversion move command, in the TortoiseSVN menu under windows), then SVN gets lots, and the autocommit script does not working. What you have to do to solve the problem is either move the folder back where it was, and use the SVN move command, or remove all the .svn folder in the folder that you moved, and its subfolders (first option is better).

I wrote the script in python as it is a powerful language, and it runs on most systems.

4.1   Using the autocommit script under windows

To do automatic backup you need to be able to commit with a script. TortoiseSVN is not controlable by script, so you also need to install the command line client of SVN (subversion). I had to fight a bit to find a setup for SVN under windows, but at the time of the writing you can find one here.

Now that you have both the frontend (TortoiseSVN) and the backend (SVN) you can interact with subversion both throught the explorer interface and throught scripts.

The script has been compiled to an executable file using py2exe

I suggest that you put it in c:\program files. You can associate folders (Explorer, menu "tools->folder options") with it: it needs to be called with the path of the working copy to be committed as an argument. The script needs plain SVN to be installed. It will not work with TortoiseSVN.

To make a backup overnight all you need to do know is to add a "scheduled task" (in the Control Panel) that call the script with the path to the working copy you want to commit as an argument (you need to edit the advanced properties of the task to be able to give the script an argument).

note: Ulrich Dinger has found a bug in the script that has been corrected. Unfortunately, I do not have a windows box lying around, so I cannot regenerate the executable file.

Note

Gabe Nodland has informed me that under Widows, the following batch file would also work:

for /f "tokens=2*" %%i in ('svn status %1 ^| find "?"') do svn add "%%i"
for /f "tokens=2*" %%i in ('svn status %1 ^| find "!"') do svn delete "%%i"
svn commit -m "Automatic commit" %1

Simply save the 3 lines above in an file called 'autocommit.bat'. If you run it from the working directory you don't need to specify a parameter If you are in another directory you can call it like autocommit.bat c:projectscalc

4.2   Using the autocommit script under linux

Under linux you can use the python code directly, but I have also a shell script that does the same thing :

Just copy one of them a folder in your path (/usr/local/bin for instance) and make it executable.

You can them add them to the list of task to be run by the cron demon:

edit the list with:

crontab -e

and add the line:

27 3 * * * /usr/local/bin/svnautocommit /full/path/to/working/copy

for instance, to run the backup at 3:27 every night.

Thanks

Thanks to Jason Judge for finding a nasty bug in this script (Feb 2nd 2010).

5   Using a repository on a different computer

Keeping backups is nice, but it is even nicer if the backups are on a different computer: this minimises risk of data loss. If you have a computer under Linux it is very easy to put your repository on it and access it from other computers.

5.1   Setting up the repository to be accessible from other computers

All you need is a ssh server. On Ubuntu Linux you need to install the openssh-server package.

5.2   Accessing the repository from a distant Linux computer

On most Linux computer an ssh client is installed by default. There is nothing to set up. You can check out the repository installed on computer "servername" by giving to subversion the URL svn+ssh://servername/full/path/to/repository.

Once the initial checkout is done you can proceed as with a local repository.

To avoid entering you password each time you do an operation on the repository you can use keys. Here are two HOWTOs on using keys :

5.3   Accessing the repository from a distant Windows computer

On Windows you can use the ssh program used by TortoiseSVN (that you have already installed). Copy the file "TortoisePlink.exe" located in the "bin" subdirectory of TortoiseSVN's install path in the "bin" subdirectory of Subversion's install path, and rename it to ssh.exe .

You can check out the repository installed on computer "servername" by giving to subversion the URL svn+ssh://username@servername/full/path/to/repository, where username is the name of a user on the Linux server that has access to the repository.

To avoid entering you password each time you do an operation on the repository you can use pagent :


[1]pysvn Workbench
Return to Top     Page last modified Thu Mar 22 09:43:37 2012.     Created by GaŽl Varoquaux with rest2web
Best viewed with firefox, or any browser that respects standards.