Backup Project Data Storage

Storage
Research Data Management
Author

Esther Plomp

Published

September 3, 2022

This post describes how to automatically and manually backup files from the Project Data Storage to a local drive.

Automatic backup is only possible if you are the only/single user of the Project Data Storage. Back up needs to happen manually if there are multiple users.

Automatic backup

  1. Map/mount the Project Data Storage as a local drive/folder (use sshfs (rather than sftp) to mount the drives)

  2. Use a folder synchronization tool that can automatically synchronize newer files.

    • For Unices cp --archive --update is a start;

    • rsync (available for Unix and Windows) allows more control but make sure to use the right arguments;

    • tar is not suited because it creates archives instead of synchronizing folders.

  3. Use a scheduler (Unices: cron, Windows: Task Scheduler) to automate the synchronization (step 1/2).

You can also do automated backup using autorestic, restic and rclone. Rclone does the data transfer to the drive, and restic deals with backup snapshots (see restic for more information).

Note

When the Project Data Storage is used from different computers, point 2 becomes very difficult because the “standard” tools only work one-way (they will only copy newer local files to the Project Data Storage but not synchronize newer files on the Project Data Storage to your local computer). Of course you can run the tools both ways, but that breaks horribly when you delete a file locally or revert to an older version, and the synchronization then happily restores the (newer) file from the Project Data Storage.

Note

This is not easy/perfect/suitable for everyone, it is not a one script fits all solution. You will probably want to tweak it for your own set up.

Manual backup

  • Unison: does two-way synchronisation and asks the user in case of conflicts (changes from multiple sides/users)

  • FileZilla and WinSCP can also do folder comparisons and show differences