The HFS is a valuable University resource with a large but ultimately finite capacity. Certain guidelines are therefore required to ensure the fair use of its resources so that the facility can be used as widely as possible.
We constantly review our operational limits in the light of available resources and capabilities. Currently we recommend a single backup account store no more than 10TB of data. We are aware that some systems exist which manage much greater amounts, but unfortunately our resources do not extend to covering these. Neither do we support 'partitioning' such systems into multiple backup accounts to circumvent the 10TB limit and so such systems must therefore look to implementing their own backup solution.
Independent of the amount of data stored, where a large number of files are held locally, we strongly recommend, and may insist, that these are either bundled into a local archive file prior to sending to the HFS or the local disk is repartitioned into smaller filesystem/volume/drive partitions. There is no magic limit here, as the requirement for this depends partially on the client machine's own resources and ability to sort large numbers of files. Typically, the figure revolves around several million file objects in any one partition.
Additionally we insist that our clients:
- Connect to our service and upload data at a reasonable speed, see Connection speed to the HFS.
- Try to avoid backing up some types of information, see What to exclude from backups.
- Try to ensure that only University work related data is included in the backup, see What to exclude from backups.
- Try to avoid backing up data from duplicate locations, see Moving data around.
Connection speed to the HFS
Long slow backups can cause considerable problems with the service: they lock server resources that need to be spread across many clients. Consequently, we demand that clients upload data at a reasonable rate such that the upload of their data does not take an unreasonable amount of time.
At 10 hours a client session will be cancelled. It is to be noted that this is an extreme measure enacted in order to manage server resources - it should not be seen as a boundary limit below which all backups will be tolerated. This is particularly important for 'server' level clients. The guiding principle is: if you have a lot of data to backup, then you must have the machine resources and network bandwidth to do it.
Please ensure that, as well as sufficient network bandwidth, a server client has the CPU and memory resources to sort and process the number of files it has on it. As a guide the IBM backup client requires 300 bytes of memory per file, so a partition of 3 million files will require just short of 1GB of memory.
We would expect most server clients to complete a backup of 300GB of data within 5 hours - equating to an upload speed of 16.67MB/sec. Should a client consistently fail to complete its backups, and there is no transient error situation (for example network misconfiguration) as the cause, then we reserve the right to exclude this client from the service.
Current limits are documented in our FAQ items What limits are there on use of the HFS? and How much data can I back up?. It should be noted that there is currently no maximum connection speed limit.
What to exclude from backups
The HFS backup service is intended for active data that is in use by current members of the University. Certain files types which should not be backed up are excluded from backup by the HFS set of default exclusions. However, these exclusions cannot catch all examples. Therefore we ask that users ensure that the following data are excluded from HFS Backup on all client machines they own or manage.
- Data that is not related to your work with the University.
- Data on drives shared from other machines (excluded by default on Windows machines).
- Block-level (or other format) files that constitute an 'image' of a machine or part of a machine (for example Ghost images).
- Backups of the local or other machines (for example Time Machine, native Windows backups).
- Virtual machine images which are not already excluded by default (the latter being *.vmdk(.*) and *.vmem).
- Copyrighted data for which the client is not the owner or the licencee or holder of said copyright.
- Other bootable images on a multi-platform-boot system (for example the /WINDOWS partition on a Linux system).
- Continuously date- and/or time-stamped files (for further information on this restriction, see our FAQ item Why am I not allowed to back up files with time- or date-stamps in their names?).
- Duplicate data: it is a waste of resources to back up the same data more than once, either from one or from multiple machines.
- Outlook archive files, called archive.pst: these are date-stamped by Outlook every time it is run, and therefore get needlessly resent by the backup software on every backup. (As an alternative, backup of these files is permitted if you detach them from Outlook via File > Data File Management. Please contact email@example.com if you wish to do this.)
- Large database files: you should not use the standard Backup-Archive Client to back up either large running databases or flat file 'dumps' of such databases.
Examples of how to exclude files and folders from HFS backups can be found in our page on how to exclude files and folders from backup. Further help can be obtained by contacting firstname.lastname@example.org.
Moving data around
The IBM backup software recognises a file moved to a new location on a system as a new file and thus a candidate for backup. A renamed mountpoint, drive, volume or partition will also, unless specifically excluded, occasion a fresh backup of all data on it. Windows drives derive their name from the Windows computer name, thus a change to this (which in itself might be due to a change in the machine's TCP/IP Name), can act as a partition rename. In these cases the data is in effect copied again and hence lies duplicate within the HFS Backup system. It is thus imperative that users avoid unnecessary moves and renames.
In cases where you do anticipate essential large scale moves or renames of your local datastore, please contact email@example.com beforehand so that we may manage this.
Additionally, we may insist on local measures being implemented in order that the backup or archival of that data does not consume a large amount of (HFS) system resources. These local measures may include, but are not limited to:- excluding files, repartitioning into smaller filesystem/volume/drive partitions or preprocessing files before backup.