With effect from version 8.1 onwards, the IBM TSM (Tivoli Storage Manager) backup / archive product has been rebranded as IBM Spectrum Protect. This is a rebranding exercise by IBM and, unless indicated specifically, there is no difference in the operation of the two brands. Version 7 of the (old) TSM client is still available for installation on older operating systems and you will see references to both IBM Spectrum Protect and TSM throughout these pages.
The HFS archive service is subject to a formal application procedure requiring explicit permission over and above the eligibility criteria for simple backup. If you are engaged in a project which produces data of long term value to the University, you should refer to the Policy on Computer Archiving Services. A simple table of what constitutes suitable candidate data for archive is also listed for quick reference. Please also bear the following considerations in mind:
- Data retention
Data cannot be archived indefinitely. We ask project applications to state a realistic lifetime for a project. Some projects have a natural conclusion whereas others may extend into the future due to regulatory requirements or simply because the data is of great value. Typically, projects are granted archival for 1-10 years and we require that a case for continued archival is made on every 5 year anniversary, this being not too onerous while ensuring the archive receives appropriate occasional consideration.
- Backup accounts
Archived data must not be backed up to the HFS backup service. Where a client machine is sending data to both the backup service and the archive service, the client software must be configured in such a way that avoids sending the same data to both backup and archive. We can help with the setup of this configuration. In practice this normally requires a physical separation of the archive data from backup data on the client (for example into a separate partition or root folder-path).
Long-term projects, and those requiring large amounts of data (> 5TB) will probably be asked to contribute to the on-costs of archive storage (see the HFS Service Level Description). Externally-funded projects should have a defined element for data storage and will be expected to contribute to these costs.
There are limits to the amount of data the service can store. In practice a range of between 50GB and 50TB is acceptable. Requests at either end of those scales will, however, be subject to greater scrutiny and possibly additional restrictions. At the lower end we might ask why local storage cannot be used. Towards the upper end, we might ask for additional assurances on data curation, provenance and access. Projects above 50TB may be considered but additional storage charges above 4TB may make a local storage solution more of an economic proposition.
Please use the archive request form to submit your application for your data to be archived on the HFS. You can also apply via the HFS Portal by selecting register new node and then choosing Project Archive account and answering some questions. The two methods are equivalent - you will be asked the same questions and your application processed in the same way.
The configuration and use of the HFS archive/retrieve client software differs subtly from the HFS Backup/Restore client software. For this reason you are strongly encouraged to read the following sections in order.
There are some general considerations on archive and retrieve that apply to all client users, irrespective of the client platform they are running. Please therefore read the next section before reading the platform specific pages on how to use the HFS archive/retrieve software.
Archiving is in some ways a completely separate concept to that of backup. Additionally, by its very nature, archive suggest an inherent value in the data to be secured. For these reasons it is advisable to start using the HFS archive software by setting up a test area within the project and exploring the capabilities and features of the archive client in a test environment, before moving on to archive "live" project data.
Some areas to consider before archiving your project data are:
- Archive file identification
When using the Backup Client, the location of a file - i.e. its directory/folder path - identifies that file. This is because there can at most be only two versions of the same file held on the backup server. In contrast the Archive Client allows unlimited versions of the same file to be kept and then for the source files on the local machine to be deleted. As such, the local directory/folder structure may provide little clue as to its archive contents and will certainly provide no information as to how versions of the same file differ.
Possible solutions to the above are to add a README or INDEX file in each directory folder, listing descriptions, dates and times of each file archived in that location. Alternatively and/or additionally, an entry in the
Descriptionfield may be used. By default the software client populates this field with the text "Archive Date: Date" which clearly becomes useless if you archive the same file twice on one day. We recommend that more descriptive entries in this field (max length 255 characters) be used for each file archived to group and distinguish archived files.
- Multiple user accounts
It is also important to consider the account (or username) under which you are going to do the archiving. On a Linux/Unix system, only the user who archived a file can see it within the archive, and only that same user can retrieve it (except that as usual root can see all and do all). If you are archiving repeatedly related material, when you (or someone) comes to retrieve it, it will probably be simply confusing if the archiving has been performed under different usernames.
- Symbolic file links (Linux/Unix)
Under Unix, when you name a symbolic link in an archive operation, the object pointed to by the link is archived, not the link data. This behaviour differs from that of backup, where the link data is backed up.
- Local file deletion
The archive client offers an option to delete the local files immediately on successful archival to the server. This option needs explicitly stating on the command line or setting in the Archive Options in the GUI and probably should not be used. The archive data is secured on the Archive Server by making three copies to tape. This process occurs early each morning between 00:00 - 01:00 am. We therefore recommend that where Archive clients need to delete archived material from their local machines - for example for reasons of space - they should desist from doing so until the day following the archival process for any particular file and then do so through local operating system commands.
- Archive file deletion
Unlike the backup client, the archive client allows the deletion of files archived on the server. Obviously care should be taken in use of this, as once deleted from the server, a file cannot later be retrieved.
3. Software installation & configuration
The HFS Archive client uses the same software as the HFS backup client; namely IBM TSM now rebranded as IBM Spectrum Protect starting with version 8 of the software. Consequently those archive clients which have already installed and are using the software for backup, will just need to follow the additional configuration steps available from the links below, in order to use the HFS archive service.
If your machine does not already have the HFS backup client software installed then you should first click the appropriate link below to first download and install that as below:
|Windows||Download & Install|
|Mac OS X||Download & Install|
|Linux/Solaris||Software download||Additional Configuration|
For simplicity, usage of the Archive client software is grouped according to interface type and function. Click on the appropriate link below for a tutorial on how to Archive and Retrieve files: