Friday 5 February 2010

Archival file systems

Most file systems are fairly straight forward - objects are written to them, objects are read from them, objects are rewritten to them, objects are deleted from them. In other words files are created, modified and deleted, and as more files are created than are ever deleted the number of files grows and eventually the filesystem fills up.

In such a context backup is equally straight forward - you periodically copy the files to another lower cost medium, typically tape, and that way you build up a set of copies allowing you to roll back files to a certain point and to guard against corruption by having multiple copies. There is an implication here that the files that are most likely to become corrupt are those that stay on disk longest.

Such filesystems are however very much the middle case. There are extreme examples at both ends. One, as typically exemplified by student filestores, is the case where the system is never quiet and has a high state of churn with lots of files being created, changed and destroyed. The other is the archival filestore, as found in a digital repository, where ideally one never deletes any object from the filestore.

In both of these scenarios what one needs to do is to write multiple copies of the objects, record their checksums and the date the file was last touched (changed), and then periodically rescan the filestore for unexpected anomalous changes ie where one copy is different from all the others and if it is replace it with a known good copy. It's also true to say that as these filestores are not required to handle a lot of fast transactions the write and read performance can be variable with in limits.

Exactly because we are not envisaging their use for high demand transaction processing systems where fast predictable performance is required we can happily use commodity technolgies for the disk systems and ethernet based solutions for the network interfaces, making the solutions potentially cheap to deploy.

Essentially, in both cases we are looking at a clustered self healing network attached filestore built out of commodity technology. Add some geographical separation and we have resilence and the ability to cope with unexpected events such power loss.

The nice thing is that the system is self maintaining once deployed, and also self contained as regards the smarts involved. For our applications it only needs to present a well known interface, such as an ext3 filesystem.

If we wish to run this as a high demand filesystem and support heterogeneous clients we simply need to front end the filestore with simple boxes to re-export the appropriate parts of the filestore.

In the case of archival filestores all access should be mediated by the repository application, but of course there is nothing to stop you reusing the object between multiple applications, so the pdf of a research paper can appear in both an open access repository, a learning objects repository, or indeed a staff directory to generate a dynamic publications list.

However, the key takeaway is that it is fault tolerant. And if it's fault tolerant the individual components don't have to be super reliable - they just need to be reliable enough to ensure that you have enough good copies of the files to cope with component failure or local filesystem corruption. In essance it becomes a local storage cloud...



No comments: