In my career in long-term preservation I had the chance to look under the skin of several systems developed with the aim to fulfill the OAIS reference model. During couple of years, I became familiar with Rosetta, SDB, RODA, one home grown system in Slovakia, more briefly with some other one customer or home tailored solutions. Joining the LTP-Pilot project at Masaryk University in Brno I started to look more closely at Archivematica as this is the focus of the project (and of other projects in Czech Republic as well – Archivematica is supposedly used in the National Archives of the Czech Republic; National Film Archives and Library of the Academy of Sciences are looking at this system too).
The OAIS - ISO 14721 - describes functional entities and information model. For some reason I tend to feel that to understand an OAIS system is always very useful to start with the information model. I.e. to understand how are the information types described in OAIS mapped to AIP, how can the SIP be structured and how is it converted to AIP, what is possible to do with the AIP later on inside the repository. The AIP modeling expresses the long-term preservation philosophy of the OAIS system creator, and constraints also possible functionality of the system.
Seeing x-th system with the same ambitions (be the “OAIS system”) I realized that there are quite some differences between them on the level of AIP data model. If we see during 10 years of existence of the OAIS such different approaches as RODA FOXML based model, Archivematica Bagit based model, or systems using METS or non-METS simple and single metadata containers like Rosetta or SDB and other, not mentioning other appraoches like SPAR in BNF, how can we expect some level of interoperability in 50 years time? How do we expect to migrate the AIPs from system to system? Does that mean that we will throw the old repository metadata in a provenance bag, and take only some information to build new AIPs? Or should we have bigger ambitions and want to map maximum of the audit and provenance information into the new systems?
I was always rather skeptical to practical usability of the abstract models like the Trustworthy digital object. But should we not really strive to model some “Common AIP Exchange Format” or have and AIP exchange method standard that would enable fast system to system exchange of the AIPs preserving maximum of the information? I don’t know if the TIPR (http://wiki.fcla.edu/TIPR/1) project resulted in practical implementations, but Repository Exchange Package could be a way to look at the problem. But this project seems to be territorially focused and ended by 2011 without any further steps to enlarging the user community.
If we look at the LTP repositories environment in 50 years time, we can expect each AIP being migrated twice between different systems. Shouldn’t we have clear and common idea about what is to be preserved in these migrations? Shouldn’t this be also explicitly described in widely accepted standard?