Client Success Story: An Indian Odyssey
Largest Data Preservation & Migration Project Ever Undertaken in India Heralds Several World Firsts in Data Management Industry
Early in 2011, SpectrumData and its partner Samit Spectrum were awarded a contract to consolidate, catalogue and remaster the aging and at risk seismic data holding of one of the largest oil and gas exploration organisations in the Southern Asian region.
The project involved the migration of over 110,000 cartridges of varying types and ages to modern IBM 3592 media in duplicate. The project was not only massive in scale, but was also very complex and involved several world firsts in the data management industry – taking nearly 18 months on the job in Mumbai India, for SpectrumData’s team of data recovery and migration specialists to complete one of the largest data migration projects ever undertaken in the subcontinent.
Chris Holloway, SpectrumData’s CEO commented that “The mobilisation for this project was a significant operation, as was the continued need to import the unusual raw material for the project to keep it running on time. However, having said that, the project has run very smoothly in a reasonably complex environment thanks to a great team effort”.
“In the Data Management industry, you don’t often get the opportunity to complete a world first, so this project was a great opportunity to do that and a lot more. As with any world first, there is some level of uncertainty and risk, which can be mitigated by careful planning and testing, which is exactly what was done.”
The Use of IBM 3592 WORM Media
Until 2011, the use of WORM media in any industry for data storage was very limited. In fact, the manufacture of this specialised media is only on a make to order basis. The use of this media on this project used more than 10 times the media that had ever been consumed on a global basis.
WORM stands for “Write Once, Read Many”, and is rarely used on projects of any kind. It poses some interesting risks, but delivers unparalleled protection for the data. The key risk is that if you for any reason write data to one of these tapes in error, the tape cannot be used again, making them a potentially expensive mistake. In a project where the consolidation of data from on occasion more than 200 originals onto one new tape, being done in a particular order, the opportunity to write data in error is high. This was where the second world first was developed.
The USE of integrated WEB based Quality Control
To ensure that the project was performed correctly, and that the client was actively involved in the process, SpectrumData created the first integrated web based data migration and QC system to fully involve the client in the project and the decision making process.
The QC system, provided multi stage processing so that junior staff members could perform low level QC, and managers of the business could then sign off on the final output. No data could be output until every individual data set was accepted by the client. If the data was not acceptable to the client, they could re-order the data sets, make suggested changes, or reject the data sets based on other criteria. In a project of 18 months, the system was able to ensure the client was kept active in the process, and this in turn has produced a fantastic outcome.
Establishment of a Cutting Edge Migration Data Centre
In order to run the project, SpectrumData and the client collaborated on the design and installation of a state of the art data centre in Mumbai. The installation involved more than 70 tape drives, a massive storage array, security, air conditioning, backup power, and a host of other hardware and software implementations. This was all shipped into Mumbai from 3 international locations, and constructed in readiness to start running media in less than 30 days.
The basic elements of the project involved:
- Advanced catalogue of each and every input tape and cartridge
- Digital photography of every tape to preserve its label details
- Sorting and barcoding of every tape into a logical order ready for migration
- Transfer of batches of data to disk array, where consolidation and comparisons were performed
- Updating of metadata catalogue based on data read vs data expected
- Formulation of an output strategy based on the consolidation of millions of files in the correct order, on a project by project basis.
- Quality control of the proposed output to be created prior to output
- Final output on dual 3592 WORM media.
- Labelling and creation of an online ordering and tape management system to track all inputs that went to an output, and the relative file positions of every file transferred.
The project has involved the transfer and rearrangement of hundreds of millions of files from several Petabytes of data. A single file output in the incorrect order was basis for rejection of the output product and specialised tools and MD5 tracking was used to track the movement of every byte.
The business continuity and disaster recovery requirements of our client also dictated that they should keep a second backup copy of the data offsite and away from operational copies of the data – a standard which is now commonplace but very difficult and expensive to achieve when you have 100K+ media cartridges to duplicate! Remastering the data onto higher density tape media (3590 to 3592 roughly gives a compression ratio of 50:1) – resulting in a much more manageable volume of tape cartridges.
For SpectrumData and the client, the project was a massive success.