Using DVD technology for archiving astronomical
data.
Benoît Pirenne (1, 2), Miguel Albrecht
(1)
(1) ESO/DMD, (2) ESO/ST-ECF
Background
Due to the slow evolution
of some astrophysical phenomena, long-term preservation of observations
has always been a major concern of observatories around the world. Be it
hand-drawings on paper or 19th century glass-plate photograph, the issue
at stake is how to best preserve data for the future generations.
The advent of digital imaging
and recording equipment in the second half of this century has provided
both more observations and denser data storage media. These media can therefore
no longer be read by the human eye. Moreover, with no immediate readability
to the unaided eye, digital recordings require specific equipment to decipher
their content. If a lot of progress has been made in the past decades to
manufacture long-lived data storage media, the same is not true for the
reading/writing equipment, quickly reaching obsolescence, and the repair
of which is rapidly becoming impossible. This apparent contradiction between
durable media and transient reading equipment is easy to understand if
one realizes that the media is usually "passive" whereas the reading device
is always active, with mechanical components.
Archivists must therefore reconsider
storage technology every few years: a transposition of the archive content
from endangered media to the newest technology has to be undertaken almost
every 3-5 years. Another major factor pushing towards migration of data
to new technology is costs: the cost of the new technology compared to
the old one often brings savings per unit of volume of up to an order of
magnitude and are a strong motivation for migration.
Current Situation.
In the case of the ESO/HST
Science Archive in Garching, since 1988, three different storage media
have been used and migrated from/to: The 2GB LMSI 12" Optical disk, the
6.4GB Sony 12" optical disk and the current 0.64GB CD-R in juke boxes.
The reasons for migrating from one to the other are given in table 1 below.
The various data storage technologies used so
far at the ESO/ST-ECF Archive Facility. Shaded area represent the solutions
actually implemented.. Units of cost represent an arbitrary monetary unit
set to 100 per GB for the most expensive solution.
|
|
Reason for choice/migration
to
|
|
Cost per GB with Juke
box
|
2GB/vol LMSI 12" optical
disk
|
Direct access, best of technology
back then. In sync with STScI and HST archive.
|
|
|
6.4GB/vol Sony 12" optical
disk
|
Direct access, factor
of 2-3 cheaper to operate, previous technology difficult to maintain. In
sync with STScI and HST archive
|
|
|
|
|
Jukebox allows for online,
no-operator-required access, ISO standard for file system
|
|
|
|
|
Much higher density, keeps
direct access advantage of CD-ROMs
|
|
|
We abandoned the 12" optical
disk in favour of the more common 5 1/4" CD-R for two reasons. On the one
hand, CD-R were enjoying an international standard defining the way their
content should be laid out (ISO9660). This was a guarantee of durability
and multi-vendor support. On the the other hand, the possibility of having
all the data on-line in juke boxes was finally an affordable possibility
as the cost of juke boxes for 12" optical disks was prohibitive for our
archive system (see right-most column of table 1). This last reason sealed
the fate of hardware compatibility with the HST archive at the STScI where
the data is still on 12"OD in jukeboxes.
However, now that we have
completed the migration to CD-R, we are faced with another concern: the
data rate growth. The VLT and HST instruments, soon to be commissioned,
will produce several TB worth of raw data per year. We could not practically
keep this data using CD-ROMs in juke boxes without making major infrastructure
investment in storage buildings!
The solution that addresses
the density problem and keeps the advantage of the CD-R technology (direct
access medium, cheap juke box capability) is the DVD.
Digital Versatile Disk (DVD)
The DVD technology has been
very long to come, heralded as it was by the specialized press for a number
of years already. However, various disagreements within the industry and
disputes around copyright issues have considerably slowed the introduction
of this technology. Since a few months however, equipment to record one's
own media (the DVD-R -see table 2 for a brief description of the variants)
has become available. Our archive facility was understandably quick to
procure and test the equipment and prepare the necessary software to support
the device (from Pioneer Corp.). Even now, little support is available.
The DVD-R can only be called such if its file system is compliant with
the UDF file system. However, software drivers to support this format for
both read and write are hardly available. To our knowledge only the latest
version of the MacOS operating system has genuine support for it. The Unix
world so far enjoys no support.
In order to obtain quick results
and to be as compatible as possible with the existing archive tools and
procedures we are using, we took a pragmatic approach: we contacted the
developer of a public-domain CD-R recording tool "cdrecord" (a popular
Linux tool, see below) and arranged with him to extend his software for
the production of DVD-Rs as well. Within a few months, a workable system
was delivered to us. However, due to the lack of software support for the
DVD native UDF file system, we are using a the standard CD-ROM format (650MB
ISO9660) extended to 4GB. To the host computer, our "DVD-R" once written
simply looks like an unusually large CD-ROM.
Projects and schedules.
The most pressing and demanding
project in our archive for high density storage media at the moment is
the future 2.2m telescope mosaic camera which will be commissioned in La
Silla starting this October. If our tests and prototypes, together with
juke box support are positive, the DVD technology will be the system of
choice for this particular archive. Also, we have started to migrate the
NTT archive from the current Sony 12" optical disks to DVD. By the time
this issue of the Messenger is distributed, we will have copied a few dozen
Sony 12" optical disks onto the new medium.
We still expect to have full
UDF support later in 1999. Our current experience shows that computer operating
system will probably transparently identify and mount media using any of
the standards. So the co-existence in the same jukebox of CD-R, DVD-R with
ISO9660 and plain DVD-R with UDF should be no problem.
The next step, in 1999 or 2000
will be the gradual migration of our CD-Rs onto the new medium to save
juke box storage space, as this is by far still the largest part of the
storage cost of CD-Rs.
For more information about this
system, please contact the authors (bpirenne@eso.org or malbrech@eso.org).
Information about "cdrecord" can be obtained from Jörg Schilling (schilling@fokus.gmd.de).
The DVD-R recording device we are using is Pioneer model DVR-S101.
The Jungle with acronyms
|
|
|
|
|
|
|
|
|
|
Compact Disc - Read-Only
Memory
|
Mass-produced (silver)
CD-ROM
|
|
|
|
|
|
|
|
|
|
|
|
Mass-reproduced Video
medium
|
|
|
|
Mass-reproduced data disc
4.7, 9.4, 18.8 GB.
|
|
|
|
|
|
|
|
re-writeable DVD (2.6
GB)
|
|
|
|
|
References
-
Pirenne, B., "Data Storage Technology: coping
with the evolution", Invited review paper for the IAU Symposium 161 on
Wide-Field imaging, Potsdam, Germany, August 1993, Kluwer Academic Publishers
p. 339, 1994
-
Russo, G., Russo, S., Pirenne, B., "An Operating
System Independent WORM Archival System", in Software -- Practice and Experience,
25(5), 521-531, May 1995
-
Albrecht, M., Péron, M., Pirenne, B.,
"Building the archive facility of the ESO Very Large Telescope", in "Information
& On-Line Data in Astronomy", D. Egret and M. A. Albrecht ed., 1995,
Kluwer, p 57
-
Pirenne, B., Durand, D., "Data Storage Technology
for Astronomy", in "Information & On-Line Data in Astronomy'', D. Egret
and M. A. Albrecht ed., 1995, Kluwer, p. 243