A. Preite-Martinez and F. Ochsenbein
C.D.S., 11, rue de l'Université, 67000 Strasbourg, France
We present a compact version of the Guide Star Catalogue (GSC) that can be loaded on disk due to its reduced size (300Mb). The reduction of a factor of 4 in size is obtained coding the original catalogue on 96 bits only per record, without loss of integrity. In this paper we describe the encoding method, the structure of the resulting files, and the very efficient search tool applied to this compact version.
The compact GSC v1.1 was installed on the CDS Archive machine (name cdsarc, Internet number 130.79.128.5), a Sun Spark workstation dedicated to archives of astronomical catalogues. A client/server mechanism was also installed. The client program allowing a query from abroad computers is available for distribution.
The first set of CD-ROMs contained the Guide Star Catalog - Version 1, with an issue date of 1 June 1989. The second set of CD-ROMs contains the Version 1.1 of the GSC, with an issue date of 1 August 1992. The Guide Star Catalog (GSC) was prepared by the Space Telescope Science Institute (ST ScI), 3700 San Martin Drive, Baltimore, MD 21218, USA. ST ScI is operated by the Association of Universities for Research in Astronomy, Inc. (AURA), under contract with the National Aeronautics and Space Administration (NASA).
The Guide Star Catalog is subdivided into regions, bounded in right ascension and declination, and which are numbered consecutively from 0001 to 9537. Data for each region are stored as separate files; these files are contained in subdirectories, each of which subtends a 7.5 degree zone of declination.
The Guide Star Catalog is distributed as a two CD-ROM set, divided at a declination of -7.5 degrees. A description file and supporting tables are duplicated on both discs. All data files are in FITS (Flexible Image Transport System) table format.
Directory GSC contains subdirectories for the 7.5 degree zones in declination; these subdirectories in turn contain the GSC region files in FITS format for the respective zone. Additional information on the Guide Star Catalog may be found in comments in the FITS headers of the supporting tables.
(a) reduce the size of the catalogue by a large factor (about 4) without loosing the integrity of the catalogue;
(b) that encoding/decoding procedures for a single GSC region be as fast as possible;
(c) that the process of loading the compressed version on disk and running application tools be machine independent (e.g. to be run both on DEC and SUN stations).
(d) build an efficient search tool for the compressed version of the GSC.
A GSC record, originally written on 45 bytes, is stored on disk using only 96 bits (12 bytes). Each field in the record is appropriately offset and scaled, then coded in a series of bits whose number depends on the dynamic range of the field.
Next table shows the number of bits used to encode a GSC record:
| GSC field | bits | range | |
| 1 | GSC-ID | 14 | 16384 |
| 2 | RA | 22 | 4194304 |
| 3 | DEC | 19 | 524288 |
| 4 | pos-error | 9 | 512 |
| 5 | magnitude | 11 | 2048 |
| 6 | mag-error | 7 | 128 |
| 7 | mag-band (coded) | 4 | 16 |
| 8 | class. | 3 | 8 |
| 9 | plate-id (coded) | 4 | 16 |
| 10 | multiple (coded) | 1 | 2 |
Two more bits are used as spares.
At the beginning of each coded region a header contains information on the whole encoding process. Although we used the same encoding procedure for all regions, formally each region is independent and it is decoded according to the content of its header. The header is in ASCII, the content is the following:
| Field in header | Settings for GSC 1.1 |
|---|---|
| length of header | first 3 ch. of header |
| version | 2 (for GSC 1.1) |
| scaling factors: | |
| RA | 100,000 |
| DEC | 100,000 |
| pos-error | 10 |
| magnitude | 100 |
| mag-error | 100 |
| offsets: | |
| RA | lower RA boundary |
| DEC | lower DEC boundary |
| mag | |
| number of plates | in the region |
| plate list | plates used in the region |
Fields are separated by spaces. Additional spaces are added, if required, at the end of the header, to make its length a multiple of 4 bytes. Then the bit-encoded region follows, at an offset given by the first field in the header.
The directory structure used for storing on disk the encoded regions of the compressed GSC is the same used in the original GSC, plus additional directories for source codes, executables, and auxiliary tables. The total size of the compressed version of the GSC, including these additional directories, is about 303Mb.
The compressed version of the GSC v1.1 was installed on the CDS Archive machine (name cdsarc, Internet number 130.79.128.5) a Sun workstation dedicated to archives of astronomical catalogues (see e.g. B.I.CDS 1992, 41, 65).
First, commands and parameters on the command line are parsed and controlled for validity. Then the list of all the regions intercepted by the searching cone is found. In this phase gsc makes use of one of the auxiliary tables provided with the original catalogue (the list of regions), previously indexed on declination as part of the loading process.
The next step is to decode the selected regions. To speed-up the process only ra and dec are decoded at this stage. Only the records with coordinates within the searching cone are extracted and fully decoded. Additional manipulation of the extracted records is performed according to the options given on the command line.
The response time (elapsed time) for a search in a field of 10 arcmin in radius is usually of the order of 0.1 to 0.3 seconds, depending on the number of regions involved and the number of objects retrieved.
In order to allow an access to this compressed version of the GSC from any computer without logging in at cdsarc, a client/server mechanism was installed: the server search program is located on the cdsarc machine, and only the query (field center and radius, plus additional options) and the results (found stars) are trasmitted over the network. The client can be anywhere on Internet. The typical answer time for a query as described above from the US west coast is around 5s. The client program allowing the query from abroad computers is available for distribution at CDS (contact person F.O.).
The syntax of the gsc command is the following:
gsc -options parameters
or
gsc ra dec radius[1, 2]
(in this case: ra in decimal hours, dec in decimal degrees)
Options and parameters available for the query are described in the following table.
| options | parameters | format/units |
| c | ra +/- dec | free format |
| r | radius[1, radius2] | arcmin (default radius =10') |
| f | file-name | file with coordinates |
| (available only for local users) | ||
| p | fields as in the GSC, separated | |
| by spaces (default); | ||
| 1 | as above, with ra dec in string format; | |
| 2 | summary of GSC record | |
| (ID, RA, DEC, mag); | ||
| 3 | as 2, with ra dec in string format. | |
| h | - | prints header for output columns |
| s | [field no.] | sorts output according to specified |
| field. Field numbers are from 1 to 12. | ||
| Default field is 11 (= sort by distance) | ||
| If field>0 sorting order is ascending, | ||
| if field<0 order is descending | ||
| n | no.of lines out | Number of sorted lines in output. |
| Max lines set by option l. | ||
| Option n forces option s. | ||
| l | max no.of lines | size of buffer for sorted output. |
| Default = 1000 lines. | ||
| m | mag1 mag2 | limits in magnitude (mag1<mag2). |
If coordinates (or file with coordinates) are not specified, ra and dec are read from standard input. Distance from centre (d in arcmin) and position angle (pa in degrees East of North) are always provided at the end of the printed record.