Sequoia 2000 January '93 retreat Postgres demo software Mike Olson, Jim Frew This directory contains the software that Frew and I demonstrated at the S2K retreat at Lake Arrowhead in January of 1993. This software manages typed large objects (AVHRR satellite images) as two-dimensional arrays and handles georeferenced coordinate transformation (between Lambert Azimuthal Equal Area and Latitute/Longitude for this demo). The general scheme is that a number of .c files are compiled to object files, which are then dynamically linked into Postgres on demand. The .pq files are Postquel scripts that define the types and functions supported by the .c files. For the demo, we used a database that already contained some AVHRR satellite images as large objects; those are not provided here, but there are plenty of them available in the 'sequoia' database on heel.s2k.Berkeley.EDU. If you are reasonably familiar with POSTGRES, you should be able to use the code supplied here to define the classes, types, and functions we demonstrated at Lake Arrowhead. Sometime in the next few weeks, we'll write a script that will handle initializing a database and loading these operators automatically. The types supported are: ARRAY: This is a Postgres ADT. Values are variable-length. In POSTGRES, variable-length data items always begin with a four-byte value that is the length of the particular datum. This length includes the four bytes; for example, if the string "POSTGRES" were stored as a variable-length (varlen) value, it would be represented as <12>POSTGRES where the <12> is stored in the four bytes immediately preceding the string. (Using this strategy, it is not necessary to store trailing null bytes for character strings.) Array values look like struct { int arr_varlen; /* len including arr_varlen */ char arr_format[8]; /* format key */ int arr_eltsize; /* size of an array element */ int arr_ndims; /* # of dimensions */ int dims[]; /* size in each dimension */ char name[]; /* name of large object */ } where 'dims' and 'name' are variable-length values that immediately follow the array structure in memory or on disk. 'Name' is the name of a large object which stores the actual array contents. 'Arr_format' is an eight-byte format name, which the code uses to understand the format of the large object. In the long term, we want to make this table-driven, like the GEO_UNITS type, below. For now, the only two formats supported are 'avhrr' and 'pgm' (which may be viewed by using the 'xv' program). C code: array.c POSTQUEL queries to set up required classes: array.pq Used in: avhrr.c geoloc: Geoloc is a georeferenced location type in two dimensions; this implentation ignores elevation. Geoloc is an ADT whose contents depend on the projection stored for a given value. For all projections, values contain struct { int geo_varlen; /* len including geo_varlen */ char geo_name[8]; /* name of projection */ float geo_prec; /* precision of value */ double geo_x; /* x component */ double geo_y; /* y component */ } 'Geol_name' is the name of the projection; this serves as a foreign key into the GEO_UNITS table, described below. 'Geol_prec' is a 32-bit floating point value describing the precision of the 'geo_x' and 'geo_y' coordinates; at present, we do nothing interesting with this. (One problem with managing precision is deciding how to change it when the conversion between projections is nonliniear). The 'geo_x' and 'geo_y' entries are the X and Y (or R and theta) components of the location. They're interpreted differently depending on the projection in use. Different projections have different amounts of additional information. The Lambert Azimuthal Equal Area projection stores struct { geoloc la_g; /* the struct above */ unsigned char la_sphcode; /* spheroid code */ double la_lon0; /* origin lon */ double la_lat0; /* origin lat */ double la_falseE; /* false eastings */ double la_falseN; /* false northings */ } where (for this demo) the spheroid code is always 0 (for Clarke 1866), origin is in the center of the conterminous US, and falseE and falseN are both zero. C code: geoloc.c POSTQUEL queries to set up classes: geoloc.pq Used in: avhrr.c GEO_UNITS: This is a class storing function pointers for all of the different projections we know about. We declare the Latitude/Longitude format to be the standard representation for geoloc values. GEO_UNITS contains char8 gu_name, regproc gu_inproc, regproc gu_outproc, regproc gu_fulloutproc, regproc gu_fromstdproc, regproc gu_tostdproc The 'gu_name' entry matches the 'geo_name' field of a geoloc value. The rest of the entries are function pointers for managing values in that projection. 'inproc', 'outproc', and 'fulloutproc' convert geoloc values from string format, to incomplete string format, and to complete string format, respectively. The idea is that under normal circumstances, users don't want to see (for example) the precision and state vectors for a particular geoloc value, but when moving data out of Postgres they do need that information. Therefore, normally users see values in abbreviated (gu_outproc) format, but should use gu_fulloutproc when they want a complete description. The 'gu_fromstdproc' and 'gu_tostdproc' convert values in a given projection from and to the standard projection, respectively. The geoloc type has in, out, fullout, tostd, and fromstd functions defined on it. These functions automatically look up the appropriate projection-specific function in the GEO_UNITS table and call that. To add a new projection to the set that is currently supported, the user must only write the in, out, fullout, tostd and fromstd functions for it, and register it by name in the GEO_UNITS table. POSTQUEL queries to set up classes: geoloc.pq Used in: geoloc.c AVHRR: This is a class that stores AVHRR images. This class is an initial implementation only. We expect to add more metadata to each image stored. We chose abstime (the POSTGRES internal absolute time type) to store dates here because it was available, but we expect to implement a new date type for Sequoia that extends further into the past and the future. Abstime values range from 1902 to 2038. For each image, the AVHRR class contains abstime image_date, int2 band, geoloc image_nw, geoloc image_se, ARRAY image 'Image_date' is the date associated with the image; for biweekly composites, this may or may not be useful. 'Band' is the sensor band (1-5) of the image. The image coordinates image_nw and image_se mark the northwest and southeast corners of the rectangle that contains the image. For AVHRR images, these are normally in LAZEA. The 'image' attribute points at a large object (an ARRAY) that contains the actual image bytes. C code: avhrr.c POSTQUEL code to set up classes: avhrr.pq The following functions operate on these types: in array.c: arrayname(ARRAY): return the name of the supplied array. arrayndims(ARRAY): number of dimensions in the supplied array. arrayformat(ARRAY): return the format string for the supplied array. arrayclip(ARRAY, format, x1, y1, x2, y2): clip the supplied array to the rectangle (x1, y1, x2, y2) (upper left, lower right). These coordinates are relative to the origin of the array. The function produces a new array large object; the format of the new array is whatever was selected by the 'format' argument to arrayclip(). in avhrr.c: clip_avhrr(AVHRR, format, northwest, southeast) Clip the AVHRR image to the rectangle described by northwest and southeast. Northwest and southeast are geolocs, and are converted to the native geoloc for the AVHRR image (Lambert Azimuthal Equal-Area) automatically. In general, coordinate transformation doesn't produce rectangles when converting among projections, so clip_avhrr computes the bounding box that completely contains the LAZEA area corresponding to the supplied rectangle. In order to do the clip, the desired coordinates are turned into indices for the underlying array, and the arrayclip() routine is called to do the real work. This function returns the resulting array; it does not insert a new tuple in the AVHRR class, although it would be easy to change it to do so. In addition, the _in, _out, _fullout, _tostd, and _fromstd functions for each of the supported map projections are defined in geoloc.c. These functions do the work described above, in the section on supported types.