PPOOSSTTGGRREESS VVeerrssiioonn 44..00 RReelleeaassee NNootteess 11999922//0077//1144 11.. IInnttrroodduuccttiioonn These are the release notes for Version 4.0 of the POSTGRES database system from UC Berkeley. The database system and its installation procedure are covered in detail in the setup document for this release, which can be found in the file $$PPOOSSTTGGRREESSHHOOMMEE//ddoocc//ppoossttggrreess--sseettuupp..mmee. Here, we cover only the most important differences from Version 3.1 and earlier versions of the system. 22.. AAiimm This release provides improved functionality over Ver- sion 3.1 in many areas of the system. In particular, sig- nificant advances have been made in POSTGRES functions and large objects. 33.. CChhaannggeess ttoo PPOOSSTTQQUUEELL 33..11.. ddeeffiinnee ffuunnccttiioonn The syntax for defining functions has been changed slightly. POSTGRES now supports functions that can return sets of values. To declare such a function you use the keyword sseettooff before the actual type name when specifying the return type: ddeeffiinnee ffuunnccttiioonn hhoobbbbiieess ((llaanngguuaaggee == ""ppoossttqquueell"",, rreettuurrnnttyyppee == sseettooff hhoobbbbiieess__rr)) aarrgg iiss ((eemmpp)) aass ""rreettrriieevvee ((hhoobbbbiieess__rr..aallll)) wwhheerree hhoobbbbiieess__rr..ppeerrssoonn == $$11..nnaammee"" The function hhoobbbbiieess is declared to return a set of instances from the class hhoobbbbiieess__rr. Had sseettooff been omitted, the rreettrriieevvee would be executed exactly one time and the first instance retrieved would be the only return value. The primary reason for forcing this dis- tinction is that POSTQUEL functions declared to return a single value can be used in a much more general fashion than functions returning a set of results. POSTGRES does not support functions or operators with sets as arguments. For example: PPOOSSTTGGRREESS VVeerrssiioonn 44..00 RReelleeaassee NNoo22tteess 11999922//0077//1144 ddeeffiinnee ffuunnccttiioonn aallwwaayyssoonnee ((llaanngguuaaggee == ""ppoossttqquueell"",, rreettuurrnnttyyppee == iinntt44)) aass ""rreettrriieevvee ((xx == 11))"" Calls to aallwwaayyssoonnee may be used as though they were con- stants. In contrast, if we had declared the return type to be sseettooff iinntt44, aallwwaayyssoonnee would be forbidden as an argument to functions and/or operators. POSTGRES now provides much better optimization of expensive functions. To aid the optimizer, the author of a C function can now choose to specify parameters which will be used to determine the function's "cost" in the query under consideration. For complete details see the Reference Manual page on functions. 33..22.. NNUULLLL DDeetteeccttiioonn POSTGRES now recognizes the keywords IISSNNUULLLL and NNOOTT-- NNUULLLL and allows qualifications based on whether or not an attribute is null. The general syntax in qualifica- tions is: wwhheerree _e_x_p_r_e_s_s_i_o_n IISSNNUULLLL wwhheerree _e_x_p_r_e_s_s_i_o_n NNOOTTNNUULLLL 33..33.. UUnniiffiieedd SSyynnttaaxx ffoorr FFuunnccttiioonnss aanndd AAttttrriibbuutteess If a POSTQUEL function is declared that takes instances of a class as its only argument, it may be invoked in two different ways. For example, assume that the function oovveerrppaaiidd has been declared in POSTQUEL to take an instance of the eemmpp class and return a boolean value. Then the two queries rreettrriieevvee ((eemmpp..oovveerrppaaiidd)) rreettrriieevvee ((oovveerrppaaiidd == oovveerrppaaiidd((eemmpp)))) are exactly equivalent. Similarly, attributes may be referenced in the same ways. If eemmpp has an attribute ssaallaarryy, then rreettrriieevvee ((eemmpp..ssaallaarryy)) rreettrriieevvee ((ssaallaarryy == ssaallaarryy((eemmpp)))) are equivalent. Both of these may be considered parame- terless method calls on instances of the eemmpp class. PPOOSSTTGGRREESS VVeerrssiioonn 44..00 RReelleeaassee NNoo33tteess 11999922//0077//1144 33..44.. BBiinnaarryy PPoorrttaallss It is now possible to get data out of POSTGRES in binary format. The query syntax is rreettrriieevvee iippoorrttaall _p_o_r_t_a_l _n_a_m_e ((_t_a_r_g_e_t _l_i_s_t)) _q_u_a_l_i_f_i_e_r_s Taking advantage of this feature should speed up lliibbppqq application programs considerably. See the Reference Manual for all of the details. 33..55.. AArrrraayyss In previous releases array indirection values were required to be integer constants. In Version 4.0 they can be any expression as long as the expression is of type int4. For example the following queries are now legal rreettrriieevvee ((ffoooo..aa[[11++22]],, ffoooo[[aarrrraayyiinndd..ii]],, ffoooo..aa)) rreettrriieevvee ((ffoooo..aa[[aarrrraayyiinndd..ii]])) wwhheerree aarrrraayyiinndd..ii <<== 33 44.. GGeenneerraall SSyysstteemm CChhaannggeess 44..11.. FFuunnccttiioonnss The big story of the Version 4.0 release is improved POSTQUEL functions. POSTQUEL function support in previ- ous releases suffered from the following limitations: +o They could take only a single argument, which had to be an instance of some class. +o They had to return a set of instances of some class. +o No run-time or declaration-time type checking was performed to verify that the arguments or return value of the function were correctly typed. POSTQUEL functions in Version 3.1 of POSTGRES were implemented using the rewrite rules system. Version 4.0 supports query language functions as first-class citi- zens, and no longer relies on the rules system to imple- ment them. As a result, the shortcomings listed above have been eliminated. There are no special restrictions on the types or the number of arguments, nor is there any restriction on the type of the result. As outlined above, the syntax for defining a POSTQUEL function is analogous to defining a PPOOSSTTGGRREESS VVeerrssiioonn 44..00 RReelleeaassee NNoo44tteess 11999922//0077//1144 C function. The biggest difference is that the aass clause must contain a quoted list of POSTQUEL queries. Within the body of the function, $_n_u_m_b_e_r refers to the _n_u_m_b_e_rth parameter passed in by the caller. See the Reference Manual for details. C functions have also been improved. In previous releases C functions were restricted to taking at most one "tuple" argument, that is, an argument whose type is an instance of some class. In Version 4.0, C functions can take up to eight tuple arguments.1 Finally, POSTGRES provides enhanced optimization of both C and POSTQUEL functions. In the case of C func- tions, the optimizer takes user-specified parameters on cost into account when deciding how to execute a query. POSTQUEL functions have one important limitation: system attributes (for example, ttmmiinn, ttmmaaxx, and ooiidd) of function return values and parameters cannot be refer- enced. The POSTGRES parser checks this condition when a function is defined and invoked, and prints a descrip- tive error message if the condition is violated. This limitation will be removed in a future release of the system. 44..22.. RReevviisseedd SSuuppppoorrtt ffoorr NNeesstteedd RReellaattiioonnss Version 3.1 of POSTGRES supported nested relations using tuple-valued attributes and rewrite rules. For example, given the declarations: ccrreeaattee ppeerrssoonn ((nnaammee == cchhaarr1166,, cchhiilldd == cchhaarr1166)) aaddddaattttrr ((ppaarreenntt == ppeerrssoonn)) ttoo ppeerrssoonn ddeeffiinnee rreewwrriittee rruullee ffiinndd__ppaarreenntt oonn rreettrriieevvee ttoo ppeerrssoonn..ppaarreenntt ddoo iinnsstteeaadd rreettrriieevvee ((ppeerrssoonn..aallll)) wwhheerree ppeerrssoonn..cchhiilldd == ccuurrrreenntt..nnaammee it was possible to refer to ppeerrssoonn..ppaarreenntt..nnaammee and have the correct name fetched by POSTGRES. ____________________ The 8-argument limit applies to aallll POSTGRES functions. PPOOSSTTGGRREESS VVeerrssiioonn 44..00 RReelleeaassee NNoo55tteess 11999922//0077//1144 In Version 4.0 of POSTGRES, rewrite rules are no longer used to support nested relations. Instead, a function should be declared that returns tuples of the appropri- ate type. For example, in Version 4.0, ccrreeaattee ppeerrssoonn ((nnaammee == cchhaarr1166,, cchhiilldd == cchhaarr1166)) ddeeffiinnee ffuunnccttiioonn ppaarreenntt ((llaanngguuaaggee == ""ppoossttqquueell"",, rreettuurrnnttyyppee == ppeerrssoonn)) aarrgg iiss ((ppeerrssoonn)) aass ""rreettrriieevvee ((ppeerrssoonn..aallll)) wwhheerree ppeerrssoonn..cchhiilldd == $$11..nnaammee"" are equivalent to the rule and attribute additions shown above. The same nested-dot notation is supported for functions as was supported for rewrite rules. As a side effect of the improved support for func- tions, it is not possible in Version 4.0 to create a relation which has an attribute that is of a complex (tuple) type. For example, the aaddddaattttrr command shown above is illegal in this release of POSTGRES, since the type of the attribute being added is ppeerrssoonn, and ppeerrssoonn is a class. Support for attributes of complex types will be reintroduced in a future release of POSTGRES. 44..33.. SSeettss POSTGRES now provides limited support for sets. Cur- rently this comes via results of POSTQUEL functions (C functions should also be able to return a set of values; this will be available in a future release). The key design choice to point out is that when two set-valued POSTQUEL functions appear in the target list of a rreettrriieevvee query, the result will be a set of "flattened" tuples. This set of tuples is the cross product of all the sets appearing the target list. For example, if you have two functions, ff and gg, where ff returns 3 results and gg returns 2, POSTGRES will return 6 tuples which make up all combinations of ff and gg. If either ff or gg retuns no results (i.e. the empty set) then we consider it to be a set of 1 element whose value is NULL. This choice was made primarily to avoid returning no results when a function in the target list returns nothing. 44..44.. LLaarrggee OObbjjeeccttss aanndd tthhee IInnvveerrssiioonn FFiillee SSyysstteemm Support for large objects in POSTGRES has been improved. Version 3.1 supported large objects stored in UNIX files. User applications could open, seek, read, and write these large objects, but only inside dynami- cally-loaded functions. In addition, changes to large objects were not transaction-protected, and no support PPOOSSTTGGRREESS VVeerrssiioonn 44..00 RReelleeaassee NNoo66tteess 11999922//0077//1144 for historical access was available. In Version 4.0, another large object implementation is provided. This implementation breaks large objects into "chunks," and stores each chunk in a tuple in the database. Updates are transaction-protected and time travel is available. Version 3.1 large objects ("UNIX large objects") are still supported. The new implemen- tation ("Inversion large objects") is available by spec- ifying the proper flags when an object is created. Details are provided in the Reference Manual. In addition, a rudimentary file system has been built on top of the POSTGRES Inversion large object implemen- tation. The Inversion file system provides transaction- protected access to user files, time travel on file sys- tem state, and the ability to do attribute- or content- based searches for file data. At present, no kernel interface to Inversion is available, so the Inversion file system cannot be mounted and used in the same way as the native file sys- tem. Instead, a suite of library routines and utility programs give users access to the Inversion file system. Again, see the Reference Manual for details. 44..55.. IInnddiicceess oonn SSyysstteemm CCaattaallooggss POSTGRES now has secondary indices defined on the system catalogs. Initially, we have indexed ppgg__pprroocc, ppgg__aattttrriibbuuttee, and ppgg__ttyyppee. We expect that, after pro- filing POSTGRES performance on large databases, addi- tional indices will be added in future releases. 55.. MMiisscceellllaanneeoouuss 55..11.. BBiigg BBuugg FFiixxeess We have found and fixed a number of serious bugs with multi-user operation that were present in release 3.1. One bug sometimes caused a backend to read end-of-file from the socket on start up. Another much more destruc- tive bug caused data from one database to be written to another database in certain circumstances. If the database being written to was tteemmppllaattee11, then the cor- ruption would be passed on to whoever ran a subsequent ccrreeaatteeddbb. Often ccrreeaatteeddbb would just fail. 55..22.. RReeffeerreennccee MMaannuuaall aanndd UUsseerr MMaannuuaall A serious effort was made to improve the dilapidated condition of both the Reference Manual and User Manual. PPOOSSTTGGRREESS VVeerrssiioonn 44..00 RReelleeaassee NNoo77tteess 11999922//0077//1144 The main focus was to root out all of the lies they con- tained, as well as documenting all of the new features. We will be paying much closer attention to bugs in these two documents. If you find something wrong in them, or something you can't understand, let us know about it (bbuugg--ppoossttggrreess@@ppoossttggrreess..BBeerrkkeelleeyy..EEDDUU). 55..33.. PPOOSSTTGGRREESS UUsseerr IIddss aanndd UUnniixx UUIIDDss The user ID of a POSTGRES registered user mmuusstt match the user's UNIX user ID. In the new release, the user ID of the POSTGRES user in //eettcc//ppaasssswwddis nnoott presumed to be anything. We now adjust the system catalog entry for the user ID at installation time. It will happen auto- matically; there is no longer any need to edit ppgg__uusseerr..hh. 55..44.. SSeeccuurriittyy Security in POSTGRES is pretty much non-existent. We are working on providing it for a future release. For now the only check POSTGRES performs is at initializa- tion time. The backend will lookup the front-end user to make sure s/he is a registered POSTGRES user before allowing the user to run any queries. Just so you are aware, it is not at all difficult for someone to circum- vent this poor excuse for a lock-out mechanism. If you really care about your data, you should make backups of your $$PPOOSSTTGGRREESSHHOOMMEE//ddaattaa directory regularly. In previous releases, ppgg__uusseerr contained entries for each of the POSTGRES implementors. In Version 4.0, the only entry is for the user named ppoossttggrreess. 55..55.. AAppppeenndd RReettuurrnnss aann OObbjjeecctt IIdd When you run the POSTQUEL aappppeenndd command from ppqqeexxeecc, the command tag that you receive has the ooiidd of the appended tuple in it. When you run aappppeenndd queries from the monitor you get output something like: AAPPPPEENNDD 220066555577 When two or more tuples are appended by the same command POSTGRES returns the invalid object ID zero. 66.. KKnnoowwnn BBuuggss There are a few known bugs that we did not fix in the current release. PPOOSSTTGGRREESS VVeerrssiioonn 44..00 RReelleeaassee NNoo88tteess 11999922//0077//1144 66..11.. IInnddiicceess aanndd tthhee IInnssttaannccee LLeevveell RRuullee SSyysstteemm The instance-level rule system essentially ignores indices, so if you are defining a rule on an indexed attribute, you should use the query rewrite rule system. 66..22.. RReettrriieevvee IInnttoo aanndd ffaaiilleedd bbaacckkeennddss If a backend fails while in the course of executing a rreettrriieevvee iinnttoo query, a spurious file, with the same name as the target class of the rreettrriieevvee iinnttoo, will be left in the database directory. This file can be safely deleted by the database administrator. 66..33.. KKnnoowwnn bbuuggss lliisstt A list of known bugs and suggested work-arounds can be anonymously FTP'd from ppoossttggrreess..BBeerrkkeelleeyy..EEDDUU (128.32.149.1). This list is kept in the file ppuubb//ppoossttggrreess--vv44rr00..bbuuggss. We will make every attempt to keep this list up to date. 77.. MMaacchhiinnee--ddeeppeennddeenntt PPrroobblleemmss 77..11.. SSPPAARRCCssttaattiioonnss rruunnnniinngg SSuunnOOSS 44..00..33 POSTGRES has been known to crash SunOS 4.0.3 on SPARCstations, due to a SunOS bug in shared memory. This bug has not been observed on SunOS 4.1 and higher, so any reports of crashes on SunOS 4.1 and higher are especially appreciated.