TThhee PPOOSSTTGGRREESS9955 UUsseerr MMaannuuaall VVeerrssiioonn 11..00 ((SSeepptteemmbbeerr 55,, 11999955)) _A_n_d_r_e_w _Y_u _a_n_d _J_o_l_l_y _C_h_e_n _(_w_i_t_h _t_h_e _P_O_S_T_G_R_E_S _G_r_o_u_p_) _C_o_m_p_u_t_e_r _S_c_i_e_n_c_e _D_i_v_._, _D_e_p_t_. _o_f _E_E_C_S _U_n_i_v_e_r_s_i_t_y _o_f _C_a_l_i_f_o_r_n_i_a _a_t _B_e_r_k_e_l_e_y ____________________ POSTGRES95 is copyright (C) 1994-5 by the Regents of the University of California. Permission to use, copy, modify, and distribute this software and its documentation for any purpose, without fee, and without a written agreement is hereby granted, provided that the above copyright notice and this paragraph and the following two paragraphs appear in all copies. IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, IN- CIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DIS- CLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HERE- UNDER IS ON AN "AS IS" BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. 11 ________________________________________________________________________________________________________________________ 11.. IINNTTRROODDUUCCTTIIOONN ________________________________________________________________________________________________________________________ This document is the user manual for the POSTGRES95 database management system developed at the University of California at Berkeley. POSTGRES95 is based on POSTGRES release 4.2. The POSTGRES project, led by Pro- fessor Michael Stonebraker, has been sponsored by the Defense Advanced Research Projects Agency (DARPA), the Army Research Office (ARO), the National Science Foun- dation (NSF), and ESL, Inc. 11..11.. WWhhaatt iiss PPOOSSTTGGRREESS?? Traditional relational database management systems (DBMSs) support a data model consisting of a collection of named relations, containing attributes of a specific type. In current commercial systems, possible types include floating point numbers, integers, character strings, money, and dates. It is commonly recognized that this model is inadequate for future data process- ing applications. The relational model successfully replaced previous models in part because of its "Spartan simplicity". However, as mentioned, this simplicity often makes the implementation of certain applications very difficult to implement. POSTGRES offers substantial additional power by incorporating the following four additional basic constructs in such a way that users can easily extend the system: classes inheritance types functions In addition, POSTGRES supports a powerful production rule system. 11..22.. AA SShhoorrtt HHiissttoorryy ooff tthhee PPOOSSTTGGRREESS PPrroojjeecctt Implementation of the POSTGRES DBMS began in 1986. The initial concepts for the system were presented in [STON86] and the definition of the initial data model appeared in [ROWE87]. The design of the rule system at that time was described in [STON87a]. The rationale and architecture of the storage manager were detailed in [STON87b]. POSTGRES has undergone several major releases since then. The first "demoware" system became operational 22 in 1987 and was shown at the 1988 ACM-SIGMOD Confer- ence. We released Version 1, described in [STON90a], to a few external users in June 1989. In response to a critique of the first rule system [STON89], the rule system was redesigned [STON90b] and Version 2 was released in June 1990 with the new rule system. Ver- sion 3 appeared in 1991 and added support for multiple storage managers, an improved query executor, and a rewritten rewrite rule system. For the most part, releases since then have focused on portability and reliability. POSTGRES has been used to implement many different research and production applications. These include: a financial data analysis system, a jet engine perfor- mance monitoring package, an asteroid tracking database, a medical information database, and several geographic information systems. POSTGRES has also been used as an educational tool at several universities. Finally, Illustra Information Technologies picked up the code and commercialized it. POSTGRES became the primary data manager for the Sequoia 2000 scientific computing project in late 1992. Furthermore, the size of the external user community nearly doubled during 1993. It became increasingly obvious that maintenance of the prototype code and sup- port was taking up large amounts of time that should have been devoted to database research. In an effort to reduce this support burden, the project officially ended with Version 4.2. 11..33.. WWhhaatt iiss PPOOSSTTGGRREESS9955?? POSTGRES95 is a derivative of the last official release of POSTGRES (version 4.2). The code is now completely ANSI C and the code size has been trimmed by 25%. There are a lot of internal changes that improve performance and code maintainability. POSTGRES95 runs about 30-50% faster on the Wisconsin Benchmark compared to v4.2. Apart from bug fixes, these are the major enhancements: +o The query language POSTQUEL has been replaced with SQL (implemented in the server). We do not support subqueries (which can be imitated with user defined SQL functions) at the moment. Aggregates have been re-implemented. We also added support for GROUP BY. The libpq interface is still available for C pro- grams. +o In addition to the monitor program, we provide a new program (psql) which supports GNU readline. +o We added a new front-end library, libpgtcl, that supports Tcl-based clients. A sample shell, pgt- clsh, provides new Tcl commands to interface tcl programs with the POSTGRES95 backend. +o The large object interface has been overhauled. We kept Inversion large objects as the only mechanism 33 for storing large objects. (This is not to be con- fused with the Inversion file system which has been removed.) +o The instance-level rule system has been removed. Rules are still available as rewrite rules. +o A short tutorial introducing regular SQL features as well as those of ours is distributed with the source code. +o GNU make (instead of BSD make) is used for the build. Also, POSTGRES95 can be compiled with an unpatched gcc (data alignment of doubles has been fixed). 11..44.. AAbboouutt TThhiiss RReelleeaassee POSTGRES95 is available free of charge. This manual describes version 1.0 of POSTGRES95. The authors have compiled and tested POSTGRES95 on the following plat- forms: +---------------------+-----------+------------------------------------+ | architecture | processor | operating system | +---------------------+-----------+------------------------------------+ |DECstation 3000 | Alpha AXP | OSF/1 2.1, 3.0, 3.2 | |DECstation 5000 | MIPS | ULTRIX 4.4 | |Sun4 | SPARC | SunOS 4.1.3, 4.1.3_U1; Solaris 2.4 | |H-P 9000/700 and 800 | PA-RISC | HP-UX 9.00, 9.01, 9.03 | |Intel | X86 | Linux 1.2.8, ELF | +---------------------+-----------+------------------------------------+ 11..55.. OOuuttlliinnee ooff TThhiiss MMaannuuaall From now on, We will use POSTGRES to mean POSTGRES95. The first part of this manual goes over some basic sys- tem concepts and procedures for starting the POSTGRES system. We then turn to a tutorial overview of the POSTGRES data model and SQL query language, introducing a few of its advanced features. Next, we explain the POSTGRES approach to extensibility and describe how users can extend POSTGRES by adding user-defined types, operators, aggregates, and both query language and pro- gramming language functions. After an extremely brief overview of the POSTGRES rule system, the manual con- cludes with a detailed appendix that discusses some of the more involved and operating system-specific proce- dures involved in extending the system. ____________________ UNIX is a trademark of X/Open, Ltd. Sun4, SPARC, SunOS and Solaris are trademarks of Sun Microsystems, Inc. DEC, DECstation, Alpha AXP and ULTRIX are trademarks of Digital Equipment Corp. PA-RISC and HP-UX are trademarks of Hewlett-Packard Co. OSF/1 is a trademark of the Open Soft- ware Foundation. 44 We assume proficiency with UNIX and C programming. 55 ________________________________________________________________________________________________________________________ 22.. PPOOSSTTGGRREESS AARRCCHHIITTEECCTTUURREE CCOONNCCEEPPTTSS ________________________________________________________________________________________________________________________ Before we continue, you should understand the basic POSTGRES system architecture. Understanding how the parts of POSTGRES interact will make the next chapter somewhat clearer. In database jargon, POSTGRES uses a simple "process- per-user" client/server model. A POSTGRES session con- sists of the following cooperating UNIX processes (pro- grams): +o A supervisory daemon process (the ppoossttmmaasstteerr), +o the user's frontend application (e.g., the ppssqqll pro- gram), and +o the one or more backend database servers (the ppoosstt-- ggrreess process itself). A single ppoossttmmaasstteerr manages a given collection of _d_a_t_a_b_a_s_e_s on a single host. Such a collection of databases is called an _i_n_s_t_a_l_l_a_t_i_o_n or _s_i_t_e. Frontend applications that wish to access a given database within an installation make calls to the LIBPQ library. The library sends user requests over the network to the ppoossttmmaasstteerr (Figure 1(a)), which in turn starts a new backend server process (Figure 1(b)) and connects the frontend process to the new server (Figure 1(c)). From that point on, the frontend process and the backend server communicate without intervention by the ppoossttmmaass-- tteerr. Hence, the ppoossttmmaasstteerr is always running, waiting for requests, whereas frontend and backend processes come and go. The LIBPQ library allows a single fron- tend to make multiple connections to backend processes. However, the frontend application is still a single- threaded process. Multithreaded frontend/backend con- nections are not currently supported in LIBPQ. One implication of this architecture is that the ppoosstt-- mmaasstteerr and the backend always run on the same machine (the database server), while the frontend application may run anywhere. You should keep this in mind, ____________________________________________________________ FFiigguurree 11. How a connection is established. ____________________________________________________________ 66 because the files that can be accessed on a client machine may not be accessible (or may only be accessed using a different filename) on the database server machine. You should also be aware that the ppoossttmmaasstteerr and ppoosstt-- ggrreess servers run with the user-id of the POSTGRES "superuser." Note that the POSTGRES superuser does not have to be a special user (e.g., a user named "post- gres"). Furthermore, the POSTGRES superuser should definitely not be the UNIX superuser, "root"! In any case, all files relating to a database should belong to this POSTGRES superuser. 77 ________________________________________________________________________________________________________________________ 33.. GGEETTTTIINNGG SSTTAARRTTEEDD WWIITTHH PPOOSSTTGGRREESS ________________________________________________________________________________________________________________________ This section discusses how to start POSTGRES and set up your own environment so that you can use frontend applications. We assume POSTGRES has already been suc- cessfully installed. (Refer to the installation notes for how to install POSTGRES.) Some of the steps listed in this section will apply to all POSTGRES users, and some will apply primarily to the site database administrator. This _s_i_t_e _a_d_m_i_n_i_s_t_r_a_- _t_o_r is the person who installed the software, created the database directories and started the ppoossttmmaasstteerr process. This person does not have to be the UNIX superuser, "root," or the computer system administra- tor. In this section, items for end users are labelled "User" and items intended for the site administrator are labelled "Admin." Throughout this manual, any examples that begin with the character ``%%'' are commands that should be typed at the UNIX shell prompt. Examples that begin with the character ``**'' are commands in the POSTGRES query lan- guage, POSTGRES SQL. 33..11.. AAddmmiinn//UUsseerr:: SSeettttiinngg UUpp YYoouurr EEnnvviirroonnmmeenntt Figure 2 shows how the POSTGRES distribution is laid out when installed in the default way. For simplicity, we will assume that POSTGRES has been installed in the directory //uussrr//llooccaall//ppoossttggrreess9955. Therefore, wherever you see the directory //uussrr//llooccaall//ppoossttggrreess9955 you should substitute the name of the directory where POSTGRES is actually installed. All POSTGRES commands are installed in the directory //uussrr//llooccaall//ppoossttggrreess9955//bbiinn. Therefore, you should add this directory to your shell _c_o_m_m_a_n_d _p_a_t_h. If you use a variant of the Berkeley C shell, such as ccsshh or ttccsshh, ____________________________________________________________ FFiigguurree 22. POSTGRES file layout. ____________________________________________________________ 88 you would add %% sseett ppaatthh == (( //uussrr//llooccaall//ppoossttggrreess9955//bbiinn $$ppaatthh )) in the ..llooggiinn file in your home directory. If you use a variant of the Bourne shell, such as sshh, kksshh, or bbaasshh, then you would add %% PPAATTHH==//uussrr//llooccaall//ppoossttggrreess9955//bbiinn::$$PPAATTHH %% eexxppoorrtt PPAATTHH to the ..pprrooffiillee file in your home directory. From now on, we will assume that you have added the POSTGRES bbiinn directory to your path. In addition, we will make frequent reference to "setting a shell vari- able" or "setting an environment variable" throughout this document. If you did not fully understand the last paragraph on modifying your search path, you should consult the UNIX manual pages that describe your shell before going any further. 33..22.. AAddmmiinn:: SSttaarrttiinngg tthhee PPoossttmmaasstteerr It should be clear from the preceding discussion that nothing can happen to a database unless the ppoossttmmaasstteerr process is running. As the site administrator, there are a number of things you should remember before starting the ppoossttmmaasstteerr. These are discussed in the section of this manual titled, "Administering POST- GRES." However, if POSTGRES has been installed by fol- lowing the installation instructions exactly as writ- ten, the following simple command is all you should need to start the ppoossttmmaasstteerr: %% ppoossttmmaasstteerr && The ppoossttmmaasstteerr occasionally prints out messages which are often helpful during troubleshooting. If you wish to view debugging messages from the postmaster, you can start it with the -d option and redirect the output to the log file: %% ppoossttmmaasstteerr --dd >>&& ppmm..lloogg && If you do not wish to see these messages, you can type %% ppoossttmmaasstteerr --SS and the ppoossttmmaasstteerr will be "S"ilent. Notice that there is no ampersand ("&") at the end of the last example. 33..33.. AAddmmiinn:: AAddddiinngg aanndd DDeelleettiinngg UUsseerrss The ccrreeaatteeuusseerr command enables specific users to access POSTGRES. The ddeessttrrooyyuusseerr command removes users and 99 prevents them from accessing POSTGRES. Note that these commands only affect users with respect to POSTGRES; they have no effect administration of users that the operating system manages. 33..44.. UUsseerr:: SSttaarrttiinngg AApppplliiccaattiioonnss Assuming that your site administrator has properly started the ppoossttmmaasstteerr process and authorized you to use the database, you (as a user) may begin to start up applications. As previously mentioned, you should add //uussrr//llooccaall//ppoossttggrreess9955//bbiinn to your shell search path. In most cases, this is all you should have to do in terms of preparation.1 If you get the following error message from a POSTGRES command (such as ppssqqll or ccrreeaatteeddbb): ccoonnnneeccttDDBB(()) ffaaiilleedd:: IIss tthhee ppoossttmmaasstteerr rruunnnniinngg aatt ''llooccaallhhoosstt'' oonn ppoorrtt ''44332222''?? it is usually because (1) the ppoossttmmaasstteerr is not run- ning, or (2) you are attempting to connect to the wrong server host. If you get the following error message: FFAATTAALL 11::FFeebb 1177 2233::1199::5555::pprroocceessss uusseerriidd ((22336600)) !!== ddaattaabbaassee oowwnneerr ((226688)) it means that the site administrator started the ppoosstt-- mmaasstteerr as the wrong user. Tell him to restart it as the POSTGRES superuser. 33..55.. UUsseerr:: MMaannaaggiinngg aa DDaattaabbaassee Now that POSTGRES is up and running we can create some databases to experiment with. Here, we describe the basic commands for managing a database. 33..55..11.. CCrreeaattiinngg aa DDaattaabbaassee Let's say you want to create a database named mmyyddbb. You can do this with the following command: %% ccrreeaatteeddbb mmyyddbb ____________________ 1 If your site administrator has not set things up in the default way, you may have some more work to do. For exam- ple, if the database server machine is a remote machine, you will need to set the PPGGHHOOSSTT environment variable to the name of the database server machine. The environment variable PPGGPPOORRTT may also have to be set. The bottom line is this: if you try to start an application program and it complains that it cannot connect to the ppoossttmmaasstteerr, you should immedi- ately consult your site administrator to make sure that your environment is properly set up. 1100 POSTGRES allows you to create any number of databases at a given site and you automatically become the _d_a_t_a_b_a_s_e _a_d_m_i_n_i_s_t_r_a_t_o_r of the database you just cre- ated. Database names must have an alphabetic first character and are limited to 16 characters in length. Not every user has authorization to become a database administrator. If POSTGRES refuses to create databases for you, then the site administrator needs to grant you permission to create databases. Consult your site administrator if this occurs. 33..55..22.. AAcccceessssiinngg aa DDaattaabbaassee Once you have constructed a database, you can access it by: +o running the POSTGRES terminal monitor programs ( mmoonniittoorr or ppssqqll) which allows you to interactively enter, edit, and execute SQL commands. +o writing a C program using the LIBPQ subroutine library. This allows you to submit SQL commands from C and get answers and status messages back to your program. This interface is discussed further in section ??. You might want to start up ppssqqll, to try out the exam- ples in this manual. It can be activated for the mmyyddbb database by typing the command: %% ppssqqll mmyyddbb You will be greeted with the following message: WWeellccoommee ttoo tthhee PPOOSSTTGGRREESS9955 iinntteerraaccttiivvee ssqqll mmoonniittoorr:: ttyyppee \\?? ffoorr hheellpp oonn ssllaasshh ccoommmmaannddss ttyyppee \\qq ttoo qquuiitt ttyyppee \\gg oorr tteerrmmiinnaattee wwiitthh sseemmiiccoolloonn ttoo eexxeeccuuttee qquueerryy YYoouu aarree ccuurrrreennttllyy ccoonnnneecctteedd ttoo tthhee ddaattaabbaassee:: mmyyddbb mmyyddbb==>> This prompt indicates that the terminal monitor is lis- tening to you and that you can type SQL queries into a workspace maintained by the terminal monitor. The ppssqqll program responds to escape codes that begin with the backslash character, "\". For example, you can get help on the syntax of various POSTGRES SQL com- mands by typing: mmyyddbb==>> \\hh Once you have finished entering your queries into the workspace, you can pass the contents of the workspace to the POSTGRES server by typing: 1111 mmyyddbb==>> \\gg This tells the server to process the query. If you terminate your query with a semicolon, the \g is not necessary. Psql will automatically process semicolon- terminated queries. To read queries from a file, say mmyyFFiillee,, instead of entering them interactively, type: mmyyddbb==>> \\ii ffiilleeNNaammee To get out of psql and return to UNIX, type mmyyddbb==>> \\qq and ppssqqll will quit and return you to your command shell. (For more escape codes, type \\hh at the mmoonniittoorr prompt.) White space (i.e., spaces, tabs and newlines) may be used freely in SQL queries. Comments are denoted by ----. Everything after the dashes up to the end of the line is ignored. 33..55..33.. DDeessttrrooyyiinngg aa DDaattaabbaassee If you are the database administrator for the database mmyyddbb, you can destroy it using the following UNIX com- mand: %% ddeessttrrooyyddbb mmyyddbb This action physically removes all of the UNIX files associated with the database and cannot be undone, so this should only be done with a great deal of fore- thought. 1122 ________________________________________________________________________________________________________________________ 44.. TTHHEE QQUUEERRYY LLAANNGGUUAAGGEE ________________________________________________________________________________________________________________________ The POSTGRES query language is a variant of SQL-3. It has many extensions such as an extensible type system, inheritance, functions and production rules. Those are features carried over from the original POSTGRES query language, POSTQUEL. This section provides an overview of how to use POSTGRES SQL to perform simple opera- tions. This manual is only intended to give you an idea of our flavor of SQL and is in no way a complete tutorial on SQL. Numerous books have been written on SQL. For instance, consult [MELT93] or [DATE93]. You should also be aware that some features are not part of the ANSI standard. In the examples that follow, we assume that you have created the mmyyddbb database as described in the previous subsection and have started psql. Examples in this manual can also be found in //uussrr//llooccaall//ppoossttggrreess9955//ssrrcc//ttuuttoorriiaall. Refer to the README file in that directory for how to use them. To start the tutorial, do the following: %% ccdd //uussrr//llooccaall//ppoossttggrreess9955//ssrrcc//ttuuttoorriiaall %% ppssqqll --ss mmyyddbb WWeellccoommee ttoo tthhee PPOOSSTTGGRREESS9955 iinntteerraaccttiivvee ssqqll mmoonniittoorr:: ttyyppee \\?? ffoorr hheellpp oonn ssllaasshh ccoommmmaannddss ttyyppee \\qq ttoo qquuiitt ttyyppee \\gg oorr tteerrmmiinnaattee wwiitthh sseemmiiccoolloonn ttoo eexxeeccuuttee qquueerryy YYoouu aarree ccuurrrreennttllyy ccoonnnneecctteedd ttoo tthhee ddaattaabbaassee:: jjoollllyy mmyyddbb==>> \\ii bbaassiiccss..ssqqll The \\ii command read in queries from the specified files. The --ss option puts you in single step mode which pauses before sending a query to the backend. Queries in this section are in the file bbaassiiccss..ssqqll. 44..11.. CCoonncceeppttss The fundamental notion in POSTGRES is that of a _c_l_a_s_s_, which is a named collection of object _i_n_s_t_a_n_c_e_s. Each instance has the same collection of named _a_t_t_r_i_b_u_t_e_s, and each attribute is of a specific _t_y_p_e. Furthermore, each instance has a permanent _o_b_j_e_c_t _i_d_e_n_t_i_f_i_e_r (OID) that is unique throughout the installation. Because 1133 SQL syntax refers to _t_a_b_l_e_s_, we will use the terms _t_a_b_l_e and _c_l_a_s_s interchangeably. Likewise, a _r_o_w is an _i_n_s_t_a_n_c_e and _c_o_l_u_m_n_s are _a_t_t_r_i_b_u_t_e_s_. As previously discussed, classes are grouped into databases, and a collection of databases managed by a single ppoossttmmaasstteerr process constitutes an installation or site. 44..22.. CCrreeaattiinngg aa NNeeww CCllaassss You can create a new class by specifying the class name, along with all attribute names and their types: CCRREEAATTEE TTAABBLLEE wweeaatthheerr (( cciittyy vvaarrcchhaarr((8800)),, tteemmpp__lloo iinntt,, ---- llooww tteemmppeerraattuurree tteemmpp__hhii iinntt,, ---- hhiigghh tteemmppeerraattuurree pprrccpp rreeaall,, ---- pprreecciippiittaattiioonn ddaattee ddaattee ));; Note that keywords are case-insensitive but identifiers are case-sensitive. POSTGRES SQL supports the usual SQL types int, float, real, smallint, char(N), var- char(N), date, and time. As we will see later, POST- GRES can be customized with an arbitrary number of user-defined data types. Consequently, type names are not keywords. So far, the POSTGRES ccrreeaattee command looks exactly like the command used to create a table in a traditional relational system. However, we will presently see that classes have properties that are extensions of the relational model. 44..33.. PPooppuullaattiinngg aa CCllaassss wwiitthh IInnssttaanncceess The iinnsseerrtt statement is used to populate a class with instances: IINNSSEERRTT IINNTTOO wweeaatthheerr VVAALLUUEESS ((''SSaann FFrraanncciissccoo'',, 4466,, 5500,, 00..2255,, ''1111//2277//11999944'')) You can also use the ccooppyy command to perform load large amounts of data from flat (ASCII) files. 44..44.. QQuueerryyiinngg aa CCllaassss The wweeaatthheerr class can be queried with normal relational selection and projection queries. A SQL sseelleecctt state- ment is used to do this. The statement is divided into a _t_a_r_g_e_t _l_i_s_t (the part that lists the attributes to be returned) and a _q_u_a_l_i_f_i_c_a_t_i_o_n (the part that specifies any restrictions). For example, to retrieve all the rows of wweeaatthheerr, type: SSEELLEECCTT ** FFRROOMM WWEEAATTHHEERR;; 1144 and the output should be: +--------------+---------+---------+------+------------+ |cciittyy | tteemmpp__lloo | tteemmpp__hhii | pprrccpp | ddaattee | +--------------+---------+---------+------+------------+ |SSaann FFrraanncciissccoo | 4466 | 5500 | 00..2255 | 1111--2277--11999944 | +--------------+---------+---------+------+------------+ |SSaann FFrraanncciissccoo | 4433 | 5577 | 00 | 1111--2299--11999944 | +--------------+---------+---------+------+------------+ |HHaayywwaarrdd | 3377 | 5544 | | 1111--2299--11999944 | +--------------+---------+---------+------+------------+ You may specify any aribitrary expressions in the tar- get list. For example, you can do: ** SSEELLEECCTT cciittyy,, ((tteemmpp__hhii++tteemmpp__lloo))//22 AASS tteemmpp__aavvgg,, ddaattee FFRROOMM wweeaatthheerr;; Arbitrary Boolean operators ( aanndd, oorr and nnoott) are allowed in the qualification of any query. For exam- ple, SSEELLEECCTT ** FFRROOMM wweeaatthheerr WWHHEERREE cciittyy == ''SSaann FFrraanncciissccoo'' aanndd pprrccpp >> 00..00;; +--------------+---------+---------+------+------------+ |cciittyy | tteemmpp__lloo | tteemmpp__hhii | pprrccpp | ddaattee | +--------------+---------+---------+------+------------+ |SSaann FFrraanncciissccoo | 4466 | 5500 | 00..2255 | 1111--2277--11999944 | +--------------+---------+---------+------+------------+ As a final note, you can specify that the results of a sseelleecctt can be returned in a sorted order or with dupli- cate instances removed. SSEELLEECCTT DDIISSTTIINNCCTT cciittyy FFRROOMM wweeaatthheerr OORRDDEERR BBYY cciittyy;; 44..55.. RReeddiirreeccttiinngg SSEELLEECCTT QQuueerriieess Any sseelleecctt query can be redirected to a new class SSEELLEECCTT ** IINNTTOO tteemmpp ffrroomm wweeaatthheerr;; This creates an implicit ccrreeaattee command, creating a new class tteemmpp with the attribute names and types specified in the target list of the SSEELLEECCTT IINNTTOO command. We can then, of course, perform any operations on the result- ing class that we can perform on other classes. 1155 44..66.. JJooiinnss BBeettwweeeenn CCllaasssseess Thus far, our queries have only accessed one class at a time. Queries can access multiple classes at once, or access the same class in such a way that multiple instances of the class are being processed at the same time. A query that accesses multiple instances of the same or different classes at one time is called a _j_o_i_n _q_u_e_r_y. As an example, say we wish to find all the records that are in the temperature range of other records. In effect, we need to compare the tteemmpp__lloo and tteemmpp__hhii attributes of each EEMMPP instance to the tteemmpp__lloo and tteemmpp__hhii attributes of all other EEMMPP instances.2 We can do this with the following query: SSEELLEECCTT WW11..cciittyy,, WW11..tteemmpp__lloo,, WW11..tteemmpp__hhii,, WW22..cciittyy,, WW22..tteemmpp__lloo,, WW22..tteemmpp__hhii FFRROOMM wweeaatthheerr WW11,, wweeaatthheerr WW22 WWHHEERREE WW11..tteemmpp__lloo << WW22..tteemmpp__lloo aanndd WW11..tteemmpp__hhii >> WW22..tteemmpp__hhii;; +--------------+---------+---------+---------------+---------+---------+ |cciittyy | tteemmpp__lloo | tteemmpp__hhii | cciittyy | tteemmpp__lloo | tteemmpp__hhii | +--------------+---------+---------+---------------+---------+---------+ |SSaann FFrraanncciissccoo | 4433 | 5577 | SSaann FFrraanncciissccoo | 4466 | 5500 | +--------------+---------+---------+---------------+---------+---------+ |SSaann FFrraanncciissccoo | 3377 | 5544 | SSaann FFrraanncciissccoo | 4466 | 5500 | +--------------+---------+---------+---------------+---------+---------+ In this case, both WW11 and WW22 are _s_u_r_r_o_g_a_t_e_s for an instance of the class wweeaatthheerr, and both range over all instances of the class. (In the terminology of most database systems, WW11 and WW22 are known as "range vari- ables.") A query can contain an arbitrary number of class names and surrogates.3 ____________________ 2 This is only a conceptual model. The actual join may be performed in a more efficient manner, but this is invisi- ble to the user. 3 The semantics of such a join are that the qualification is a truth expression defined for the Cartesian product of the classes indicated in the query. For those instances in the Cartesian product for which the qualification is true, POSTGRES computes and returns the values specified in the target list. POSTGRES SQL does not assign any meaning to duplicate values in such expressions. This means that POST- GRES sometimes recomputes the same target list several times -- this frequently happens when Boolean expressions are con- nected with an oorr. To remove such duplicates, you must use the sseelleecctt ddiissttiinncctt statement. 1166 44..77.. UUppddaatteess You can update existing instances using the uuppddaattee com- mand. Suppose you discover the temperature readings are all off by 2 degrees as of Nov 28, you may update the data as follow: ** UUPPDDAATTEE wweeaatthheerr SSEETT tteemmpp__hhii == tteemmpp__hhii -- 22,, tteemmpp__lloo == tteemmpp__lloo -- 22 WWHHEERREE ddaattee >> ''1111//2288//11999944;; 44..88.. DDeelleettiioonnss Deletions are performed using the ddeelleettee command: ** DDEELLEETTEE FFRROOMM wweeaatthheerr WWHHEERREE cciittyy == ''HHaayywwaarrdd'';; All weather recording belongs to Hayward is removed. One should be wary of queries of the form DDEELLEETTEE FFRROOMM _c_l_a_s_s_n_a_m_e_; Without a qualification, the ddeelleettee command will simply delete all instances of the given class, leaving it empty. The system wwiillll nnoott rreeqquueesstt ccoonnffiirrmmaattiioonn before doing this. 44..99.. UUssiinngg AAggggrreeggaattee FFuunnccttiioonnss Like most other query languages, POSTGRES supports aggregate functions. However, the current implementa- tion of POSTGRES aggregate functions is very limited. Specifically, while there are aggregates to compute such functions as the count, sum, average, maximum and minimum over a set of instances, aggregates can only appear in the target list of a query and not in the qualification ( wwhheerree clause) As an example, SSEELLEECCTT mmaaxx((tteemmpp__lloo)) FFRROOMM wweeaatthheerr;; Aggregates may also have GROUP BY clauses: SSEELLEECCTT cciittyy,, mmaaxx((tteemmpp__lloo)) FFRROOMM wweeaatthheerr GGRROOUUPP BBYY cciittyy;; 1177 ________________________________________________________________________________________________________________________ 55.. AADDVVAANNCCEEDD PPOOSSTTGGRREESS SSQQLL FFEEAATTUURREESS ________________________________________________________________________________________________________________________ Having covered the basics of using POSTGRES SQL to access your data, we will now discuss those features of POSTGRES that distinguish it from conventional data managers. These features include inheritance, time travel and non-atomic data values (array- and set- valued attributes). Examples in this section can also be found in aaddvvaannccee..ssqqll in the tutorial directory. (Refer to the introduction of the previous chapter for how to use it.) 55..11.. IInnhheerriittaannccee Let's create two classes. The ccaappiittaallss class contains state capitals which are also cities. Naturally, the ccaappiittaallss class should _i_n_h_e_r_i_t from cciittiieess. CCRREEAATTEE TTAABBLLEE cciittiieess (( nnaammee tteexxtt,, ppooppuullaattiioonn ffllooaatt,, aallttiittuuddee iinntt ---- ((iinn fftt)) ));; CCRREEAATTEE TTAABBLLEE ccaappiittaallss (( ssttaattee cchhaarr22 )) IINNHHEERRIITTSS ((cciittiieess));; In this case, an instance of ccaappiittaallss _i_n_h_e_r_i_t_s all attributes (nnaammee, ppooppuullaattiioonn, and aallttiittuuddee) from its parent, cciittiieess. The type of the attribute nnaammee is tteexxtt,, a built-in POSTGRES type for variable length ASCII strings. The type of the attribute ppooppuullaattiioonn is ffllooaatt44,, a built-in POSTGRES type for double precision floating point numbres. State capitals have an extra attribute, ssttaattee, that shows their state. In POSTGRES, a class can inherit from zero or more other classes,4 and a query can reference either all instances of a class or all instances of a class plus all of its descendants. For example, the following query finds all the cities that are situated at an attitude of 500 'ft or higher: ____________________ 4 I.e., the inheritance hierarchy is a directed acyclic graph. 1188 SSEELLEECCTT nnaammee,, aallttiittuuddee FFRROOMM cciittiieess WWHHEERREE aallttiittuuddee >> 550000;; +----------+----------+ |nnaammee | aallttiittuuddee | +----------+----------+ |LLaass VVeeggaass | 22117744 | +----------+----------+ |MMaarriippoossaa | 11995533 | +----------+----------+ On the other hand, to find the names of all cities, including state capitals, that are located at an alti- tude over 500 'ft, the query is: SSEELLEECCTT cc..nnaammee,, cc..aallttiittuuddee FFRROOMM cciittiieess** cc WWHHEERREE cc..aallttiittuuddee >> 550000;; which returns: +----------+----------+ |nnaammee | aallttiittuuddee | +----------+----------+ |LLaass VVeeggaass | 22117744 | +----------+----------+ |MMaarriippoossaa | 11995533 | +----------+----------+ |MMaaddiissoonn | 884455 | +----------+----------+ Here the ** after cciittiieess indicates that the query should be run over cciittiieess and all classes below cciittiieess in the inheritance hierarchy. Many of the commands that we have already discussed -- sseelleecctt, uuppddaattee and ddeelleettee -- support this ** notation, as do others, like aalltteerr com- mand. 55..22.. TTiimmee TTrraavveell POSTGRES supports the notion of _t_i_m_e _t_r_a_v_e_l. This fea- ture allows a user to run historical queries. For example, to find the current population of Mariposa city, one would query: SSEELLEECCTT ** FFRROOMM cciittiieess WWHHEERREE nnaammee == ''MMaarriippoossaa'';; 1199 +---------+------------+----------+ |nnaammee | ppooppuullaattiioonn | aallttiittuuddee | +---------+------------+----------+ |MMaarriippoossaa | 11332200 | 11995533 | +---------+------------+----------+ POSTGRES will automatically find the version of Mari- posa's record valid at the current time. One can also give a time _r_a_n_g_e. For example to see the past and present populations of Mariposa, one would query: SSEELLEECCTT nnaammee,, ppooppuullaattiioonn FFRROOMM cciittiieess[[''eeppoocchh'',, ''nnooww'']] WWHHEERREE nnaammee == ''MMaarriippoossaa'';; where "epoch" indicates the beginning of the system clock.5 If you have executed all of the examples so far, then the above query returns: +---------+------------+ |nnaammee | ppooppuullaattiioonn | +---------+------------+ |MMaarriippoossaa | 11220000 | +---------+------------+ |MMaarriippoossaa | 11332200 | +---------+------------+ The default beginning of a time range is the earliest time representable by the system and the default end is the current time; thus, the above time range can be abbreviated as ``[[,,]].'' 55..33.. NNoonn--AAttoommiicc VVaalluueess One of the tenets of the relational model is that the attributes of a relation are _a_t_o_m_i_c. POSTGRES does not have this restriction; attributes can themselves con- tain sub-values that can be accessed from the query language. For example, you can create attributes that are _a_r_r_a_y_s of base types. 55..33..11.. AArrrraayyss POSTGRES allows attributes of an instance to be defined as fixed-length or variable-length multi-dimensional arrays. Arrays of any base type or user-defined type can be created. To illustrate their use, we first cre- ate a class with arrays of base types. ** CCRREEAATTEE TTAABBLLEE SSAALL__EEMMPP (( nnaammee tteexxtt,, ____________________ 5 On UNIX systems, this is always midnight, January 1, 1970 GMT. 2200 ppaayy__bbyy__qquuaarrtteerr iinntt44[[]],, sscchheedduullee cchhaarr1166[[]][[]] ));; The above query will create a class named SSAALL__EEMMPP with a tteexxtt string (nnaammee), a one-dimensional array of iinntt44 (ppaayy__bbyy__qquuaarrtteerr), which represents the employee's salary by quarter and a two-dimensional array of cchhaarr1166 (sscchheedduullee), which represents the employee's weekly schedule. Now we do some IINNSSEERRTTSSs; note that when appending to an array, we enclose the values within braces and separate them by commas. If you know C, this is not unlike the syntax for initializing struc- tures. IINNSSEERRTT IINNTTOO SSAALL__EEMMPP VVAALLUUEESS ((''BBiillll'',, ''{{1100000000,, 1100000000,, 1100000000,, 1100000000}}'',, ''{{{{""mmeeeettiinngg"",, ""lluunncchh""}},, {{}}}}''));; IINNSSEERRTT IINNTTOO SSAALL__EEMMPP VVAALLUUEESS ((''CCaarrooll'',, ''{{2200000000,, 2255000000,, 2255000000,, 2255000000}}'',, ''{{{{""ttaallkk"",, ""ccoonnssuulltt""}},, {{""mmeeeettiinngg""}}}}''));; By default, POSTGRES uses the "one-based" numbering convention for arrays -- that is, an array of _n ele- ments starts with array[1] and ends with array[_n]. Now, we can run some queries on SSAALL__EEMMPP. First, we show how to access a single element of an array at a time. This query retrieves the names of the employees whose pay changed in the second quarter: ** SSEELLEECCTT nnaammee FFRROOMM SSAALL__EEMMPP WWHHEERREE SSAALL__EEMMPP..ppaayy__bbyy__qquuaarrtteerr[[11]] <<>> SSAALL__EEMMPP..ppaayy__bbyy__qquuaarrtteerr[[22]];; +------+ |nnaammee | +------+ |CCaarrooll | +------+ This query retrieves the third quarter pay of all employees: ** SSEELLEECCTT SSAALL__EEMMPP..ppaayy__bbyy__qquuaarrtteerr[[33]] FFRROOMM SSAALL__EEMMPP;; 2211 +---------------+ |ppaayy__bbyy__qquuaarrtteerr | +---------------+ |1100000000 | +---------------+ |2255000000 | +---------------+ We can also access arbitrary _s_l_i_c_e_s of an array, or _s_u_b_a_r_r_a_y_s. This query retrieves the first item on Bill's schedule for the first two days of the week. ** SSEELLEECCTT SSAALL__EEMMPP..sscchheedduullee[[11::22]][[11::11]] FFRROOMM SSAALL__EEMMPP WWHHEERREE SSAALL__EEMMPP..nnaammee == ''BBiillll'';; +-------------------+ |sscchheedduullee | +-------------------+ |{{{{""mmeeeettiinngg""}},,{{""""}}}} | +-------------------+ 2222 ________________________________________________________________________________________________________________________ 66.. EEXXTTEENNDDIINNGG SSQQLL:: AANN OOVVEERRVVIIEEWW ________________________________________________________________________________________________________________________ In the sections that follow, we will discuss how you can extend the POSTGRES SQL query language by adding: +o functions +o types +o operators +o aggregates 66..11.. HHooww EExxtteennssiibbiilliittyy WWoorrkkss POSTGRES is extensible because its operation is _c_a_t_a_- _l_o_g_-_d_r_i_v_e_n. If you are familiar with standard rela- tional systems, you know that they store information about databases, tables, columns, etc., in what are commonly known as _s_y_s_t_e_m _c_a_t_a_l_o_g_s. (Some systems call this the _d_a_t_a _d_i_c_t_i_o_n_a_r_y). The catalogs appear to the user as classes, like any other, but the DBMS stores its internal bookkeeping in them. One key difference between POSTGRES and standard relational systems is that POSTGRES stores much more information in its cata- logs -- not only information about tables and columns, but also information about its types, functions, access methods, and so on. These classes can be modified by the user, and since POSTGRES bases its internal opera- tion on these classes, this means that POSTGRES can be extended by users. By comparison, conventional database systems can only be extended by changing hard- coded procedures within the DBMS or by loading modules specially-written by the DBMS vendor. POSTGRES is also unlike most other data managers in that the server can incorporate user-written code into itself through _d_y_n_a_m_i_c _l_o_a_d_i_n_g. That is, the user can specify an object code file (e.g., a compiled ..oo file or shared library) that implements a new type or func- tion and POSTGRES will load it as required. Code writ- ten in SQL are even more trivial to add to the server. This ability to modify its operation "on the fly" makes POSTGRES uniquely suited for rapid prototyping of new applications and storage structures. 66..22.. TThhee PPOOSSTTGGRREESS TTyyppee SSyysstteemm The POSTGRES type system can be broken down in several ways. Types are divided into _b_a_s_e types and _c_o_m_p_o_s_i_t_e types. Base types are those, like iinntt44, that are implemented in a language such as C. They generally correspond to what are often known as "abstract data types"; POSTGRES 2233 can only operate on such types through methods provided by the user and only understands the behavior of such types to the extent that the user describes them. Com- posite types are created whenever the user creates a class. EEMMPP is an example of a composite type. POST- GRES stores these types in only one way (within the file that stores all instances of the class) but the user can "look inside" at the attributes of these types from the query language and optimize their retrieval by (for example) defining indices on the attributes. POSTGRES base types are further divided into _b_u_i_l_t_-_i_n types and _u_s_e_r_-_d_e_f_i_n_e_d types. Built-in types (like iinntt44) are those that are compiled into the system. User-defined types are those created by the user in the manner to be described below. 66..33.. AAbboouutt tthhee PPOOSSTTGGRREESS SSyysstteemm CCaattaallooggss Having introduced the basic extensibility concepts, we can now take a look at how the catalogs are actually laid out. You can skip this section for now, but some later sections will be incomprehensible without the information given here, so mark this page for later reference. All system catalogs have names that begin with ppgg__. The following classes contain information that may be useful to the end user. (There are many other system catalogs, but there should rarely be a reason to query them directly.) +-------------+------------------------------------+ |catalog name | description | +-------------+------------------------------------+ |ppgg__ddaattaabbaassee | databases | |ppgg__ccllaassss | classes | |ppgg__aattttrriibbuuttee | class attributes | |ppgg__iinnddeexx | secondary indices | | | | |ppgg__pprroocc | procedures (both C and SQL) | |ppgg__ttyyppee | types (both base and complex) | |ppgg__ooppeerraattoorr | operators | |ppgg__aaggggrreeggaattee | aggregates and aggregate functions | | | | |ppgg__aamm | access methods | |ppgg__aammoopp | access method operators | |ppgg__aammpprroocc | access method support functions | |ppgg__ooppccllaassss | access method operator classes | +-------------+------------------------------------+ The Reference Manual gives a more detailed explanation of these catalogs and their attributes. However, Fig- ure 3 shows the major entities and their relationships in the system catalogs. (Attributes that do not refer to other entities are not shown unless they are part of 2244 ____________________________________________________________ FFiigguurree 33. The major POSTGRES system catalogs. ____________________________________________________________ a primary key.) This diagram is more or less incomprehensible until you actually start looking at the contents of the catalogs and see how they relate to each other. For now, the main things to take away from this diagram are as fol- lows: (1) In several of the sections that follow, we will present various join queries on the system cata- logs that display information we need to extend the system. Looking at this diagram should make some of these join queries (which are often three- or four-way joins) more understandable, because you will be able to see that the attributes used in the queries form foreign keys in other classes. (2) Many different features (classes, attributes, functions, types, access methods, etc.) are tightly integrated in this schema. A simple ccrreeaattee command may modify many of these cata- logs. (3) Types and procedures6 are central to the schema. Nearly every catalog contains some reference to instances in one or both of these classes. For example, POSTGRES frequently uses type signa- tures (e.g., of functions and operators) to identify unique instances of other catalogs. (4) There are many attributes and relationships that have obvious meanings, but there are many (par- ticularly those that have to do with access methods) that do not. The relationships between ppgg__aamm, ppgg__aammoopp, ppgg__aammpprroocc,, ppgg__ooppeerraattoorr and ppgg__ooppccllaassss are particularly hard to understand and will be described in depth (in the section on interfacing types and operators to indices) after we have discussed basic extensions. ____________________ 6 We use the words _p_r_o_c_e_d_u_r_e and _f_u_n_c_t_i_o_n more or less interchangably. 2255 ________________________________________________________________________________________________________________________ 77.. EEXXTTEENNDDIINNGG SSQQLL:: FFUUNNCCTTIIOONNSS ________________________________________________________________________________________________________________________ As it turns out, part of defining a new type is the definition of functions that describe its behavior. Consequently, while it is possible to define a new function without defining a new type, the reverse is not true. We therefore describe how to add new func- tions to POSTGRES before describing how to add new types. POSTGRES SQL provides two types of functions: _q_u_e_r_y _l_a_n_g_u_a_g_e _f_u_n_c_t_i_o_n_s (functions written in SQL and _p_r_o_- _g_r_a_m_m_i_n_g _l_a_n_g_u_a_g_e _f_u_n_c_t_i_o_n_s (functions written in a compiled programming language such as C.) Either kind of function can take a base type, a composite type or some combination as arguments (parameters). In addi- tion, both kinds of functions can return a base type or a composite type. It's easier to define SQL functions, so we'll start with those. Examples in this section can also be found in ffuunnccss..ssqqll and CC--ccooddee//ffuunnccss..cc. 77..11.. QQuueerryy LLaanngguuaaggee ((SSQQLL)) FFuunnccttiioonnss ________________________________________________________________________________________________________________________ 77..11..11.. SSQQLL FFuunnccttiioonnss oonn BBaassee TTyyppeess The simplest possible SQL function has no arguments and simply returns a base type, such as iinntt44: CCRREEAATTEE FFUUNNCCTTIIOONN oonnee(()) RREETTUURRNNSS iinntt44 AASS ''SSEELLEECCTT 11 aass RREESSUULLTT'' LLAANNGGUUAAGGEE ''ssqqll'';; SSEELLEECCTT oonnee(()) AASS aannsswweerr;; +-------+ |aannsswweerr | +-------+ |11 | +-------+ Notice that we defined a target list for the function (with the name RREESSUULLTT), but the target list of the query that invoked the function overrode the function's target list. Hence, the result is labelled aannsswweerr instead of oonnee. 2266 It's almost as easy to define SQL functions that take base types as arguments. In the example below, notice how we refer to the arguments within the function as $$11 and $$22. CCRREEAATTEE FFUUNNCCTTIIOONN aadddd__eemm((iinntt44,, iinntt44)) RREETTUURRNNSS iinntt44 AASS ''SSEELLEECCTT $$11 ++ $$22;;'' LLAANNGGUUAAGGEE ''ssqqll'';; SSEELLEECCTT aadddd__eemm((11,, 22)) AASS aannsswweerr;; +-------+ |aannsswweerr | +-------+ |33 | +-------+ 77..11..22.. SSQQLL FFuunnccttiioonnss oonn CCoommppoossiittee TTyyppeess When specifying functions with arguments of composite types (such as EEMMPP), we must not only specify which argument we want (as we did above with $$11 and $$22) but also the attributes of that argument. For example, take the function ddoouubbllee__ssaallaarryy that computes what your salary would be if it were doubled. CCRREEAATTEE FFUUNNCCTTIIOONN ddoouubbllee__ssaallaarryy((EEMMPP)) RREETTUURRNNSS iinntt44 AASS ''SSEELLEECCTT $$11..ssaallaarryy ** 22 AASS ssaallaarryy;;'' LLAANNGGUUAAGGEE ''ssqqll'';; SSEELLEECCTT nnaammee,, ddoouubbllee__ssaallaarryy((EEMMPP)) AASS ddrreeaamm FFRROOMM EEMMPP WWHHEERREE EEMMPP..ddeepptt == ''ttooyy'';; +-----+-------+ |nnaammee | ddrreeaamm | +-----+-------+ |SSaamm | 22440000 | +-----+-------+ Notice the use of the syntax $$11..ssaallaarryy. Before launching into the subject of functions that return composite types, we must first introduce the _f_u_n_c_t_i_o_n notation for projecting attributes. The sim- ple way to explain this is that we can usually use the notation aattttrriibbuuttee((ccllaassss)) and ccllaassss..aattttrriibbuuttee inter- changably. ---- ---- tthhiiss iiss tthhee ssaammee aass:: ---- SSEELLEECCTT EEMMPP..nnaammee AASS yyoouunnggsstteerr FFRROOMM EEMMPP WWHHEERREE EEMMPP..aaggee << 3300 ---- SSEELLEECCTT nnaammee((EEMMPP)) AASS yyoouunnggsstteerr 2277 FFRROOMM EEMMPP WWHHEERREE aaggee((EEMMPP)) << 3300;; +----------+ |yyoouunnggsstteerr | +----------+ |SSaamm | +----------+ As we shall see, however, this is not always the case. This function notation is important when we want to use a function that returns a single instance. We do this by assembling the entire instance within the function, attribute by attribute. This is an example of a func- tion that returns a single EEMMPP instance: CCRREEAATTEE FFUUNNCCTTIIOONN nneeww__eemmpp(()) RREETTUURRNNSS EEMMPP AASS ''SSEELLEECCTT \\''NNoonnee\\''::::tteexxtt AASS nnaammee,, 11000000 AASS ssaallaarryy,, 2255 AASS aaggee,, \\''nnoonnee\\''::::cchhaarr1166 AASS ddeepptt;;'' LLAANNGGUUAAGGEE ''ssqqll'';; In this case we have specified each of the attributes with a constant value, but any computation or expres- sion could have been substituted for these constants. Defining a function like this can be tricky. Some of the more important caveats are as follows: +o The target list order must be eexxaaccttllyy the same as that in which the attributes appear in the CCRREEAATTEE TTAABBLLEE statement (or when you execute a ..** query). +o You must be careful to typecast the expressions (using ::::) very carefully or you will see the fol- lowing error: WWAARRNN::::ffuunnccttiioonn ddeeccllaarreedd ttoo rreettuurrnn ttyyppee EEMMPP ddooeess nnoott rreettrriieevvee ((EEMMPP..**)) +o When calling a function that returns an instance, we cannot retrieve the entire instance. We must either project an attribute out of the instance or pass the entire instance into another function. SSEELLEECCTT nnaammee((nneeww__eemmpp(()))) AASS nnoobbooddyy;; +-------+ |nnoobbooddyy | +-------+ |NNoonnee | +-------+ +o The reason why, in general, we must use the function syntax for projecting attributes of function return 2288 values is that the parser just doesn't understand the other (dot) syntax for projection when combined with function calls. SSEELLEECCTT nneeww__eemmpp(())..nnaammee AASS nnoobbooddyy;; WWAARRNN::ppaarrsseerr:: ssyynnttaaxx eerrrroorr aatt oorr nneeaarr "".."" Any collection of commands in the SQL query language can be packaged together and defined as a function. The commands can include updates (i.e., iinnsseerrtt, uuppddaattee and ddeelleettee) as well as sseelleecctt queries. However, the final command must be a sseelleecctt that returns whatever is specified as the function's rreettuurrnnttyyppee. CCRREEAATTEE FFUUNNCCTTIIOONN cclleeaann__EEMMPP (()) RREETTUURRNNSS iinntt44 AASS ''DDEELLEETTEE FFRROOMM EEMMPP WWHHEERREE EEMMPP..ssaallaarryy <<== 00;; SSEELLEECCTT 11 AASS iiggnnoorree__tthhiiss'' LLAANNGGUUAAGGEE ''ssqqll'';; SSEELLEECCTT cclleeaann__EEMMPP(());; +--+ |xx | +--+ |11 | +--+ 77..22.. PPrrooggrraammmmiinngg LLaanngguuaaggee FFuunnccttiioonnss ________________________________________________________________________________________________________________________ 77..22..11.. PPrrooggrraammmmiinngg LLaanngguuaaggee FFuunnccttiioonnss oonn BBaassee TTyyppeess Internally, POSTGRES regards a base type as a "blob of memory." The user-defined functions that you define over a type in turn define the way that POSTGRES can operate on it. That is, POSTGRES will only store and retrieve the data from disk and use your user-defined functions to input, process, and output the data. Base types can have one of three internal formats: +o pass by value, fixed-length +o pass by reference, fixed-length +o pass by reference, variable-length By-value types can only be 1, 2 or 4 bytes in length (even if your computer supports by-value types of other sizes). POSTGRES itself only passes integer types by value. You should be careful to define your types such that they will be the same size (in bytes) on all architectures. For example, the lloonngg type is dangerous because it is 4 bytes on some machines and 8 bytes on others, whereas iinntt type is 4 bytes on most UNIX machines (though not on most personal computers). A 2299 reasonable implementation of the iinntt44 type on UNIX machines might be: //** 44--bbyyttee iinntteeggeerr,, ppaasssseedd bbyy vvaalluuee **// ttyyppeeddeeff iinntt iinntt44;; On the other hand, fixed-length types of any size may be passed by-reference. For example, here is a sample implementation of the POSTGRES cchhaarr1166 type: //** 1166--bbyyttee ssttrruuccttuurree,, ppaasssseedd bbyy rreeffeerreennccee **// ttyyppeeddeeff ssttrruucctt {{ cchhaarr ddaattaa[[1166]];; }} cchhaarr1166;; Only pointers to such types can be used when passing them in and out of POSTGRES functions. Finally, all variable-length types must also be passed by reference. All variable-length types must begin with a length field of exactly 4 bytes, and all data to be stored within that type must be located in the mem- ory immediately following that length field. The length field is the total length of the structure (i.e., it includes the size of the length field itself). We can define the tteexxtt type as follows: ttyyppeeddeeff ssttrruucctt {{ iinntt44 lleennggtthh;; cchhaarr ddaattaa[[11]];; }} tteexxtt;; Obviously, the ddaattaa field is not long enough to hold all possible strings -- it's impossible to declare such a structure in C. When manipulating variable-length types, we must be careful to allocate the correct amount of memory and initialize the length field. For example, if we wanted to store 40 bytes in a tteexxtt structure, we might use a code fragment like this: ##iinncclluuddee ""ppoossttggrreess..hh"" ##iinncclluuddee ""uuttiillss//ppaalllloocc..hh"" ...... cchhaarr bbuuffffeerr[[4400]];; //** oouurr ssoouurrccee ddaattaa **// ...... tteexxtt **ddeessttiinnaattiioonn == ((tteexxtt **)) ppaalllloocc((VVAARRHHDDRRSSZZ ++ 4400));; ddeessttiinnaattiioonn-->>lleennggtthh == VVAARRHHDDRRSSZZ ++ 4400;; mmeemmmmoovvee((ddeessttiinnaattiioonn-->>ddaattaa,, bbuuffffeerr,, 4400));; ...... 3300 Now that we've gone over all of the possible structures for base types, we can show some examples of real func- tions. Suppose ffuunnccss..cc look like: ##iinncclluuddee <> ##iinncclluuddee ""ppoossttggrreess..hh"" //** ffoorr cchhaarr1166,, eettcc.. **// ##iinncclluuddee ""uuttiillss//ppaalllloocc..hh"" //** ffoorr ppaalllloocc **// iinntt aadddd__oonnee((iinntt aarrgg)) {{ rreettuurrnn((aarrgg ++ 11));; }} cchhaarr1166 ** ccoonnccaatt1166((cchhaarr1166 **aarrgg11,, cchhaarr1166 **aarrgg22)) {{ cchhaarr1166 **nneeww__cc1166 == ((cchhaarr1166 **)) ppaalllloocc((ssiizzeeooff((cchhaarr1166))));; mmeemmsseett((((vvooiidd **)) nneeww__cc1166,, 00,, ssiizzeeooff((cchhaarr1166))));; ((vvooiidd)) ssttrrnnccppyy((nneeww__cc1166,, aarrgg11,, 1166));; rreettuurrnn ((cchhaarr1166 **))((ssttrrnnccaatt((nneeww__cc1166,, aarrgg22,, 1166))));; }} tteexxtt ** ccooppyytteexxtt((tteexxtt **tt)) {{ //** ** VVAARRSSIIZZEE iiss tthhee ttoottaall ssiizzee ooff tthhee ssttrruucctt iinn bbyytteess.. **// tteexxtt **nneeww__tt == ((tteexxtt **)) ppaalllloocc((VVAARRSSIIZZEE((tt))));; mmeemmsseett((nneeww__tt,, 00,, VVAARRSSIIZZEE((tt))));; VVAARRSSIIZZEE((nneeww__tt)) == VVAARRSSIIZZEE((tt));; //** ** VVAARRDDAATTAA iiss aa ppooiinntteerr ttoo tthhee ddaattaa rreeggiioonn ooff tthhee ssttrruucctt.. **// mmeemmccppyy((((vvooiidd **)) VVAARRDDAATTAA((nneeww__tt)),, //** ddeessttiinnaattiioonn **// ((vvooiidd **)) VVAARRDDAATTAA((tt)),, //** ssoouurrccee **// VVAARRSSIIZZEE((tt))--VVAARRHHDDRRSSZZ));; //** hhooww mmaannyy bbyytteess **// rreettuurrnn((nneeww__tt));; }} On OSF/1 we would type: CCRREEAATTEE FFUUNNCCTTIIOONN aadddd__oonnee((iinntt44)) RREETTUURRNNSS iinntt44 AASS ''//uussrr//llooccaall//ppoossttggrreess9955//ttuuttoorriiaall//oobbjj//ffuunnccss..ssoo'' LLAANNGGUUAAGGEE ''cc'';; CCRREEAATTEE FFUUNNCCTTIIOONN ccoonnccaatt1166((cchhaarr1166,, cchhaarr1166)) RREETTUURRNNSS cchhaarr1166 AASS ''//uussrr//llooccaall//ppoossttggrreess9955//ttuuttoorriiaall//oobbjj//ffuunnccss..ssoo'' LLAANNGGUUAAGGEE ''cc'';; 3311 CCRREEAATTEE FFUUNNCCTTIIOONN ccooppyytteexxtt((tteexxtt)) RREETTUURRNNSS tteexxtt AASS ''//uussrr//llooccaall//ppoossttggrreess9955//ttuuttoorriiaall//oobbjj//ffuunnccss..ssoo'' LLAANNGGUUAAGGEE ''cc'';; On other systems, we might have to make the filename end in ..ssll (to indicate that it's a shared library). 77..22..22.. PPrrooggrraammmmiinngg LLaanngguuaaggee FFuunnccttiioonnss oonn CCoommppoossiittee TTyyppeess Composite types do not have a fixed layout like C structures. Instances of a composite type may contain null fields. In addition, composite types that are part of an inheritance hierarchy may have different fields than other members of the same inheritance hier- archy. Therefore, POSTGRES provides a procedural interface for accessing fields of composite types from C. As POSTGRES processes a set of instances, each instance will be passed into your function as an opaque struc- ture of type TUPLE. Suppose we want to write a function to answer the query ** SSEELLEECCTT nnaammee,, cc__oovveerrppaaiidd((EEMMPP,, 11550000)) AASS oovveerrppaaiidd FFRROOMM EEMMPP WWHHEERREE nnaammee == ''BBiillll'' oorr nnaammee == ''SSaamm'';; In the query above, we can define cc__oovveerrppaaiidd as: ##iinncclluuddee ""ppoossttggrreess..hh"" //** ffoorr cchhaarr1166,, eettcc.. **// ##iinncclluuddee ""lliibbppqq--ffee..hh"" //** ffoorr TTUUPPLLEE **// bbooooll cc__oovveerrppaaiidd((TTUUPPLLEE tt,,//** tthhee ccuurrrreenntt iinnssttaannccee ooff EEMMPP **// iinntt44 lliimmiitt)) {{ bbooooll iissnnuullll == ffaallssee;; iinntt44 ssaallaarryy;; ssaallaarryy == ((iinntt44)) GGeettAAttttrriibbuutteeBByyNNaammee((tt,, ""ssaallaarryy"",, &&iissnnuullll));; iiff ((iissnnuullll)) rreettuurrnn ((ffaallssee));; rreettuurrnn((ssaallaarryy >> lliimmiitt));; }} GGeettAAttttrriibbuutteeBByyNNaammee is the POSTGRES system function that returns attributes out of the current instance. It has three arguments: the argument of type TUPLE passed into the function, the name of the desired attribute, and a return parameter that describes whether the attribute is null. GGeettAAttttrriibbuutteeBByyNNaammee will align data properly so you can cast its return value to the desired type. For example, if you have an attribute nnaammee which is of the type cchhaarr1166, the GGeettAAttttrriibbuutteeBByyNNaammee call would look like: 3322 cchhaarr **ssttrr;; ...... ssttrr == ((cchhaarr **)) GGeettAAttttrriibbuutteeBByyNNaammee((tt,, ""nnaammee"",, &&iissnnuullll)) The following query lets POSTGRES know about the cc__oovveerrppaaiidd function: ** CCRREEAATTEE FFUUNNCCTTIIOONN cc__oovveerrppaaiidd((EEMMPP,, iinntt44)) RREETTUURRNNSS bbooooll AASS ''//uussrr//llooccaall//ppoossttggrreess9955//ttuuttoorriiaall//oobbjj//ffuunnccss..ssoo'' LLAANNGGUUAAGGEE ''cc'';; While there are ways to construct new instances or mod- ify existing instances from within a C function, these are far too complex to discuss in this manual. 77..22..33.. CCaavveeaattss We now turn to the more difficult task of writing pro- gramming language functions. Be warned: this section of the manual will not make you a programmer. You must have a good understanding of C (including the use of pointers and the mmaalllloocc memory manager) before trying to write C functions for use with POSTGRES. While it may be possible to load functions written in languages other than C into POSTGRES, this is often difficult (when it is possible at all) because other languages, such as FORTRAN and Pascal often do not fol- low the same "calling convention" as C. That is, other languages do not pass argument and return values between functions in the same way. For this reason, we will assume that your programming language functions are written in C. The basic rules for building C functions are as fol- lows: (1) Most of the header (include) files for POSTGRES should already be installed in //uussrr//llooccaall//ppoossttggrreess9955//iinncclluuddee (see Figure 2). You should always include --II//uussrr//llooccaall//ppoossttggrreess9955//iinncclluuddee on your cccc command lines. Sometimes, you may find that you require header files that are in the server source itself (i.e., you need a file we neglected to install in iinncclluuddee). In those cases you may need to add one or more of --II//uussrr//llooccaall//ppoossttggrreess9955//ssrrcc//bbaacckkeenndd --II//uussrr//llooccaall//ppoossttggrreess9955//ssrrcc//bbaacckkeenndd//iinncclluuddee --II//uussrr//llooccaall//ppoossttggrreess9955//ssrrcc//bbaacckkeenndd//ppoorrtt//<> --II//uussrr//llooccaall//ppoossttggrreess9955//ssrrcc//bbaacckkeenndd//oobbjj (where <> is the name of the port, e.g., aallpphhaa or ssppaarrcc)).. 3333 (2) When allocating memory, use the POSTGRES rou- tines ppaalllloocc and ppffrreeee instead of the corre- sponding C library routines mmaalllloocc and ffrreeee. The memory allocated by ppaalllloocc will be freed automatically at the end of each transaction, preventing memory leaks. (3) Always zero the bytes of your structures using mmeemmsseett or bbzzeerroo. Several routines (such as the hash access method, hash join and the sort algo- rithm) compute functions of the raw bits con- tained in your structure. Even if you initial- ize all fields of your structure, there may be several bytes of alignment padding (holes in the structure) that may contain garbage values. (4) Most of the internal POSTGRES types are declared in ppoossttggrreess..hh, so it's usually a good idea to include that file as well. (5) Compiling and loading your object code so that it can be dynamically loaded into POSTGRES always requires special flags. See Appendix A for a detailed explanation of how to do it for your particular operating system. 3344 ________________________________________________________________________________________________________________________ 88.. EEXXTTEENNDDIINNGG SSQQLL:: TTYYPPEESS ________________________________________________________________________________________________________________________ As previously mentioned, there are two kinds of types in POSTGRES: _b_a_s_e types (defined in a programming lan- guage) and _c_o_m_p_o_s_i_t_e types (instances). Examples in this section up to interfacing indices can be found in ccoommpplleexx..ssqqlland ccoommpplleexx..cc. Composite exam- ples are in ffuunnccss..ssqqll. 88..11.. UUsseerr--DDeeffiinneedd TTyyppeess ________________________________________________________________________________________________________________________ 88..11..11.. FFuunnccttiioonnss NNeeeeddeedd ffoorr aa UUsseerr--DDeeffiinneedd TTyyppee A user-defined type must always have _i_n_p_u_t and _o_u_t_p_u_t functions. These functions determine how the type appears in strings (for input by the user and output to the user) and how the type is organized in memory. The input function takes a null-delimited character string as its input and returns the internal (in memory) rep- resentation of the type. The output function takes the internal representation of the type and returns a null- delimited character string. Suppose we want to define a ccoommpplleexx type which repre- sents complex numbers. Naturally, we choose to repre- sent a complex in memory as the following C structure: ttyyppeeddeeff ssttrruucctt CCoommpplleexx {{ ddoouubbllee xx;; ddoouubbllee yy;; }} CCoommpplleexx;; and a string of the form ((xx,,yy)) as the external string representation. These functions are usually not hard to write, espe- cially the output function. However, there are a num- ber of points to remember. (1) When defining your external (string) representa- tion, remember that you must eventually write a complete and robust parser for that representa- tion as your input function! CCoommpplleexx ** ccoommpplleexx__iinn((cchhaarr **ssttrr)) {{ ddoouubbllee xx,, yy;; CCoommpplleexx **rreessuulltt;; 3355 iiff ((ssssccaannff((ssttrr,, "" (( %%llff ,, %%llff ))"",, &&xx,, &&yy)) !!== 22)) {{ eelloogg((WWAARRNN,, ""ccoommpplleexx__iinn:: eerrrroorr iinn ppaarrssiinngg rreettuurrnn NNUULLLL;; }} rreessuulltt == ((CCoommpplleexx **))ppaalllloocc((ssiizzeeooff((CCoommpplleexx))));; rreessuulltt-->>xx == xx;; rreessuulltt-->>yy == yy;; rreettuurrnn ((rreessuulltt));; }} The output function can simply be: cchhaarr ** ccoommpplleexx__oouutt((CCoommpplleexx **ccoommpplleexx)) {{ cchhaarr **rreessuulltt;; iiff ((ccoommpplleexx ==== NNUULLLL)) rreettuurrnn((NNUULLLL));; rreessuulltt == ((cchhaarr **)) ppaalllloocc((6600));; sspprriinnttff((rreessuulltt,, ""((%%gg,,%%gg))"",, ccoommpplleexx-->>xx,, ccoommpplleexx-->>yy));; rreettuurrnn((rreessuulltt));; }} (2) You should try to make the input and output functions inverses of each other. If you do not, you will have severe problems when you need to dump your data into a file and then read it back in (say, into someone else's database on another computer). This is a particularly com- mon problem when floating-point numbers are involved. To define the ccoommpplleexx type, we need to create the two user-defined functions ccoommpplleexx__iinn and ccoommpplleexx__oouutt before creating the type: CCRREEAATTEE FFUUNNCCTTIIOONN ccoommpplleexx__iinn((ooppaaqquuee)) RREETTUURRNNSS ccoommpplleexx AASS ''//uussrr//llooccaall//ppoossttggrreess9955//ttuuttoorriiaall//oobbjj//ccoommpplleexx..ssoo'' LLAANNGGUUAAGGEE ''cc'';; CCRREEAATTEE FFUUNNCCTTIIOONN ccoommpplleexx__oouutt((ooppaaqquuee)) RREETTUURRNNSS ooppaaqquuee AASS ''//uussrr//llooccaall//ppoossttggrreess9955//ttuuttoorriiaall//oobbjj//ccoommpplleexx..ssoo'' LLAANNGGUUAAGGEE ''cc'';; CCRREEAATTEE TTYYPPEE ccoommpplleexx (( iinntteerrnnaalllleennggtthh == 1166,, iinnppuutt == ccoommpplleexx__iinn,, oouuttppuutt == ccoommpplleexx__oouutt ));; 3366 As discussed earlier, POSTGRES fully supports arrays of base types. Additionally, POSTGRES supports arrays of user-defined types as well. When you define a type, POSTGRES automatically provides support for arrays of that type. For historical reasons, the array type has the same name as the user-defined type with the under- score character __ prepended. Composite types do not need any function defined on them, since the system already understands what they look like inside. 88..11..22.. LLaarrggee OObbjjeeccttss The types discussed to this point are all "small" objects -- that is, they are smaller than 8KB7 in size. If you require a larger type for something like a docu- ment retrieval system or for storing bitmaps, you will need to use the POSTGRES _l_a_r_g_e _o_b_j_e_c_t interface. ____________________ 7 8 * 1024 == 8192 bytes. In fact, the type must be con- siderably smaller than 8192 bytes, since the POSTGRES tuple and page overhead must also fit into this 8KB limitation. The actual value that fits depends on the machine architec- ture. 3377 ________________________________________________________________________________________________________________________ 99.. EEXXTTEENNDDIINNGG SSQQLL:: OOPPEERRAATTOORRSS ________________________________________________________________________________________________________________________ POSTGRES supports left unary, right unary and binary operators. Operators can be _o_v_e_r_l_o_a_d_e_d, or re-used with different numbers and types of arguments. If there is an ambiguous situation and the system cannot determine the correct operator to use, it will return an error and you may have to typecast the left and/or right operands to help it understand which operator you meant to use. To create an operator for adding two complex numbers can be done as follows. First we need to create a function to add the new types. Then, we can create the operator with the function. CCRREEAATTEE FFUUNNCCTTIIOONN ccoommpplleexx__aadddd((ccoommpplleexx,, ccoommpplleexx)) RREETTUURRNNSS ccoommpplleexx AASS ''$$PPWWDD//oobbjj//ccoommpplleexx..ssoo'' LLAANNGGUUAAGGEE ''cc'';; CCRREEAATTEE OOPPEERRAATTOORR ++ (( lleeffttaarrgg == ccoommpplleexx,, rriigghhttaarrgg == ccoommpplleexx,, pprroocceedduurree == ccoommpplleexx__aadddd,, ccoommmmuuttaattoorr == ++ ));; We've shown how to create a binary operator here. To create unary operators, just omit one of lleeffttaarrgg (for left unary) or rriigghhttaarrgg (for right unary). If we give the system enough type information, it can automatically figure out which operators to use. SSEELLEECCTT ((aa ++ bb)) AASS cc FFRROOMM tteesstt__ccoommpplleexx;; +----------------+ |cc | +----------------+ |((55..22,,66..0055)) | +----------------+ |((113333..4422,,114444..9955)) | +----------------+ 3388 ________________________________________________________________________________________________________________________ 1100.. EEXXTTEENNDDIINNGG SSQQLL:: AAGGGGRREEGGAATTEESS ________________________________________________________________________________________________________________________ Aggregates in POSTGRES are expressed in terms of _s_t_a_t_e _t_r_a_n_s_i_t_i_o_n _f_u_n_c_t_i_o_n_s. That is, an aggregate can be defined in terms of _s_t_a_t_e that is modified whenever an instance is processed. Some state functions look at a particular value in the instance when computing the new state (_s_f_u_n_c_1 in the ccrreeaattee aaggggrreeggaattee syntax) while others only keep track of their own internal state (_s_f_u_n_c_2). If we define an aggregate that uses only ssffuunncc11, we define an aggregate that computes a running function of the attribute values from each instance. "Sum" is an example of this kind of aggregate. "Sum" starts at zero and always adds the current instance's value to its running total. We will use the iinntt44ppll that is built into POSTGRES to perform this addition. CCRREEAATTEE AAGGGGRREEGGAATTEE ccoommpplleexx__ssuumm (( ssffuunncc11 == ccoommpplleexx__aadddd,, bbaasseettyyppee == ccoommpplleexx,, ssttyyppee11 == ccoommpplleexx,, iinniittccoonndd11 == ''((00,,00))'' ));; SSEELLEECCTT ccoommpplleexx__ssuumm((aa)) FFRROOMM tteesstt__ccoommpplleexx;; +------------+ |ccoommpplleexx__ssuumm | +------------+ |((3344,,5533..99)) | +------------+ If we define only ssffuunncc22, we are specifying an aggre- gate that computes a running function that is indepen- dent of the attribute values from each instance. "Count" is the most common example of this kind of aggregate. "Count" starts at zero and adds one to its running total for each instance, ignoring the instance value. Here, we use the built-in iinntt44iinncc routine to do the work for us. This routine increments (adds one to) its argument. CCRREEAATTEE AAGGGGRREEGGAATTEE mmyy__ccoouunntt ((ssffuunncc22 == iinntt44iinncc,, ---- aadddd oonnee bbaasseettyyppee == iinntt44,, ssttyyppee22 == iinntt44,, 3399 iinniittccoonndd22 == ''00'')) SSEELLEECCTT mmyy__ccoouunntt((**)) aass eemmpp__ccoouunntt ffrroomm EEMMPP;; +----------+ |eemmpp__ccoouunntt | +----------+ |55 | +----------+ "Average" is an example of an aggregate that requires both a function to compute the running sum and a func- tion to compute the running count. When all of the instances have been processed, the final answer for the aggregate is the running sum divided by the running count. We use the iinntt44ppll and iinntt44iinncc routines we used before as well as the POSTGRES integer division rou- tine, iinntt44ddiivv, to compute the division of the sum by the count. CCRREEAATTEE AAGGGGRREEGGAATTEE mmyy__aavveerraaggee ((ssffuunncc11 == iinntt44ppll,, ---- ssuumm bbaasseettyyppee == iinntt44,, ssttyyppee11 == iinntt44,, ssffuunncc22 == iinntt44iinncc,, ---- ccoouunntt ssttyyppee22 == iinntt44,, ffiinnaallffuunncc == iinntt44ddiivv,, ---- ddiivviissiioonn iinniittccoonndd11 == ''00'',, iinniittccoonndd22 == ''00'')) SSEELLEECCTT mmyy__aavveerraaggee((ssaallaarryy)) aass eemmpp__aavveerraaggee FFRROOMM EEMMPP;; +------------+ |eemmpp__aavveerraaggee | +------------+ |11664400 | +------------+ 4400 ________________________________________________________________________________________________________________________ 1111.. IINNTTEERRFFAACCIINNGG EEXXTTEENNSSIIOONNSS TTOO IINNDDIICCEESS ________________________________________________________________________________________________________________________ The procedures described thus far let you define a new type, new functions and new operators. However, we cannot yet define a secondary index (such as a B-tree, R-tree or hash access method) over a new type or its operators. Look back at Figure 3. The right half shows the cata- logs that we must modify in order to tell POSTGRES how to use a user-defined type and/or user-defined opera- tors with an index (i.e., ppgg__aamm, ppgg__aammoopp, ppgg__aammpprroocc and ppgg__ooppccllaassss). Unfortunately, there is no simple command to do this. We will demonstrate how to modify these catalogs through a running example: a new operator class for the B-tree access method that sorts integers in ascending absolute value order. The ppgg__aamm class contains one instance for every user- defined access method. Support for the heap access method is built into POSTGRES, but every other access method is described here. The schema is +-------------+---------------------------------------------------------------+ |aammnnaammee | name of the access method | +-------------+---------------------------------------------------------------+ |aammoowwnneerr | object id of the owner's instance in pg_user | +-------------+---------------------------------------------------------------+ |aammkkiinndd | not used at present, but set to 'o' as a place holder | +-------------+---------------------------------------------------------------+ |aammssttrraatteeggiieess | number of strategies for this access method (see below) | +-------------+---------------------------------------------------------------+ |aammssuuppppoorrtt | number of support routines for this access method (see below) | +-------------+---------------------------------------------------------------+ |aammggeettttuuppllee | procedure identifiers for interface routines to the access | |aammiinnsseerrtt | method. For example, rreeggpprroocc ids for opening, closing, and | |...... | getting instances from the access method appear here. | +-------------+---------------------------------------------------------------+ The object ID of the instance in ppgg__aamm is used as a foreign key in lots of other classes. You don't need to add a new instance to this class; all you're inter- ested in is the object ID of the access method instance you want to extend: SSEELLEECCTT ooiidd FFRROOMM ppgg__aamm WWHHEERREE aammnnaammee == ''bbttrreeee'' 4411 +----+ |ooiidd | +----+ |440033 | +----+ The aammssttrraatteeggiieess attribute exists to standardize com- parisons across data types. For example, B-trees impose a strict ordering on keys, lesser to greater. Since POSTGRES allows the user to define operators, POSTGRES cannot look at the nnaammee of an operator (eg, >> or <<) and tell what kind of comparison it is. In fact, some access methods don't impose any ordering at all. For example, R-trees express a rectangle-containment relationship, whereas a hashed data structure expresses only bitwise similarity based on the value of a hash function. POSTGRES needs some consistent way of taking a qualification in your query, looking at the operator and then deciding if a usable index exists. This implies that POSTGRES needs to know, for example, that the <<== and >> operators partition a B-tree. POSTGRES uses strategies to express these relationships between operators and the way they can be used to scan indices. Defining a new set of strategies is beyond the scope of this discussion, but we'll explain how B-tree strate- gies work because you'll need to know that to add a new operator class. In the ppgg__aamm class, the aammssttrraatteeggiieess attribute is the number of strategies defined for this access method. For B-trees, this number is 5. These strategies correspond to +----------------------+---+ |less than | 1 | +----------------------+---+ |less than or equal | 2 | +----------------------+---+ |equal | 3 | +----------------------+---+ |greater than or equal | 4 | +----------------------+---+ |greater than | 5 | +----------------------+---+ The idea is that you'll need to add procedures corre- sponding to the comparisons above to the ppgg__aammoopp rela- tion (see below). The access method code can use these strategy numbers, regardless of data type, to figure out how to partition the B-tree, compute selectivity, and so on. Don't worry about the details of adding procedures yet; just understand that there must be a set of these procedures for iinntt22, iinntt44, ooiidd, and every other data type on which a B-tree can operate. 4422 Sometimes, strategies aren't enough information for the system to figure out how to use an index. Some access methods require other support routines in order to work. For example, the B-tree access method must be able to compare two keys and determine whether one is greater than, equal to, or less than the other. Simi- larly, the R-tree access method must be able to compute intersections, unions, and sizes of rectangles. These operations do not correspond to user qualifications in SQL queries; they are administrative routines used by the access methods, internally. In order to manage diverse support routines consis- tently across all POSTGRES access methods, ppgg__aamm includes an attribute called aammssuuppppoorrtt. This attribute records the number of support routines used by an access method. For B-trees, this number is one -- the routine to take two keys and return -1, 0, or +1, depending on whether the first key is less than, equal to, or greater than the second.8 The aammssttrraatteeggiieess entry in ppgg__aamm is just the _n_u_m_b_e_r of strategies defined for the access method in question. The procedures for less than, less equal, and so on don't appear in ppgg__aamm. Similarly, aammssuuppppoorrtt is just the number of support routines required by the access method. The actual routines are listed elsewhere. The next class of interest is ppgg__ooppccllaassss.. This class exists only to associate a name with an ooiidd. In ppgg__aammoopp, every B-tree operator class has a set of pro- cedures, one through five, above. Some existing opclasses are iinntt22__ooppss, iinntt44__ooppss, and ooiidd__ooppss. You need to add an instance with your opclass name (for example, ccoommpplleexx__aabbss__ooppss) to ppgg__ooppccllaassss. The ooiidd of this instance is a foreign key in other classes. IINNSSEERRTT IINNTTOO ppgg__ooppccllaassss ((ooppccnnaammee)) VVAALLUUEESS ((''ccoommpplleexx__aabbss__ooppss''));; SSEELLEECCTT ooiidd,, ooppccnnaammee FFRROOMM ppgg__ooppccllaassss WWHHEERREE ooppccnnaammee == ''ccoommpplleexx__aabbss__ooppss'';; +------+--------------+ |ooiidd | ooppccnnaammee | +------+--------------+ |1177331144 | iinntt44__aabbss__ooppss | +------+--------------+ Note that the ooiidd for your ppgg__ooppccllaassss instance wwiillll bbee ddiiffffeerreenntt! You should substitute your value for 17314 ____________________ 8 Strictly speaking, this routine can return a negative number (< 0), 0, or a non-zero positive number (> 0). 4433 wherever it appears in this discussion. So now we have an access method and an operator class. We still need a set of operators; the procedure for defining operators was discussed earlier in this man- ual. For the ccoommpplleexx__aabbss__ooppss operator class on B- trees, the operators we require are: absolute value less-than absolute value less-than-or-equal absolute value equal absolute value greater-than-or-equal absolute value greater-than Suppose the code that implements the functions defined is stored in the file //uussrr//llooccaall//ppoossttggrreess9955//ssrrcc//ttuuttoorriiaall//ccoommpplleexx..cc Part of the code look like this: (note that we will only show the equality operator for the rest of the examples. The other four operators are very similar. Refer to ccoommpplleexx..cc or ccoommpplleexx..ssqqll for the details.) ##ddeeffiinnee MMaagg((cc)) ((((cc))-->>xx**((cc))-->>xx ++ ((cc))-->>yy**((cc))-->>yy)) bbooooll ccoommpplleexx__aabbss__eeqq((CCoommpplleexx **aa,, CCoommpplleexx **bb)) {{ ddoouubbllee aammaagg == MMaagg((aa)),, bbmmaagg == MMaagg((bb));; rreettuurrnn ((aammaagg====bbmmaagg));; }} There are a couple of important things that are happen- ing below. First, note that operators for less-than, less-than-or- equal, equal, greater-than-or-equal, and greater-than for iinntt44 are being defined. All of these operators are already defined for iinntt44 under the names <<, <<==, ==, >>==, and >>. The new operators behave differently, of course. In order to guarantee that POSTGRES uses these new operators rather than the old ones, they need to be named differently from the old ones. This is a key point: you can overload operators in POSTGRES, but only if the operator isn't already defined for the argument types. That is, if you have << defined for (iinntt44, iinntt44), you can't define it again. POSTGRES ddooeess nnoott cchheecckk this when you define your operator, so be care- ful. To avoid this problem, odd names will be used for the operators. If you get this wrong, the access meth- ods are likely to crash when you try to do scans. The other important point is that all the operator functions return _B_o_o_l_e_a_n values. The access methods rely on this fact. (On the other hand, the support 4444 function returns whatever the particular access method expects -- in this case, a signed integer.) The final routine in the file is the "support routine" mentioned when we discussed the aammssuuppppoorrtt attribute of the ppgg__aamm class. We will use this later on. For now, ignore it. CCRREEAATTEE FFUUNNCCTTIIOONN ccoommpplleexx__aabbss__eeqq((ccoommpplleexx,, ccoommpplleexx)) RREETTUURRNNSS bbooooll AASS ''//uussrr//llooccaall//ppoossttggrreess9955//ttuuttoorriiaall//oobbjj//ccoommpplleexx..ssoo'' LLAANNGGUUAAGGEE ''cc'';; Now define the operators that use them. As noted, the operator names must be unique among all operators that take two iinntt44 operands. In order to see if the opera- tor names listed below are taken, we can do a query on ppgg__ooppeerraattoorr: //** ** tthhiiss qquueerryy uusseess tthhee rreegguullaarr eexxpprreessssiioonn ooppeerraattoorr ((~~)) ** ttoo ffiinndd tthhrreeee--cchhaarraacctteerr ooppeerraattoorr nnaammeess tthhaatt eenndd iinn ** tthhee cchhaarraacctteerr && **// SSEELLEECCTT ** FFRROOMM ppgg__ooppeerraattoorr WWHHEERREE oopprrnnaammee ~~ ''^^....&&$$''::::tteexxtt;; to see if your name is taken for the types you want. The important things here are the procedure (which are the C functions defined above) and the restriction and join selectivity functions. You should just use the ones used below--note that there are different such functions for the less-than, equal, and greater-than cases. These _m_u_s_t be supplied, or the access method will crash when it tries to use the operator. You should copy the names for rreessttrriicctt and jjooiinn, but use the procedure names you defined in the last step. CCRREEAATTEE OOPPEERRAATTOORR == (( lleeffttaarrgg == ccoommpplleexx,, rriigghhttaarrgg == ccoommpplleexx,, pprroocceedduurree == ccoommpplleexx__aabbss__eeqq,, rreessttrriicctt == eeqqsseell,, jjooiinn == eeqqjjooiinnsseell )) Notice that five operators corresponding to less, less equal, equal, greater, and greater equal are defined. We're just about finished. the last thing we need to do is to update the ppgg__aammoopp relation. To do this, we need the following attributes: 4455 +------------+----------------------+ |aammooppiidd | the ooiidd of the ppgg__aamm | | | instance for B-tree | | | (== 403, see above) | +------------+----------------------+ |aammooppccllaaiidd | the ooiidd of the | | | ppgg__ooppccllaassss instance | | | for iinntt44__aabbss__ooppss (== | | | whatever you got in- | | | stead of 17314, see | | | above) | +------------+----------------------+ |aammooppoopprr | the ooiidds of the op- | | | erators for the op- | | | class (which we'll | | | get in just a | | | minute) | +------------+----------------------+ |aammooppsseelleecctt,, | cost functions. | |aammooppnnppaaggeess | | +------------+----------------------+ The cost functions are used by the query optimizer to decide whether or not to use a given index in a scan. Fortunately, these already exist. The two functions we'll use are bbttrreeeesseell,, which estimates the selectivity of the B-tree, and bbttrreeeennppaaggee,, which estimates the num- ber of pages a search will touch in the tree. So we need the ooiidds of the operators we just defined. We'll look up the names of all the operators that take two iinntt44s, and pick ours out: SSEELLEECCTT oo..ooiidd AASS ooppooiidd,, oo..oopprrnnaammee IINNTTOO TTAABBLLEE ccoommpplleexx__ooppss__ttmmpp FFRROOMM ppgg__ooppeerraattoorr oo,, ppgg__ttyyppee tt WWHHEERREE oo..oopprrlleefftt == tt..ooiidd aanndd oo..oopprrrriigghhtt == tt..ooiidd aanndd tt..ttyyppnnaammee == ''ccoommpplleexx'';; which returns: 4466 +------+---------+ |ooiidd | oopprrnnaammee | +------+---------+ |1177332211 | << | +------+---------+ |1177332222 | <<== | +------+---------+ |1177332233 | == | +------+---------+ |1177332244 | >>== | +------+---------+ |1177332255 | >> | +------+---------+ (Again, some of your ooiidd numbers will almost certainly be different.) The operators we are interested in are those with ooiidds 17321 through 17325. The values you get will probably be different, and you should substi- tute them for the values below. We can look at the operator names and pick out the ones we just added. Now we're ready to update ppgg__aammoopp with our new operator class. The most important thing in this entire discus- sion is that the operators are ordered, from less equal through greater equal, in ppgg__aammoopp. We add the instances we need: IINNSSEERRTT IINNTTOO ppgg__aammoopp ((aammooppiidd,, aammooppccllaaiidd,, aammooppoopprr,, aammooppssttrraatteeggyy,, aammooppsseelleecctt,, aammooppnnppaaggeess)) SSEELLEECCTT aamm..ooiidd,, ooppccll..ooiidd,, cc..ooppooiidd,, 33,, ''bbttrreeeesseell''::::rreeggpprroocc,, ''bbttrreeeennppaaggee''::::rreeggpprroocc FFRROOMM ppgg__aamm aamm,, ppgg__ooppccllaassss ooppccll,, ccoommpplleexx__ooppss__ttmmpp cc WWHHEERREE aammnnaammee == ''bbttrreeee'' aanndd ooppccnnaammee == ''ccoommpplleexx__aabbss__ooppss'' aanndd cc..oopprrnnaammee == ''=='';; Note the order: "less than" is 1, "less than or equal" is 2, "equal" is 3, "greater than or equal" is 4, and "greater than" is 5. The last step (finally!) is registration of the "sup- port routine" previously described in our discussion of ppgg__aamm. The ooiidd of this support routine is stored in the ppgg__aammpprroocc class, keyed by the access method ooiidd and the operator class ooiidd. First, we need to register the function in POSTGRES (recall that we put the C code that implements this routine in the bottom of the file in which we implemented the operator routines): CCRREEAATTEE FFUUNNCCTTIIOONN iinntt44__aabbss__ccmmpp((iinntt44,, iinntt44)) RREETTUURRNNSS iinntt44 AASS ''//uussrr//llooccaall//ppoossttggrreess9955//ttuuttoorriiaall//oobbjj//ccoommpplleexx..ssoo'' LLAANNGGUUAAGGEE ''cc'';; 4477 SSEELLEECCTT ooiidd,, pprroonnaammee FFRROOMM ppgg__pprroocc WWHHEERREE pprrnnaammee == ''iinntt44__aabbss__ccmmpp'';; +------+--------------+ |ooiidd | pprroonnaammee | +------+--------------+ |1177332288 | iinntt44__aabbss__ccmmpp | +------+--------------+ (Again, your ooiidd number will probably be different and you should substitute the value you see for the value below.) Recalling that the B-tree instance's ooiidd is 403 and that of iinntt44__aabbss__ooppss is 17314, we can add the new instance as follows: IINNSSEERRTT IINNTTOO ppgg__aammpprroocc ((aammiidd,, aammooppccllaaiidd,, aammpprroocc,, aammpprrooccnnuumm)) VVAALLUUEESS ((''440033''::::ooiidd,, ---- bbttrreeee ooiidd ''1177331144''::::ooiidd,, ---- ppgg__ooppccllaassss ttuuppllee ''1177332288''::::ooiidd,, ---- nneeww ppgg__pprroocc ooiidd ''11''::::iinntt22));; 4488 ________________________________________________________________________________________________________________________ 1122.. LLIIBBPPQQ ________________________________________________________________________________________________________________________ LIBPQ is the application programming interface to POST- GRES. LIBPQ is a set of library routines which allows client programs to pass queries to the POSTGRES backend server and to receive the results of these queries. This version of the documentation describes the C interface library. Three short programs are included at the end of this section to show how to write pro- grams that use LIBPQ. There are several examples of LIBPQ applications in the following directories: ....//ssrrcc//tteesstt//rreeggrreessss ....//ssrrcc//tteesstt//eexxaammpplleess ....//ssrrcc//bbiinn//ppssqqll Frontend programs which use LIBPQ must include the header file lliibbppqq--ffee..hh and must link with the lliibbppqq library. 1122..11.. CCoonnttrrooll aanndd IInniittiiaalliizzaattiioonn The following environment variables can be used to set up default environment values to avoid hard-coding database names into an application program: +o PPGGHHOOSSTT sets the default server name. +o PPGGOOPPTTIIOONNSS sets additional runtime options for the POSTGRES backend. +o PPGGPPOORRTT sets the default port for communicating with the POSTGRES backend. +o PPGGTTTTYY sets the file or tty on which debugging mes- sages from the backend server are displayed. +o PPGGDDAATTAABBAASSEE sets the default POSTGRES database name. +o PPGGRREEAALLMM sets the _K_e_r_b_e_r_o_s realm to use with POST- GRES, if it is different from the local realm. If PPGGRREEAALLMM is set, POSTGRES applications will attempt authentication with servers for this realm and use separate ticket files to avoid conflicts with local ticket files. This environment variable is only used if _K_e_r_b_e_r_o_s authentication is enabled. 1122..22.. DDaattaabbaassee CCoonnnneeccttiioonn FFuunnccttiioonnss The following routines deal with making a connection to a backend from a C program. PPQQsseettddbb 4499 Makes a new connection to a backend. PPGGccoonnnn **PPQQsseettddbb((cchhaarr **ppgghhoosstt,, cchhaarr **ppggppoorrtt,, cchhaarr **ppggooppttiioonnss,, cchhaarr **ppggttttyy,, cchhaarr **ddbbNNaammee));; If any argument is NULL, then the corresponding environment variable is checked. If the environ- ment variable is also not set, then hardwired defaults are used. _P_Q_s_e_t_d_b always returns a valid PGconn pointer. The _P_Q_s_t_a_t_u_s (see below) command should be called to ensure that a connection was properly made before queries are sent via the connection. LIBPQ programmers should be careful to maintain the PGconn abstraction. Use the accessor functions below to get at the contents of PGconn. Avoid directly referencing the fields of the PGconn structure as they are subject to change in the future. PPQQddbb returns the database name of the connection. cchhaarr **PPQQddbb((PPGGccoonnnn **ccoonnnn)) PPQQhhoosstt returns the host name of the connection. cchhaarr **PPQQhhoosstt((PPGGccoonnnn **ccoonnnn)) PPQQooppttiioonnss returns the pgoptions used in the con- nection. cchhaarr **PPQQooppttiioonnss((PPGGccoonnnn **ccoonnnn)) PPQQppoorrtt returns the pgport of the connection. cchhaarr **PPQQppoorrtt((PPGGccoonnnn **ccoonnnn)) PPQQttttyy returns the pgtty of the connection. cchhaarr **PPQQttttyy((PPGGccoonnnn **ccoonnnn)) PPQQssttaattuuss Returns the status of the connection. The status can be CONNECTION_OK or CONNECTION_BAD. CCoonnnnSSttaattuussTTyyppee **PPQQssttaattuuss((PPGGccoonnnn **ccoonnnn)) PPQQeerrrroorrMMeessssaaggee returns the error message associ- ated with the connection cchhaarr **PPQQeerrrroorrMMeessssaaggee((PPGGccoonnnn** ccoonnnn));; 5500 PPQQffiinniisshh Close the connection to the backend. Also frees memory used by the PGconn structure. The PGconn pointer should not be used after PQfinish has been called. vvooiidd PPQQffiinniisshh((PPGGccoonnnn **ccoonnnn)) PPQQrreesseett Reset the communication port with the backend. This function will close the IPC socket connection to the backend and attempt to reestablish a new connection to the same backend. vvooiidd PPQQrreesseett((PPGGccoonnnn **ccoonnnn)) PPQQttrraaccee Enables tracing of messages passed between the frontend and the backend. The messages are echoed to the debug_port file stream. vvooiidd PPQQttrraaccee((PPGGccoonnnn **ccoonnnn,, FFIILLEE** ddeebbuugg__ppoorrtt));; PPQQuunnttrraaccee Disables tracing of messages passed between the frontend and the backend. vvooiidd PPQQuunnttrraaccee((PPGGccoonnnn **ccoonnnn));; 1122..33.. QQuueerryy EExxeeccuuttiioonn FFuunnccttiioonnss PPQQeexxeecc Submit a query to POSTGRES. Returns a PGresult pointer if the query was successful or a NULL oth- erwise. If a NULL is returned, _P_Q_e_r_r_o_r_M_e_s_s_a_g_e can be used to get more information about the error. PPGGrreessuulltt **PPQQeexxeecc((PPGGccoonnnn **ccoonnnn,, cchhaarr **qquueerryy));; The PGresult structure encapsulates the query result returned by the backend. LIBPQ programmers should be careful to maintain the PGresult abstraction. Use the accessor functions described below to retrieve the results of the query. Avoid directly referencing the fields of the PGresult structure as they are subject to change in the future. PPQQrreessuullttSSttaattuuss Returns the result status of the query. _P_Q_r_e_s_u_l_t_- _S_t_a_t_u_s can return one of the following values: PPGGRREESS__EEMMPPTTYY__QQUUEERRYY,, 5511 PPGGRREESS__CCOOMMMMAANNDD__OOKK,, //** tthhee qquueerryy wwaass aa ccoommmmaanndd **// PPGGRREESS__TTUUPPLLEESS__OOKK,, //** tthhee qquueerryy ssuucccceessssffuullllyy rreettuurrnneedd ttuupplleess **// PPGGRREESS__CCOOPPYY__OOUUTT,, PPGGRREESS__CCOOPPYY__IINN,, PPGGRREESS__BBAADD__RREESSPPOONNSSEE,, //** aann uunneexxppeecctteedd rreessppoonnssee wwaass rreecceeiivveedd **// PPGGRREESS__NNOONNFFAATTAALL__EERRRROORR,, PPGGRREESS__FFAATTAALL__EERRRROORR If the result status is PGRES_TUPLES_OK, then the following routines can be used to retrieve the tuples returned by the query. PPQQnnttuupplleess returns the number of tuples (instances) in the query result. iinntt PPQQnnttuupplleess((PPGGrreessuulltt **rreess));; PPQQnnffiieellddss returns the number of fields (attributes) in the query result. iinntt PPQQnnffiieellddss((PPGGrreessuulltt **rreess));; PPQQffnnaammee returns the field (attribute) name associ- ated with the given field index. Field indices start at 0. cchhaarr **PPQQffnnaammee((PPGGrreessuulltt **rreess,, iinntt ffiieelldd__iinnddeexx));; PPQQffnnuummbbeerr returns the field (attribute) index associated with the given field name. iinntt PPQQffnnuummbbeerr((PPGGrreessuulltt **rreess,, cchhaarr** ffiieelldd__nnaammee));; PPQQffttyyppee returns the field type associated with the given field index. The integer returned is an internal coding of the type. Field indices start at 0. OOiidd PPQQffttyyppee((PPGGrreessuulltt **rreess,, iinntt ffiieelldd__nnuumm));; PPQQffssiizzee returns the size in bytes of the field associated with the given field index. If the size returned is -1, the field is a variable length field. Field indices start at 0. iinntt22 PPQQffssiizzee((PPGGrreessuulltt **rreess,, iinntt ffiieelldd__iinnddeexx));; PPQQggeettvvaalluuee returns the field (attribute) value. For most queries, the value returned by _P_Q_g_e_t_v_a_l_u_e is a null-terminated ASCII string representation 5522 of the attribute value. If the query was a result of a BBIINNAARRYY cursor, then the value returned by _P_Q_g_e_t_v_a_l_u_e is the binary representation of the type in the internal format of the backend server. It is the programmer's responsibility to cast and convert the data to the correct C type. The value returned by _P_Q_g_e_t_v_a_l_u_e points to storage that is part of the PGresult structure. One must explic- itly copy the value into other storage if it is to be used past the lifetime of the PGresult struc- ture itself. cchhaarr** PPQQggeettvvaalluuee((PPGGrreessuulltt **rreess,, iinntt ttuupp__nnuumm,, iinntt ffiieelldd__nnuumm));; PPQQggeettlleennggtthh returns the length of a field (attribute) in bytes. If the field is a _s_t_r_u_c_t _v_a_r_l_e_n_a, the length returned here does nnoott include the size field of the varlena, i.e., it is 4 bytes less. iinntt PPQQggeettlleennggtthh((PPGGrreessuulltt **rreess,, iinntt ttuupp__nnuumm,, iinntt ffiieelldd__nnuumm));; PPQQccmmddSSttaattuuss Returns the command status associated with the last query command. cchhaarr **PPQQccmmddSSttaattuuss((PPGGrreessuulltt **rreess));; PPQQooiiddSSttaattuuss Returns a string with the object id of the tuple inserted if the last query is an INSERT command. Otherwise, returns an empty string. cchhaarr** PPQQooiiddSSttaattuuss((PPGGrreessuulltt **rreess));; PPQQpprriinnttTTuupplleess Prints out all the tuples and, optionally, the attribute names to the specified output stream. The programs ppssqqll and mmoonniittoorr both use _P_Q_p_r_i_n_t_- _T_u_p_l_e_s for output. vvooiidd PPQQpprriinnttTTuupplleess(( PPGGrreessuulltt** rreess,, FFIILLEE** ffoouutt,, //** oouuttppuutt ssttrreeaamm **// iinntt pprriinnttAAttttNNaammee,,//** pprriinntt aattttrriibbuuttee nnaammeess oorr nnoott**// iinntt tteerrsseeOOuuttppuutt,, //** ddeelliimmiitteerr bbaarrss oorr nnoott??**// iinntt wwiiddtthh //** wwiiddtthh ooff ccoolluummnn,, vvaarriiaabbllee wwiiddtthh iiff 00**// ));; 5533 PPQQcclleeaarr Frees the storage associated with the PGresult. Every query result should be properly freed when it is no longer used. Failure to do this will result in memory leaks in the frontend applica- tion. vvooiidd PPQQcclleeaarr((PPQQrreessuulltt **rreess));; 1122..44.. FFaasstt PPaatthh POSTGRES provides a ffaasstt ppaatthh interface to send func- tion calls to the backend. This is a trapdoor into system internals and can be a potential security hole. Most users will not need this feature. PPGGrreessuulltt** PPQQffnn((PPGGccoonnnn** ccoonnnn,, iinntt ffnniidd,, iinntt **rreessuulltt__bbuuff,, iinntt **rreessuulltt__lleenn,, iinntt rreessuulltt__iiss__iinntt,, PPQQAArrggBBlloocckk **aarrggss,, iinntt nnaarrggss));; The _f_n_i_d argument is the object identifier of the func- tion to be executed. _r_e_s_u_l_t___b_u_f is the buffer in which to load the return value. The caller must have allo- cated sufficient space to store the return value. The result length will be returned in the storage pointed to by _r_e_s_u_l_t___l_e_n_. If the result is to be an integer value, than _r_e_s_u_l_t___i_s___i_n_t should be set to 1; otherwise it should be set to 0. _a_r_g_s and _n_a_r_g_s specify the arguments to the function. ttyyppeeddeeff ssttrruucctt {{ iinntt lleenn;; iinntt iissiinntt;; uunniioonn {{ iinntt **ppttrr;; iinntt iinntteeggeerr;; }} uu;; }} PPQQAArrggBBlloocckk;; _P_Q_f_n always returns a valid PGresult*. The resultSta- tus should be checked before the result is used. The caller is responsible for freeing the PGresult with _P_Q_c_l_e_a_r when it is not longer needed. 1122..55.. AAssyynncchhrroonnoouuss NNoottiiffiiccaattiioonn POSTGRES supports asynchronous notification via the _L_I_S_T_E_N and _N_O_T_I_F_Y commands. A backend registers its interest in a particular relation with the LISTEN com- mand. All backends listening on a particular relation will be notified asynchronously when a NOTIFY of that relation name is executed by another backend. No 5544 additional information is passed from the notifier to the listener. Thus, typically, any actual data that needs to be communicated is transferred through the relation. LIBPQ applications are notified whenever a connected backend has received an asynchronous notification. However, the communication from the backend to the frontend is not asynchronous. Notification comes piggy-backed on other query results. Thus, an applica- tion must submit queries, even empty ones, in order to receive notice of backend notification. In effect, the LIBPQ application must poll the backend to see if there is any pending notification information. After the execution of a query, a frontend may call _P_Q_N_o_t_i_f_i_e_s to see if any notification data is available from the backend. PPQQNNoottiiffiieess returns the notification from a list of unhandled notifications from the backend. Returns NULL if there are no pending notifications from the back- end. _P_Q_N_o_t_i_f_i_e_s behaves like the popping of a stack. Once a notification is returned from _P_Q_n_o_- _t_i_f_i_e_s_, it is considered handled and will be removed from the list of notifications. PPGGnnoottiiffyy** PPQQNNoottiiffiieess((PPGGccoonnnn **ccoonnnn));; The second sample program gives an example of the use of asynchronous notification. 1122..66.. FFuunnccttiioonnss AAssssoocciiaatteedd wwiitthh tthhee CCOOPPYY CCoommmmaanndd The _c_o_p_y command in POSTGRES has options to read from or write to the network connection used by LIBPQ. Therefore, functions are necessary to access this net- work connection directly so applications may take full advantage of this capability. PPQQggeettlliinnee Reads a newline-terminated line of characters (transmitted by the backend server) into a buffer _s_t_r_i_n_g of size _l_e_n_g_t_h. Like _f_g_e_t_s(3), this rou- tine copies up to _l_e_n_g_t_h-1 characters into _s_t_r_i_n_g. It is like _g_e_t_s(3), however, in that it converts the terminating newline into a null character. _P_Q_g_e_t_l_i_n_e returns EOF at EOF, 0 if the entire line has been read, and 1 if the buffer is full but the terminating newline has not yet been read. Notice that the application must check to see if a new line consists of the single character ".", which indicates that the backend server has fin- ished sending the results of the _c_o_p_y command. Therefore, if the application ever expects to receive lines that are more than _l_e_n_g_t_h-1 charac- ters long, the application must be sure to check the return value of _P_Q_g_e_t_l_i_n_e very carefully. 5555 The code in ....//ssrrcc//bbiinn//ppssqqll//ppssqqll..cc contains routines that correctly handle the copy protocol. iinntt PPQQggeettlliinnee((PPGGccoonnnn **ccoonnnn,, cchhaarr **ssttrriinngg,, iinntt lleennggtthh)) PPQQppuuttlliinnee Sends a null-terminated _s_t_r_i_n_g to the backend server. The application must explicitly send the single character "." to indicate to the backend that it has finished sending its data. vvooiidd PPQQppuuttlliinnee((PPGGccoonnnn **ccoonnnn,, cchhaarr **ssttrriinngg));; PPQQeennddccooppyy Syncs with the backend. This function waits until the backend has finished the copy. It should either be issued when the last string has been sent to the backend using _P_Q_p_u_t_l_i_n_e or when the last string has been received from the backend using _P_G_g_e_t_l_i_n_e. It must be issued or the backend may get "out of sync" with the frontend. Upon return from this function, the backend is ready to receive the next query. The return value is 0 on successful completion, nonzero otherwise. iinntt PPQQeennddccooppyy((PPGGccoonnnn **ccoonnnn));; As an example: PPQQeexxeecc((ccoonnnn,, ""ccrreeaattee ttaabbllee ffoooo ((aa iinntt44,, bb cchhaarr1166,, dd ffllooaatt88))""));; PPQQeexxeecc((ccoonnnn,, ""ccooppyy ffoooo ffrroomm ssttddiinn""));; PPQQppuuttlliinnee((ccoonnnn,, ""33<>hheelllloo wwoorrlldd<>44..55\\nn""));; PPQQppuuttlliinnee((ccoonnnn,,""44<>ggooooddbbyyee wwoorrlldd<>77..1111\\nn""));; ...... PPQQppuuttlliinnee((ccoonnnn,,""..\\nn""));; PPQQeennddccooppyy((ccoonnnn));; 1122..77.. LLIIBBPPQQ TTrraacciinngg FFuunnccttiioonnss PPQQttrraaccee Enable tracing of the frontend/backend communica- tion to a debugging file stream. vvooiidd PPQQttrraaccee((PPGGccoonnnn **ccoonnnn FFIILLEE **ddeebbuugg__ppoorrtt)) 5566 PPQQuunnttrraaccee Disable tracing started by _P_Q_t_r_a_c_e vvooiidd PPQQuunnttrraaccee((PPGGccoonnnn **ccoonnnn)) 1122..88.. UUsseerr AAuutthheennttiiccaattiioonn FFuunnccttiioonnss If the user has generated the appropriate authentica- tion credentials (e.g., obtaining _K_e_r_b_e_r_o_s tickets), the frontend/backend authentication process is handled by _P_Q_e_x_e_c without any further intervention. The fol- lowing routines may be called by LIBPQ programs to tai- lor the behavior of the authentication process. ffee__ggeettaauutthhnnaammee Returns a pointer to static space containing what- ever name the user has authenticated. Use of this routine in place of calls to _g_e_t_e_n_v(3) or _g_e_t_p_- _w_u_i_d(3) by applications is highly recommended, as it is entirely possible that the authenticated user name is nnoott the same as value of the UUSSEERR environment variable or the user's entry in _/_e_t_c_/_p_a_s_s_w_d. cchhaarr **ffee__ggeettaauutthhnnaammee((cchhaarr** eerrrroorrMMeessssaaggee)) ffee__sseettaauutthhssvvcc Specifies that LIBPQ should use authentication service _n_a_m_e rather than its compiled-in default. This value is typically taken from a command-line switch. vvooiidd ffee__sseettaauutthhssvvcc((cchhaarr **nnaammee,, cchhaarr** eerrrroorrMMeessssaaggee)) Any error messages from the authentication attempts are returned in the errorMessage argu- ment. 1122..99.. BBUUGGSS The query buffer is 8192 bytes long, and queries over that length will be silently truncated. 1122..1100.. SSaammppllee PPrrooggrraammss 5577 1122..1100..11.. SSaammppllee PPrrooggrraamm 11 //** ** tteessttlliibbppqq..cc ** TTeesstt tthhee CC vveerrssiioonn ooff LLIIBBPPQQ,, tthhee PPOOSSTTGGRREESS ffrroonntteenndd lliibbrraarryy.. ** ** **// ##iinncclluuddee <> ##iinncclluuddee ""lliibbppqq--ffee..hh"" vvooiidd eexxiitt__nniicceellyy((PPGGccoonnnn** ccoonnnn)) {{ PPQQffiinniisshh((ccoonnnn));; eexxiitt((11));; }} mmaaiinn(()) {{ cchhaarr **ppgghhoosstt,, **ppggppoorrtt,, **ppggooppttiioonnss,, **ppggttttyy;; cchhaarr** ddbbNNaammee;; iinntt nnFFiieellddss;; iinntt ii,,jj;; //** FFIILLEE **ddeebbuugg;; **// PPGGccoonnnn** ccoonnnn;; PPGGrreessuulltt** rreess;; //** bbeeggiinn,, bbyy sseettttiinngg tthhee ppaarraammeetteerrss ffoorr aa bbaacckkeenndd ccoonnnneeccttiioonn iiff tthhee ppaarraammeetteerrss aarree nnuullll,, tthheenn tthhee ssyysstteemm wwiillll ttrryy ttoo uussee rreeaassoonnaabbllee ddeeffaauullttss bbyy llooookkiinngg uupp eennvviirroonnmmeenntt vvaarriiaabblleess oorr,, ffaaiilliinngg tthhaatt,, uussiinngg hhaarrddwwiirreedd ccoonnssttaannttss **// ppgghhoosstt == NNUULLLL;; //** hhoosstt nnaammee ooff tthhee bbaacckkeenndd sseerrvveerr **// ppggppoorrtt == NNUULLLL;; //** ppoorrtt ooff tthhee bbaacckkeenndd sseerrvveerr **// ppggooppttiioonnss == NNUULLLL;; //** ssppeecciiaall ooppttiioonnss ttoo ssttaarrtt uupp tthhee bbaacckkeenndd sseerrvveerr **// ppggttttyy == NNUULLLL;; //** ddeebbuuggggiinngg ttttyy ffoorr tthhee bbaacckkeenndd sseerrvveerr **// ddbbNNaammee == ""tteemmppllaattee11"";; //** mmaakkee aa ccoonnnneeccttiioonn ttoo tthhee ddaattaabbaassee **// ccoonnnn == PPQQsseettddbb((ppgghhoosstt,, ppggppoorrtt,, ppggooppttiioonnss,, ppggttttyy,, ddbbNNaammee));; //** cchheecckk ttoo sseeee tthhaatt tthhee bbaacckkeenndd ccoonnnneeccttiioonn wwaass ssuucccceessssffuullllyy mmaaddee **// iiff ((PPQQssttaattuuss((ccoonnnn)) ==== CCOONNNNEECCTTIIOONN__BBAADD)) {{ ffpprriinnttff((ssttddeerrrr,,""CCoonnnneeccttiioonn ttoo ddaattaabbaassee ''%%ss'' ffaaiilleedd..00,, ddbbNNaammee));; ffpprriinnttff((ssttddeerrrr,,""%%ss"",,PPQQeerrrroorrMMeessssaaggee((ccoonnnn))));; eexxiitt__nniicceellyy((ccoonnnn));; }} //** ddeebbuugg == ffooppeenn((""//ttmmpp//ttrraaccee..oouutt"",,""ww""));; **// //** PPQQttrraaccee((ccoonnnn,, ddeebbuugg));; **// //** ssttaarrtt aa ttrraannssaaccttiioonn bblloocckk **// 5588 rreess == PPQQeexxeecc((ccoonnnn,,""BBEEGGIINN""));; iiff ((PPQQrreessuullttSSttaattuuss((rreess)) !!== PPGGRREESS__CCOOMMMMAANNDD__OOKK)) {{ ffpprriinnttff((ssttddeerrrr,,""BBEEGGIINN ccoommmmaanndd ffaaiilleedd00));; PPQQcclleeaarr((rreess));; eexxiitt__nniicceellyy((ccoonnnn));; }} //** sshhoouulldd PPQQcclleeaarr PPGGrreessuulltt wwhheenneevveerr iitt iiss nnoo lloonnggeerr nneeeeddeedd ttoo aavvooiidd mmeemmoorryy lleeaakkss **// PPQQcclleeaarr((rreess));; //** ffeettcchh iinnssttaanncceess ffrroomm tthhee ppgg__ddaattaabbaassee,, tthhee ssyysstteemm ccaattaalloogg ooff ddaattaabbaasseess**// rreess == PPQQeexxeecc((ccoonnnn,,""DDEECCLLAARREE mmyyppoorrttaall CCUURRSSOORR FFOORR sseelleecctt ** ffrroomm ppgg__ddaattaabbaassee""));; iiff ((PPQQrreessuullttSSttaattuuss((rreess)) !!== PPGGRREESS__CCOOMMMMAANNDD__OOKK)) {{ ffpprriinnttff((ssttddeerrrr,,""DDEECCLLAARREE CCUURRSSOORR ccoommmmaanndd ffaaiilleedd00));; PPQQcclleeaarr((rreess));; eexxiitt__nniicceellyy((ccoonnnn));; }} PPQQcclleeaarr((rreess));; rreess == PPQQeexxeecc((ccoonnnn,,""FFEETTCCHH AALLLL iinn mmyyppoorrttaall""));; iiff ((PPQQrreessuullttSSttaattuuss((rreess)) !!== PPGGRREESS__TTUUPPLLEESS__OOKK)) {{ ffpprriinnttff((ssttddeerrrr,,""FFEETTCCHH AALLLL ccoommmmaanndd ddiiddnn''tt rreettuurrnn ttuupplleess pprrooppeerrllyy00));; PPQQcclleeaarr((rreess));; eexxiitt__nniicceellyy((ccoonnnn));; }} //** ffiirrsstt,, pprriinntt oouutt tthhee aattttrriibbuuttee nnaammeess **// nnFFiieellddss == PPQQnnffiieellddss((rreess));; ffoorr ((ii==00;; ii << nnFFiieellddss;; ii++++)) {{ pprriinnttff((""%%--1155ss"",,PPQQffnnaammee((rreess,,ii))));; }} pprriinnttff((""00));; //** nneexxtt,, pprriinntt oouutt tthhee iinnssttaanncceess **// ffoorr ((ii==00;; ii << PPQQnnttuupplleess((rreess));; ii++++)) {{ ffoorr ((jj==00 ;; jj << nnFFiieellddss;; jj++++)) {{ pprriinnttff((""%%--1155ss"",, PPQQggeettvvaalluuee((rreess,,ii,,jj))));; }} pprriinnttff((""00));; }} PPQQcclleeaarr((rreess));; //** cclloossee tthhee ppoorrttaall **// rreess == PPQQeexxeecc((ccoonnnn,, ""CCLLOOSSEE mmyyppoorrttaall""));; PPQQcclleeaarr((rreess));; //** eenndd tthhee ttrraannssaaccttiioonn **// rreess == PPQQeexxeecc((ccoonnnn,, ""EENNDD""));; PPQQcclleeaarr((rreess));; //** cclloossee tthhee ccoonnnneeccttiioonn ttoo tthhee ddaattaabbaassee aanndd cclleeaannuupp **// PPQQffiinniisshh((ccoonnnn));; 5599 //** ffcclloossee((ddeebbuugg));; **// }} 6600 1122..1100..22.. SSaammppllee PPrrooggrraamm 22 //** ** tteessttlliibbppqq22..cc ** TTeesstt ooff tthhee aassyynncchhrroonnoouuss nnoottiiffiiccaattiioonn iinntteerrffaaccee ** ppooppuullaattee aa ddaattaabbaassee wwiitthh tthhee ffoolllloowwiinngg:: CCRREEAATTEE TTAABBLLEE TTBBLL11 ((ii iinntt44));; CCRREEAATTEE TTAABBLLEE TTBBLL22 ((ii iinntt44));; CCRREEAATTEE RRUULLEE rr11 AASS OONN IINNSSEERRTT TTOO TTBBLL11 DDOO [[IINNSSEERRTT IINNTTOO TTBBLL22 vvaalluueess ((nneeww..ii));; NNOOTTIIFFYY TTBBLL22]];; ** TThheenn ssttaarrtt uupp tthhiiss pprrooggrraamm ** AAfftteerr tthhee pprrooggrraamm hhaass bbeegguunn,, ddoo IINNSSEERRTT IINNTTOO TTBBLL11 vvaalluueess ((1100));; ** ** **// ##iinncclluuddee <> ##iinncclluuddee ""lliibbppqq--ffee..hh"" vvooiidd eexxiitt__nniicceellyy((PPGGccoonnnn** ccoonnnn)) {{ PPQQffiinniisshh((ccoonnnn));; eexxiitt((11));; }} mmaaiinn(()) {{ cchhaarr **ppgghhoosstt,, **ppggppoorrtt,, **ppggooppttiioonnss,, **ppggttttyy;; cchhaarr** ddbbNNaammee;; iinntt nnFFiieellddss;; iinntt ii,,jj;; PPGGccoonnnn** ccoonnnn;; PPGGrreessuulltt** rreess;; PPGGnnoottiiffyy** nnoottiiffyy;; //** bbeeggiinn,, bbyy sseettttiinngg tthhee ppaarraammeetteerrss ffoorr aa bbaacckkeenndd ccoonnnneeccttiioonn iiff tthhee ppaarraammeetteerrss aarree nnuullll,, tthheenn tthhee ssyysstteemm wwiillll ttrryy ttoo uussee rreeaassoonnaabbllee ddeeffaauullttss bbyy llooookkiinngg uupp eennvviirroonnmmeenntt vvaarriiaabblleess oorr,, ffaaiilliinngg tthhaatt,, uussiinngg hhaarrddwwiirreedd ccoonnssttaannttss **// ppgghhoosstt == NNUULLLL;; //** hhoosstt nnaammee ooff tthhee bbaacckkeenndd sseerrvveerr **// ppggppoorrtt == NNUULLLL;; //** ppoorrtt ooff tthhee bbaacckkeenndd sseerrvveerr **// ppggooppttiioonnss == NNUULLLL;; //** ssppeecciiaall ooppttiioonnss ttoo ssttaarrtt uupp tthhee bbaacckkeenndd sseerrvveerr **// ppggttttyy == NNUULLLL;; //** ddeebbuuggggiinngg ttttyy ffoorr tthhee bbaacckkeenndd sseerrvveerr **// ddbbNNaammee == ggeetteennvv((""UUSSEERR""));; //** cchhaannggee tthhiiss ttoo tthhee nnaammee ooff yyoouurr tteesstt ddaattaabbaassee**// //** mmaakkee aa ccoonnnneeccttiioonn ttoo tthhee ddaattaabbaassee **// ccoonnnn == PPQQsseettddbb((ppgghhoosstt,, ppggppoorrtt,, ppggooppttiioonnss,, ppggttttyy,, ddbbNNaammee));; 6611 //** cchheecckk ttoo sseeee tthhaatt tthhee bbaacckkeenndd ccoonnnneeccttiioonn wwaass ssuucccceessssffuullllyy mmaaddee **// iiff ((PPQQssttaattuuss((ccoonnnn)) ==== CCOONNNNEECCTTIIOONN__BBAADD)) {{ ffpprriinnttff((ssttddeerrrr,,""CCoonnnneeccttiioonn ttoo ddaattaabbaassee ''%%ss'' ffaaiilleedd..00,, ddbbNNaammee));; ffpprriinnttff((ssttddeerrrr,,""%%ss"",,PPQQeerrrroorrMMeessssaaggee((ccoonnnn))));; eexxiitt__nniicceellyy((ccoonnnn));; }} rreess == PPQQeexxeecc((ccoonnnn,, ""LLIISSTTEENN TTBBLL22""));; iiff ((PPQQrreessuullttSSttaattuuss((rreess)) !!== PPGGRREESS__CCOOMMMMAANNDD__OOKK)) {{ ffpprriinnttff((ssttddeerrrr,,""LLIISSTTEENN ccoommmmaanndd ffaaiilleedd00));; PPQQcclleeaarr((rreess));; eexxiitt__nniicceellyy((ccoonnnn));; }} //** sshhoouulldd PPQQcclleeaarr PPGGrreessuulltt wwhheenneevveerr iitt iiss nnoo lloonnggeerr nneeeeddeedd ttoo aavvooiidd mmeemmoorryy lleeaakkss **// PPQQcclleeaarr((rreess));; wwhhiillee ((11)) {{ //** aassyynncc nnoottiiffiiccaattiioonn oonnllyy ccoommee bbaacckk aass aa rreessuulltt ooff aa qquueerryy**// //** wwee ccaann sseenndd eemmppttyy qquueerriieess **// rreess == PPQQeexxeecc((ccoonnnn,, "" ""));; //** pprriinnttff((""rreess-->>ssttaattuuss == %%ss00,, ppggrreessSSttaattuuss[[PPQQrreessuullttSSttaattuuss((rreess))]]));; **// //** cchheecckk ffoorr aassyynncchhrroonnoouuss rreettuurrnnss **// nnoottiiffyy == PPQQnnoottiiffiieess((ccoonnnn));; iiff ((nnoottiiffyy)) {{ ffpprriinnttff((ssttddeerrrr,, ""AASSYYNNCC NNOOTTIIFFYY ooff ''%%ss'' ffrroomm bbaacckkeenndd ppiidd ''%%dd'' rreecceeiivveedd00,, nnoottiiffyy-->>rreellnnaammee,, nnoottiiffyy-->>bbee__ppiidd));; ffrreeee((nnoottiiffyy));; bbrreeaakk;; }} PPQQcclleeaarr((rreess));; }} //** cclloossee tthhee ccoonnnneeccttiioonn ttoo tthhee ddaattaabbaassee aanndd cclleeaannuupp **// PPQQffiinniisshh((ccoonnnn));; }} 6622 1122..1100..33.. SSaammppllee PPrrooggrraamm 33 //** ** tteessttlliibbppqq33..cc ** TTeesstt tthhee CC vveerrssiioonn ooff LLIIBBPPQQ,, tthhee PPOOSSTTGGRREESS ffrroonntteenndd lliibbrraarryy.. ** tteessttss tthhee bbiinnaarryy ccuurrssoorr iinntteerrffaaccee ** ** ** ppooppuullaattee aa ddaattaabbaassee bbyy ddooiinngg tthhee ffoolllloowwiinngg:: CCRREEAATTEE TTAABBLLEE tteesstt11 ((ii iinntt44,, dd ffllooaatt44,, pp ppoollyyggoonn));; IINNSSEERRTT IINNTTOO tteesstt11 vvaalluueess ((11,, 33..556677,, ''((33..00,, 44..00,, 11..00,, 22..00))''::::ppoollyyggoonn));; IINNSSEERRTT IINNTTOO tteesstt11 vvaalluueess ((22,, 8899..0055,, ''((44..00,, 33..00,, 22..00,, 11..00))''::::ppoollyyggoonn));; tthhee eexxppeecctteedd oouuttppuutt iiss:: ttuuppllee 00:: ggoott ii == ((44 bbyytteess)) 11,, dd == ((44 bbyytteess)) 33..556677000000,, pp == ((44 bbyytteess)) 22 ppooiinnttss bboouunnddbbooxx == ((hhii==33..000000000000//44..000000000000,, lloo == 11..000000000000,,22..000000000000)) ttuuppllee 11:: ggoott ii == ((44 bbyytteess)) 22,, dd == ((44 bbyytteess)) 8899..005500000033,, pp == ((44 bbyytteess)) 22 ppooiinnttss bboouunnddbbooxx == ((hhii==44..000000000000//33..000000000000,, lloo == 22..000000000000,,11..000000000000)) ** **// ##iinncclluuddee <> ##iinncclluuddee ""lliibbppqq--ffee..hh"" ##iinncclluuddee ""uuttiillss//ggeeoo--ddeeccllss..hh"" //** ffoorr tthhee PPOOLLYYGGOONN ttyyppee **// vvooiidd eexxiitt__nniicceellyy((PPGGccoonnnn** ccoonnnn)) {{ PPQQffiinniisshh((ccoonnnn));; eexxiitt((11));; }} mmaaiinn(()) {{ cchhaarr **ppgghhoosstt,, **ppggppoorrtt,, **ppggooppttiioonnss,, **ppggttttyy;; cchhaarr** ddbbNNaammee;; iinntt nnFFiieellddss;; iinntt ii,,jj;; iinntt ii__ffnnuumm,, dd__ffnnuumm,, pp__ffnnuumm;; PPGGccoonnnn** ccoonnnn;; PPGGrreessuulltt** rreess;; //** bbeeggiinn,, bbyy sseettttiinngg tthhee ppaarraammeetteerrss ffoorr aa bbaacckkeenndd ccoonnnneeccttiioonn iiff tthhee ppaarraammeetteerrss aarree nnuullll,, tthheenn tthhee ssyysstteemm wwiillll ttrryy ttoo uussee rreeaassoonnaabbllee ddeeffaauullttss bbyy llooookkiinngg uupp eennvviirroonnmmeenntt vvaarriiaabblleess 6633 oorr,, ffaaiilliinngg tthhaatt,, uussiinngg hhaarrddwwiirreedd ccoonnssttaannttss **// ppgghhoosstt == NNUULLLL;; //** hhoosstt nnaammee ooff tthhee bbaacckkeenndd sseerrvveerr **// ppggppoorrtt == NNUULLLL;; //** ppoorrtt ooff tthhee bbaacckkeenndd sseerrvveerr **// ppggooppttiioonnss == NNUULLLL;; //** ssppeecciiaall ooppttiioonnss ttoo ssttaarrtt uupp tthhee bbaacckkeenndd sseerrvveerr **// ppggttttyy == NNUULLLL;; //** ddeebbuuggggiinngg ttttyy ffoorr tthhee bbaacckkeenndd sseerrvveerr **// ddbbNNaammee == ggeetteennvv((""UUSSEERR""));; //** cchhaannggee tthhiiss ttoo tthhee nnaammee ooff yyoouurr tteesstt ddaattaabbaassee**// //** mmaakkee aa ccoonnnneeccttiioonn ttoo tthhee ddaattaabbaassee **// ccoonnnn == PPQQsseettddbb((ppgghhoosstt,, ppggppoorrtt,, ppggooppttiioonnss,, ppggttttyy,, ddbbNNaammee));; //** cchheecckk ttoo sseeee tthhaatt tthhee bbaacckkeenndd ccoonnnneeccttiioonn wwaass ssuucccceessssffuullllyy mmaaddee **// iiff ((PPQQssttaattuuss((ccoonnnn)) ==== CCOONNNNEECCTTIIOONN__BBAADD)) {{ ffpprriinnttff((ssttddeerrrr,,""CCoonnnneeccttiioonn ttoo ddaattaabbaassee ''%%ss'' ffaaiilleedd..00,, ddbbNNaammee));; ffpprriinnttff((ssttddeerrrr,,""%%ss"",,PPQQeerrrroorrMMeessssaaggee((ccoonnnn))));; eexxiitt__nniicceellyy((ccoonnnn));; }} //** ssttaarrtt aa ttrraannssaaccttiioonn bblloocckk **// rreess == PPQQeexxeecc((ccoonnnn,,""BBEEGGIINN""));; iiff ((PPQQrreessuullttSSttaattuuss((rreess)) !!== PPGGRREESS__CCOOMMMMAANNDD__OOKK)) {{ ffpprriinnttff((ssttddeerrrr,,""BBEEGGIINN ccoommmmaanndd ffaaiilleedd00));; PPQQcclleeaarr((rreess));; eexxiitt__nniicceellyy((ccoonnnn));; }} //** sshhoouulldd PPQQcclleeaarr PPGGrreessuulltt wwhheenneevveerr iitt iiss nnoo lloonnggeerr nneeeeddeedd ttoo aavvooiidd mmeemmoorryy lleeaakkss **// PPQQcclleeaarr((rreess));; //** ffeettcchh iinnssttaanncceess ffrroomm tthhee ppgg__ddaattaabbaassee,, tthhee ssyysstteemm ccaattaalloogg ooff ddaattaabbaasseess**// rreess == PPQQeexxeecc((ccoonnnn,,""DDEECCLLAARREE mmyyccuurrssoorr BBIINNAARRYY CCUURRSSOORR FFOORR sseelleecctt ** ffrroomm tteesstt11""));; iiff ((PPQQrreessuullttSSttaattuuss((rreess)) !!== PPGGRREESS__CCOOMMMMAANNDD__OOKK)) {{ ffpprriinnttff((ssttddeerrrr,,""DDEECCLLAARREE CCUURRSSOORR ccoommmmaanndd ffaaiilleedd00));; PPQQcclleeaarr((rreess));; eexxiitt__nniicceellyy((ccoonnnn));; }} PPQQcclleeaarr((rreess));; rreess == PPQQeexxeecc((ccoonnnn,,""FFEETTCCHH AALLLL iinn mmyyccuurrssoorr""));; iiff ((PPQQrreessuullttSSttaattuuss((rreess)) !!== PPGGRREESS__TTUUPPLLEESS__OOKK)) {{ ffpprriinnttff((ssttddeerrrr,,""FFEETTCCHH AALLLL ccoommmmaanndd ddiiddnn''tt rreettuurrnn ttuupplleess pprrooppeerrllyy00));; PPQQcclleeaarr((rreess));; eexxiitt__nniicceellyy((ccoonnnn));; }} ii__ffnnuumm == PPQQffnnuummbbeerr((rreess,,""ii""));; dd__ffnnuumm == PPQQffnnuummbbeerr((rreess,,""dd""));; pp__ffnnuumm == PPQQffnnuummbbeerr((rreess,,""pp""));; ffoorr ((ii==00;;ii<<33;;ii++++)) {{ pprriinnttff((""ttyyppee[[%%dd]] == %%dd,, ssiizzee[[%%dd]] == %%dd00,, ii,, PPQQffttyyppee((rreess,,ii)),, ii,, PPQQffssiizzee((rreess,,ii))));; 6644 }} ffoorr ((ii==00;; ii << PPQQnnttuupplleess((rreess));; ii++++)) {{ iinntt **iivvaall;; ffllooaatt **ddvvaall;; iinntt pplleenn;; PPOOLLYYGGOONN** ppvvaall;; //** wwee hhaarrdd--wwiirree tthhiiss ttoo tthhee 33 ffiieellddss wwee kknnooww aabboouutt **// iivvaall == ((iinntt**))PPQQggeettvvaalluuee((rreess,,ii,,ii__ffnnuumm));; ddvvaall == ((ffllooaatt**))PPQQggeettvvaalluuee((rreess,,ii,,dd__ffnnuumm));; pplleenn == PPQQggeettlleennggtthh((rreess,,ii,,pp__ffnnuumm));; //** pplleenn ddooeessnn''tt iinncclluuddee tthhee lleennggtthh ffiieelldd ssoo nneeeedd ttoo iinnccrreemmeenntt bbyy VVAARRHHDDSSZZ**// ppvvaall == ((PPOOLLYYGGOONN**)) mmaalllloocc((pplleenn ++ VVAARRHHDDRRSSZZ));; ppvvaall-->>ssiizzee == pplleenn;; mmeemmmmoovvee((((cchhaarr**))&&ppvvaall-->>nnppttss,, PPQQggeettvvaalluuee((rreess,,ii,,pp__ffnnuumm)),, pplleenn));; pprriinnttff((""ttuuppllee %%dd:: ggoott00,, ii));; pprriinnttff(("" ii == ((%%dd bbyytteess)) %%dd,,00,, PPQQggeettlleennggtthh((rreess,,ii,,ii__ffnnuumm)),, **iivvaall));; pprriinnttff(("" dd == ((%%dd bbyytteess)) %%ff,,00,, PPQQggeettlleennggtthh((rreess,,ii,,dd__ffnnuumm)),, **ddvvaall));; pprriinnttff(("" pp == ((%%dd bbyytteess)) %%dd ppooiinnttss bboouunnddbbooxx == ((hhii==%%ff//%%ff,, lloo == %%ff,,%%ff))00,, PPQQggeettlleennggtthh((rreess,,ii,,dd__ffnnuumm)),, ppvvaall-->>nnppttss,, ppvvaall-->>bboouunnddbbooxx..xxhh,, ppvvaall-->>bboouunnddbbooxx..yyhh,, ppvvaall-->>bboouunnddbbooxx..xxll,, ppvvaall-->>bboouunnddbbooxx..yyll));; }} PPQQcclleeaarr((rreess));; //** cclloossee tthhee ppoorrttaall **// rreess == PPQQeexxeecc((ccoonnnn,, ""CCLLOOSSEE mmyyccuurrssoorr""));; PPQQcclleeaarr((rreess));; //** eenndd tthhee ttrraannssaaccttiioonn **// rreess == PPQQeexxeecc((ccoonnnn,, ""EENNDD""));; PPQQcclleeaarr((rreess));; //** cclloossee tthhee ccoonnnneeccttiioonn ttoo tthhee ddaattaabbaassee aanndd cclleeaannuupp **// PPQQffiinniisshh((ccoonnnn));; }} 6655 ________________________________________________________________________________________________________________________ 1133.. LLAARRGGEE OOBBJJEECCTTSS ________________________________________________________________________________________________________________________ In POSTGRES, data values are stored in tuples and indi- vidual tuples cannot span data pages. Since the size of a data page is 8192 bytes, the upper limit on the size of a data value is relatively low. To support the stor- age of larger atomic values, POSTGRES provides a large object interface. This interface provides file- oriented access to user data that has been declared to be a large type. This section describes the implementation and the pro- grammatic and query language interfaces to POSTGRES large object data. 1133..11.. HHiissttoorriiccaall NNoottee Originally, POSTGRES 4.2 supports three standard imple- mentations of large objects: as files external to POST- GRES, as UNIX files managed by POSTGRES, and as data stored within the POSTGRES database. It causes consid- erable confusion among users. As a result, we only sup- port large objects as data stored within the POSTGRES database in POSTGRES95. Even though is is slower to access, it provides stricter data integrity and time travel. For historical reasons, they are called Inver- sion large objects. (We will use Inversion and large objects interchangeably to mean the same thing in this section.) 1133..22.. IInnvveerrssiioonn LLaarrggee OObbjjeeccttss The Inversion large object implementation breaks large objects up into "chunks" and stores the chunks in tuples in the database. A B-tree index guarantees fast searches for the correct chunk number when doing random access reads and writes. 1133..33.. LLaarrggee OObbjjeecctt IInntteerrffaacceess The facilities POSTGRES provides to access large objects, both in the backend as part of user-defined functions or the front end as part of an application using the interface, are described below. (For users familiar with POSTGRES 4.2, POSTGRES95 has a new set of functions providing a more coherent interface. The interface is the same for dynamically-loaded C func- tions as well as for . The POSTGRES large object interface is modeled after the UNIX file system interface, with analogues of _o_p_e_n(2), _r_e_a_d(2), _w_r_i_t_e(2), _l_s_e_e_k(2), etc. User func- tions call these routines to retrieve only the data of 6666 interest from a large object. For example, if a large object type called _m_u_g_s_h_o_t existed that stored pho- tographs of faces, then a function called _b_e_a_r_d could be declared on _m_u_g_s_h_o_t data. _B_e_a_r_d could look at the lower third of a photograph, and determine the color of the beard that appeared there, if any. The entire large object value need not be buffered, or even exam- ined, by the _b_e_a_r_d function. Large objects may be accessed from dynamically-loaded C functions or database client programs that link the library. POSTGRES provides a set of routines that sup- port opening, reading, writing, closing, and seeking on large objects. 1133..33..11.. CCrreeaattiinngg aa LLaarrggee OObbjjeecctt The routine OOiidd lloo__ccrreeaatt((PPGGccoonnnn **ccoonnnn,, iinntt mmooddee)) creates a new large object. The _m_o_d_e is a bitmask describing several different attributes of the new object. The symbolic constants listed here are defined in //uussrr//llooccaall//ppoossttggrreess9955//ssrrcc//bbaacckkeenndd//lliibbppqq//lliibbppqq--ffss..hh The access type (read, write, or both) is controlled by OR ing together the bits INV_READ and INV_WRITE. If the large object should be archived -- that is, if his- torical versions of it should be moved periodically to a special archive relation -- then the INV_ARCHIVE bit should be set. The low-order sixteen bits of _m_a_s_k are the storage manager number on which the large object should reside. For sites other than Berkeley, these bits should always be zero. The commands below create an (Inversion) large object: iinnvv__ooiidd == lloo__ccrreeaatt((IINNVV__RREEAADD||IINNVV__WWRRIITTEE||IINNVV__AARRCCHHIIVVEE));; 1133..33..22.. IImmppoorrttiinngg aa LLaarrggee OObbjjeecctt To import a UNIX file as a large object, call OOiidd lloo__iimmppoorrtt((PPGGccoonnnn **ccoonnnn,, tteexxtt **ffiilleennaammee)) The _f_i_l_e_n_a_m_e argument specifies the UNIX pathname of the file to be imported as a large object. 1133..33..33.. EExxppoorrttiinngg aa LLaarrggee OObbjjeecctt To export a large object into UNIX file, call iinntt 6677 lloo__eexxppoorrtt((PPGGccoonnnn **ccoonnnn,, OOiidd lloobbjjIIdd,, tteexxtt **ffiilleennaammee)) The _l_o_b_j_I_d argument specifies the Oid of the large object to export and the _f_i_l_e_n_a_m_e argument specifies the UNIX pathname of the file. 1133..33..44.. OOppeenniinngg aann EExxiissttiinngg LLaarrggee OObbjjeecctt To open an existing large object, call iinntt lloo__ooppeenn((PPGGccoonnnn **ccoonnnn,, OOiidd lloobbjjIIdd,, iinntt mmooddee,, ......)) The _l_o_b_j_I_d argument specifies the Oid of the large object to open. The mode bits control whether the object is opened for reading INV_READ), writing or both. A large object cannot be opened before it is created. lloo__ooppeenn returns a large object descriptor for later use in lloo__rreeaadd, lloo__wwrriittee, lloo__llsseeeekk, lloo__tteellll, and lloo__cclloossee. 1133..33..55.. WWrriittiinngg DDaattaa ttoo aa LLaarrggee OObbjjeecctt The routine iinntt lloo__wwrriittee((PPGGccoonnnn **ccoonnnn,, iinntt ffdd,, cchhaarr **bbuuff,, iinntt lleenn)) writes _l_e_n bytes from _b_u_f to large object _f_d. The _f_d argument must have been returned by a previous _l_o___o_p_e_n. The number of bytes actually written is returned. In the event of an error, the return value is negative. 1133..33..66.. SSeeeekkiinngg oonn aa LLaarrggee OObbjjeecctt To change the current read or write location on a large object, call iinntt lloo__llsseeeekk((PPGGccoonnnn **ccoonnnn,, iinntt ffdd,, iinntt ooffffsseett,, iinntt wwhheennccee)) This routine moves the current location pointer for the large object described by _f_d to the new location speci- fied by _o_f_f_s_e_t. The valid values for .i whence are SSEEEEKK__SSEETT SSEEEEKK__CCUURR and SSEEEEKK__EENNDD.. 1133..33..77.. CClloossiinngg aa LLaarrggee OObbjjeecctt DDeessccrriippttoorr A large object may be closed by calling iinntt lloo__cclloossee((PPGGccoonnnn **ccoonnnn,, iinntt ffdd)) where _f_d is a large object descriptor returned by _l_o___o_p_e_n. On success, _l_o___c_l_o_s_e returns zero. On error, the return value is negative. 6688 1133..44.. BBuuiilltt iinn rreeggiisstteerreedd ffuunnccttiioonnss There are two built-in registered functions, _l_o___i_m_p_o_r_t and _l_o___e_x_p_o_r_t which are convenient for use in SQL queries. Here is an example of there use CCRREEAATTEE TTAABBLLEE iimmaaggee (( nnaammee tteexxtt,, rraasstteerr ooiidd ));; IINNSSEERRTT IINNTTOO iimmaaggee ((nnaammee,, rraasstteerr)) VVAALLUUEESS ((''bbeeaauuttiiffuull iimmaaggee'',, lloo__iimmppoorrtt((''//eettcc//mmoottdd''))));; SSEELLEECCTT lloo__eexxppoorrtt((iimmaaggee..rraasstteerr,, ""//ttmmpp//mmoottdd"")) ffrroomm iimmaaggee WWHHEERREE nnaammee == ''bbeeaauuttiiffuull iimmaaggee'';; 1133..55.. AAcccceessssiinngg LLaarrggee OObbjjeeccttss ffrroomm LLIIBBPPQQ Below is a sam- ple program which shows how the large object interface in LIBPQ can be used. Parts of the program are com- mented out but are left in the source for the readers benefit. This program can be found in ....//ssrrcc//tteesstt//eexxaammpplleess Frontend applications which use the large object inter- face in LIBPQ should include the header file lliibbppqq//lliibbppqq--ffss..hh and link with the lliibbppqq library. 6699 1133..66.. SSaammppllee PPrrooggrraamm //**-------------------------------------------------------------------------------------------------------------------------------------------------- ** ** tteessttlloo..cc---- ** tteesstt uussiinngg llaarrggee oobbjjeeccttss wwiitthh lliibbppqq ** ** CCooppyyrriigghhtt ((cc)) 11999944,, RReeggeennttss ooff tthhee UUnniivveerrssiittyy ooff CCaalliiffoorrnniiaa ** ** ** IIDDEENNTTIIFFIICCAATTIIOONN ** //uussrr//llooccaall//ddeevveell//ppgglliittee//ccvvss//ssrrcc//ddoocc//mmaannuuaall..mmee,,vv 11..1166 11999955//0099//0011 2233::5555::0000 jjoollllyy EExxpp ** **-------------------------------------------------------------------------------------------------------------------------------------------------- **// ##iinncclluuddee <> ##iinncclluuddee ""lliibbppqq--ffee..hh"" ##iinncclluuddee ""lliibbppqq//lliibbppqq--ffss..hh"" ##ddeeffiinnee BBUUFFSSIIZZEE 11002244 //** ** iimmppoorrttFFiillee -- ** iimmppoorrtt ffiillee ""iinn__ffiilleennaammee"" iinnttoo ddaattaabbaassee aass llaarrggee oobbjjeecctt ""lloobbjjOOiidd"" ** **// OOiidd iimmppoorrttFFiillee((PPGGccoonnnn **ccoonnnn,, cchhaarr **ffiilleennaammee)) {{ OOiidd lloobbjjIIdd;; iinntt lloobbjj__ffdd;; cchhaarr bbuuff[[BBUUFFSSIIZZEE]];; iinntt nnbbyytteess,, ttmmpp;; iinntt ffdd;; //** ** ooppeenn tthhee ffiillee ttoo bbee rreeaadd iinn **// ffdd == ooppeenn((ffiilleennaammee,, OO__RRDDOONNLLYY,, 00666666));; iiff ((ffdd << 00)) {{ //** eerrrroorr **// ffpprriinnttff((ssttddeerrrr,, ""ccaann''tt ooppeenn uunniixx ffiillee }} //** ** ccrreeaattee tthhee llaarrggee oobbjjeecctt **// lloobbjjIIdd == lloo__ccrreeaatt((ccoonnnn,, IINNVV__RREEAADD||IINNVV__WWRRIITTEE));; iiff ((lloobbjjIIdd ==== 00)) {{ ffpprriinnttff((ssttddeerrrr,, ""ccaann''tt ccrreeaattee llaarrggee oobbjjeecctt""));; }} lloobbjj__ffdd == lloo__ooppeenn((ccoonnnn,, lloobbjjIIdd,, IINNVV__WWRRIITTEE));; //** ** rreeaadd iinn ffrroomm tthhee UUnniixx ffiillee aanndd wwrriittee ttoo tthhee iinnvveerrssiioonn ffiillee 7700 **// wwhhiillee ((((nnbbyytteess == rreeaadd((ffdd,, bbuuff,, BBUUFFSSIIZZEE)))) >> 00)) {{ ttmmpp == lloo__wwrriittee((ccoonnnn,, lloobbjj__ffdd,, bbuuff,, nnbbyytteess));; iiff ((ttmmpp << nnbbyytteess)) {{ ffpprriinnttff((ssttddeerrrr,, ""eerrrroorr wwhhiillee rreeaaddiinngg }} }} ((vvooiidd)) cclloossee((ffdd));; ((vvooiidd)) lloo__cclloossee((ccoonnnn,, lloobbjj__ffdd));; rreettuurrnn lloobbjjIIdd;; }} vvooiidd ppiicckkoouutt((PPGGccoonnnn **ccoonnnn,, OOiidd lloobbjjIIdd,, iinntt ssttaarrtt,, iinntt lleenn)) {{ iinntt lloobbjj__ffdd;; cchhaarr** bbuuff;; iinntt nnbbyytteess;; iinntt nnrreeaadd;; lloobbjj__ffdd == lloo__ooppeenn((ccoonnnn,, lloobbjjIIdd,, IINNVV__RREEAADD));; iiff ((lloobbjj__ffdd << 00)) {{ ffpprriinnttff((ssttddeerrrr,,""ccaann''tt ooppeenn llaarrggee oobbjjeecctt %%dd"",, lloobbjjIIdd));; }} lloo__llsseeeekk((ccoonnnn,, lloobbjj__ffdd,, ssttaarrtt,, SSEEEEKK__SSEETT));; bbuuff == mmaalllloocc((lleenn++11));; nnrreeaadd == 00;; wwhhiillee ((lleenn -- nnrreeaadd >> 00)) {{ nnbbyytteess == lloo__rreeaadd((ccoonnnn,, lloobbjj__ffdd,, bbuuff,, lleenn -- nnrreeaadd));; bbuuff[[nnbbyytteess]] == '' '';; ffpprriinnttff((ssttddeerrrr,,"">>>>>> %%ss"",, bbuuff));; nnrreeaadd ++== nnbbyytteess;; }} ffpprriinnttff((ssttddeerrrr,,""00));; lloo__cclloossee((ccoonnnn,, lloobbjj__ffdd));; }} vvooiidd oovveerrwwrriittee((PPGGccoonnnn **ccoonnnn,, OOiidd lloobbjjIIdd,, iinntt ssttaarrtt,, iinntt lleenn)) {{ iinntt lloobbjj__ffdd;; cchhaarr** bbuuff;; iinntt nnbbyytteess;; iinntt nnwwrriitttteenn;; iinntt ii;; lloobbjj__ffdd == lloo__ooppeenn((ccoonnnn,, lloobbjjIIdd,, IINNVV__RREEAADD));; iiff ((lloobbjj__ffdd << 00)) {{ ffpprriinnttff((ssttddeerrrr,,""ccaann''tt ooppeenn llaarrggee oobbjjeecctt %%dd"",, lloobbjjIIdd));; 7711 }} lloo__llsseeeekk((ccoonnnn,, lloobbjj__ffdd,, ssttaarrtt,, SSEEEEKK__SSEETT));; bbuuff == mmaalllloocc((lleenn++11));; ffoorr ((ii==00;;ii<> 00)) {{ nnbbyytteess == lloo__wwrriittee((ccoonnnn,, lloobbjj__ffdd,, bbuuff ++ nnwwrriitttteenn,, lleenn -- nnwwrriitttteenn));; nnwwrriitttteenn ++== nnbbyytteess;; }} ffpprriinnttff((ssttddeerrrr,,""00));; lloo__cclloossee((ccoonnnn,, lloobbjj__ffdd));; }} //** ** eexxppoorrttFFiillee -- ** eexxppoorrtt llaarrggee oobbjjeecctt ""lloobbjjOOiidd"" ttoo ffiillee ""oouutt__ffiilleennaammee"" ** **// vvooiidd eexxppoorrttFFiillee((PPGGccoonnnn **ccoonnnn,, OOiidd lloobbjjIIdd,, cchhaarr **ffiilleennaammee)) {{ iinntt lloobbjj__ffdd;; cchhaarr bbuuff[[BBUUFFSSIIZZEE]];; iinntt nnbbyytteess,, ttmmpp;; iinntt ffdd;; //** ** ccrreeaattee aann iinnvveerrssiioonn ""oobbjjeecctt"" **// lloobbjj__ffdd == lloo__ooppeenn((ccoonnnn,, lloobbjjIIdd,, IINNVV__RREEAADD));; iiff ((lloobbjj__ffdd << 00)) {{ ffpprriinnttff((ssttddeerrrr,,""ccaann''tt ooppeenn llaarrggee oobbjjeecctt %%dd"",, lloobbjjIIdd));; }} //** ** ooppeenn tthhee ffiillee ttoo bbee wwrriitttteenn ttoo **// ffdd == ooppeenn((ffiilleennaammee,, OO__CCRREEAATT||OO__WWRROONNLLYY,, 00666666));; iiff ((ffdd << 00)) {{ //** eerrrroorr **// ffpprriinnttff((ssttddeerrrr,, ""ccaann''tt ooppeenn uunniixx ffiillee ffiilleennaammee));; }} //** ** rreeaadd iinn ffrroomm tthhee UUnniixx ffiillee aanndd wwrriittee ttoo tthhee iinnvveerrssiioonn ffiillee **// wwhhiillee ((((nnbbyytteess == lloo__rreeaadd((ccoonnnn,, lloobbjj__ffdd,, bbuuff,, BBUUFFSSIIZZEE)))) >> 00)) {{ 7722 ttmmpp == wwrriittee((ffdd,, bbuuff,, nnbbyytteess));; iiff ((ttmmpp << nnbbyytteess)) {{ ffpprriinnttff((ssttddeerrrr,,""eerrrroorr wwhhiillee wwrriittiinngg ffiilleennaammee));; }} }} ((vvooiidd)) lloo__cclloossee((ccoonnnn,, lloobbjj__ffdd));; ((vvooiidd)) cclloossee((ffdd));; rreettuurrnn;; }} vvooiidd eexxiitt__nniicceellyy((PPGGccoonnnn** ccoonnnn)) {{ PPQQffiinniisshh((ccoonnnn));; eexxiitt((11));; }} iinntt mmaaiinn((iinntt aarrggcc,, cchhaarr ****aarrggvv)) {{ cchhaarr **iinn__ffiilleennaammee,, **oouutt__ffiilleennaammee;; cchhaarr **ddaattaabbaassee;; OOiidd lloobbjjOOiidd;; PPGGccoonnnn **ccoonnnn;; PPGGrreessuulltt **rreess;; iiff ((aarrggcc !!== 44)) {{ ffpprriinnttff((ssttddeerrrr,, ""UUssaaggee:: %%ss ddaattaabbaassee__nnaammee iinn__ffiilleennaammee oouutt__ffiilleennaammee00,, aarrggvv[[00]]));; eexxiitt((11));; }} ddaattaabbaassee == aarrggvv[[11]];; iinn__ffiilleennaammee == aarrggvv[[22]];; oouutt__ffiilleennaammee == aarrggvv[[33]];; //** ** sseett uupp tthhee ccoonnnneeccttiioonn **// ccoonnnn == PPQQsseettddbb((NNUULLLL,, NNUULLLL,, NNUULLLL,, NNUULLLL,, ddaattaabbaassee));; //** cchheecckk ttoo sseeee tthhaatt tthhee bbaacckkeenndd ccoonnnneeccttiioonn wwaass ssuucccceessssffuullllyy mmaaddee **// iiff ((PPQQssttaattuuss((ccoonnnn)) ==== CCOONNNNEECCTTIIOONN__BBAADD)) {{ ffpprriinnttff((ssttddeerrrr,,""CCoonnnneeccttiioonn ttoo ddaattaabbaassee ''%%ss'' ffaaiilleedd..00,, ddaattaabbaassee));; ffpprriinnttff((ssttddeerrrr,,""%%ss"",,PPQQeerrrroorrMMeessssaaggee((ccoonnnn))));; eexxiitt__nniicceellyy((ccoonnnn));; }} rreess == PPQQeexxeecc((ccoonnnn,, ""bbeeggiinn""));; PPQQcclleeaarr((rreess));; 7733 pprriinnttff((""iimmppoorrttiinngg ffiillee //** lloobbjjOOiidd == iimmppoorrttFFiillee((ccoonnnn,, iinn__ffiilleennaammee));; **// lloobbjjOOiidd == lloo__iimmppoorrtt((ccoonnnn,, iinn__ffiilleennaammee));; //** pprriinnttff((""aass llaarrggee oobbjjeecctt %%dd..00,, lloobbjjOOiidd));; pprriinnttff((""ppiicckkiinngg oouutt bbyytteess 11000000--22000000 ooff tthhee llaarrggee oobbjjeecctt00));; ppiicckkoouutt((ccoonnnn,, lloobbjjOOiidd,, 11000000,, 11000000));; pprriinnttff((""oovveerrwwrriittiinngg bbyytteess 11000000--22000000 ooff tthhee llaarrggee oobbjjeecctt wwiitthh XX''ss00));; oovveerrwwrriittee((ccoonnnn,, lloobbjjOOiidd,, 11000000,, 11000000));; **// pprriinnttff((""eexxppoorrttiinngg llaarrggee oobbjjeecctt ttoo ffiillee //** eexxppoorrttFFiillee((ccoonnnn,, lloobbjjOOiidd,, oouutt__ffiilleennaammee));; **// lloo__eexxppoorrtt((ccoonnnn,, lloobbjjOOiidd,,oouutt__ffiilleennaammee));; rreess == PPQQeexxeecc((ccoonnnn,, ""eenndd""));; PPQQcclleeaarr((rreess));; PPQQffiinniisshh((ccoonnnn));; eexxiitt((00));; }} 7744 ________________________________________________________________________________________________________________________ 1144.. TTHHEE PPOOSSTTGGRREESS RRUULLEE SSYYSSTTEEMM ________________________________________________________________________________________________________________________ Production rule systems are conceptually simple, but there are many subtle points involved in actually using them. Consequently, we will not attempt to explain the actual syntax and operation of the POSTGRES rule system here. Instead, you should read [STON90b] to understand some of these points and the theoretical foundations of the POSTGRES rule system before trying to use rules. The discussion in this section is intended to provide an overview of the POSTGRES rule system and point the user at helpful references and examples. The "query rewrite" rule system modifies queries to take rules into consideration, and then passes the mod- ified query to the query optimizer for execution. It is very powerful, and can be used for many things such as query language procedures, views, and versions. The power of this rule system is discussed in [ONG90] as well as [STON90b]. 7755 ________________________________________________________________________________________________________________________ 1155.. AADDMMIINNIISSTTEERRIINNGG PPOOSSTTGGRREESS ________________________________________________________________________________________________________________________ In this section, we will discuss aspects of POSTGRES that are of interest to those who make extensive use of POSTGRES, or who are the site administrator for a group of POSTGRES users. 1155..11.. FFrreeqquueenntt TTaasskkss ________________________________________________________________________________________________________________________ Here we will briefly discuss some procedures that you should be familiar with in managing any POSTGRES installation. 1155..11..11.. SSttaarrttiinngg tthhee PPoossttmmaasstteerr If you did not install POSTGRES exactly as described in the installation instructions, you may have to perform some additional steps before starting the ppoossttmmaasstteerr process. +o Even if you were not the person who installed POST- GRES, you should understand the installation instructions. The installation instructions explain some important issues with respect to where POSTGRES places some important files, proper settings for environment variables, etc. that may vary from one version of POSTGRES to another. +o You _m_u_s_t start the ppoossttmmaasstteerr process with the user- id that owns the installed database files. In most cases, if you have followed the installation instructions, this will be the user "postgres". If you do not start the ppoossttmmaasstteerr with the right user- id, the backend servers that are started by the ppoossttmmaasstteerr will not be able to read the data. +o Make sure that //uussrr//llooccaall//ppoossttggrreess9955//bbiinn is in your shell command path, because the ppoossttmmaasstteerr will use your PPAATTHH to locate POSTGRES commands. +o Remember to set the environment variable PPGGDDAATTAA to the directory where the POSTGRES databases are installed. (This variable is more fully explained in the POSTGRES installation instructions.) +o If you do start the ppoossttmmaasstteerr using non-standard options, such as a different TCP port number, remem- ber to tell all users so that they can set their PPGGPPOORRTT environment variable correctly. 7766 1155..11..22.. SShhuuttttiinngg DDoowwnn tthhee PPoossttmmaasstteerr If you need to halt the ppoossttmmaasstteerr process, you can use the UNIX kkiillll(1) command. Some people habitually use the --99 or --KKIILLLL option; this should never be necessary and we do not recommend that you do this, as the ppoosstt-- mmaasstteerr will be unable to free its various shared resources, its child processes will be unable to exit gracefully, etc. 1155..11..33.. AAddddiinngg aanndd RReemmoovviinngg UUsseerrss The ccrreeaatteeuusseerr and ddeessttrrooyyuusseerr commands enable and dis- able access to POSTGRES by specific users on the host system. 1155..11..44.. PPeerriiooddiicc UUppkkeeeepp The vvaaccuuuumm command should be run on each database peri- odically. This command processes deleted instances9 and, more importantly, updates the system _s_t_a_t_i_s_t_i_c_s concerning the size of each class. If these statistics are permitted to become out-of-date and inaccurate, the POSTGRES query optimizer may make extremely poor deci- sions with respect to query evaluation strategies. Therefore, we recommend running vvaaccuuuumm every night or so (perhaps in a script that is executed by the UNIX ccrroonn(1) or aatt(1) commands). DDoo ffrreeqquueenntt bbaacckkuuppss. That is, you should either back up your database directories using the POSTGRES ccooppyy command and/or the UNIX dduummpp(1) or ttaarr(1) commands. You may think, "Why am I backing up my database? What about crash recovery?" One side effect of the POSTGRES "no overwrite" storage manager is that it is also a "no log" storage manager. That is, the database log stores only abort/commit data, and this is not enough informa- tion to recover the database if the storage medium (disk) or the database files are corrupted! In other words, if a disk block goes bad or POSTGRES happens to corrupt a database file, yyoouu ccaannnnoott rreeccoovveerr tthhaatt ffiillee. This can be disastrous if the file is one of the shared catalogs, such as ppgg__ddaattaabbaassee. 1155..11..55.. TTuunniinngg Once your users start to load a significant amount of data, you will typically run into performance problems. POSTGRES is not the fastest DBMS in the world, but many ____________________ 9 This may mean different things depending on the _a_r_c_h_i_v_e _m_o_d_e with which each class has been created. However, the current implementation of the vvaaccuuuumm command does _n_o_t per- form any compaction or clustering of data. Therefore, the UNIX files which store each POSTGRES class never shrink and the space "reclaimed" by vvaaccuuuumm is never actually reused. 7777 of the worst problems encountered by users are due to their lack of experience with any DBMS. Some general tips include: (1) Define indices over attributes that are commonly used for qualifications. For example, if you often execute queries of the form SSEELLEECCTT ** ffrroomm EEMMPP wwhheerree ssaallaarryy << 55000000 then a B-tree index on the ssaallaarryy attribute will probably be useful. If scans involving equality are more common, as in SSEELLEECCTT ** ffrroomm EEMMPP wwhheerree ssaallaarryy == 55000000 then you should consider defining a hash index on ssaallaarryy. You can define both, though it will use more disk space and may slow down updates a bit. Scans using indices are mmuucchh faster than sequential scans of the entire class. (2) Run the vvaaccuuuumm command a lot. This command updates the statistics that the query optimizer uses to make intelligent decisions; if the statistics are inaccurate, the system will make inordinately stupid decisions with respect to the way it joins and scans classes. (3) When specifying query qualfications (i.e., the wwhheerree part of the query), try to ensure that a clause involving a constant can be turned into one of the form _r_a_n_g_e___v_a_r_i_a_b_l_e _o_p_e_r_a_t_o_r _c_o_n_- _s_t_a_n_t, e.g., EEMMPP..ssaallaarryy == 55000000 The POSTGRES query optimizer will only use an index with a constant qualification of this form. It doesn't hurt to write the clause as 55000000 == EEMMPP..ssaallaarryy if the operator (in this case, ==) has a _c_o_m_m_u_t_a_- _t_o_r operator defined so that POSTGRES can rewrite the query into the desired form. How- ever, if such an operator does not exist, POST- GRES will never consider the use of an index. (4) When joining several classes together in one query, try to write the join clauses in a "chained" form, e.g., wwhheerree AA..aa == BB..bb aanndd BB..bb == CC..cc aanndd ...... Notice that relatively few clauses refer to a given class and attribute; the clauses form a 7788 linear sequence connecting the attributes, like links in a chain. This is preferable to a query written in a "star" form, such as wwhheerree AA..aa == BB..bb aanndd AA..aa == CC..cc aanndd ...... Here, many clauses refer to the same class and attribute (in this case, AA..aa). When presented with a query of this form, the POSTGRES query optimizer will tend to consider far more choices than it should and may run out of memory. (5) If you are really desperate to see what query plans look like, you can run the ppoossttmmaasstteerr with the --dd option and then run mmoonniittoorr with the --tt option. The format in which query plans will be printed is hard to read but you should be able to tell whether any index scans are being per- formed. 1155..22.. IInnffrreeqquueenntt TTaasskkss ________________________________________________________________________________________________________________________ At some time or another, every POSTGRES site adminis- trator has to perform all of the following actions. 1155..22..11.. CClleeaanniinngg UUpp AAfftteerr CCrraasshheess The ppoossttggrreess server and the ppoossttmmaasstteerr run as two dif- ferent processes. They may crash separately or together. The housekeeping procedures required to fix one kind of crash are different from those required to fix the other. The message you will usually see when the backend server crashes is: FFAATTAALL:: nnoo rreessppoonnssee ffrroomm bbaacckkeenndd:: ddeetteecctteedd iinn ...... This generally means one of two things: there is a bug in the POSTGRES server, or there is a bug in some user code that has been dynamically loaded into POSTGRES. You should be able to restart your application and resume processing, but there are some considerations: (1) POSTGRES usually dumps a core file (a snapshot of process memory used for debugging) in the database directory //uussrr//llooccaall//ppoossttggrreess9955//ddaattaa//bbaassee//<>//ccoorree on the server machine. If you don't want to try to debug the problem or produce a stack trace to report the bug to someone else, you can delete this file (which is probably around 10MB). (2) When one backend crashes in an uncontrolled way (i.e., without calling its built-in cleanup 7799 routines), the ppoossttmmaasstteerr will detect this situ- ation, kill all running servers and reinitialize the state shared among all backends (e.g., the shared buffer pool and locks). If your server crashed, you will get the "no response" message shown above. If your server was killed because someone else's server crashed, you will see the following message: II hhaavvee bbeeeenn ssiiggnnaalllleedd bbyy tthhee ppoossttmmaasstteerr.. SSoommee bbaacckkeenndd pprroocceessss hhaass ddiieedd uunneexxppeecctteeddllyy aanndd ppoossssiibbllyy ccoorrrruupptteedd sshhaarreedd mmeemmoorryy.. TThhee ccuurrrreenntt ttrraannssaaccttiioonn wwaass aabboorrtteedd,, aanndd II aamm ggooiinngg ttoo eexxiitt.. PPlleeaassee rreesseenndd tthhee llaasstt qquueerryy.. ---- TThhee ppoossttggrreess bbaacckkeenndd (3) Sometimes shared state is not completely cleaned up. Frontend applications may see errors of the form: WWAARRNN:: ccaannnnoott wwrriittee bblloocckk 3344 ooff mmyyccllaassss [[mmyyddbb]] bblliinndd In this case, you should kill the ppoossttmmaasstteerr and restart it. (4) When the system crashes while updating the sys- tem catalogs (e.g., when you are creating a class, defining an index, retrieving into a class, etc.) the B-tree indices defined on the catalogs are sometimes corrupted. The general (and non-unique) symptom is that aallll queries stop working. If you have tried all of the above steps and nothing else seems to work, try using the rreeiinnddeexxddbb command. If rreeiinnddeexxddbb suc- ceeds but things still don't work, you have another problem; if it fails, the system cata- logs themselves were almost certainly corrupted and you will have to go back to your backups. The ppoossttmmaasstteerr does not usually crash (it doesn't do very much except start servers) but it does happen on occasion. In addition, there are a few cases where it encounters problems during the reinitialization of shared resources. Specifically, there are race condi- tions where the operating system lets the ppoossttmmaasstteerr free shared resources but then will not permit it to reallocate the same amount of shared resources (even when there is no contention). You will typically have to run the iippcccclleeaann command if system errors cause the ppoossttmmaasstteerr to crash. If this happens, you may find (using the UNIX iippccss(1) command) that the "postgres" user has shared memory and/or semaphores allocated even though no ppoossttmmaasstteerr process is running. In this case, you should run iippcccclleeaann as the "postgres" user in order to deallocate these resources. Be warned that _a_l_l such resources owned by 8800 the "postgres" user will be deallocated. If you have multiple ppoossttmmaasstteerr processes running on the same machine, you should kill all of them before running iippcccclleeaann (otherwise, they will crash on their own when their shared resources are suddenly deallocated). 1155..22..22.. MMoovviinngg DDaattaabbaassee DDiirreeccttoorriieess By default, all POSTGRES databases are stored in sepa- rate subdirectories under //uussrr//llooccaall//ppoossttggrreess9955//ddaattaa//bbaassee.10 At some point, you may find that you wish to move one or more databases to another location (e.g., to a filesystem with more free space). If you wish to move _a_l_l of your databases to the new location, you can simply: +o Kill the ppoossttmmaasstteerr. +o Copy the entire ddaattaa directory to the new location (making sure that the new files are owned by user "postgres"). %% ccpp --rrpp //uussrr//llooccaall//ppoossttggrreess9955//ddaattaa //nneeww//ppllaaccee//ddaattaa +o Reset your PPGGDDAATTAA environment variable (as described earlier in this manual and in the installation instructions). ## uussiinngg ccsshh oorr ttccsshh...... %% sseetteennvv PPGGDDAATTAA //nneeww//ppllaaccee//ddaattaa ## uussiinngg sshh,, kksshh oorr bbaasshh...... %% PPGGDDAATTAA==//nneeww//ppllaaccee//ddaattaa;; eexxppoorrtt PPGGDDAATTAA +o Restart the ppoossttmmaasstteerr. %% ppoossttmmaasstteerr && +o After you run some queries and are sure that the newly-moved database works, you can remove the old ddaattaa directory. %% rrmm --rrff //uussrr//llooccaall//ppoossttggrreess9955//ddaattaa To install a _s_i_n_g_l_e database in an alternate directory while leaving all other databases in place, do the fol- lowing: ____________________ 10 Data for certain classes may stored elsewhere if a non-standard storage manager was specified when they were created. Use of non-standard storage managers is an experi- mental feature that is not supported outside of Berkeley. 8811 +o Create the database (if it doesn't already exist) using the ccrreeaatteeddbb command. In the following steps we will assume the database is named ffoooo. +o Kill the ppoossttmmaasstteerr. +o Copy the directory //uussrr//llooccaall//ppoossttggrreess9955//ddaattaa//bbaassee//ffoooo and its contents to its ultimate destination. It should still be owned by the "postgres" user. %% ccpp --rrpp //uussrr//llooccaall//ppoossttggrreess9955//ddaattaa//bbaassee//ffoooo //nneeww//ppllaaccee//ffoooo +o Remove the directory //uussrr//llooccaall//ppoossttggrreess9955//ddaattaa//bbaassee//ffoooo: %% rrmm --rrff //uussrr//llooccaall//ppoossttggrreess9955//ddaattaa//bbaassee//ffoooo +o Make a symbolic link from //uussrr//llooccaall//ppoossttggrreess9955//ddaattaa//bbaassee to the new direc- tory: %% llnn --ss //nneeww//ppllaaccee//ffoooo //uussrr//llooccaall//ppoossttggrreess9955//ddaattaa//bbaassee//ffoooo +o Restart the ppoossttmmaasstteerr. 1155..22..33.. UUppddaattiinngg DDaattaabbaasseess POSTGRES is a research system. In general, POSTGRES may not retain the same binary format for the storage of databases from release to release. Therefore, when you update your POSTGRES software, you will probably have to modify your databases as well. This is a com- mon occurrence with commercial database systems as well; unfortunately, unlike commercial systems, POST- GRES does not come with user-friendly utilities to make your life easier when these updates occur. In general, you must do the following to update your databases to a new software release: +o _E_x_t_e_n_s_i_o_n_s (such as user-defined types, functions, aggregates, etc.) must be reloaded by re-executing the SQL CCRREEAATTEE commands. See Appendix A for more details. +o _D_a_t_a must be dumped from the old classes into ASCII files (using the CCOOPPYY command), the new classes cre- ated in the new database (using the CCRREEAATTEETABLE com- mand), and the data reloaded from the ASCII files. +o _R_u_l_e_s and _v_i_e_w_s must also be reloaded by re- executing the various CCRREEAATTEE commands. You should give any new release a "trial period"; in particular, do not delete the old database until you are satisfied that there are no compatibility problems with the new software. For example, you do not want to discover that a bug in a type's "input" (conversion from ASCII) and "output" (conversion to ASCII) routines prevents you from reloading your data after you have 8822 destroyed your old databases! (This should be standard procedure when updating any software package, but some people try to economize on disk space without applying enough foresight.) 1155..33.. DDaattaabbaassee SSeeccuurriittyy ________________________________________________________________________________________________________________________ Most sites that use POSTGRES are educational or research institutions and do not pay much attention to security in their POSTGRES installations. If desired, one can install POSTGRES with additional security fea- tures. Naturally, such features come with additional administrative overhead that must be dealt with. 1155..33..11.. KKeerrbbeerrooss POSTGRES can be configured to use the MIT Kerberos net- work authentication system. This prevents outside users from connecting to your databases over the net- work without the correct authentication information. 1155..44.. QQuueerryyiinngg tthhee SSyysstteemm CCaattaallooggss ________________________________________________________________________________________________________________________ As an administrator (or sometimes as a plain user), you want to find out what extensions have been added to a given database. The queries listed below are "canned" queries that you can run on any database to get simple answers. Before executing any of the queries below, be sure to execute the POSTGRES vvaaccuuuumm command. (The queries will run much more quickly that way.) Also, note that these queries are also listed in //uussrr//llooccaall//ppoossttggrreess9955//ttuuttoorriiaall//ssyyssccaatt..ssqqll so use cut-and-paste (or the \\ii command) instead of doing a lot of typing. This query prints the names of all database adminstra- tors and the name of their database(s). SSEELLEECCTT uusseennaammee,, ddaattnnaammee FFRROOMM ppgg__uusseerr,, ppgg__ddaattaabbaassee WWHHEERREE uusseessyyssiidd == iinntt22iinn((iinntt44oouutt((ddaattddbbaa)))) OORRDDEERR BBYY uusseennaammee,, ddaattnnaammee;; This query lists all user-defined classes in the database. SSEELLEECCTT rreellnnaammee FFRROOMM ppgg__ccllaassss WWHHEERREE rreellkkiinndd == ''rr'' ---- nnoott iinnddiicceess aanndd rreellnnaammee !!~~ ''^^ppgg__'' ---- nnoott ccaattaallooggss aanndd rreellnnaammee !!~~ ''^^IInnvv'' ---- nnoott llaarrggee oobbjjeeccttss 8833 OORRDDEERR BBYY rreellnnaammee;; This query lists all simple indices (i.e., those that are not defined over a function of several attributes). SSEELLEECCTT bbcc..rreellnnaammee AASS ccllaassss__nnaammee,, iicc..rreellnnaammee AASS iinnddeexx__nnaammee,, aa..aattttnnaammee FFRROOMM ppgg__ccllaassss bbcc,, ---- bbaassee ccllaassss ppgg__ccllaassss iicc,, ---- iinnddeexx ccllaassss ppgg__iinnddeexx ii,, ppgg__aattttrriibbuuttee aa ---- aatttt iinn bbaassee WWHHEERREE ii..iinnddrreelliidd == bbcc..ooiidd aanndd ii..iinnddeexxrreelliidd == iicc..ooiidd aanndd ii..iinnddkkeeyy[[00]] == aa..aattttnnuumm aanndd aa..aattttrreelliidd == bbcc..ooiidd aanndd ii..iinnddpprroocc == ''00''::::ooiidd ---- nnoo ffuunnccttiioonnaall iinnddiicceess OORRDDEERR BBYY ccllaassss__nnaammee,, iinnddeexx__nnaammee,, aattttnnaammee;; This query prints a report of the user-defined attributes and their types for all user-defined classes in the database. SSEELLEECCTT cc..rreellnnaammee,, aa..aattttnnaammee,, tt..ttyyppnnaammee FFRROOMM ppgg__ccllaassss cc,, ppgg__aattttrriibbuuttee aa,, ppgg__ttyyppee tt WWHHEERREE cc..rreellkkiinndd == ''rr'' ---- nnoo iinnddiicceess aanndd cc..rreellnnaammee !!~~ ''^^ppgg__'' ---- nnoo ccaattaallooggss aanndd cc..rreellnnaammee !!~~ ''^^IInnvv'' ---- nnoo llaarrggee oobbjjeeccttss aanndd aa..aattttnnuumm >> 00 ---- nnoo ssyysstteemm aatttt''ss aanndd aa..aattttrreelliidd == cc..ooiidd aanndd aa..aattttttyyppiidd == tt..ooiidd OORRDDEERR BBYY rreellnnaammee,, aattttnnaammee;; This query lists all user-defined base types (not including array types). SSEELLEECCTT uu..uusseennaammee,, tt..ttyyppnnaammee FFRROOMM ppgg__ttyyppee tt,, ppgg__uusseerr uu WWHHEERREE uu..uusseessyyssiidd == iinntt22iinn((iinntt44oouutt((tt..ttyyppoowwnneerr)))) aanndd tt..ttyypprreelliidd == ''00''::::ooiidd ---- nnoo ccoommpplleexx ttyyppeess aanndd tt..ttyyppeelleemm == ''00''::::ooiidd ---- nnoo aarrrraayyss aanndd uu..uusseennaammee <<>> ''ppoossttggrreess'' OORRDDEERR BBYY uusseennaammee,, ttyyppnnaammee;; This query lists all left-unary (post-fix) operators. SSEELLEECCTT oo..oopprrnnaammee AASS lleefftt__uunnaarryy,, rriigghhtt..ttyyppnnaammee AASS ooppeerraanndd,, rreessuulltt..ttyyppnnaammee AASS rreettuurrnn__ttyyppee FFRROOMM ppgg__ooppeerraattoorr oo,, ppgg__ttyyppee rriigghhtt,, ppgg__ttyyppee rreessuulltt WWHHEERREE oo..oopprrkkiinndd == ''ll'' ---- lleefftt uunnaarryy aanndd oo..oopprrrriigghhtt == rriigghhtt..ooiidd aanndd oo..oopprrrreessuulltt == rreessuulltt..ooiidd 8844 OORRDDEERR BBYY ooppeerraanndd;; This query lists all right-unary (pre-fix) operators. SSEELLEECCTT oo..oopprrnnaammee AASS rriigghhtt__uunnaarryy,, lleefftt..ttyyppnnaammee AASS ooppeerraanndd,, rreessuulltt..ttyyppnnaammee AASS rreettuurrnn__ttyyppee FFRROOMM ppgg__ooppeerraattoorr oo,, ppgg__ttyyppee lleefftt,, ppgg__ttyyppee rreessuulltt WWHHEERREE oo..oopprrkkiinndd == ''rr'' ---- rriigghhtt uunnaarryy aanndd oo..oopprrlleefftt == lleefftt..ooiidd aanndd oo..oopprrrreessuulltt == rreessuulltt..ooiidd OORRDDEERR BBYY ooppeerraanndd;; This query lists all binary operators. SSEELLEECCTT oo..oopprrnnaammee AASS bbiinnaarryy__oopp,, lleefftt..ttyyppnnaammee AASS lleefftt__oopprr,, rriigghhtt..ttyyppnnaammee AASS rriigghhtt__oopprr,, rreessuulltt..ttyyppnnaammee AASS rreettuurrnn__ttyyppee FFRROOMM ppgg__ooppeerraattoorr oo,, ppgg__ttyyppee lleefftt,, ppgg__ttyyppee rriigghhtt,, ppgg__ttyyppee rreessuulltt WWHHEERREE oo..oopprrkkiinndd == ''bb'' ---- bbiinnaarryy aanndd oo..oopprrlleefftt == lleefftt..ooiidd aanndd oo..oopprrrriigghhtt == rriigghhtt..ooiidd aanndd oo..oopprrrreessuulltt == rreessuulltt..ooiidd OORRDDEERR BBYY lleefftt__oopprr,, rriigghhtt__oopprr;; This query returns the name, number of arguments (parameters) and return type of all user-defined C functions. The same query can be used to find all built-in C functions if you change the "C" to "inter- nal", or all SQL functions if you change the "C" to "postquel". SSEELLEECCTT pp..pprroonnaammee,, pp..pprroonnaarrggss,, tt..ttyyppnnaammee FFRROOMM ppgg__pprroocc pp,, ppgg__llaanngguuaaggee ll,, ppgg__ttyyppee tt WWHHEERREE pp..pprroollaanngg == ll..ooiidd aanndd pp..pprroorreettttyyppee == tt..ooiidd aanndd ll..llaannnnaammee == ''cc'' OORRDDEERR BBYY pprroonnaammee;; This query lists all of the aggregate functions that have been installed and the types to which they can be applied. ccoouunntt is not included because it can take any type as its argument. SSEELLEECCTT aa..aaggggnnaammee,, tt..ttyyppnnaammee FFRROOMM ppgg__aaggggrreeggaattee aa,, ppgg__ttyyppee tt WWHHEERREE aa..aaggggbbaasseettyyppee == tt..ooiidd OORRDDEERR BBYY aaggggnnaammee,, ttyyppnnaammee;; This query lists all of the operator classes that can be used with each access method as well as the opera- tors that can be used with the respective operator 8855 classes. SSEELLEECCTT aamm..aammnnaammee,, ooppcc..ooppccnnaammee,, oopprr..oopprrnnaammee FFRROOMM ppgg__aamm aamm,, ppgg__aammoopp aammoopp,, ppgg__ooppccllaassss ooppcc,, ppgg__ooppeerraattoorr oopprr WWHHEERREE aammoopp..aammooppiidd == aamm..ooiidd aanndd aammoopp..aammooppccllaaiidd == ooppcc..ooiidd aanndd aammoopp..aammooppoopprr == oopprr..ooiidd OORRDDEERR BBYY aammnnaammee,, ooppccnnaammee,, oopprrnnaammee;; 8866 ________________________________________________________________________________________________________________________ 1166.. RREEFFEERREENNCCEESS ________________________________________________________________________________________________________________________ [DATE93] Date, C. J. and Darwen, Hugh, A Guide to The SQL Standard, 3rd Edition, Reading, MA, June 1993. [MELT93] Melton, J. Understanding the New SQL, 1994. [ONG90] Ong, L. and Goh, J., ``A Unified Framework for Version Modeling Using Production Rules in a Database System," Electronics Research Laboratory, University of California, ERL Technical Memorandum M90/33, Berkeley, CA, April 1990. [ROWE87] Rowe, L. and Stonebraker, M., ``The POSTGRES Data Model,'' Proc. 1987 VLDB Conference, Brighton, England, Sept. 1987. [STON86] Stonebraker, M. and Rowe, L., ``The Design of POSTGRES,'' Proc. 1986 ACM-SIGMOD Confer- ence on Management of Data, Washington, DC, May 1986. [STON87a] Stonebraker, M., Hanson, E. and Hong, C.-H., ``The Design of the POSTGRES Rules System,'' Proc. 1987 IEEE Conference on Data Engineer- ing, Los Angeles, CA, Feb. 1987. [STON87b] Stonebraker, M., ``The POSTGRES Storage Sys- tem,'' Proc. 1987 VLDB Conference, Brighton, England, Sept. 1987. [STON89] Stonebraker, M., Hearst, M., and Potamianos, S., ``A Commentary on the POSTGRES Rules System,'' SIGMOD Record _1_8(3), Sept. 1989. [STON90a] Stonebraker, M., Rowe, L. A., and Hirohama, M., ``The Implementation of POSTGRES,'' IEEE Transactions on Knowledge and Data Engineer- ing _2(1), March 1990. [STON90b] Stonebraker, M. et al., ``On Rules, Proce- dures, Caching and Views in Database Sys- tems,'' Proc. 1990 ACM-SIGMOD Conference on Management of Data, Atlantic City, N.J., June 1990. 8877 ________________________________________________________________________________________________________________________ AAppppeennddiixx AA:: LLiinnkkiinngg DDyynnaammiiccaallllyy--LLooaaddeedd FFuunnccttiioonnss ________________________________________________________________________________________________________________________ After you have created and registered a user-defined function, your work is essentially done. POSTGRES, however, must load the _o_b_j_e_c_t _c_o_d_e (e.g., a ..oo file, or a shared library) that implements your function. As previously mentioned, POSTGRES loads your code at run- time, as required. In order to allow your code to be dynamically loaded, you may have to compile and link- edit it in a special way. This section briefly describes how to perform the compilation and link- editing required before you can load your user-defined functions into a running POSTGRES server. Note that _t_h_i_s _p_r_o_c_e_s_s _h_a_s _c_h_a_n_g_e_d _a_s _o_f _V_e_r_s_i_o_n _4_._2_.11 You should expect to read (and reread, and re-reread) the manual pages for the C compiler, cccc(1), and the link editor, lldd(1), if you have specific questions. In addition, the regression test suites in the directory //uussrr//llooccaall//ppoossttggrreess9955//ssrrcc//rreeggrreessss contain several work- ing examples of this process. If you copy what these tests do, you should not have any problems. The following terminology will be used below: Dynamic loading is what POSTGRES does to an object file. The object file is copied into the running POSTGRES server and the functions and variables within the file are made available to the functions within the POSTGRES process. POSTGRES does this using the dynamic loading mechanism provided by the operating system. ____________________ 11 The old POSTGRES dynamic loading mechanism required in-depth knowledge in terms of executable format, placement and alignment of executable instructions within memory, etc. on the part of the person writing the dynamic loader. Such loaders tended to be slow and buggy. As of Version 4.2, the POSTGRES dynamic loading mechanism has been rewritten to use the dynamic loading mechanism provided by the operating sys- tem. This approach is generally faster, more reliable and more portable than our previous dynamic loading mechanism. The reason for this is that nearly all modern versions of UNIX use a dynamic loading mechanism to implement shared li- braries and must therefore provide a fast and reliable mech- anism. On the other hand, the object file must be post- processed a bit before it can be loaded into POSTGRES. We hope that the large increase in speed and reliability will make up for the slight decrease in convenience. 8888 Loading and link editing is what you do to an object file in order to pro- duce another kind of object file (e.g., an exe- cutable program or a shared library). You perform this using the link editing program, lldd(1). The following general restrictions and notes also apply to the discussion below. +o Paths given to the ccrreeaattee ffuunnccttiioonn command must be absolute paths (i.e., start with "/") that refer to directories visible on the machine on which the POSTGRES server is running.12 +o The POSTGRES user must be able to traverse the path given to the ccrreeaattee ffuunnccttiioonn command and be able to read the object file. This is because the POSTGRES server runs as the POSTGRES user, not as the user who starts up the frontend process. (Making the file or a higher-level directory unreadable and/or unexecutable by the "postgres" user is an _e_x_t_r_e_m_e_l_y common mistake.) +o Symbol names defined within object files must not conflict with each other or with symbols defined in POSTGRES. +o The GNU C compiler usually does not provide the spe- cial options that are required to use the operating system's dynamic loader interface. In such cases, the C compiler that comes with the operating system must be used. UULLTTRRIIXX It is very easy to build dynamically-loaded object files under ULTRIX. ULTRIX does not have any shared- library mechanism and hence does not place any restric- tions on the dynamic loader interface. On the other hand, we had to (re)write a non-portable dynamic loader ourselves and could not use true shared libraries. Under ULTRIX, the only restriction is that you must produce each object file with the option --GG 00. (Notice that that's the numeral ``00'' and not the letter ``OO''). For example, ## ssiimmppllee UULLTTRRIIXX eexxaammppllee %% cccc --GG 00 --cc ffoooo..cc produces an object file called ffoooo..oo that can then be ____________________ 12 Relative paths do in fact work, but are relative to the directory where the database resides (which is generally invisible to the frontend application). Obviously, it makes no sense to make the path relative to the directory in which the user started the frontend application, since the server could be running on a completely different machine! 8899 dynamically loaded into POSTGRES. No additional load- ing or link-editing must be performed. DDEECC OOSSFF//11 Under DEC OSF/1, you can take any simple object file and produce a shared object file by running the lldd com- mand over it with the correct options. The commands to do this look like: ## ssiimmppllee DDEECC OOSSFF//11 eexxaammppllee %% cccc --cc ffoooo..cc %% lldd --sshhaarreedd --eexxppeecctt__uunnrreessoollvveedd ''**'' --oo ffoooo..ssoo ffoooo..oo The resulting shared object file can then be loaded into POSTGRES. When specifying the object file name to the ccrreeaattee ffuunnccttiioonn command, one must give it the name of the shared object file (ending in ..ssoo) rather than the simple object file.13 If the file you specify is not a shared object, the backend will hang! SSuunnOOSS 44..xx,, SSoollaarriiss 22..xx aanndd HHPP--UUXX Under both SunOS 4.x, Solaris 2.x and HP-UX, the simple object file must be created by compiling the source file with special compiler flags _a_n_d a shared library must be produced. The necessary steps with HP-UX are as follows. The ++zz flag to the HP-UX C compiler produces so-called "Posi- tion Independent Code" (PIC) and the ++uu flag removes some alignment restrictions that the PA-RISC architec- ture normally enforces. The object file must be turned into a shared library using the HP-UX link editor with the --bb option. This sounds complicated but is actually very simple, since the commands to do it are just: ## ssiimmppllee HHPP--UUXX eexxaammppllee %% cccc ++zz ++uu --cc ffoooo..cc %% lldd --bb --oo ffoooo..ssll ffoooo..oo ____________________ 13 Actually, POSTGRES does not care what you name the file as long as it is a shared object file. If you prefer to name your shared object files with the extension ..oo, this is fine with POSTGRES so long as you make sure that the cor- rect file name is given to the ccrreeaattee ffuunnccttiioonn command. In other words, you must simply be consistent. However, from a pragmatic point of view, we discourage this practice because you will undoubtedly confuse yourself with regards to which files have been made into shared object files and which have not. For example, it's very hard to write MMaakkeeffiillees to do the link-editing automatically if both the object file and the shared object file end in ..oo! 9900 As with the ..ssoo files mentioned in the last subsection, the ccrreeaattee ffuunnccttiioonn command must be told which file is the correct file to load (i.e., you must give it the location of the shared library, or ..ssll file). Under SunOS 4.x, the commands look like: ## ssiimmppllee SSuunnOOSS 44..xx eexxaammppllee %% cccc --PPIICC --cc ffoooo..cc %% lldd --ddcc --ddpp --BBddyynnaammiicc --oo ffoooo..ssoo ffoooo..oo and the equivalent lines under Solaris 2.x are: ## ssiimmppllee SSoollaarriiss 22..xx eexxaammppllee %% cccc --KK PPIICC --cc ffoooo..cc oorr %% ggcccc --ffPPIICC --cc ffoooo..cc %% lldd --GG --BBddyynnaammiicc --oo ffoooo..ssoo ffoooo..oo When linking shared libraries, you may have to specify some additional shared libraries (typically system libraries, such as the C and math libraries) on your lldd command line. 9911