.\" XXX standard disclaimer belongs here....
.\" $Header: RCS/introduction,v 1.2 90/07/19 15:11:49 claire Exp $
.XA 0 "Section 4 \*- Data Types, Functions, and Operators (Types)" _
.SP INTRODUCTION COMMANDS 1/16/90
.XA 1 "Introduction"
.uh OVERVIEW
.lp
In this portion of the manual, we describe the components of the query language \*(PQ which is
available either from the terminal monitor or from an application
program via LIBPQ.  The main concepts in \*(PQ are types, functions
and rules.  In this introduction we describe each of these constructs.  Immediately
following this introduction, we discuss
the components of the \*(PQ language,
built-in types, and system types.  In the next portion of the manual 
the individual \*(PQ commands appear in alphabetical order.
.uh "KINDS OF TYPES"
.lp
\*(PP supports three kinds of types, namely 
.b base
types,
.b array
types,
and
.b composite
types.  The query language capabilities for each are different, and
we discuss them in turn.
.lp
Base types hold atomic data elements that appear to \*(PP
internals as uninterpreted byte strings.  Example base types
are integers and floating point numbers.  Indexes can 
be constructed for columns of relations containing base types and
such columns can be references using the conventional rel-name.column
addressing format. Moreover, functions and operators can be defined whose
operands are base types.  Lastly, base types can be added and dropped dynamically. 
.lp
There are three kinds of
.b base
types available in \*(PP.
.np
.b "Built-in types"
.br
These are data types that are used in the system catalogs.
Hence,
they must exist as \*(PP data types or the \*(PP system will not run.
Most of these types are
.q "hard wired"
into \*(PP so the system can boot.
.np
.b "System types"
.br
These are data types that are defined by the \*(PP system administrator.
They are automatically available for each data base that is created
on a \*(PP system.
The built-in and system data types can be changed by a system administrator by
making appropriate modifications to the file
.(l
\&.../files/local1_template1.bki
.)l
Each new data base automatically receives the collection of built-in and system
types specified in the above file at the time
the data base is created.
System types which are defined subsequently must 
be inserted into pre-existing data bases one-by-one as user defined types.
.br
Other template files may be constructed in files named
.(l
\&.../files/local1_template-name.bki
.)l
and then used by createdb with the -t flag.
See
.b bki
(files) and
.b createdb
(unix) for more information.
.np
.b "User types"
.br
These data types are defined dynamically by a user of a
data base.
Their scope is limited to the data base in which they are defined.
See
.b "define type"
(commands) for details on creating and using these types.
C functions, \*(PQ functions, aggregate functions, and operators can be defined for user types
using respectively the commands 
.b define
.b C function 
(commands),
.b define
.b POSTQUEL
.b function 
(commands),
.b define 
.b aggregate 
(commands),
and
.b define
.b operator 
(commands).
.lp
In addition \*(PP supports fixed and
variable length
.b arrays
of base types.
Whenever a new built-in, system or user type is constructed, \*(PP
automatically defines fixed and variable length arrays of this type as
additional types.  If B is a base type, then B[N] is an array of N instances
of B, while B[] is a variable length array of instances of B, for example:
.ti 2
create emp (name = char16, age = int4, budget = int4[12], salary_history = float8[])
.)l
Here budget is an array of 12 integers while salary_history is a variable length array of
floating point numbers.  No sparse matrix techniques are applied to the storage of arrays;
rather elements are stored contiguously in a tuple.  
.lp
All operations available for base types are also available for arrays of base
types.  Moreover, conventional array addressing is automatically provided in
\*(PQ.  Hence, the i-th element of an array can be addressed as 
.(l
rel-name.column[i]
.)l
For example the following query updates the April budget of joe.
.(l
replace emp (budget[4] = 95) where emp.name = "joe"
.)l
.lp
There are also three kinds of \fBcomposite\fR types in \*(PQ.
.np
.b "Tuple in a specific relation"
.br
Whenever a relation is created, a type is automatically constructed
of the same name whose value is a tuple
in the indicated relation.
For example, if ``emp'' is created as a relation, then the type
\fBemp\fR
is automatically constructed.
This new type can be used in other relations, for example:
.(l
create dept (name = char16, budget = int4, mgr = \fBemp\fR)
.)l
Here the field \fBmgr\fR
is of type
.q emp
and refers to a tuple from the
.q emp
relation.
The value of the \fBmgr\fR field for each tuple is
a function which returns the type, emp.
For example, if f is a POSTQUEL function which accepts a character string
argument and returns the type, emp, then 
the following is a valid insert to dept:
.(l
append dept ( name = "toy", budget = 100000, mgr = f ("toy"))
.)l
In Version 2, only \*(PQ functions have the power to return tuples.
In the future C functions will be extended to have this capability.
.np
.b "Set of tuples in a specific relation"
.br
Whenever a relation, X,  is created, a type is automatically constructed, setof X,
whose value is a collection of tuples
in the indicated relation.
The following example illustrates this construct.
.(l
create dept (name = char16, budget = int4, emps = \fBsetof emp\fR)
.)l
Here the field \fBemps\fR
is of type
.q setof emp
and refers to a collection of tuples from the
.q emp
relation.
The value of the \fBemps\fR field for each tuple is
a function which returns the type, setof emp.
.np
.b "Any relation as a data type"
.br
The type
.b relation
is automatically available and allows the value of a field
in a relation to be 
an arbitrary collection of tuples from arbitrary relations.  
For example, consider the following emp relation:
.(l
create emp (name = char16, hobbies = relation)
.)l
Here, the value of hobbies for any employee is any collection of tuples from one or more
relations.  In fact, the actual value is a function which returns this type, 
Assuming the f has been defined to return the relation type, the following 
insert works correctly.
.(l
append to emp (name = "joe", hobbies = f("joe"))
.)l
.lp
For composite data types 
\*(PQ supports
.q "nested dot"
addressing.  Hence, the following query will find
the name of the manager of the shoe department:
.(l
retrieve (dept.mgr.name) where dept.name = "shoe"
.)l
Nested dot notation is explained in the
.b postquel
(postquel) section.
.uh "KINDS OF FUNCTIONS"
.lp
In \*(PP there are four kinds of functions that can be defined. 
.np
.b "Normal functions"
.br
Normal functions can be written either in C or in \*(PQ and then defined
to \*(PP using the 
.b define 
.b C 
.b function 
(commands) and 
.b define 
.b \*(PQ 
.b function
(commands) respectively. Normal functions take base or array types as arguments and return
base, array or composite types.
.br
Queries can include normal functions using the standard notation, e.g.:
.(l
retrieve (emp.name) where overpaid (emp.salary, emp.age)
.)l
Here, overpaid is a normal function accepting a floating point number and
an integer as arguments and returning a boolean.  Clauses in a qualification
containing normal functions cannot be optimized by \*(PP, and a sequential
scan of the associated relation will typically result.
.np
.b "operators"
.br
Consider a normal function which takes two operands of the
same type and returns a boolean, e.g:
.(l
retrieve (emp.name) where greater (emp.age, 25)
.)l
An operator can be associated with 
this function, say >, using the 
.b define 
.b operator 
(commands) command.
In this command, the information is specified that is needed by the optimizer to efficiently
process queries including the operator token.  Hence, the query:
.(l
retrieve (emp.name) where emp.age > 25
.)l
can be optimized to use an age index, whereas the one with the function notation
cannot. 
.np
.b "aggregate functions"
.br
Aggregate functions allow a \*(PP user to compute aggregates such as
count, sum and average.  Unfortunately, they do not work in Version 2.
.np
.b "Inheritable functions (methods)"
.br
If a function has a first argument which 
is of type \fBtuple\fR in some relation, then this function is inheritable.
Consider the following query:
.(l
retrieve (emp.name) where overpaid(emp)
.)l
Here overpaid takes an argument of type tuple in emp and returns a boolean.
Such functions can be written in C or POSTQUEL.  If written in C, they must access
fields in the argument tuple using special
.b accessor
.b functions
as described in the 
.b define 
.b C 
.b function 
(commands) section.  Inheritable functions
can be referenced
either using the functional notation above or using one of the column style
notations as follows:
.(l
retrieve (emp.name) where emp.overpaid
retrieve (emp.name) where emp.overpaid()
.)l
These latter notations emphasise the fact that overpaid effectively defines a new column
for the table emp containing the field, overpaid.  Moreover, if any relation
inherits from the emp relation, e.g: the pensionemp relation, then any inheritable functions
defined for emp are automatically defined for pensionemp.  Hence, the following
query automatically works:
.(l
retrieve (pensionemp.name) where overpaid (pensionemp)
.)l
Inheritable functions follow the conventions of the Common Lisp Object System (CLOS)
when a function can be inherited from multiple parents. 
.uh "RULES"
.lp
The third major concept in \*(PP is the notion of 
.b rules.
They have the form:
.(l
on condition
then do action
.)l
Rules can be used to 
.b trigger
DBMS actions e.g:
.(l
on update to emp.salary where emp.name = "mike"
then do replace emp (salary = new.salary) where emp.name = "joe"
.)l
When mike receives a salary adjustment, then this rule propagates
the new salary on to Joe.  An alternate rule which accomplishes
the same thing is:
.(l
on retrieve to emp.salary where emp.name = "joe"
then do instead retrieve (emp.salary) where emp.name = "mike"
.)l
This rule will retrieve the salary of mike in place of whatever
is stored in joe's record.  Rules can be used to assist
with the definition and maintenance of data in a table.  Moroever,
rules can sometimes be used in place of functions if the user wishes.  Hence
the following two commands have the effect of defining a column, overpaid.
.(l
add to emp (overpaid = boolean)

on retrieve to emp.overpaid
then do instead retrieve (overpaid = overpaid (current.salary, current.age))
.)l
This column will be inherited in the standard way, and the effect is the same
as an inheritable function.  The above solution allows the user
to add additional rules to further define the column, e.g:
.(l
on update to emp.overpaid
then do ....
.)l
Such additional rules cannot be specified using the solution containing
a function definition.
