Return-Path: pg_adm@postgres.berkeley.edu
Received: by postgres.Berkeley.EDU (5.61/1.29)
	id AA04107; Fri, 1 Nov 91 13:55:51 -0800
Date: Fri, 1 Nov 91 13:55:51 -0800
Message-Id: <9111012155.AA04107@postgres.Berkeley.EDU>
From: cflatter@zia.AOC.NRAO.EDU (Chris Flatters)
Subject: thoughts on binary POSTGRES <-> client communication
To: postgres@postgres.berkeley.edu
Sender: pg_adm@postgres.berkeley.edu

My concern in a recent thread in this mail group is how to use POSTGRES
to support a scientific application that handles a large amount of data.
This involves reading the data in the database, performing computations
and updating the values of specific fields according to the results of
the computations.  After some thought, I think that I have come up
with a way to do this fairly efficiently in POSTGRES 3.0.  Since I
haven't seen this described before, I will outline the scheme here in case
somebody else finds it useful.

Suppose that we have an application that is to update a field `value' in
some relation `data'.  We can go about this in the following way.

1: retrieve the oids and values from data into a temporary relation (which
   I shall call t1).

	retrieve into t1 (data.oid, data.value)

2: copy this to the POSTGRES/client I/O stream

	copy binary t1 to stdout

   (using binary mode removes the overhead of converting to text)

3: read the data from the POSTGRES I/O stream in the client

4: compute

5: create a relation to hold the new values

   	create t2 (id = oid, value = <whatever>)

   (in this case we could delete the contents of t1 and reuse it but this
    is not normally true: for example the application may use the values
    of more than one field --- perhaps from more than one relation ---
    to calculate a new value for data.value)

6: write the new data to the POSTGRES I/O stream from the client

7: fill in t2

	copy binary t2 from stdin

8: create a new class, identical to t2

	create t3 (id = oid, value = <whatever>)

9: define a rule that updates data on an append to t3

	define instance rule upd_data is
        on append to t3
        do instead replace data(value = new.value) where data.oid = new.id

10: do an append

	append t3 (t2.all)

11: clean out the temporary relations and rules as appropriate.

I have not tried this from a client program yet but I have tested the
procedure out by hand and it does work.  In practice the recieve would
be restricted using a where clause so that the retrieved data will fit
into core memory.

I am not sure exactly what the tradeoffs are between this method and
the more straightforward method of retrieving data through a portal
and issuing queries to update values in the database.  However, I am
reasonably confident that this method will be more efficient where the
client is dealing with a large number of tuples.  At the least it allows
me to get data in and out of POSTGRES in binary form (important for
floating-point data since ASCII conversion can result in a loss of
accuracy) while retaining the ability to use POSTQUEL.

===============================================================================
Chris Flatters				| Internet: cflatter@nrao.edu
AIPS Scientific Programmer		| SPAN: NRAO::CFLATTER
P.O. Box 0				| Bitnet: cflatter@nrao
Socorro, NM 87801-0387			|
===============================================================================

