Delivery-Date: Thu, 25 Mar 93 10:37:27 -0800
Via: uk.ac.manchester.computer-science; Thu, 25 Mar 1993 18:20:03 +0000
Date: Thu, 25 Mar 93 18:01:43 GMT
From: Carole Goble <carole@computer-science.manchester.ac.uk>
Message-Id: <9303251801.AA06622@r1o.cs.man.ac.uk>
To: aoki@postgres.berkeley.edu
Subject: Re: sun problem
Cc: boonteet@computer-science.manchester.ac.uk,
        cag@computer-science.manchester.ac.uk,
        rick@computer-science.manchester.ac.uk


Paul

Okay, we have a crash.  Neil McQuaig doesn't help us as we are bsd. 
The first part of these notes has the Postmaster process failing
but we couldn't get a stack trace. We rebooted the machine and 
increased the number of file descriptors. Then we hit ANOTHER problem
where the postmaster keeps going but the postgres process exits when
an attempt to REPLACE or similar is made without dumping core. 

Any help/guidance/debugging hints appreciated.


Only one user (Mr Boontee) on who did the following:
----------------------------------------------------
* define rewrite rule sharon_too_much is
	on replace to emp.salary
		where current.name="sharon" and 
			new.salary > emp.salary and
			emp.name = current.manager
	do instead replace emp (salary=current.salary) where
			current.name="sharon"
\g

Query sent to backend is "define rewrite rule sharon_too_much is on replace 
to emp.salary where current.name="sharon" and new.salary > emp.salary and 
  emp.name = current.manager
do instead replace emp (salary=current.salary) 
where current.name="sharon" "
APPEND 30048DEFINE
Go 
* replace emp (salary=500) where emp.name="sharon"
\g

Query sent to backend is "replace emp (salary=500) where emp.name="sharon" "
Error: No response from the backend, exiting...


On the server:
---------------

PostMaster: fork failedprogram terminated by signal ILL (bad stack)

I typed (dbx) trace instead of (dbx) where which
gave me a segmentation fault. Sorry. That was finger trouble.

t6% dbx postmaster
Reading symbolic information...
Read 3481 symbols
warning: core file read error: address not in data space
warning: core file read error: address not in data space
attempt to read stack failed - bad frame pointer


Processes now running on server:
--------------------------------
Note the lack of Mr Boontee's postgres backend and the postmaster
process:

t6% ps aux
USER       PID %CPU %MEM   SZ  RSS TT STAT START  TIME COMMAND
root         0  0.0  0.0    0    0 ?  D    Mar 15  0:47 swapper
root         1  0.0  0.0   52    0 ?  IW   Mar 15  0:00 /sbin/init -
root         2  0.0  0.0    0    0 ?  D    Mar 15  0:04 pagedaemon
root       108  0.0  0.0   56    0 ?  IW   Mar 15  0:11 rpc.bootparamd
root        61  0.0  0.0   68    0 ?  IW   Mar 15  1:23 portmap
root      3429  0.0  0.0   24    0 ?  IW   Mar 20  0:00 sh -c /usr/lib/sendmail -oeq -oi r
root        68  0.0  0.0   40    0 ?  IW   Mar 15  0:00 keyserv
root        79  0.0  0.0   16    0 ?  I    Mar 15  0:00  (biod)
root        78  0.0  0.0   16    0 ?  I    Mar 15  0:00  (biod)
bin         66  0.0  0.0   36    0 ?  IW   Mar 15  0:00 ypbind
root       168  0.0  0.0   40    0 co IW   Mar 15  0:00 - std.9600 console (getty)
root        90  0.0  0.0   60    0 ?  IW   Mar 15  0:01 syslogd
root       117  0.0  0.2  108   56 ?  S    Mar 15  0:06 automount
root       101  0.0  0.0   68    0 ?  IW   Mar 15  2:55 rpc.mountd -n
root       102  0.0  0.0   28    0 ?  I    Mar 15  9:18  (nfsd)
root       104  0.0  0.0   28    0 ?  I    Mar 15  9:43  (nfsd)
root       105  0.0  0.0   48    0 ?  IW   Mar 15  0:14 rarpd -a
root       106  0.0  0.0   24    0 ?  IW   Mar 15  0:03 rarpd -a
root      6715  0.0  0.0   56    0 ?  IW   Mar 22  0:00 cron
root       111  0.0  0.0   84    0 ?  IW   Mar 15  0:00 rpc.lockd
root       112  0.0  0.0   52    0 ?  IW   Mar 15  0:00 rpc.statd
root      9931  0.0  0.1   24   24 ?  S    12:10   0:00 in.rlogind
root       146  0.0  0.0  100    0 ?  IW   Mar 15  3:51 /usr/oracle/bin/orasrv
oracle     138  0.0  0.8  268  232 ?  S    Mar 15  0:10 ora_pmon_oracle4
root       154  0.0  0.0   12    4 ?  S    Mar 15 96:16 update
root      6716  0.0  0.0   24    0 ?  IW   Mar 22  0:00 sh -c /usr/lib/sendmail -oeq -oi r
oracle     139  0.0  1.0  240  304 ?  S    Mar 15  1:46 ora_dbwr_oracle4
oracle     140  0.0  0.8  232  236 ?  S    Mar 15  1:03 ora_lgwr_oracle4
oracle     141  0.0  0.0  296    0 ?  IW   Mar 15 23:02 ora_smon_oracle4
root       157  0.0  0.2   56   60 ?  S    Mar 15  0:02 cron
root       163  0.0  0.0   48    0 ?  IW   Mar 15  0:03 inetd
oracle    9920  0.0  4.8  296 1488 ?  I    12:07   0:00 oracleoracle4 T:I,,5
root       166  0.0  0.0   52    0 ?  IW   Mar 15  0:00 /usr/lib/lpd
root      3428  0.0  0.0   56    0 ?  IW   Mar 20  0:00 cron
postgres 17338  0.0  0.0   48    0 p0 IW   Mar 17  0:00 -csh (csh)
root       173  0.0  0.0  144    0 ?  IW   Mar 15  1:19 rpc.rquotad
root     17337  0.0  0.0   24    0 ?  IW   Mar 17  0:00 in.rlogind
root      6599  0.0  0.7 2364  208 ?  S    Mar 21  0:05 -(null) To t6 (sendmail)
root      3430  0.0  0.7 3036  208 ?  S    Mar 20  0:06 -(null) To t6 (sendmail)
postgres  9936  0.0  1.4  188  420 p1 R    12:12   0:00 ps aux
oracle    9923  0.0  0.0  184    0 ?  IW   12:08   0:00 oracleoracle4 T:I,,5
root      6717  0.0  0.7 1688  208 ?  S    Mar 22  0:03 -(null) To t6 (sendmail)
root      6598  0.0  0.0   24    0 ?  IW   Mar 21  0:00 sh -c /usr/lib/sendmail -oeq -oi r
postgres  9932  0.0  0.7   48  216 p1 S    12:10   0:00 -csh (csh)
oracle    7881  0.0  0.0  424    0 ?  IW   Mar 23  0:02 oracleoracle4 T:I,,5
postgres 18119  0.0  0.0  256    0 p0 IW   Mar 17  0:00 dbx
oracle    9899  0.0  0.0  308    0 ?  IW   11:37   0:00 oracleoracle4 T:I,,5
root      6597  0.0  0.0   56    0 ?  IW   Mar 21  0:00 cron


We reboot the server and increase the number of file descriptors.
So now we get....

Welcome to the C POSTGRES terminal monitor

Go
* retrieve (emp.all) \g

Query sent to backend is "retrieve (emp.all) "
-------------------------------------------------------------------------------------
| location    | age         | name        | salary      | manager     | dept        |
-------------------------------------------------------------------------------------
| (11,10)     | 20          | bill        | 1000        | sharon      | toy         |
-------------------------------------------------------------------------------------
| (11,10)     | 31          | carole      | 6000        | sharon      | toy         |
-------------------------------------------------------------------------------------
| (15,12)     | 25          | sharon      | 500         | sam         | shoe        |
-------------------------------------------------------------------------------------
| (10,5)      | 30          | sam         | 300         | bill        | candy       |
-------------------------------------------------------------------------------------

Go

* replace emp (salary=5000) where emp.name="sharon"
\g

Query sent to backend is "replace emp (salary=5000) where emp.name="sharon" "
NOTICE:Mar 25 17:37:58:I have been signalled by the postmaster.
NOTICE:Mar 25 17:37:58:Some backend process has died unexpectedly and possibly
NOTICE:Mar 25 17:37:58:corrupted shared memory.  The current transaction was
NOTICE:Mar 25 17:37:58:aborted, and I am going to exit.  Please resend the
NOTICE:Mar 25 17:37:58:last query. -- The postgres backend

Go
* replace emp (salary=5000) where emp.name = "sharon" \g

Query sent to backend is "replace emp (salary=5000) where emp.name = "sharon" "
r1o>

The postgres process has been removed, the postmaster is still running
but the postgres process hasn't dumped core.