Resent-From: POSTGRES mailing list <postman@postgres.Berkeley.EDU>
Resent-Message-Id: <199401290016.QAA27010@nobozo.CS.Berkeley.EDU>
Sender: owner-postman@postgres.Berkeley.EDU
Message-Id: <199401290016.QAA26997@nobozo.CS.Berkeley.EDU>
From: aoki@postgres.Berkeley.EDU (Paul M. Aoki)
To: Brian Holman <bkh@liblas.byu.edu>
Cc: postgres@postgres.Berkeley.EDU
Subject: Re: Postgres v4.2? OSF/1 Port? 
In-reply-to: Your message of Fri, 28 Jan 1994 15:36:15 +119303928 (MST) 
	     <9401282236.AA00690@liblas.byu.edu> 
Date: Fri, 28 Jan 1994 16:16:53 -0800
Resent-To: postgres-dist@postgres.Berkeley.EDU
Resent-Date: Fri, 28 Jan 94 16:16:54 -0800
Resent-XMts: smtp

Brian Holman <bkh@liblas.byu.edu> writes:
> Quoted from Mike Stonebraker <mike@postgres.Berkeley.EDU>
[...]

whoops, that wasn't supposed to go out on this mailing list.
i botched my mailing list mux/demux :-(  sorry 'bout that.

> >As such, we can move ahead with a schedule based on a firm 2/1 code freeze.
> Does this mean that the Postgres V4.2 is near completion?

yes.  can't imagine release will be more than a couple of weeks
after code freeze.

> Is the DEC Alpha OSF/1 port sufficently stable to be included in 
> the next release?

i think so.  it's what i'm using these days for my development work,
anyway.

> Another question:  I'm running Postgres V4.1 on a DECStation 5000 running
> Ultrix.  I have some non-critical production databases running on it that
> log statistical information.  In other words, there are alot of very brief
> transactions being send to the postmaster.
> Why is the back end keep going down?  Is it the number of "postgres"
> processes running at a time?  Does shared memory just get screwed up every
> once in a while or what?  Any ideas would be appreciated.

when a backend dies unexpectedly, the postmaster detaches its 
shared resources and reinitializes from scratch.  (i don't think 
there's any good reason to actually detach the resources before 
reinitializing, but there it is.)  sometimes it fails to reattach 
(it prints out a message when this happens) and terminates.

i think this was partly due to the way backends neglected to shmdt 
(detach) when going down.  this would occasionally cause the postmaster 
to "leak" shared resources (which couldn't be deallocated by the OS 
because it thought a process was still attached).  i changed the 
shared memory stuff to shmdt and i didn't see this problem anymore on
ultrix 4.2a.  we upgraded the sequoia 2000 db server machine to ultrix 
4.3a and the problem seemed to go away in the old code, too -- it was
up for about two months without this problem.

i think i sent that shmdt change to robert withrow, who was reporting 
something similar under SVR4, to see if that solved his problem.  i 
don't remember what his conclusion was, though.
--
  Paul M. Aoki  |  CS Div., Dept. of EECS, UCB  |  aoki@postgres.Berkeley.EDU
                |  Berkeley, CA 94720           |  ...!uunet!ucbvax!aoki