agora inbox for postgres@postgres.berkeley.edu  
help / color / mirror / Atom feed
B-tree bug
5+ messages / 3 participants
[nested] [flat]

* B-tree bug
@ 1996-04-02 15:12  Jeff Sidell <jsidell@cs.berkeley.edu>
  0 siblings, 0 replies; 5+ messages in thread

From: Jeff Sidell @ 1996-04-02 15:12 UTC (permalink / raw)
  To: mariposa_hackers@postgres.Berkeley.EDU; +Cc: legacy

When a b-tree index's scan position is marked, only the
line pointer value is saved.  If the position is restored
after the index scan has read past that page, it doesn't
work.  I noticed this while doing a merge-join with index
scans beneath it (tpc-D query #8).  I have reproduced the
bug with much smaller datasets.  The code to be changed is
in backend/executor/nodeIndexScan.  The procedures are
ExecIndexMarkPos() and ExecIndexRestrPos().

Has anyone else found this bug and possibly fixed it?

Jeff
=============================================================================
                                  JEFF SIDELL

Computer Science Division                            jsidell@cs.berkeley.edu
University of California at Berkeley    http://http.cs.berkeley.edu/~jsidell/
=============================================================================


==============================================================================
   To add/remove yourself to/from the POSTGRES mailing list: send mail with 
   the subject line ADD or DEL to "postgres-request@postgres.Berkeley.EDU".
   If this fails, send mail to "post_questions@postgres.Berkeley.EDU" and
   a human will deal with it.  DO NOT post to the "postgres" mailing list.
==============================================================================
              URL: http://s2k-ftp.CS.Berkeley.EDU:8000/postgres/



^ permalink  raw  reply  [nested|flat] 5+ messages in thread

* b-tree bug?
@ 1996-05-01 15:35  Jeff Sidell <jsidell@cs.berkeley.edu>
  0 siblings, 0 replies; 5+ messages in thread

From: Jeff Sidell @ 1996-05-01 15:35 UTC (permalink / raw)
  To: legacy; +Cc: mariposa@postgres.Berkeley.EDU

Has anyone seen this?  I noticed some weird answers and it seems there's
a problems with b-trees?!  The same query gets a different answer when using
a b-tree index scan than when using a regular sequential scan.  In the examples
below, there was a b-tree index created over LINEITEM using the following 
CREATE INDEX command:

    create index LINEITEM_SDATE on LINEITEM using btree(L_SHIPDATE int4_ops);

(and yes, L_SHIPDATE is an int4)


********************** With index scans allowed *******************************
Go
* select count(*) from LINEITEM where L_SHIPDATE >= 8767 and L_SHIPDATE < 9132;
 
Query sent to backend is "select count(*) from LINEITEM where L_SHIPDATE >= 8767
 and L_SHIPDATE < 9132;"
---------------
| count       |
---------------
| 10393       |
---------------
 
Go
* select count(*) from LINEITEM where L_SHIPDATE = 8767;
 
Query sent to backend is "select count(*) from LINEITEM where L_SHIPDATE = 8767;
"
---------------
| count       |
---------------
| 18          |
---------------
********************** With index scans forbidden
*******************************
* parkcity:jsidell 8% monitor
Welcome to the POSTGRES95 terminal monitor
 
Go

* select count(*) from LINEITEM where L_SHIPDATE >= 8767 and L_SHIPDATE < 9132;
 
Query sent to backend is "select count(*) from LINEITEM where L_SHIPDATE >= 8767
 and L_SHIPDATE < 9132;"
---------------
| count       |
---------------
| 91947       |
---------------
 
* select count(*) from LINEITEM where L_SHIPDATE = 8767;
 
Query sent to backend is "select count(*) from LINEITEM where L_SHIPDATE = 8767;
"
---------------
| count       |
---------------
| 231         |
---------------
 
Go
=============================================================================
                                  JEFF SIDELL

Computer Science Division                            jsidell@cs.berkeley.edu
University of California at Berkeley    http://http.cs.berkeley.edu/~jsidell/
=============================================================================


==============================================================================
   To add/remove yourself to/from the POSTGRES mailing list: send mail with 
   the subject line ADD or DEL to "postgres-request@postgres.Berkeley.EDU".
   If this fails, send mail to "post_questions@postgres.Berkeley.EDU" and
   a human will deal with it.  DO NOT post to the "postgres" mailing list.
==============================================================================
              URL: http://s2k-ftp.CS.Berkeley.EDU:8000/postgres/



^ permalink  raw  reply  [nested|flat] 5+ messages in thread

* Re: b-tree bug?
@ 1996-05-01 17:15  Joe Hellerstein <jmh@cs.berkeley.edu>
  0 siblings, 1 reply; 5+ messages in thread

From: Joe Hellerstein @ 1996-05-01 17:15 UTC (permalink / raw)
  To: Jeff Sidell <jsidell@cs.berkeley.edu>; +Cc: legacy

Well, last time I checked, postgres officially assigned no semantics to
duplicates, so it's *just* possible that everything officially is working
"right" in your example, because postgres aggregates are semantically
meaningless in the face of duplicates.  But that's probably not what's going on.

Joe



-------------------------------------
Joseph M. Hellerstein
EECS Computer Science Division
University of California, Berkeley
http://www.cs.berkeley.edu/~jmh


==============================================================================
   To add/remove yourself to/from the POSTGRES mailing list: send mail with 
   the subject line ADD or DEL to "postgres-request@postgres.Berkeley.EDU".
   If this fails, send mail to "post_questions@postgres.Berkeley.EDU" and
   a human will deal with it.  DO NOT post to the "postgres" mailing list.
==============================================================================
              URL: http://s2k-ftp.CS.Berkeley.EDU:8000/postgres/



^ permalink  raw  reply  [nested|flat] 5+ messages in thread

* Re: b-tree bug?
@ 1996-05-01 18:16  Jeff Sidell <jsidell@postgres.Berkeley.EDU>
  parent: Joe Hellerstein <jmh@cs.berkeley.edu>
  0 siblings, 1 reply; 5+ messages in thread

From: Jeff Sidell @ 1996-05-01 18:16 UTC (permalink / raw)
  To: Joe Hellerstein <jmh@cs.berkeley.edu>; +Cc: legacy

> 
> Well, last time I checked, postgres officially assigned no semantics to
> duplicates, so it's *just* possible that everything officially is working
> "right" in your example, because postgres aggregates are semantically
> meaningless in the face of duplicates.  But that's probably not what's going on.
> 

I'm not sure what the official semantics for aggregates/aggregates-over-
duplicates are, but unless I'm missing something, there's a bug:
The aggregate node takes tuples passed in to it from the subplan
and calls the appropriate function for each such tuple.  Count() simply
adds one to a running count.  The fact that we're getting different
counts for an underlying index scan and a sequential scan indicates
that they are retrieving different numbers of tuples. 

Semantically, "select count(*) from LINEITEM where L_SHIPDATE >= D1
and L_SHIPDATE < D2" means "Tell me how many tuples in the LINEITEM
table have a shipdate D such that D1 <= D <= D2".  Not "how many
*unique* tuples" or "how many tuples with unique shipdates".

Jeff

==============================================================================
   To add/remove yourself to/from the POSTGRES mailing list: send mail with 
   the subject line ADD or DEL to "postgres-request@postgres.Berkeley.EDU".
   If this fails, send mail to "post_questions@postgres.Berkeley.EDU" and
   a human will deal with it.  DO NOT post to the "postgres" mailing list.
==============================================================================
              URL: http://s2k-ftp.CS.Berkeley.EDU:8000/postgres/



^ permalink  raw  reply  [nested|flat] 5+ messages in thread

* Re: b-tree bug?
@ 1996-05-01 18:22  Joe Hellerstein <jmh@cs.berkeley.edu>
  parent: Jeff Sidell <jsidell@postgres.Berkeley.EDU>
  0 siblings, 0 replies; 5+ messages in thread

From: Joe Hellerstein @ 1996-05-01 18:22 UTC (permalink / raw)
  To: Jeff Sidell <jsidell@postgres.Berkeley.EDU>; +Cc: legacy

> I'm not sure what the official semantics for aggregates/aggregates-over-
> duplicates are, but unless I'm missing something, there's a bug:
> The aggregate node takes tuples passed in to it from the subplan
> and calls the appropriate function for each such tuple.  Count() simply
> adds one to a running count.  The fact that we're getting different
> counts for an underlying index scan and a sequential scan indicates
> that they are retrieving different numbers of tuples. 

I believe you're right that there's a bug.

> 
> Semantically, "select count(*) from LINEITEM where L_SHIPDATE >= D1
> and L_SHIPDATE < D2" means "Tell me how many tuples in the LINEITEM
> table have a shipdate D such that D1 <= D <= D2".  Not "how many
> *unique* tuples" or "how many tuples with unique shipdates".

Actually, the last time I checked the number of duplicate values
returned by postgres queries was officially non-deterministic (this
was Postquel semantics).  For example, if the plan was
sorting/hashing/index-probing anyway it could remove dups, but if it
wasn't it could leave them there for efficiency.  An unhappy corrolary
to this semantics is that two identical tuples may or may not get
treated separately in an aggregate, and you're not supposed to care.

In point of fact, I doubt this is what's going on in your example, but
it's just possible.  And one of my pet peeves.

Joe

==============================================================================
   To add/remove yourself to/from the POSTGRES mailing list: send mail with 
   the subject line ADD or DEL to "postgres-request@postgres.Berkeley.EDU".
   If this fails, send mail to "post_questions@postgres.Berkeley.EDU" and
   a human will deal with it.  DO NOT post to the "postgres" mailing list.
==============================================================================
              URL: http://s2k-ftp.CS.Berkeley.EDU:8000/postgres/



^ permalink  raw  reply  [nested|flat] 5+ messages in thread


end of thread, other threads:[~1996-05-01 18:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
1996-04-02 15:12 B-tree bug Jeff Sidell <jsidell@cs.berkeley.edu>
1996-05-01 15:35 b-tree bug? Jeff Sidell <jsidell@cs.berkeley.edu>
1996-05-01 17:15 Re: b-tree bug? Joe Hellerstein <jmh@cs.berkeley.edu>
1996-05-01 18:16 ` Re: b-tree bug? Jeff Sidell <jsidell@postgres.Berkeley.EDU>
1996-05-01 18:22   ` Re: b-tree bug? Joe Hellerstein <jmh@cs.berkeley.edu>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox