agora inbox for postgres@postgres.berkeley.edu
help / color / mirror / Atom feedFrom: Joerg Stadler <joerg@itap.physik.uni-stuttgart.de>
To: postgres@postgres.Berkeley.EDU
Subject: Beginners questions
Date: Tue, 10 Oct 1995 13:31:01 +0100
Message-ID: <199510101231.NAA03056@titania> (raw)
Hello postgres users,
my task is to implement a (insitiution wide) database that handles
BibTeX citation entries. Here are some BibTex examples:
=========================== examples ====================================
TechReport{dongarra93,
author = "Dongarra, Jack J.",
title = "Performance of Various Computers Using Standar
Linear Equations Software",
institution = "Computer Science Department, University of Tennessee
and Mathematical Sciences Section, Oak Ridge
National Laboratory",
year = 1993,
key = "Ma Sonstiges",
note = "per email zu erhalten von netlib@ornl.gov"
}
@BOOK{wall90,
AUTHOR = {Larry Wall and Randall Schwartz},
EDITOR = {O'Reilly },
PUBLISHER = {O'Reilly},
TITLE = {Programming perl},
YEAR = {1990},
KEYWORD = {Ma Sonstiges}
}
@ARTICLE{black54,
AUTHOR = {Black,, P.J.},
JOURNAL = {Acta. Cryst.},
KEY = {Qk Approximanten},
NUMBER = {8},
PAGES = {1955},
TITLE = {{T}he {S}tructure of {FeAl$_{3}$}},
YEAR = {1954}
}
=========================== examples end ===============================
I somehow found out about postgres and decided to give it a
try.
Now, as seen from the examples, the BibTeX format is field-oriented,
but you can't tell in advance how many different fields there
are. This depends on the type of the citation, and also users sre
allowed to define thier own fields.
Since I can't map BibTeX field names directly to postgres attributes,
I came up with the following database structure:
create ENTRIES ( citekey=text, // Unique key for each citation: e.g. Wall1990b
author =text, // Name of first author: e.g. Wall
year =int4, // Year of publication: e.g. 1990
uid =int4, // Who added this citation?
ctype =text ) // ARTICLE, BOOK, ....
create BIBDATA ( citekey=text, // See above
field =text, // BibTeX field name: AUTHOR, TITLE etc.
data =text ) // BibTeX field's contents
There is one entry in the ENTRIES relation for every citation and
there is one entry in the BIBDATA relation for every field in a given
citation.
I wrote some pgperl scripts that add, delete, replace and retrieve
(given a citekey) citations from this database. This works quite
well.
But one (most important) question remains: How do I formulate
efficient queries that search multiple data fields in BIBDATA?
If I just wan't to know about one field, everything works fine. I do:
retrieve (ENTRIES.citekey)
where BIBDATA.citekey=ENTRIES.citekey
and BIBDATA.field="AUTHOR" and BIBDATA.data~"Wall"
Adding a second search condition works, but is terribly slow:
retrieve (ENTRIES.citekey)
from B1, B2 in BIBDATA
where B1.citekey=ENTRIES.citekey
and B2.citekey=ENTRIES.citekey
and B1.field="AUTHOR" and B1.data~"Wall"
and B2.field="TITLE" and B2.data~"Programming"
Three conditions don't work at all: The query
retrieve (ENTRIES.citekey)
from B1, B2, B3 in BIBDATA
where B1.citekey=ENTRIES.citekey
and B2.citekey=ENTRIES.citekey
and B3.citekey=ENTRIES.citekey
and B1.field="AUTHOR" and B1.data~"Wall"
and B2.field="TITLE" and B2.data~"Programming"
and B3.field="YEAR" and B3.data~"1990"
takes forever and gives a (wrong) null result.
I'm no expert in database systems and this maybe too naive an
approach.
My questions are:
Is there a better (more efficient) way to formulate these
queries? How can I have queries with more than two search conditions?
Is there a better way to store BibTeX databases in postgres?
I'm grateful for any hints.
Greetings from southern Germany
joerg
J"org Stadler joerg@itap.physik.uni-stuttgart.de
Inst. f. Theoretische und Angewandte Physik Phone: 0711 / 685 - 5268
Universit"at Stuttgart Pfaffenwaldring 57 Room 6.346
70550 Stuttgart finger for pgp public key
==============================================================================
To add/remove yourself to/from the POSTGRES mailing list: send mail with
the subject line ADD or DEL to "postgres-request@postgres.Berkeley.EDU".
If this fails, send mail to "post_questions@postgres.Berkeley.EDU" and
a human will deal with it. DO NOT post to the "postgres" mailing list.
==============================================================================
URL: http://s2k-ftp.CS.Berkeley.EDU:8000/postgres/
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: postgres@postgres.berkeley.edu
Cc: joerg@itap.physik.uni-stuttgart.de
Subject: Re: Beginners questions
In-Reply-To: <199510101231.NAA03056@titania>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox