agora inbox for postgres@postgres.berkeley.edu
help / color / mirror / Atom feedFrom: Paul M. Aoki <aoki@cs.Berkeley.EDU>
To: Kai Petzke <wpp@marie.physik.tu-berlin.de>
Cc: goli@plains.NoDak.edu
Cc: postgres@postgres.Berkeley.EDU
Subject: hash functions [was: New type and pg_operator.]
Date: Fri, 12 Aug 94 01:35:13 -0700
Message-ID: <199408120835.BAA13813@faerie.CS.Berkeley.EDU> (raw)
In-Reply-To: <9407160941.AA23797@marie.physik.tu-berlin.de>
wpp@marie.physik.tu-berlin.de (Kai Petzke) writes:
> > > unless you use arcane order-preserving hash functions,
> > What are those order preserving hash functions?
> Hash functions calculate a short "checksum" from a datum.
> But because of the data reduction, it is in general impossible to
> preserve the ordering of the data.
41. CONFERENCE PAPER
Fox, E.A.; Qi Fan Chen; Daoud, A.M.; Heath, L.S.
Order preserving minimal perfect hash functions and information
retrieval.
IN: Proceedings of the 13th International Conference on Research and
Development in Information Retrieval. (Proceedings of the 13th
International Conference on Research and Development in Information
Retrieval, Brussels, Belgium, 5-7 Sept. 1990). Edited by: Vidick, J.L. New
York, NY, USA: ACM, 1989. p. 279-312.
Pub type: Practical; Theoretical or Mathematical.
Abstract: Rapid access to information is essential for a wide variety of
retrieval systems and applications. Hashing has long been used when the
fastest possible direct search is desired, but is generally not
appropriate when sequential or range searches are also required. The paper
describes a hashing method, developed for collections that are relatively
static, that supports both direct and sequential access. Indeed, the
algorithm described gives hash functions that are optimal in terms of time
and hash table space utilization, and that preserve any a priori ordering
desired. Furthermore, the resulting order preserving minimal perfect has
functions (OPMPHFs) can be found using space and time that is on average
linear in the number of keys involved.
i happened to run across this while searching for "linear hashing"
references. whew. i was thinking i was totally nuts for thinking
such a thing existed. after some thought, i also dimly recall that
there were also some later papers on the subject from fox's group in
the ACM SIGIR Forum as well.
(which is not to say that anything else i have ever said, here or
anywhere else, is either right or sensible.)
--
Paul M. Aoki | University of California at Berkeley
aoki@CS.Berkeley.EDU | Dept. of EECS, Computer Science Division (#1776)
| Berkeley, CA 94720-1776
==============================================================================
To add/remove yourself to/from the POSTGRES mailing list: send mail with
the subject line ADD or DEL to "postgres-request@postgres.Berkeley.EDU"
If this fails, send mail to "post_questions@postgres.Berkeley.EDU" and
a human will deal with it. DO NOT post to the "postgres" mailing list.
==============================================================================
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: postgres@postgres.berkeley.edu
Cc: aoki@cs.Berkeley.EDU, wpp@marie.physik.tu-berlin.de, goli@plains.NoDak.edu
Subject: Re: hash functions [was: New type and pg_operator.]
In-Reply-To: <199408120835.BAA13813@faerie.CS.Berkeley.EDU>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox