Efficient Passage ranking for document databases

M. Kaszkiel, J. Zobel, and R. Sacks-Davis

Queries to text collections are resolved by ranking the documents in the collection and returning the highest-scoring documents to the user. An alternative retrieval method is to rank passages, that is, short fragments of documents, a strategy that can improve effectiveness and identify relevant material in documents that are too large for users to consider as a whole. However, ranking of passages can considerably increase retrieval costs. In this paper we explore alternative query evaluation techniques, and develop new techniques for evaluating queries on passages. We show experimentally that, appropriately implemented, effective passage retrieval is practical in limited memory on a desktop machine. Compared to passage ranking with adaptations of current document ranking algorithms, our new ``DO-TOS'' passage ranking algorithm requires only a fraction of the resources, at the cost of a small loss of effectiveness.

Hosted by www.Geocities.ws

1