Contributo in atti di convegno, 2010, ENG
Tonellotto N.; Macdonald C.; Ounis I.
CNR-ISTI, Pisa, Italy; Department of Computing Science, University of Glasgow, Glasgow, UK; Department of Computing Science, University of Glasgow, Glasgow, UK
Modern retrieval approaches apply not just single-term weighting models when ranking documents - instead, proximity weighting models are in common use, which highly score the co-occurrence of pairs of query terms in close proximity to each other in documents. The adoption of these proximity weighting models can cause a computational overhead when documents are scored, negatively impacting the efficiency of the retrieval process. In this paper, we discuss the integration of proximity weighting models into efficient dynamic pruning strategies. In particular, we propose to modify document-at-a-time strategies to include proximity scoring without any modifications to pre-existing index structures. Our resulting two-stage dynamic pruning strategies only consider single query terms during first stage pruning, but can early terminate the proximity scoring of a document if it can be shown that it will never be retrieved. We empirically examine the efficiency benefits of our approach using a large Web test collection of 50 million documents and 10,000 queries from a real query log. Our results show that our proposed two-stage dynamic pruning strategies are considerably more efficient than the original strategies, particularly for queries of 3 or more terms.
SIGIR 2010 - Workshop on Large Scale Distributed Search, pp. 31–35, Ginevra, Svizzera, Luglio 2010
Information Search and Retrieval, Information Retrieval, Search Engines
ISTI – Istituto di scienza e tecnologie dell'informazione "Alessandro Faedo"
CNR authors
External IDs
CNR OAI-PMH: oai:it.cnr:prodotti:92088