In this article, the use of a new term extraction method for query expansion (QE) in text retrieval is investigated. The new method expands the initial query with a structured representation made of weighted word pairs (WWP) extracted from a set of training documents (relevance feedback). Standard text retrieval systems can handle a WWP structure through custom Boolean weighted models. We experimented with both the explicit and pseudorelevance feedback schemas and compared the proposed term extraction method with others in the literature, such as KLD and RM3. Evaluations have been conducted on a number of test collections (Text REtrivel Conference [TREC]-6, -7, -8, -9, and -10). Results demonstrated that the QE method based on this new structure outperforms the baseline.

Improving relevance feedback-based query expansion by the use of a weighted word pairs approach

COLACE, Francesco;DE SANTO, Massimo;GRECO, LUCA;NAPOLETANO, PAOLO
2015

Abstract

In this article, the use of a new term extraction method for query expansion (QE) in text retrieval is investigated. The new method expands the initial query with a structured representation made of weighted word pairs (WWP) extracted from a set of training documents (relevance feedback). Standard text retrieval systems can handle a WWP structure through custom Boolean weighted models. We experimented with both the explicit and pseudorelevance feedback schemas and compared the proposed term extraction method with others in the literature, such as KLD and RM3. Evaluations have been conducted on a number of test collections (Text REtrivel Conference [TREC]-6, -7, -8, -9, and -10). Results demonstrated that the QE method based on this new structure outperforms the baseline.
File in questo prodotto:
File Dimensione Formato  
134 Colace Pre-Print.pdf

accesso aperto

Descrizione: C2015 ASIS&T,Published online 21 March 2015 in Wiley OnlineLibrary (wileyonlinelibrary.com). DOI: 10.1002/asi.23331. Link editore: https://asistdl.onlinelibrary.wiley.com/doi/epdf/10.1002/asi.23331
Tipologia: Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review)
Licenza: Creative commons
Dimensione 502.52 kB
Formato Adobe PDF
502.52 kB Adobe PDF Visualizza/Apri
Colace Francesco 4-134 DEFINITIVO.pdf

non disponibili

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 632.3 kB
Formato Adobe PDF
632.3 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11386/4650932
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 20
  • ???jsp.display-item.citation.isi??? 14
social impact