In this article, the use of a new term extraction method for query expansion (QE) in text retrieval is investigated. The new method expands the initial query with a structured representation made of weighted word pairs (WWP) extracted from a set of training documents (relevance feedback). Standard text retrieval systems can handle a WWP structure through custom Boolean weighted models. We experimented with both the explicit and pseudorelevance feedback schemas and compared the proposed term extraction method with others in the literature, such as KLD and RM3. Evaluations have been conducted on a number of test collections (Text REtrivel Conference [TREC]-6, -7, -8, -9, and -10). Results demonstrated that the QE method based on this new structure outperforms the baseline.
Improving relevance feedback-based query expansion by the use of a weighted word pairs approach
COLACE, Francesco;DE SANTO, Massimo;GRECO, LUCA;NAPOLETANO, PAOLO
2015
Abstract
In this article, the use of a new term extraction method for query expansion (QE) in text retrieval is investigated. The new method expands the initial query with a structured representation made of weighted word pairs (WWP) extracted from a set of training documents (relevance feedback). Standard text retrieval systems can handle a WWP structure through custom Boolean weighted models. We experimented with both the explicit and pseudorelevance feedback schemas and compared the proposed term extraction method with others in the literature, such as KLD and RM3. Evaluations have been conducted on a number of test collections (Text REtrivel Conference [TREC]-6, -7, -8, -9, and -10). Results demonstrated that the QE method based on this new structure outperforms the baseline.File | Dimensione | Formato | |
---|---|---|---|
134 Colace Pre-Print.pdf
accesso aperto
Descrizione: C2015 ASIS&T,Published online 21 March 2015 in Wiley OnlineLibrary (wileyonlinelibrary.com). DOI: 10.1002/asi.23331. Link editore: https://asistdl.onlinelibrary.wiley.com/doi/epdf/10.1002/asi.23331
Tipologia:
Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review)
Licenza:
Creative commons
Dimensione
502.52 kB
Formato
Adobe PDF
|
502.52 kB | Adobe PDF | Visualizza/Apri |
Colace Francesco 4-134 DEFINITIVO.pdf
non disponibili
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
632.3 kB
Formato
Adobe PDF
|
632.3 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.