Compartir
Titre
REINA at WebCLEF 2006 : Mixing fields to improve retrieval
Autor(es)
Sujet
Web pages retrieval
Information retrieval
Web search
Combining fields
Recuperación de páginas webs
Combinación de campos
Materia USAL
Recuperación de la información
Fecha de publicación
2006
Citación
Zazo Rodríguez, A.F., García Figuerola, L.C., Alonso Berrocal, J.L., Zazo Rodríguez, A.F. y Rodríguez Vázquez de Aldana, E. (2006). REINA at WebCLEF 2006 : Mixing fields to improve retrieval. En Nardi, A., Peters, C. y Vicedo, J.L. (Eds.) "WORKING NOTES CLEF 2006 workshop, 20-22 September, Alicante, Spain"
Resumen
This paper describes the participation of the REINA Research Group of the University of Salamanca at WebCLEF 2006. The task in that we have participated this year is the Monolingual Mixed Task in Spanish. To select web pages of the EuroGov collectionin Spanish, the wide collection was processed with a language guesser, searching for pages in Spanish. All pages in the .es domain were also pre-selected. Our focus, this year, is to test pre-retrieval ways of mixing fields or elements of information in web pages, as well as to test the retrieval capacity of these fields. Mixing terms from several sources in a only index can be achieved, in retrieval systems based on the vector spacemodel, operating on the term frequency in the document, if we use a tf x idf schemaof weigthing. BODY field is, by the way, the most powerfull from the point of viewof retrieval, but ANCHORS of backlinks add a considerable improvement. META fields, nevertheless, contribute little to the improvement in retrieval.
Description
Se describe la participación del Grupo de Investigación REINA de la Universidad de Salamanca en foro WebCLEF 2006. Este año participa con un trabajo sobre Subtarea mixta monolingüe en español
URI
Aparece en las colecciones