skip to main content
10.1145/2983323.2983750acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Word Vector Compositionality based Relevance Feedback using Kernel Density Estimation

Published:24 October 2016Publication History

ABSTRACT

A limitation of standard information retrieval (IR) models is that the notion of term composionality is restricted to pre-defined phrases and term proximity. Standard text based IR models provide no easy way of representing semantic relations between terms that are not necessarily phrases, such as the equivalence relationship between `osteoporosis' and the terms `bone' and `decay'. To alleviate this limitation, we introduce a relevance feedback (RF) method which makes use of word embedded vectors. We leverage the fact that the vector addition of word embeddings leads to a semantic composition of the corresponding terms, e.g. addition of the vectors for `bone' and `decay' yields a vector that is likely to be close to the vector for the word `osteoporosis'. Our proposed RF model enables incorporation of semantic relations by exploiting term compositionality with embedded word vectors. We develop our model for RF as a generalization of the relevance model (RLM). Our experiments demonstrate that our word embedding based RF model significantly outperforms the RLM model on standard TREC test collections, namely the TREC 6,7,8 and Robust ad-hoc and the TREC 9 and 10 WT10G test collections.

References

  1. A. Berger and J. Lafferty. Information retrieval as statistical translation. In SIGIR '99, pages 222--229, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. L. A. Clarke, N. Craswell, and I. Soboroff. Overview of the TREC 2004 terabyte track. In TREC '04, 2004.Google ScholarGoogle Scholar
  3. S. Clinchant and E. Gaussier. A theoretical analysis of pseudo-relevance feedback models. In ICTIR '13, pages 6--13, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Collins-Thompson, C. Macdonald, P. N. Bennett, F. Diaz, and E. M. Voorhees. TREC 2014 web track overview. In Proc. of TREC 2014, 2014.Google ScholarGoogle Scholar
  5. S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. JASIS, 41(6):391--407, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  6. F. Diaz. Condensed list relevance models. In ICTIR '15, pages 313--316, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Efron, J. Lin, J. He, and A. de Vries. Temporal feedback for tweet search with non-parametric density estimation. In Proc. of SIGIR '14, pages 33--42, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Ganguly, J. Leveling, and G. J. F. Jones. Topical relevance model. In AIRS '12, pages 326--335, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  9. D. Ganguly, D. Roy, M. Mitra, and G. J. F. Jones. Word embedding based generalized language model for information retrieval. In SIGIR'15, pages 795--798, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Goodwin and S. M. Harabagiu. UTD at TREC 2014: Query expansion for clinical decision support. In Proc. of TREC 2014, 2014.Google ScholarGoogle Scholar
  11. M. Grbovic, N. Djuric, V. Radosavljevic, F. Silvestri, and N. Bhamidipati. Context- and content-aware embeddings for query rewriting in sponsored search. In Proc. of SIGIR 2015, pages 383--392, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Hiemstra. Using Language Models for Information Retrieval. PhD thesis, Center of Telematics and Information Technology, AE Enschede, 2000.Google ScholarGoogle Scholar
  13. T. Hofmann. Probabilistic latent semantic indexing. In Proc. of SIGIR'99, pages 50--57, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. N. A. Jaleel, J. Allan, W. B. Croft, F. Diaz, L. S. Larkey, X. Li, M. D. Smucker, and C. Wade. Umass at TREC 2004: Novelty and HARD. In Proc. of TREC '04, 2004.Google ScholarGoogle Scholar
  15. V. Lavrenko and B. W. Croft. Relevance based language models. In Proc. of SIGIR '01, pages 120--127, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Lioma, J. G. Simonsen, B. Larsen, and N. D. Hansen. Non-compositional term dependence for information retrieval. In Proc. of SIGIR '15, pages 595--604, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Y. Lv and C. Zhai. A comparative study of methods for estimating query language models with pseudo feedback. In Proc. of CIKM '09, pages 1895--1898, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Metzler and W. B. Croft. Latent concept expansion using markov random fields. In Proc. of SIGIR '07, pages 311--318, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Proc. of NIPS '13, pages 3111--3119, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. Pal, M. Mitra, and K. Datta. Improving query expansion using wordnet. JAIST, 65(12):2469--2478, 2014.Google ScholarGoogle Scholar
  21. A. Sordoni, Y. Bengio, and J.-Y. Nie. Learning concept embeddings for query expansion by quantum entropy minimization. In Proc. of AAAI '14, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. I. Vulic and M. Moens. Monolingual and cross-lingual information retrieval models based on (bilingual) word embeddings. In Proc. of SIGIR '15, pages 363--372, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. X. Wei and W. B. Croft. LDA-based document models for ad-hoc retrieval. In SIGIR '06, pages 178--185, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. X. Yi and J. Allan. A comparative study of utilizing topic models for information retrieval. In Proc. of ECIR '09, pages 29--41, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information retrieval. TOIS, 22(2):179--214, Apr. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. G. Zheng and J. Callan. Learning to reweight terms with distributed representations. In Proc. of SIGIR'15, pages 575--584, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Word Vector Compositionality based Relevance Feedback using Kernel Density Estimation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
      October 2016
      2566 pages
      ISBN:9781450340731
      DOI:10.1145/2983323

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 October 2016

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CIKM '16 Paper Acceptance Rate160of701submissions,23%Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader