Wikipedia-based query phrase expansion in patent class search

Cited 21 time in webofscience Cited 23 time in scopus
  • Hit : 738
  • Download : 4
Relevance feedback methods generally suffer from topic drift caused by word ambiguities and synonymous uses of words. Topic drift is an important issue in patent information retrieval as people tend to use different expressions describing similar concepts causing low precision and recall at the same time. Furthermore, failing to retrieve relevant patents to an application during the examination process may cause legal problems caused by granting an existing invention. A possible cause of topic drift is utilizing a relevance feedback-based search method. As a way to alleviate the inherent problem, we propose a novel query phrase expansion approach utilizing semantic annotations in Wikipedia pages, trying to enrich queries with phrases disambiguating the original query words. The idea was implemented for patent search where patents are classified into a hierarchy of categories, and the analyses of the experimental results showed not only the positive roles of phrases and words in retrieving additional relevant documents through query expansion but also their contributions to alleviating the query drift problem. More specifically, our query expansion method was compared against relevance-based language model, a state-of-the-art query expansion method, to show its superiority in terms of MAP on all levels of the classification hierarchy.
Publisher
SPRINGER
Issue Date
2014-10
Language
English
Article Type
Article
Keywords

INFORMATION-RETRIEVAL; LEXICAL COHESION; TERMS

Citation

INFORMATION RETRIEVAL, v.17, no.5-6, pp.430 - 451

ISSN
1386-4564
DOI
10.1007/s10791-013-9233-4
URI
http://hdl.handle.net/10203/192763
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 21 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0