Skip to main content

Research Repository

Advanced Search

Generating unambiguous URL clusters from web search

Smith, Gavin; Donner, Christoph; Hooijmaijers, Dennis; Truran, Mark; Goulding, James; Ashman, Helen; Brailsford, T.

Generating unambiguous URL clusters from web search Thumbnail


Authors

Gavin Smith

Christoph Donner

Dennis Hooijmaijers

Mark Truran

James Goulding

Helen Ashman



Abstract

This paper reports on the generation of unambiguous clusters of from clickthrough data from the MSN search query log (the RFP 2006 dataset). Selections (clickthroughs) by a user from a single query can be assumed to have some semantic relevance, and the URLs coselected in this way be aggregated to form single-sense clusters. When the graphs a single term separate into distinct clusters, the semantics of distinct clusters can be interpreted as disambiguated of URLs. This principle had been tested on smaller more constrained datasets previously, and this paper reports findings from applying a method based on the principle to the 2006 dataset. paper evaluates the proposed coselection method for single-sense clusters against two other methods, with parameters. The evaluation is done both with a human to determine the quality of the clusters generated by the methods, and by a simple "edit distance" analysis to the content difference of the methods. main questions addressed are i) whether it is feasie to single-sense / sense-coherent clusters, and ii) whether, in closed world, it would be feasible to discover ambiguous terms. experimentation showed that sense-coherent clusters were and further indicated that ambiguous terms could be detected from observing small overlap between large clusters. Copyright 2009.

Citation

Smith, G., Brailsford, T., Donner, C., Hooijmaijers, D., Truran, M., Goulding, J., & Ashman, H. (2009). Generating unambiguous URL clusters from web search. . https://doi.org/10.1145/1507509.1507514

Conference Name Proceedings of Workshop on Web Search Click Data, WSCD'09
Conference Location Barcelona, Spain
Start Date Feb 9, 2009
End Date Feb 11, 2009
Acceptance Date Jan 2, 2009
Publication Date Jul 14, 2009
Deposit Date Sep 25, 2018
Publicly Available Date Sep 25, 2018
Peer Reviewed Peer Reviewed
Pages 28-34
ISBN 9781605584348
DOI https://doi.org/10.1145/1507509.1507514
Public URL https://uwe-repository.worktribe.com/output/998752
Publisher URL https://doi.org/10.1145/1507509.1507514
Additional Information Additional Information : This is the accepted version of the paper. The final version can be found online at https://doi.org/10.1145/1507509.1507514
Title of Conference or Conference Proceedings : Proceedings of the 2009 workshop on Web Search Click Data

Files




You might also like



Downloadable Citations