Gavin Smith
Generating unambiguous URL clusters from web search
Smith, Gavin; Donner, Christoph; Hooijmaijers, Dennis; Truran, Mark; Goulding, James; Ashman, Helen; Brailsford, T.
Authors
Christoph Donner
Dennis Hooijmaijers
Mark Truran
James Goulding
Helen Ashman
Tim Brailsford Tim.Brailsford@uwe.ac.uk
Professor of Computer Science
Abstract
This paper reports on the generation of unambiguous clusters of from clickthrough data from the MSN search query log (the RFP 2006 dataset). Selections (clickthroughs) by a user from a single query can be assumed to have some semantic relevance, and the URLs coselected in this way be aggregated to form single-sense clusters. When the graphs a single term separate into distinct clusters, the semantics of distinct clusters can be interpreted as disambiguated of URLs. This principle had been tested on smaller more constrained datasets previously, and this paper reports findings from applying a method based on the principle to the 2006 dataset. paper evaluates the proposed coselection method for single-sense clusters against two other methods, with parameters. The evaluation is done both with a human to determine the quality of the clusters generated by the methods, and by a simple "edit distance" analysis to the content difference of the methods. main questions addressed are i) whether it is feasie to single-sense / sense-coherent clusters, and ii) whether, in closed world, it would be feasible to discover ambiguous terms. experimentation showed that sense-coherent clusters were and further indicated that ambiguous terms could be detected from observing small overlap between large clusters. Copyright 2009.
Presentation Conference Type | Conference Paper (Published) |
---|---|
Conference Name | Proceedings of Workshop on Web Search Click Data, WSCD'09 |
Start Date | Feb 9, 2009 |
End Date | Feb 11, 2009 |
Acceptance Date | Jan 2, 2009 |
Publication Date | Jul 14, 2009 |
Deposit Date | Sep 25, 2018 |
Publicly Available Date | Sep 25, 2018 |
Peer Reviewed | Peer Reviewed |
Pages | 28-34 |
ISBN | 9781605584348 |
DOI | https://doi.org/10.1145/1507509.1507514 |
Public URL | https://uwe-repository.worktribe.com/output/998752 |
Publisher URL | https://doi.org/10.1145/1507509.1507514 |
Additional Information | Additional Information : This is the accepted version of the paper. The final version can be found online at https://doi.org/10.1145/1507509.1507514 Title of Conference or Conference Proceedings : Proceedings of the 2009 workshop on Web Search Click Data |
Contract Date | Sep 25, 2018 |
Files
Generating_unambiguous_URL_clusters_from_Web_searc.pdf
(380 Kb)
PDF
You might also like
The ethical and social implications of personalization technologies for e-learning
(2014)
Journal Article
On the Turing Completeness of the Semantic Web
(2014)
Journal Article
Towards computation of novel ideas from corpora of scientific text
(2015)
Presentation / Conference Contribution
Enhancing reflective learning experiences in museums through interactive installations
(2018)
Journal Article
AnswerPro: Designing to motivate interaction
(-0001)
Presentation / Conference Contribution
Downloadable Citations
About UWE Bristol Research Repository
Administrator e-mail: repository@uwe.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search