Skip to main content

Research Repository

Advanced Search

A knowledge-based approach to Information Extraction for semantic interoperability in the archaeology domain

Vlachidis, Andreas; Tudhope, Douglas

A knowledge-based approach to Information Extraction for semantic interoperability in the archaeology domain Thumbnail


Authors

Douglas Tudhope



Abstract

© 2015 ASIS & T. The article presents a method for automatic semantic indexing of archaeological grey-literature reports using empirical (rule-based) Information Extraction techniques in combination with domain-specific knowledge organization systems. The semantic annotation system (OPTIMA) performs the tasks of Named Entity Recognition, Relation Extraction, Negation Detection, and Word-Sense Disambiguation using hand-crafted rules and terminological resources for associating contextual abstractions with classes of the standard ontology CIDOC Conceptual Reference Model (CRM) for cultural heritage and its archaeological extension, CRM-EH. Relation Extraction (RE) performance benefits from a syntactic-based definition of RE patterns derived from domain oriented corpus analysis. The evaluation also shows clear benefit in the use of assistive natural language processing (NLP) modules relating to Word-Sense Disambiguation, Negation Detection, and Noun Phrase Validation, together with controlled thesaurus expansion. The semantic indexing results demonstrate the capacity of rule-based Information Extraction techniques to deliver interoperable semantic abstractions (semantic annotations) with respect to the CIDOC CRM and archaeological thesauri. Major contributions include recognition of relevant entities using shallow parsing NLP techniques driven by a complimentary use of ontological and terminological domain resources and empirical derivation of context-driven RE rules for the recognition of semantic relationships from phrases of unstructured text.

Citation

Vlachidis, A., & Tudhope, D. (2016). A knowledge-based approach to Information Extraction for semantic interoperability in the archaeology domain. Journal of the Association for Information Science and Technology, 67(5), 1138-1152. https://doi.org/10.1002/asi.23485

Journal Article Type Article
Acceptance Date Jan 1, 2015
Publication Date May 1, 2016
Deposit Date Feb 6, 2018
Publicly Available Date Feb 6, 2018
Journal Journal of the Association for Information Science and Technology
Electronic ISSN 2330-1643
Publisher Association for Information Science and Technology (ASIS&T)
Peer Reviewed Peer Reviewed
Volume 67
Issue 5
Pages 1138-1152
DOI https://doi.org/10.1002/asi.23485
Keywords knowledge-based approach, information extraction, semantic interoperability, archaeology domain
Public URL https://uwe-repository.worktribe.com/output/844942
Publisher URL http://dx.doi.org/10.1002/asi.23485

Files





You might also like



Downloadable Citations