Andreas Vlachidis Andreas.Vlachidis@uwe.ac.uk
Senior Lecturer in Computer Science
A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain
Vlachidis, Andreas; Tudhope, Douglas
Authors
Douglas Tudhope
Abstract
The paper presents a method for automatic semantic indexing of archaeological grey-literature reports using empirical (rule-based) Information Extraction techniques in combination with domain-specific knowledge organization systems. Performance is evaluated via the Gold Standard method. The semantic annotation system (OPTIMA) performs the tasks of Named Entity Recognition, Relation Extraction, Negation Detection and Word Sense disambiguation using hand-crafted rules and terminological resources for associating contextual abstractions with classes of the standard ontology (ISO 21127:2006) CIDOC Conceptual Reference Model (CRM) for cultural heritage and its archaeological extension, CRM-EH, together with concepts from English Heritage thesauri and glossaries.
Relation Extraction performance benefits from a syntactic based definition of relation extraction patterns derived from domain oriented corpus analysis. The evaluation also shows clear benefit in the use of assistive NLP modules relating to word-sense disambiguation, negation detection and noun phrase validation, together with controlled thesaurus expansion.
The semantic indexing results demonstrate the capacity of rule-based Information Extraction techniques to deliver interoperable semantic abstractions (semantic annotations) with respect to the CIDOC CRM and archaeological thesauri. Major contributions include recognition of relevant entities using shallow parsing NLP techniques driven by a complimentary use of ontological and terminological domain resources and empirical derivation of context-driven relation extraction rules for the recognition of semantic relationships from phrases of unstructured text. The semantic annotations have proven capable of supporting semantic query, document study and cross-searching via the ontology framework.
Citation
Vlachidis, A., & Tudhope, D. (2015). A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain. Journal of the Association for Information Science and Technology, 67(5), 1138-1152. https://doi.org/10.1002/asi.23485
Journal Article Type | Article |
---|---|
Acceptance Date | Jan 1, 2015 |
Publication Date | Jan 1, 2015 |
Journal | Journal of the Association for Information Science and Technology |
Publisher | Association for Information Science and Technology (ASIS&T) |
Peer Reviewed | Peer Reviewed |
Volume | 67 |
Issue | 5 |
Pages | 1138-1152 |
DOI | https://doi.org/10.1002/asi.23485 |
Keywords | knowledge-based approach, information extraction, semantic interoperability, archaeology domain |
Public URL | https://uwe-repository.worktribe.com/output/844942 |
Publisher URL | http://dx.doi.org/10.1002/asi.23485 |
Files
AKnowledgeBasedApproachtoInformationExtraction_Revised_V5.pdf
(885 Kb)
PDF
You might also like
Text mining in archaeology: Extracting information from archaeological reports
(2015)
Book Chapter
Classical Art Semantics Information Extraction: CASIE pilot project
(2013)
Presentation / Conference
Information extraction techniques for the purposes of semantic indexing of archaeological resources
(2013)
Presentation / Conference
Semantic technologies for archaeology resources: Results from the star project
(2013)
Book Chapter