Skip to main content

Research Repository

Advanced Search

A study of semantic integration across archaeological data and reports in different languages

Binding, Ceri; Tudhope, Douglas; Vlachidis, Andreas

A study of semantic integration across archaeological data and reports in different languages Thumbnail


Authors

Ceri Binding

Douglas Tudhope



Abstract

© The Author(s) 2018. This study investigates the semantic integration of data extracted from archaeological datasets with information extracted via natural language processing (NLP) across different languages. The investigation follows a broad theme relating to wooden objects and their dating via dendrochronological techniques, including types of wooden material, samples taken and wooden objects including shipwrecks. The outcomes are an integrated RDF dataset coupled with an associated interactive research demonstrator query builder application. The semantic framework combines the CIDOC Conceptual Reference Model (CRM) with the Getty Art and Architecture Thesaurus (AAT). The NLP, data cleansing and integration methods are described in detail together with illustrative scenarios from the web application Demonstrator. Reflections and recommendations from the study are discussed. The Demonstrator is a novel SPARQL web application, with CRM/AAT-based data integration. Functionality includes the combination of free text and semantic search with browsing on semantic links, hierarchical and associative relationship thesaurus query expansion. Queries concern wooden objects (e.g. samples of beech wood keels), optionally from a given date range, with automatic expansion over AAT hierarchies of wood types and specialised associative relationships. Following a ‘mapping pattern’ approach (via the STELETO tool) ensured validity and consistency of all RDF output. The user is shielded from the complexity of the underlying semantic framework by a query builder user interface. The study demonstrates the feasibility of connecting information extracted from datasets and grey literature reports in different languages and semantic cross-searching of the integrated information. The semantic linking of textual reports and datasets opens new possibilities for integrative research across diverse resources.

Citation

Binding, C., Tudhope, D., & Vlachidis, A. (2019). A study of semantic integration across archaeological data and reports in different languages. Journal of Information Science, 45(3), 364-386. https://doi.org/10.1177/0165551518789874

Journal Article Type Article
Acceptance Date Jun 28, 2018
Online Publication Date Jul 31, 2018
Publication Date Jun 1, 2019
Deposit Date Jul 2, 2018
Publicly Available Date Jul 2, 2018
Journal Journal of Information Science
Print ISSN 0165-5515
Electronic ISSN 1741-6485
Publisher SAGE Publications
Peer Reviewed Peer Reviewed
Volume 45
Issue 3
Pages 364-386
DOI https://doi.org/10.1177/0165551518789874
Keywords digital archaeology, semantic technologies, multilingual NLP, CIDOC-CRM
Public URL https://uwe-repository.worktribe.com/output/873244
Publisher URL https://doi.org/10.1177/0165551518789874
Additional Information Additional Information : ©2018. Reprinted by permission of SAGE Publications

Files





You might also like



Downloadable Citations