Ceri Binding
A study of semantic integration across archaeological data and reports in different languages
Binding, Ceri; Tudhope, Douglas; Vlachidis, Andreas
Authors
Abstract
© The Author(s) 2018. This study investigates the semantic integration of data extracted from archaeological datasets with information extracted via natural language processing (NLP) across different languages. The investigation follows a broad theme relating to wooden objects and their dating via dendrochronological techniques, including types of wooden material, samples taken and wooden objects including shipwrecks. The outcomes are an integrated RDF dataset coupled with an associated interactive research demonstrator query builder application. The semantic framework combines the CIDOC Conceptual Reference Model (CRM) with the Getty Art and Architecture Thesaurus (AAT). The NLP, data cleansing and integration methods are described in detail together with illustrative scenarios from the web application Demonstrator. Reflections and recommendations from the study are discussed. The Demonstrator is a novel SPARQL web application, with CRM/AAT-based data integration. Functionality includes the combination of free text and semantic search with browsing on semantic links, hierarchical and associative relationship thesaurus query expansion. Queries concern wooden objects (e.g. samples of beech wood keels), optionally from a given date range, with automatic expansion over AAT hierarchies of wood types and specialised associative relationships. Following a ‘mapping pattern’ approach (via the STELETO tool) ensured validity and consistency of all RDF output. The user is shielded from the complexity of the underlying semantic framework by a query builder user interface. The study demonstrates the feasibility of connecting information extracted from datasets and grey literature reports in different languages and semantic cross-searching of the integrated information. The semantic linking of textual reports and datasets opens new possibilities for integrative research across diverse resources.
Journal Article Type | Article |
---|---|
Acceptance Date | Jun 28, 2018 |
Online Publication Date | Jul 31, 2018 |
Publication Date | Jun 1, 2019 |
Deposit Date | Jul 2, 2018 |
Publicly Available Date | Jul 2, 2018 |
Journal | Journal of Information Science |
Print ISSN | 0165-5515 |
Electronic ISSN | 1741-6485 |
Publisher | SAGE Publications |
Peer Reviewed | Peer Reviewed |
Volume | 45 |
Issue | 3 |
Pages | 364-386 |
DOI | https://doi.org/10.1177/0165551518789874 |
Keywords | digital archaeology, semantic technologies, multilingual NLP, CIDOC-CRM |
Public URL | https://uwe-repository.worktribe.com/output/873244 |
Publisher URL | https://doi.org/10.1177/0165551518789874 |
Additional Information | Additional Information : ©2018. Reprinted by permission of SAGE Publications |
Contract Date | Jul 2, 2018 |
Files
Archaeology-integration-JISauthorversion.pdf
(2.8 Mb)
PDF
Archaeology-integration-JISauthorversion.docx
(1.4 Mb)
Document
You might also like
Text mining in archaeology: Extracting information from archaeological reports
(2015)
Book Chapter
ARIADNE: A research infrastructure for archaeology
(2017)
Journal Article
Enabling European archaeological research: The ARIADNE E-infrastructure
(2017)
Journal Article
Downloadable Citations
About UWE Bristol Research Repository
Administrator e-mail: repository@uwe.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search