From Natural Language to Ontology Population in the Cultural Heritage Domain. A Computational Linguistics-based approach.

Maria Pia di Buono, Mario Monteleone


Abstract
This paper presents an on-going Natural Language Processing (NLP) research based on Lexicon-Grammar (LG) and aimed at improving knowledge management of Cultural Heritage (CH) domain. We intend to demonstrate how our language formalization technique can be applied for both processing and populating a domain ontology. We also use NLP techniques for text extraction and mining to fill information gaps and improve access to cultural resources. The Linguistic Resources (LRs, i.e. electronic dictionaries) we built can be used in the structuring of effective Knowledge Management Systems (KMSs). In order to apply to Parts of Speech (POS) the classes and properties defined by the Conseil Interational des Musees (CIDOC) Conceptual Reference Model (CRM), we use Finite State Transducers/Automata (FSTs/FSA) and their variables built in the form of graphs. FSTs/FSA are also used for analysing corpora in order to retrieve recursive sentence structures, in which combinatorial and semantic constraints identify properties and denote relationship. Besides, FSTs/FSA are also used to match our electronic dictionary entries (ALUs, or Atomic Linguistic Units) to RDF subject, object and predicate (SKOS Core Vocabulary). This matching of linguistic data to RDF and their translation into SPARQL/SERQL path expressions allows the use ALUs to process natural-language queries.
Anthology ID:
L14-1538
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3661–3666
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/686_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Maria Pia di Buono and Mario Monteleone. 2014. From Natural Language to Ontology Population in the Cultural Heritage Domain. A Computational Linguistics-based approach.. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3661–3666, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
From Natural Language to Ontology Population in the Cultural Heritage Domain. A Computational Linguistics-based approach. (di Buono & Monteleone, LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/686_Paper.pdf