The paper describes the main challenges faced, and the solutions adopted in the frame of the project DASI - Digital Archive for the study of pre-Islamic Arabian inscriptions. In particular, the methodological and technological issues emerged in the conversion from a domain-specific text-based project of digital edition of an epigraphic corpus, to an objective-driven archive for the study and dissemination of inscriptions in different languages and scripts are discussed. With a view to keeping pace with, and possibly fostering reasoning on best practices in the community of digital epigraphers beyond each specific cultural/linguistic domain, special attention is devoted to: the modelling of data and encoding (XML annotation vs database approach; the conceptual model for the valorization of the material aspect of the epigraph; the textual encoding for critical editions); interoperability (pros and cons of compliance to standards; harmonization of metadata; openness; semantic interoperability); lexicography (tools for under-resourced languages; translations).
Hesperia. Banco de datos de lenguas paleohispánicas and AELAW. Ancient European Languages and Writings are two narrowly linked projects whose common feature is their general aim: cataloguing the documents written in the ancient languages of Europe (8th cent. BCE–5th cent. CE) excluding Latin, Greek, and Phoenician. Although both projects are closely linked, BDHesp has a track record of twenty years, while AELAW has been active for only two and a half years. In this paper, where we have especially focused on BDHesp, we summarize the problems that arose during the encoding of Palaeohispanic languages, written in multiple writing systems and their variants, and the solutions addressed. We also present the promising tools that have been developed in BDHesp to make significant progress in our understanding of Palaeohispanic languages and writings. Lastly, we introduce AELAW network and its two databases, its aims and what we intend to accomplish in the future.
Le projet Ramsès ambitionne de constituer un corpus électronique qui rassemble la totalité des textes conservés en néo-égyptien (Nouvel Empire et Troisième Période Intermédiaire). Loin de se limiter à la saisie informatique des documents, le corpus sera enrichi d'une série d'annotations ecdotiques et linguistiques propres à permettre une compréhension plus fine de la langue des Ramsès.
This paper reviews the experience of the Ramses Project in constructing a richly annotated corpus of Late Egyptian that consists of 300 000 words in 2011 (and is expected to grow up to more than 1 million words in coming years). During the first five years of the project, this corpus has been encoded in hieroglyphic script, translated in French or English and received annotations for part-of-speech information, lemmatization, and morphological analysis. The methodology and working tools that have been developed in order to build this corpus are here discussed and future developments are presented.
This paper reports on the construction-based Treebank currently under development in the frame-work of the Ramses Project, which aims at building a multifaceted annotated corpus of Late Egyptian texts. We describe the specifications that have been implemented and we introduce the syntactic formalism and the related representation format that are used for the syntactic annotation. Further-more, the annotation scheme is discussed with particular attention paid to its evolutionary nature. Finally, we explain the methods as well as the annotating tool, called SyntaxEditor; we conclude by addressing the question of forthcoming developments, especially the search engine and a context-sensitive parser.
Digital epigraphy has made great strides toward interoperability and data integration over the last two decades, and Linked Data approaches are now taking advantage of the spatial information associated with inscriptions for new search and visualization tools. The ability to search across epigraphic collections by time, and especially by relative chronologies, lags behind. The PeriodO project has created a Linked Data gazetteer of structured period definitions that facilitates translation between absolute dates and relative chronologies, creating new possibilities for the interoperability of epigraphic collections and their connection with archaeological databases.
This paper introduces Ramses, a database of Late Egyptian texts, currently under development at the University of Liège (Belgium). Ramses sets out to be a new and powerful research tool. Its main applications are linguistically and philologically orientated. After a general overview of the structure of the database, the search engines are described with some detail.
First official presentation of the "Ramses Project", an richly annotated corpus of Late Egyptian [Paper submitted in 2008/2009]