EpiHub - Open Digital Epigraphy Hub

Aioanei, Andrei C., Regine R. Hunziker-Rodewald, Konstantin M. Klein, and Dominik L. Michels. 2024. “Deep Aramaic: Towards a Synthetic Data Paradigm Enabling Machine Learning in Epigraphy.” PLOS ONE 19 (4): e0299297. doi:10.1371/journal.pone.0299297.

Epigraphy is witnessing a growing integration of artificial intelligence, notably through its subfield of machine learning (ML), especially in tasks like extracting insights from ancient inscriptions. However, scarce labeled data for training ML algorithms severely limits current techniques, especially for ancient scripts like Old Aramaic. Our research pioneers an innovative methodology for generating synthetic training data tailored to Old Aramaic letters. Our pipeline synthesizes photo-realistic Aramaic letter datasets, incorporating textural features, lighting, damage, and augmentations to mimic real-world inscription diversity. Despite minimal real examples, we engineer a dataset of 250 000 training and 25 000 validation images covering the 22 letter classes in the Aramaic alphabet. This comprehensive corpus provides a robust volume of data for training a residual neural network (ResNet) to classify highly degraded Aramaic letters. The ResNet model demonstrates 95% accuracy in classifying real images from the 8th century BCE Hadad statue inscription. Additional experiments validate performance on varying materials and styles, proving effective generalization. Our results validate the model’s capabilities in handling diverse real-world scenarios, proving the viability of our synthetic data approach and avoiding the dependence on scarce training data that has constrained epigraphic analysis. Our innovative framework elevates interpretation accuracy on damaged inscriptions, thus enhancing knowledge extraction from these historical resources.

Arkhipov, Ilya, Dominique Charpin, Christian Gaubert, and Nele Ziegler. 2024. “The Old Babylonian Glossary of the ARCHIBAB Text Corpus: Results and Prospects.” Revue d’assyriologie et d’archéologie Orientale 118 (1): 91–102. https://shs.cairn.info/revue-revue-d-assyriologie-et-d-archeologie-orientale-2024-1-page-91.

View on shs.cairn.info

Bodel, John, Jonathan Prag, and Charlotte Roueché. 2024. “Open Scholarship: Epigraphic Corpora in the Digital Age.” In L’épigraphie Au XXIe Siècle. Actes Du XVIe Congrès International d’Épigraphie Grecque et Latine, Bordeaux, 29 Août-02 Septembre 2022, edited by Pierre Fröhlich and Milagros Navarro Cabellero, 91–117. Bordeaux: Ausonius.

Bordreuil, Etienne, Valérie Matoïan, and Jan Tavernier, eds. 2024. Administrations et Pratiques Comptables Au Proche-Orient Ancien: Actes Du Colloque International, Tenu a Louvain-La-Neuve, Les 21-22 Fevrier 2019. Peeters Publishers. doi:10.2307/jj.20261347.

View on www.jstor.org

Cobanoglu, Yunus, Jussi Laasonen, Fabian Simonjetz, Ilya Khait, Sophie Cohen, Zsombor Földi, Aino Hätinen, et al. 2024. “Transliterated Cuneiform Tablets of the Electronic Babylonian Library Platform.” Journal of Open Humanities Data 10 (1). doi:10.5334/johd.148.

This work presents a corpus of transliterated cuneiform tablets from the Electronic Babylonian Library (eBL) platform, including a public API endpoint to download the latest version of the data, and a Python library to parse the transliterations in ATF format. As of the time of writing, the constantly growing dataset contains around 25,000 tablets with over 350,000 lines of transliterated text. This dataset is a sizeable addition to open-source cuneiform data and a major milestone for research within the fields of cuneiform studies and NLP.

View on openhumanitiesdata.metajnl.com

Delvigne, Damien, and Charles Doyen. 2024. “Collecter et étudier les poids antiques et médiévaux: le projet pondera online.” In Administrations et Pratiques Comptables Au Proche-Orient Ancien: Actes Du Colloque International, Tenu a Louvain-La-Neuve, Les 21-22 Fevrier 2019, edited by Etienne Bordreuil, Valérie Matoïan, and Jan Tavernier, 207–22. Louvain-la-Neuve. doi:https://doi.org/10.2307/jj.20261347.10.

Fendel, Victoria Beatrix. 2024. “The I.Sicily Sketch Engine Corpus.” Journal of Open Humanities Data 10: 56. doi:10.5334/johd.258.

View on openhumanitiesdata.metajnl.com

Grunewald, Susan, and Ruth Mostern. 2024. “Working with Named Places: How and Why to Build a Gazetteer.” Edited by Yann Ryan. Programming Historian, no. 13. University of Sussex. doi:10.46430/phen0117.

A digital gazetteer records information associated with specific places. This lesson teaches you how to create a gazetteer from a historical text, using the Linked Places Delimited (LP-TSV) format.

View on programminghistorian.org

Heřmánková, Petra, Brian Ballsun-Stanton, and Ray Laurence. 2024. “FAIR Turn in Epigraphy: Low Barrier Pathways to Quantitative and Reproducible Research in Latin Epigraphy.” In CHR 2024. Computational Humanities Research 2024. Proceedings of the Computational Humanities Research Conference 2024 Aarhus, Denmark, December 4-6, 2024., edited by Wouter Haverals, Marijn Koolen, and Laure Thompson, 3834:649–61. CEUR Workshop Proceedings. CEUR. https://ceur-ws.org/Vol-3834/paper4.pdf.

The application of FAIR (Findable, Accessible, Interoperable, Reusable) principles can revolutionise the epigraphic discipline by facilitating quantitative and reproducible research. Despite the richness of Latin inscriptions, the lack of low-barrier tools for accessing and analysing these datasets has hindered largescale studies and the uptake of FAIR and Open Science principles in ancient studies. The LatEpig v2.0 tool addresses this gap by enabling researchers to programmatically access the Epigraphic Database Clauss-Slaby, and generate reproducible research following state-of-the-art standards. The main aim of LatEpig is to democratise data access and enhance research potential without requiring advanced technical skills. A case study on ‘viator’ inscriptions exemplifies the tool’s utility, illustrating spatial and temporal trends in inscriptions addressing messengers and travellers across the Roman Empire. LatEpig exemplifies that the development of similar tools is crucial for advancing FAIR and Open Science practices in the Humanities, ensuring that substantial investments in digital resources are fully realised.

View on ceur-ws.org

Jakacki, Diane K. 2024. “Review: World Historical Gazetteer.” Reviews in Digital Humanities 5 (5). PubPub. doi:10.21428/3e88f64f.6bceb2bf.

View on reviewsindh.pubpub.org

Palladino, Chiara, and Tariq Yousef. 2024. “Development of Robust NER Models and Named Entity Tagsets for Ancient Greek.” In Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024, edited by Rachele Sprugnoli and Marco Passarotti, 89–97. Torino, Italia: ELRA and ICCL. https://aclanthology.org/2024.lt4hala-1.11/.

This contribution presents a novel approach to the development and evaluation of transformer-based models for Named Entity Recognition and Classification in Ancient Greek texts. We trained two models with annotated datasets by consolidating potentially ambiguous entity types under a harmonized set of classes. Then, we tested their performance with out-of-domain texts, reproducing a real-world use case. Both models performed very well under these conditions, with the multilingual model being slightly superior on the monolingual one. In the conclusion, we emphasize current limitations due to the scarcity of high-quality annotated corpora and to the lack of cohesive annotation strategies for ancient languages.

View on aclanthology.org

Prag, Jonathan, and Valentina Mignosa. 2024. “I.Sicily, Crossreads e l’approccio digitale ai documenti epigrafici dall’area elima.” In Conflitto e cultura civica nella storia della Sicilia antica: tra stasis e homonoia, edited by Carmine Ampolo, Rossella Giglio, Anna Magnetto, and Maria Cecilia Parra, 77–94. Pisa.

Prag, Jonathan, and Alfredo Tosques. 2024. “I.Sicily as a Tool for the Study of Roman Sicily: An Experiment in Institutional Annotation.” Gerión. Revista de Historia Antigua 42 (Esp.): 73–91. doi:10.5209/geri.95520.

Study of Roman Sicily is well established and has a long tradition, with the two most authoritative and well-established epigraphic corpora –CIL X (1883) and IG XIV (1890)– dating to the late 19th century. While I.Sicily was conceived to offer easy and up-to-date access to the evergrowing but increasingly scattered epigraphic evidence of Sicily, its digital nature also enables the adoption of new approaches and the pursuit of novel research questions. The open-access dataset has recently been expanded to include institutional annotations, which hold great promise for research, particularly in fields that rely on extensive and detailed datasets, such as administrative and onomastic history (prosopographic annotation will follow). This paper aims to demonstrate both the potential and the limitations of a digitally annotated dataset as a tool for historical research, through a preliminary case study on the practice of dedications to the Roman emperor in Sicily. Recent scholarship suggests that provincial subjects also contributed to shaping the notion and the expectations around emperorship, which were not only imposed from above. The data-driven approach facilitated by an annotated corpus is well-suited to the new bottom-up perspective, but it is not without methodological pitfalls, which will be highlighted in this paper.

View on revistas.ucm.es

Ferrara, Silvia, Barbara Montecchi, and Miguel Valério, eds. 2024. “IDIOM: A Digital Research Environment for the Documentation and Study of Maya Hieroglyphic Texts and Language.” In Writing from Invention to Decipherment, by Christian M. Prager, Katja Diederichs, Antje Grothe, Nikolai Grube, Guido Krempel, Mallory Matsumoto, Tobias Mercer, Cristina Vertan, and Elisabeth Wagner, 227–51. Oxford: Oxford University Press. doi:10.1093/oso/9780198908746.003.0012.

The Maya hieroglyphic script, an indigenous graphic notation system in the Americas, presents a formidable decipherment challenge. Approximately 40 per cent of its approximately one thousand known signs remain elusive owing to limited comprehension of the Classic Mayan language. Spanning modern-day south-eastern Mexico, Guatemala, Belize, and western Honduras, the Classic Maya civilization left over ten thousand inscriptions, primarily detailing the lives of political elites. The ‘Text Database and Dictionary of Classic Mayan’ project endeavours to unveil the script’s mysteries via an online text database and dictionary at https://classicmayan.org. Collaborative digital humanities methodologies and tools empower insights into the Maya’s cultural and historical legacy. The project catalogues inscribed artefacts and images in the virtual research environment TextGrid and the ‘Maya Image Archive (MIA)’, enhancing accessibility and collaboration. It further converts Maya hieroglyphic texts into machine-readable XML/TEI format and employs a novel sign classification framework. A new linguistic tool facilitates linguistic analysis and translation, enriching our understanding of Classic Mayan language and culture. Furthermore, the project compiles a vast repository of digitized Maya culture-related images and textual data, accessible online. As of 2024, it focuses on hieroglyphic texts from specific regions, with ongoing transliteration, transcription, and linguistic analysis. This digital approach not only facilitates dynamic Maya script research but also offers a platform for comprehensive source material evaluation and publication.

View on academic.oup.com

Rossi, Irene. 2024. “Corpus of South Arabian Inscriptions Bibliography.” Bibliographical dataset. Zotero. doi:https://doi.org/10.5281/zenodo.14274417.

View on www.zotero.org

Rossi, Irene, and Chiara Salvador. 2024. “An Observatory of Epigraphic Resources on the Web: The Open Digital Epigraphy Hub.” Archeologia e Calcolatori 35 (2): 503–23. doi:10.19282/ac.35.2.2024.51.

The Open Digital Epigraphy Hub (EpiHub) is an open access digital platform developed to streamline accessibility and organization of resources in digital epigraphy. Created within the Humanities and Cultural Heritage Italian Open Science Cloud (H2IOSC), EpiHub addresses the fragmented landscape of digital epigraphic resources, which span disciplines like linguistics, philology, and archaeology. Offering a comprehensive catalogue of national and international resources – such as datasets, digital tools, geographical and chronological gazetteers, dictionaries, and text-processing software – EpiHub structures these assets through descriptive metadata to facilitate discoverability and usability for researchers and practitioners across diverse cultural and temporal scopes. The platform’s flexible back-end architecture supports efficient data management and real-time updates to enhance front-end accessibility, organizing resources by thematic collections and allowing advanced searches based on specific epigraphic needs, such as language, geographic region, or historical period. Emphasizing FAIR principles, EpiHub standardizes metadata and controlled vocabularies to foster broader interoperability and data reuse across research projects. Integrated with related H2IOSC resources, including H-SeTIS and DHeLO, EpiHub aims to become a central resource, continuously enriched to support collaboration and innovation within the digital epigraphy community.

View on www.archcalc.cnr.it

Salomon, Corinna. 2024. “Lexicon Leponticum – Concept and Implementation.” In Cisalpine Celtic Literacy – Proceedings of the International Symposium Maynooth 23–24 June 2022, edited by Corinna Salomon and David Stifter. Hagen.

Scarpa, Erica, and Riccardo Valente. 2024. “ICONCLASS.” National Research Council. doi:10.71795/JACA-ND06.

View on h-setis.cnr.it

Serzhykivna, Tamrazian Amest. 2024. “A Corpus-Based Approach to Identifying Top-Level Terms for Developing a SKOS Vocabulary for Ukrainian Epigraphy.” Слобожанський Науковий Вісник. Серія: Філологія, no. 7: 58–65. doi:10.32782/philspu/2024.7.9.

Метою дослідження є ідентифікація та класифікація термінів верхнього рівня в українській епіграфіці для розробки словника SKOS, що сприятиме категоризації, організації та пошуку епіграфічних написів. Дане дослідження заповнює значну прогалину в цифрових гуманітарних науках, де українська епіграфічна спадщина була недостатньо представлена. Основою для опису епіграфічних артефактів в українській академії обрано «Корпус графіті Софії Київської» В. Карнієнка, який представляє структурований формат української епіграфічної спадщини. Дослідження ґрунтується на порівняльних методах та детальному корпусному лінгвістичному аналізі контекстного застосування термінів у науковому дискурсі. Корпус графіті Софії Київської є обґрунтованим вибором завдяки його структурованості та систематичному підходу до опису епіграфічних пам'яток. Важливим елементом дослідження є аналіз структури та змісту робіт Карнієнка для розвитку стандартизованого словника SKOS для українських епіграфічних написів. Дослідження пропонує систематизований підхід до розвитку словника SKOS для української епіграфіки, інтегрованого з існуючими рамками, такими як словник EAGLE для греко-римських артефактів. Це не лише технічний, а й стратегічний крок, який забезпечує ширшу застосовність та інтероперабельність, дозволяючи вивчати українські написи поряд із написами інших культур. Отримані результати сприяють більш інтегрованому та доступному цифровому уявленню епіграфічної спадщини, що не лише збагачує світовий ландшафт цифрових гуманітарних наук, а й забезпечує належну увагу та наукове визнання багатої епіграфічної спадщини України. Важливість цифрових інструментів та корпусного аналізу розвитку цифрових гуманітарних наук, зокрема цифрової епіграфіки, наголошується у цьому дослідженні. Розробка комплексного словника SKOS для української епіграфіки дозволить інтегрувати ці словники з існуючими рамками, такими як словник EAGLE для греко-римських артефактів, що забезпечить ширшу застосовність та інтероперабельність, дозволяючи українським написам вивчати поряд з написами між мовами та культурами.

View on journals.spu.sumy.ua

Sonik, Karen, and Dahlia Shehata. 2024. “Mesopotamian Literature: Issues, Theories, and Methods of Sumerian and Akkadian Narrative Analysis.” In Contemporary Approaches to Mesopotamian Literature. How to Tell a Story, edited by Dahlia Shehata and Karen Sonik, 11–98. Leiden; Boston: BRILL. doi:10.1163/9789004697577_003.

View on brill.com