2025 Proceedings
DLfM '25: Proceedings of the 12th International Conference on Digital Libraries for Musicology
From Pixels to Paleography: A Dual-Pathway Neural Network for Neume Script Classification
- Kyrie Bouressa and Ichiro Fujinaga
POSSUMM employs a Siamese neural network architecture with dual-pathway analysis: one branch classifies script styles in a full page context, while the other isolates and analyzes individual neume forms using modified word-spotting techniques. This combination of macro-level classification with micro-level feature recognition provides advantages over traditional single-branch models, such as standard convolutional neural networks, achieving 96% accuracy overall and 93% accuracy for minority notation traditions, even with fragmentary or damaged inputs.
For uncertain cases, the system implements a Bayesian query-by-example protocol that provides data that allows users to assist the model in making its classification in an efficient user-guided decision process, typically requiring just 3–5 interactions to achieve high-confidence classification. This human-in-the-loop approach bridges the gap between fully automated systems and expert knowledge, creating a framework where specialist paleographic expertise can be democratized across cultural heritage institutions.
POSSUMM enables improved cataloging of under-described fragments, correction of misclassifications, and cross-collection discovery by notation type. POSSUMM’s architecture is designed for extensibility and integration with linked data frameworks (such as Digital Scriptorium, Mapping Manuscript Migrations, or Wikidata), advancing both technical accuracy and the broader goals of digital musicology through the transformation of specialized knowledge into machine-actionable metadata.
(Digital) Philology for/of Multiple Creative Processes: Considering Notation, Recordings, and Digital Editions
- Joshua Neumann
A Multimodal Dataset of Greek Folk Music
- Anna Maria Christodoulou and Olivier Lartillot
Drafting the Landscape of Computational Musicology Tools: A Survey-Based Approach
- Jorge Junior Morgado Vega, Sachin Sharma and Federico Simonetta
Since the 50s, musicology has been increasingly impacted by computational tools in various ways, from systematic analysis approaches, to modeling of creativity. This article presents a comprehensive assessment of the current state of computational musicology tools based on survey data collected from practitioners in the field. Through a structured questionnaire, we gathered information on tool usage patterns, common analytical tasks, user satisfaction levels, data characteristics, and prioritized features across four distinct domains: symbolic music, music-related imagery, audio, and lyrics. Our findings reveal significant gaps between current tooling capabilities and user needs, highlighting some limitations of these tools across all domains. This assessment contributes to the ongoing dialogue between tool developers and music scholars, aiming to enhance the effectiveness and accessibility of computational methods in musicological research.
The IRMA Dataset: A Structured Audio–MIDI Corpus for Iranian Classical Music
- Sepideh Shafiei and Shapour Hakam
Performance Configuration Analysis in Portuguese Traditional Music: A Computational Approach
- Nawaraj Khatri and Gilberto Bernardes
We present an analysis of performance configurations in Portuguese traditional music, using computational methods to process field recordings from the A Música Portuguesa A Gostar Dela Própria (MPAGDP) archive. Our approach employs YOLOv11s ("You Only Look Once"), a computer vision system that can detect and count performers in archival footage, allowing us to automatically classify performances into meaningful categories: solo, duo, small, and large ensembles. This computational classification method processed over 8000 field recordings with 96% accuracy, enabling systematic examination of performance contexts that would be time-consuming through manual analysis. Our analysis reveals significant relationships between performance configuration and musical practice across Portuguese traditions. Solo performers, comprising 48% of vocal recordings, predominantly appear in narrative and poetic traditions requiring individual expression. Large ensembles (21%) maintain collective practices like polyphonic singing traditions. The geographic distribution shows regional traits—Alentejo features large-ensemble singing traditions, while northern regions favor solo performances. The temporal analysis traces how traditional forms maintain continuity through specific performance configurations, while contemporary adaptations emerge primarily in small group formats, illuminating the social dimensions of musical transmission and adaptation in Portuguese traditional music.
The Cancionero de Miranda Edition: Leveraging Open Source Technologies for Multi-Modal Music Publication
- Fernando Herrera de Las Heras
The primary contribution is a flexible publication pipeline that generates tailored editions for different audiences (performers, musicologists, casual readers) across multiple media formats (paper and digital screens). We evaluate and integrate various open source technologies, some of them not typically found in music publishing workflows, like the audio synthesis using virtual singer databanks or an immersive visualization player with auto-scrolling score developed specifically for this project.
The paper details our MEI encoding methodology, the customization architecture that adapts content presentation based on user needs, and the technical challenges overcome in creating a cohesive system from disparate open source components. In line with digital preservation best practices, all developed tools are released as reusable open source components that can process standard MEI-encoded music files and markdown text, enabling similar multi-modal editions of other musical collections.
Our findings demonstrate how digital library technologies can enhance access to musical heritage while simultaneously serving the specialized needs of both scholarly and performance communities.
The Polyphonic Audio to Roman Corpus
- Thiago Poppe, Luisa Lopes and Flavio Figueiredo
Roman Numeral Analysis (RNA) is a method for representing chords based on their scale degree and function within a tonal context. This task is particularly challenging, as the classification of a chord can vary depending on the surrounding musical context. The difficulty increases further when RNA is applied to real polyphonic audio recordings, due to the presence of multiple timbres, background noise, ambience effects, and human expressiveness. Despite recent progress, the current literature lacks a large-scale, Music Information Retrieval (MIR)-friendly dataset—one that includes real audio, RNA labels, and well-defined training and evaluation splits—spanning diverse artists, genres, and song complexities. In this paper, we fill this gap by introducing the Polyphonic Audio to Roman Corpus (PARC), a MIR-friendly polyphonic audio to RNA dataset with metadata. We also adapt the current state-of-the-art Deep Learning models, originally focused on symbolic music, to evaluate PARC. To evaluate model performance, not only do we employ standard classification metrics, but we additionally propose a novel equivalence-aware evaluation framework that accounts for inherent ambiguities in RNA labeling (e.g., a C major chord can be seen as I in the key of C major or as III in the key of A minor). The PARC dataset and our evaluations provide valuable insights and open new directions towards RNA on real polyphonic audio recordings.
Annotation of digital music notation documents: surveying needs for a generalised implementation
- Kevin Page, David Lewis and Laurent Pugin
The ability to annotate music notation documents offers a powerful affordance to musicologists using digital libraries, and in the organisation and discovery of annotated sources within a music digital library. In this paper we first assess the current state of the art for annotating digital scores, then report on a survey conducted into existing uses and future needs elicited from the music library community. Analysing the survey results, we distinguish between extensions which might provide generalised annotation services for music notation software, versus application-specific interfaces and visualisations using such annotation services. Drawing upon the Web Annotation Model, we frame this distinction in terms of annotation targets and bodies, whereby specialist or customised bodies might utilise common shared mechanisms to address targets. We demonstrate the value of the latter by, for the first time, implementing support for annotation targets in the popular and widely used Verovio open source music engraving software, adding visual indications for enumerations and ranges encoded using the MEI <annot> element, and which can be manipulated in the resultant SVG. We conclude that common mechanisms for specifying and implementing annotation targets are not only possible, but a practical and useful foundation for music digital library tools and infrastructure.
Knowing when to stop: insights from ecology for building catalogues, collections, and corpora
- Jan Hajič Jr. and Fabian C. Moss
Collaborative workflows for encoding, validating, and publishing a multimodal digital edition
- David M. Weigl, Olja Janjuš, Reinier de Valk, Ilias Kyriazis, Julia Jaklin, Stefan Rosmer, Silas Bischoff, Henning Burghoff, Martina Bürgermeister, Christoph Steindl, Andreas Rauber and Kateryna Schöning
German lute tablature (GLT), once widespread throughout central Europe in the 15th and 16th centuries, has remained underexplored in scholarship and largely abandoned in performance practice, in part due to its significantly greater complexity when compared to other tablature types. ANON is an international, interdisciplinary research project assembling a comprehensive multimodal digital edition of surviving GLT sources. The project's team unites scholars with backgrounds in musicology, music performance, German language and literature studies, music informatics, library and information studies, and Web science; with correspondingly heterogeneous research priorities, data formats, and software tools. Developing a research data infrastructure and corresponding workflows to provide effective and sustainable support for this collaboration is an important focus of the project. Here, we present an overview of the technologies, tools, environments, and workflows serving to fulfill the varied stakeholder requirements underlying our multimodal digital edition.
Sustainable Archiving of Music Databases through RDF and NLQ2SPARQL
- Ichiro Fujinaga
Modelling Musical Meaning: A Semantically Enriched Corpus from Nineteenth-Century Spanish Music Lexicons
- Teresa Cascudo García-Villaraco, David Ferreiro Carballo and Arturo de Las Casas Escolar
The corpus integrates ontological modelling (subject–predicate–object triples) and manual annotation of conceptual relations and coreferential structures. It organises musical knowledge into categories, such as material sound entities, musical forms, performance practices, and reception. Through semantic enrichment, the system supports complex SPARQL-style queries and enables diachronic tracking of terminological shifts and conceptual change in music theory discourse.
We focus here on a pilot case study centred on the term ‘acorde’ (chord) in Carlos José Melcior’s Diccionario enciclopédico de la música (1859). We develop a preliminary ontological model from the entry, which captures both explicit and implicit conceptual structures. The model is then compared to the corresponding definitions in María Luisa Lacal’s (1899) and Jaime Pahissa’s (1929) dictionaries, highlighting semantic variation over time. This case illustrates how nineteenth-century lexicographic voices, often considered peripheral, encode complex relationships among, for instance, acoustics, perception, notation, and theoretical classification.
This work contributes to digital musicology by bridging historical music discourse and semantic web technologies. Beyond digitisation, it proposes new interfaces for interaction with historical sources: structured conceptual navigation, alignment with existing music ontologies, and potential integration into digital library systems.
Curating a Public Carnatic Music Dataset: Scalable Extraction of Ragam, Shruti, and Talam Metadata for Computational Musicology
- Sanjay Natesan and Homayoon Beigi
We introduce a novel, publicly available corpus of South Indian Carnatic music, which—for the first time—spans 172 distinct ragams (melodic frameworks) and 676 curated concert recordings, segmented into more than 11,219 audio clips. Each clip is annotated with its shruti (tonal center) and talam (metrical cycle) as Linked-Data entities, enabling automatic interoperability with established Music Information Retrieval (MIR) ontologies. The dataset was assembled through a hybrid pipeline that combines web-scale harvesting of YouTube concerts, automated signal processing for quality control, and expert-in-the-loop validation. To address inconsistencies in crowdsourced metadata, we introduce a pragmatic taxonomy that reconciles regional performance practices with canonical musicological literature. Case studies in automatic ragam recognition and comparative talam analysis illustrate how the resource advances computational musicology, cross-cultural MIR, and data quality assessment in digital libraries. This dataset is released under an open license at https://www.kaggle.com/datasets/sanjaynatesan/carnatic-song-database and will be updated as the resource grows.
MuNG Studio: Annotation Tool for Music Notation Graph
- Jiří Mayer, Filip Jebavý, Markéta Herzánová Vlková, Martina Dvořáková, Pavel Pecina and Jan Hajič Jr.
This paper introduces MuNG Studio, a new annotation tool for the Music Notation Graph (MuNG) format. MuNG is a high-detail graphical annotation format designed for Optical Music Recognition (OMR) tasks, originally proposed for the MUSCIMA++ dataset in 2017. MUSCIMA++ had a significant impact on the OMR community; however, most subsequent datasets made little use of the full MuNG format. This was likely due to the lack of user-friendly tools supporting the format. The original MUSCIMarker tool supporting the MuNG format that was used to annotate the MUSCIMA++ dataset is now obsolete and is impossible to install. The new MuNG Studio seeks to provide an easy-to-install web-based viewer and editor for the MuNG format with the goal of expanding and supporting the now growing ecosystem around MuNG.
Smashcima: Full-Page Handwritten Music Document Synthesizer
- Jiří Mayer, Pavel Pecina and Jan Hajič Jr.
Accompaniment in America: A Minimal-Computing Digital Collection for Hybrid Musicological Publication
- Chanda Vanderhart, David Wögerbauer and David M. Weigl
Despite the increasing digitization of music scholarship, historical musicology has been slow to adopt hybrid or multimodal approaches to research dissemination. This paper presents Accompaniment in America, a hybrid digital and print publication that bridges this gap by integrating traditional musicological scholarship with an open-access, minimal computing digital collection.
The project – comprising a mini-monograph and a multifaceted digital companion – examines the institutionalization of collaborative piano in North America while demonstrating how lightweight, sustainable digital frameworks can enhance humanities research and publication without alienating traditional scholars or requiring extensive institutional resources.
The collection, built by a team of two, leverages open-source tools (GitHub, Zenodo), hosts archival materials, interactive visualizations, and pedagogical resources, and adheres to FAIR principles (Findable, Accessible, Interoperable, Reusable). Designed for longevity and low technical and financial overhead, it serves as both a living archive and a model for minimal computing in musicology, addressing challenges of accessibility, copyright, and resource limitations. By pairing QR codes in the print edition with hyperlinked digital content, the project fosters engagement across analog and digital scholarly ecosystems.
We argue that such minimal computing approaches can both enrich traditional musicological dissemination and democratize access to data and digitized archival material while preserving the comfort levels and preferences of traditional music scholars. The paper concludes with reflections on the technical, legal, and ethical challenges of small-scale hybrid publishing, including copyright barriers and the need for interdisciplinary collaboration. This case study offers actionable insights for scholars, computer scientists, librarians, and archivists seeking to integrate digital libraries into cultural heritage projects with limited budgets and infrastructure.