Overview
This project aims to study artificial waveform audio synthesis and musical features decisive for human (re)cognition, using yodel as a case study.
So far, autoregressive machine-learning models for music generation in the audio waveform were able to imitate timbre but failed to reproduce the general structure of human compositions. Music synthesis in sheet notation has obtained more organic results. Still, it is not well-suited to a majority of musical genres, such as yodel music, which are often unavailable in written form.
We plan to conduct a combined investigation of generative models and musical signatures in the case of natural yodel, using recent breakthroughs in both fields. We will apply some of the latest deep-learning technologies to audio synthesis, which are expected to provide a dramatic improvement in the generation of music structure and to suggest interesting connections for music analysis. Additionally, a new understanding of music qualities will be established by an effort to drive the artificial generation of samples, to discriminate it from human works, and to pinpoint the differences between machine classification and human cognition of music.