Post Summary
I am collecting SVP files for an open-source research dataset. This is a dataset I hope can benefit everyone interested in singing voice technologies, and of course, push this field that we all love even further. Of course, these files should only be used with the creators consent, so therefore I’m asking for your help to submit as many SVP files as you would like to share! Accompanying lyrics and a generated wav file are also much appreciated.
Who am I?
My name is Silas Antonisen, a 26 year-old researcher at the University of Granada in Spain. I am studying a PhD in music information retrieval, with a deep focus on singing voices. That means I want to work on improving systems ranging from automatic lyrics transcription to singing voice synthesis.
My Previous Work
I love Japanese pop/rock music and wanted to make my own with Synthesizer V. However, after laying down some chords, I of course realize that I can’t really write my own Japanese lyrics. Therefore, my first scientific article in this research field which I published just a few months ago is called “PolySinger: Singing Voice to Singing Voice translation from English to Japanese”. This is an open source system made for translating your English songs into Japanese. If you would like to read more about this work, or listen to some samples, or find the code so you can try it yourself, please visit the project page at: PolySinger: Singing-Voice to Singing-Voice Translation From English to Japanese
My New Reseach
One of the major challenges in making voicebanks is annotating singing data for training a neural network, as this is a very difficult and also time consuming task. I want to investigate the possibility of automatically annotating singing data with high accuracy. Generally, this would require a lyrics transcription system, and/or phoneme alignment system and pitch/vocal-melody detection system, but it is difficult to train these systems, because there is a lack of annotated open source data. I believe in this age of generative AI that we can leverage generated content to innovate new systems. My hypothesis is that SVS has come to a point were it sounds very natural and humanlike, and as such, the data surrounding the generated singing should be of high quality.
Crowdsourcing an Open-Source SVP Dataset
To create a large-scale high quality dataset for the purpose of research in singing voice technology, e.g., lyrics transcription, melody extraction and ultimately automatic annotation of singing voices for the creation of voicebanks, I am trying to collect a dataset of generated singing voices alongside the inputs (notes, lyrics, phonemes, parameters etc.). This dataset will be currated and tested in several applications for the publication of a journal paper, and will be completely open-sourced, so you can gain access to this dataset and my trained models as well! If you would like to participate in this project, please attach your SVP files (lyrics and wav files are also appreciated) to this thread or reach out to me on my university mail: santon@ugr.es
Thank you so much for showing interest in this project, and may we together evolve the field of singing voice synthesis! If you want to know more about me, feel free to visit my webpage: https://antonisen.dev/
(Btw, if anyone from the Synthesizer V team is reading along… Yes I am also looking for an internship)