Skip to the content.

Analysis and detection of singing techniques
in repertoires of J-POP solo singers

Yuya Yamamoto, Juhan Nam, Hiroko Terasawa

Paper on ArXiv Conference page

Abstract of the paper

In this paper, we focus on singing techniques within the scope of music information retrieval research. We investigate how singers use singing techniques using real-world recordings of famous solo singers in Japanese popular music songs (J-POP). First, we built a new dataset of singing techniques. The dataset consists of 168 commercial J-POP songs, and each song is annotated using various singing techniques with timestamps and vocal pitch contours. We also present descriptive statistics of singing techniques on the dataset to clarify what and how often singing techniques appear. We further explored the difficulty of the automatic detection of singing techniques using previously proposed machine learning techniques. In the detection, we also investigate the effectiveness of auxiliary information (i.e., pitch and distribution of label duration), not only providing the baseline. The best result achieves 40.4% at macro-average F-measure on nine-way multi-class detection. We provide the annotation of the dataset and its detail on the appendix website (this site). https://yamathcy.github.io/ISMIR2022J-POP/

Dataset “COSIAN”

Description

We built a new dataset named COSIAN (a COllection of SInging voice ANnotation) to conduct the analysis. COSIAN is an annotation collection of Japanese popular (J-POP) songs, focusing on singing style and expression of famous solo-singers.

It consists of various 168 songs. There are 21 female- and 21 male singers. Each singer has four songs that have different moods from each other.

What is the motivation?

Understanding the singing voice more

The basic concept of the work is analyzing the singers’ characteristics by clarification of how they render the song. One of the naive ways to realize it is annotating the presence of singing techniques, which are produced by fluctuating the pitch, timbre, etc. However, there are no such datasets, so we decided to build it.

Metadata

It contains songlist. it contains following information;

Annotations

(CAUTION) Audio files are not contained below!!

-> If you want the annotation files, access here and request a permission. The annotation is research purpose only.

The request should include the following. Otherwise it will be rejected.


Because of copyright issue, we don’t provide raw audio tracks. Instead, we provide links of music streaming service for each songs in COSIAN.

Annotation procedure

We used Sonic visualiser, to annotate the singing techniques with both of the help of sound playback and visualizing the spectrograms and pitchgrams.

Annotated singing techniques

Overview

Examples of each singing technique

Data statistics

Detected examples

These are the examples automatically detected by Focal-GT model. Note that videos are sample of audio clip, we actually used audio from the CD recordings for the task.

Good examples

#1: Sakura / Ikimono gakari
Video clip 1:30-1:36 Label (Upper: ground truth label, lower: detected labels)
#2: Omoiga karanaru sono mae ni / Ken Hirai
Video clip 2:38-2:45 Label (Upper: ground truth label, lower: detected labels)

Bad examples

We confirmed that one of the common mis-detection cases is from the detection of too short or frequently switching regions.

#1: Readymade / Ado
Video clip 0:50-0:55 Label (Upper: ground truth label, lower: detected labels)
#2: Honey / L'Arc~en~Ciel
Video clip 0:14-0:20 Label (Upper: ground truth label, lower: detected labels)

Contact

If you have any questions about the paper, please contact the first author Yuya. We also accept issues in github repository.

License

The COSIAN contains copyright material. We share COSIAN with researchers under the following conditions:

University of Tsukuba and KAIST shall not be held liable for any errors in the content of COSIAN nor damage arising from the use of COSIAN. The COSIAN administrator may update these conditions of use at any time.

Citation

Cite the ISMIR 2022 paper.

@inproceedings{yamamoto2022analysis,
         author = {Yamamoto, Yuya and Nam, Juhan and Terasawa, Hiroko},
         title = {Analysis and Detection of Singing Techniques in Repertoires of J-POP solo singers},
         booktitle = {Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR)},
         year = {2022}
}