Controllable deep melody generation via hierarchical music representation

International Society for Music Information Retrieval Conference

Published November 8, 2021

Shuqi Dai, Zeyu Jin, Celso Gomes, Roger B. Dannenberg

Recent advances in deep learning have expanded possibilities to generate music, but generating a customizable full piece of music with consistent long-term structures remains a challenge. This paper introduces a hierarchical music representation called Music Frameworks, and a multi-step generative process to create a full-length melody guided by long-term repetitive structure, chord, melodic contour, and rhythm constraints. To generate the melody, we first generate rhythm and basic melody using two separate transformer-based networks. Then, another transformer-based network generates the melody conditioned on the basic melody, rhythm, and chords in an auto-regressive manner. To customize, one can alter chords, basic melody, and rhythm while our networks generate the melody accordingly. Evaluations demonstrate the effectiveness of our method in writing a completely new melody and rhythm given chords. A listening test reveals that melodies generated by our method are rated as good as or better than human-composed music in the POP909 dataset about half the time.

Research Area:  Audio