Generating Piano Music with Transformers: A Comparative Study of Scale, Data, and Metrics
Jonathan Lehmkuhl, Ábel Ilyés-Kun, Nico Bremes, Cemhan Kaan Özaltan, Frederik Muthers, Jiayi Yuan
虽然近年来提出了各种变压器用于符号音乐生成,但对具体设计选择如何影响生成音乐的质量仍然知之甚少。 在这项工作中,我们系统地比较了不同的数据集,模型架构,模型大小以及符号钢琴音乐生成任务的训练策略。 为了支持模型开发和评估,我们研究了一系列定量指标,并分析了它们与通过听力研究收集的人类判断的相关性。 我们性能最好的模型,一个950M参数变压器,训练有来自各种流派的80K MIDI文件,产生输出,在图灵风格的听力调查中通常被评为人组成。
Although a variety of transformers have been proposed for symbolic music generation in recent years, there is still little comprehensive study on how specific design choices affect the quality of the generated music. In this work, we systematically compare different datasets, model architectures, model sizes, and training strategies for the task of symbolic piano music generation. To support model development and evaluation, we examine a range of quantitative metrics and analyze how well they co...