Generative Modeling of Bach-Style Symbolic Music: A Comparative Study of Autoregressive, Latent-Variable, and Adversarial Approaches

· Source: Takara TLDR - Daily AI Papers · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Generative Music · Depth: Expert, quick

Summary

A comparative study evaluated three generative model families for creating Bach-style symbolic piano music using a shared MIDI corpus. Researchers Kyuil Lee, Dezhi Yu, and Yongkang Huang compared autoregressive LSTMs with attention, latent-variable models (recurrent VAEs and vector-quantized VAEs), and generative adversarial networks. The autoregressive LSTM with attention demonstrated superior musical coherence in generated samples. Vector-quantized VAEs were found to mitigate posterior collapse and produce more structured outputs compared to traditional recurrent VAEs. Conversely, the adversarial approach, while capturing local pitch patterns, proved challenging to train and exhibited less reliable generalization to Bach's distinct style. This research highlights the distinct strengths and limitations of each modeling paradigm for symbolic music generation.

Key takeaway

For machine learning engineers developing symbolic music generation systems, you should prioritize autoregressive LSTMs with attention to achieve the highest musical coherence in outputs. If you require more structured latent representations, consider vector-quantized VAEs, as they effectively mitigate posterior collapse. Avoid generative adversarial networks for projects demanding reliable stylistic generalization. Their training difficulties and limited transferability to complex styles like Bach's make them less suitable for such applications.

Key insights

Autoregressive LSTMs excel in musical coherence for Bach-style generation, while VQ-VAEs improve structure over recurrent VAEs.

Principles

Method

The study compared autoregressive LSTMs with attention, recurrent VAEs, VQ-VAEs, and GANs on a shared MIDI corpus for Bach-style music, evaluating coherence, latent representations, and stylistic generation.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.