ARTFEED — Contemporary Art Intelligence

Break-the-Beat! Model Synthesizes Drum Audio from MIDI

digital · 2026-05-16

Researchers have developed 'Break-the-Beat!,' a model that generates drum audio from MIDI input using a reference audio timbre. Built by fine-tuning a pre-trained text-to-audio model with a content encoder and hybrid conditioning mechanism, it addresses the lack of specific control in drum loop creation. A new dataset of paired target-reference drum audio was constructed from existing datasets. Experiments show high-quality audio following high-resolution drum patterns.

Key facts

  • Break-the-Beat! is a model for MIDI-to-drum audio synthesis.
  • It uses a reference audio timbre to render drum MIDI.
  • The model fine-tunes a pre-trained text-to-audio model.
  • It employs a content encoder and hybrid conditioning mechanism.
  • A new dataset of paired target-reference drum audio was created.
  • Experiments demonstrate high-quality drum audio generation.
  • The model addresses polyphonic, percussive drum synthesis.
  • Current methods like one-shot samples require non-trivial effort.

Entities

Institutions

  • arXiv

Sources