SCHmUBERT

implementation of absorbing state diffusion model from https://github.com/samb-t/unleashing-transformers

Samples

Samples in MIDI format can be found in the samples folder. You can also explore them in your browser (open in new tab if page not found)

Installation

I run my experiments in Python 3.10, with all dependencies managed by Conda.

conda env create -f env.yml

Note that for all experiments, a soundfont-file called 'soundfont.sf2' (not included) must be located in the root-directory of the project.

Prepare Dataset

I use the Lakh MIDI Dataset to train the models. For loading, preprocessing and extracting melodies and trios from the MIDI files, I adapted the pipelines magenta implemented for their MusciVAE. To prepare the dataset run:

python prepare_data.py --root_dir=/path/to/lmd_full --target data/lakh_trio.npy --mode trio --bars 64

Train

I use visdom to log the training progress and periodically show samples.

To train the model, start visdom and run for example:

python train.py --dataset data/lakh_trio.npy --bars 64 --batch_size 64 --tracks trio --model conv_transformer

So far, I got the best results with the conv_transformer model with one 1DConvolutional layer with a width of 4. Pay attention to the steps_per_eval param, which is set to 10000 per default. The evaluation step is more computationally expensive than training for 10000 steps, which is why you might want to increase this value if you do not need that many evaluations.

Evaluate

To evaluate the framewise self-similarity metric on the samples generated by a model, run:

python evaluate.py --mode unconditional|infilling|self

Sample

For sampling, I ~~implemented~~ hacked a rudimentary GUI using nicegui.

python sample.py --load_step 140000 --bars 64 --tracks trio --model conv_transformer

The GUI supports:

visualizing samples (melody=red, bass=blue, drums=black), y position indicated pitch height, special pitch values: 0: pause, 1: note off, 90: mask
adaption of sample steps (Slider in Upload Expansion area)
diffuse from left to right ('=>') or vice versa ('<=')
copy from left to right ('>') or vice versa, only mask values are overwritten
sampling unconditionally (select 'A' in the central toggle to diffuse All (batch of 8) instead of the Selected sample)
uploading midi or musicxml - pieces for conditioning
masking whole tracks LM = Left Melody, RD = Right Drums, ....
masking area selected with mouse (mask button at the bottom)
playing with cursor indicating exact position in left and right visualization

Model Weights

Model weights for the Conv_Transformer EMA model trained on the Lakh-MIDI Dataset can be obtained here. Extract the 'logs' folder to the project root, and set load_step, model, ... accordingly (250000, conv_transformer, ...).

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
docs		docs
hparams		hparams
models		models
note_density_estimator		note_density_estimator
preprocessing		preprocessing
samples		samples
soundfont_helpers		soundfont_helpers
soundfonts/essentials_sforzando		soundfonts/essentials_sforzando
utils		utils
.gitignore		.gitignore
LICENCE		LICENCE
README.md		README.md
SCHmUBERT_GUI.gif		SCHmUBERT_GUI.gif
backlog.txt		backlog.txt
class_guided_sampling_POC.py		class_guided_sampling_POC.py
construct_perfect_samples.py		construct_perfect_samples.py
download_sgm.py		download_sgm.py
env.yml		env.yml
evaluate.py		evaluate.py
prepare_data.py		prepare_data.py
sample.py		sample.py
sample_continuous.py		sample_continuous.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SCHmUBERT

Samples

Installation

Prepare Dataset

Train

Evaluate

Sample

Model Weights

About

Releases

Packages

Languages

License

plassma/symbolic-music-discrete-diffusion

Folders and files

Latest commit

History

Repository files navigation

SCHmUBERT

Samples

Installation

Prepare Dataset

Train

Evaluate

Sample

Model Weights

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages