Name		Name	Last commit message	Last commit date
parent directory ..
dataset		dataset
README.md		README.md
__init__.py		__init__.py
constants.py		constants.py
dataset.py		dataset.py
embeddings.py		embeddings.py
lofi2lofi_dataset.py		lofi2lofi_dataset.py
lofi2lofi_model.py		lofi2lofi_model.py
lofi2lofi_train.py		lofi2lofi_train.py
lyrics2lofi_dataset.py		lyrics2lofi_dataset.py
lyrics2lofi_model.py		lyrics2lofi_model.py
lyrics2lofi_train.py		lyrics2lofi_train.py
train.py		train.py

README.md

LOFI Model

Lo-fi music has rather simple characteristics (short loops of chord progressions, simple melodies, no dynamics, etc.) which makes it an easy target for computational music generation. We obtained a dataset containing thousands of songs, each sample containing chord progressions, melodies, and other musical parameters, and trained two VAE models (Lofi2Lofi and Lyrics2Lofi) to generate a latent space of musical parameters to sample from. These musical parameters are:

Chords: an integer sequence of a chord progression in Roman numeral notation (0-8; 0=rest; 8=end)
Melodies: an integer sequence of eight notes for each chord, by scale degree over two octaves (0-15; 0=rest)
Tempo: a continuous value between [0, 1] that indicates the tempo, can be scaled to BPM
Key: an integer between 1 and 12 denoting the musical key, by chromatic scale degree
Mode: an integer between 1 and 7 corresponding to one of the seven Greek modes
Valence: a continuous value between [0, 1] that denotes musical positiveness
Energy: a continuous value between [0, 1] that denotes a perceptual measure of intensity and activity

A sample can thus be represented in JSON as such:

{
  "title": "#338934052871945450670672",
  "key": 3,
  "mode": 1,
  "bpm": 75,
  "energy": 0.501,
  "valence": 0.381,
  "chords": [6, 7, 1, 4, 5, 1],
  "melodies": [
    [0, 6, 6, 6, 5, 6, 5, 6],
    [5, 6, 2, 0, 2, 0, 2, 0],
    [0, 5, 0, 5, 0, 5, 0, 5],
    [0, 6, 6, 6, 6, 6, 6, 6],
    [6, 5, 5, 5, 5, 5, 5, 2],
    [0, 5, 0, 0, 0, 0, 0, 0]
  ]
}

Models

We trained two VAEs, Lofi2Lofi and Lyrics2Lofi.

Lofi2Lofi

Lofi2Lofi is a symmetrical VAE. Each dataset sample is encoded in the same format as the output.

The architecture of the decoder is easier to look at if we unroll the LSTMs:

This architecture ensures that melodies are conditioned on the chord and each note is conditioned on the previous notes. In music, context is very important, and we hope to reflect that in our model.

Lyrics2Lofi

Lyrics2Lofi is an asymmetrical VAE which takes lyrics as input. We initially hoped to be able to turn input text into lo-fi music, but preliminary results show that text embeddings are simply not able to provide enough information about the music itself, leading to poor validation performance.

Running the model

First, follow the instructions in the dataset folder to build the dataset.

To run Lofi2Lofi:

Run lofi2lofi_train.py

To run Lyrics2Lofi:

Run make_embeddings inside embeddings.py to build the embeddings.npy file.
Run lyrics2lofi_train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model

model

README.md

LOFI Model

Models

Lofi2Lofi

Lyrics2Lofi

Running the model

Files

model

Directory actions

More options

Directory actions

More options

Latest commit

History

model

Folders and files

parent directory

README.md

LOFI Model

Models

Lofi2Lofi

Lyrics2Lofi

Running the model