OpenAI releases Jukebox, a machine learning framework that generates music


OpenAI recently launched Jukebox, a model that generates music with singing in the raw audio domain. As a generative model for music, Jukebox can handle the long context of raw audio using an autoencoder. Jukebox’s autoencoder processes the audio files using a multiscale VQ-VAE to compress it to discrete codes and modeling those using autoregressive Transformers.

Provided with a genre, artist, and lyrics as input, Jukebox can output a new music sample produced from scratch. This is a type of innovation that expands the boundaries of generative models to a new level. Jukebox’s model is capable of generating audio pieces that are multiple minutes long, and with recognizable singing in natural-sounding voices. Please listen to the Jukebox-generated country song listed at the end of this article.


# Required: Sampling
conda create --name jukebox python=3.7.5
conda activate jukebox
conda install mpi4py=3.0.3
conda install pytorch=1.4 torchvision=0.5 cudatoolkit=10.0 -c pytorch
git clone
cd jukebox
pip install -r requirements.txt
pip install -e .

# Required: Training
conda install av=7.0.01 -c conda-forge 
pip install ./tensorboardX
# Optional: Apex for faster training with fused_adam
conda install pytorch=1.1 torchvision=0.3 cudatoolkit=10.0 -c pytorch
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./apex




Sample Explorer:

Video Paper Summary by Luca