• Home
  • Projects
  • Art
  • Blog
  • CV
Menu

Riley Wong

  • Home
  • Projects
  • Art
  • Blog
  • CV
Generated Final Fantasy MIDI files, visualized

Generated Final Fantasy MIDI files, visualized

Generating Jazz Music with an LSTM Recurrent Neural Network

February 25, 2019

Last week was my first week at the Recurse Center! I’m having so much fun lol. While here, I’m exploring creative applications of deep learning.

As my starter project, I wanted to generate jazz music using a neural network. LSTM stands for Long Short-Term Memory, and is a type of recurrent neural network that is capable of processing sequences. You can think of this as having short-term memory capable of learning long-term dependencies.

Using this tutorial as a starting point, I trained an LSTM model on two datasets: Final Fantasy music (conveniently provided from the tutorial, which let me focus on the model building over finding data), and Herbie Hancock jazz music (my original goal!).

Here are the results:

Final Fantasy

For this composition, I generated a bunch of MIDI files using the model, picked 3 I liked, set them to different instruments, and composited them into one piece.

Herbie Hancock

For these, I didn’t compose or edit the songs by much. After generating a few MIDI files, I picked some I liked, and set each one individually to an instrument I thought sounded nice.

My favorite parts are 0:22-0:45 in the Wurlitzer piece, and the first 5 seconds of the Vibraphone piece.


Honestly I’m quite happy with the results! This was my first time working with music data. I also had access to the Recurse Center’s GPU cluster, which made this project possible. I’ve pushed my code up to this Github repo for reference.

Project components:

  • Learning about the MIDI file format and how to encode it using the Python Music21 library

  • Finding MIDI files for my training data

  • Getting set up on the GPU cluster (and using screen so I don’t disconnect and interrupt my training session when I leave for the day!)

  • Training the LSTM model using Keras, saving the weights as I go. Learned from a friend: if you have access to a GPU, you’ll want to use CuDNNLSTM rather than LSTM layers, to save on training times! Generating doesn’t take that long but it would improve on generating times as well.

  • Generating music using the LSTM model (same architecture, load up the most recent weights file). Using the first 100 notes, predict the next note. Shift the window for the input sequence by one note, repeat. Stop whenever you feel your song is long enough lol. The songs in this post are about 250 notes each.

  • Opening up the MIDI files in Garage Band so I can play it with various fun instruments and sounds :-)

There are a number of possible extensions from here. For example, right now the rhythm is pretty straightforward, as notes are set to offset from the last note by 0.5 seconds. A possible extension is to encode the rhythms of the training data (doable since MIDI file formats are essentially note+time offset). Another extension is to add some music rules, e.g. counterpoint, harmony, consonance, etc. which I would have to research more about. It would also be really interesting to train a network on multiple instrumental parts, such as an orchestral score, where different instruments would have musical relationships or dependencies with one another.

Sheet Music

For fun, I used an online converter to generate sheet music for piano from the output MIDI files :-)

Here’s the first song (Wurlitzer Electric):

jazz_03_sheet_1.png
jazz_03_sheet_2.png.png

Here’s the second song (Vibraphone):

jazz_02_sheet_1.png
jazz_02_sheet_2.png
Tags music, generative music, generative art, lstm, neural network, neural networks, python, machine learning
1 Comment
  • 2024
    • Nov 18, 2024 [Talk] Applications of MP-FHE for Vulnerable Communities Nov 18, 2024
    • Jun 11, 2024 [Resource] Community Models for Music Venues and Platforms Jun 11, 2024
    • Jun 11, 2024 [Resource] Interfaces for Data Consent Jun 11, 2024
    • May 7, 2024 Cooperative Leaders and Scholars, Community Venues and Cultural Land Trusts May 7, 2024
    • Mar 27, 2024 [Talk] Governable Spaces | Collective Governance: Governance Archaeology Mar 27, 2024
    • Mar 7, 2024 [Essay] Privacy-Preserving Data Governance, Ash Center Occasional Papers Series Mar 7, 2024
    • Jan 20, 2024 [Talk] Privacy-Preserving Data Governance, Second Interdisciplinary Workshop on Reimagining Democracy, Harvard Kennedy School Ash Center Jan 20, 2024
  • 2022
    • Nov 2, 2022 decentralized networks for community care, dweb reflections, general updates Nov 2, 2022
    • Nov 2, 2022 dm-uy 1133 creative coding guest lecture: creative applications of generative machine learning Nov 2, 2022
    • Mar 10, 2022 coops and governance: mood board Mar 10, 2022
  • 2021
    • Nov 4, 2021 coops and governance Nov 4, 2021
    • Nov 1, 2021 hypnopompia -- published fiction story w/ kernel mag Nov 1, 2021
    • Jul 28, 2021 sleep, dreams, and brain waves Jul 28, 2021
  • 2019
    • Jun 3, 2019 pokemon2pokemon: Using Neural Networks to Generate Pokemon as Different Elemental Types Jun 3, 2019
    • May 16, 2019 Localhost Talk: creative applications of deep learning, aka, neural networks for fun and not profit :-) May 16, 2019
    • Apr 24, 2019 Implementing char-RNN from Scratch in PyTorch, and Generating Fake Book Titles Apr 24, 2019
    • Apr 19, 2019 samoyed2bernese: Using CycleGAN for Image-to-Image Translation between Samoyeds and Bernese Mountain Dogs Apr 19, 2019
    • Apr 12, 2019 joke2punchline, punchline2joke: Using a Seq2Seq Neural Network to "Translate" Between Jokes and Punchlines Apr 12, 2019
    • Apr 4, 2019 Implementing a Seq2Seq Neural Network with Attention for Machine Translation from Scratch using PyTorch Apr 4, 2019
    • Apr 3, 2019 AACR June L. Biedler Prize for Cancer Journalism, SABEW Best in Business Honorable Mention Apr 3, 2019
    • Mar 19, 2019 Implementing Neural Style Transfer from Scratch using PyTorch Mar 19, 2019
    • Mar 6, 2019 Circuit Cities with Pix2Pix: Using Image-to-Image Translation with Generative Adversarial Networks to Create Buildings, Maps, and Satellite Images from Circuit Boards Mar 6, 2019
    • Mar 5, 2019 Dogspotting: Using Machine Learning to Draw Bounding Boxes around Dogs in Pictures Mar 5, 2019
    • Feb 28, 2019 Text Generation with GPT-2, OpenAI's Recently Released Language Model Feb 28, 2019
    • Feb 25, 2019 Generating Jazz Music with an LSTM Recurrent Neural Network Feb 25, 2019
  • 2018
    • Sep 25, 2018 Black Patients Miss Out On Promising Cancer Drugs Sep 25, 2018
    • May 23, 2018 Predicting Readmission Risk after Orthopedic Surgery May 23, 2018
    • May 3, 2018 Machine Learning for Healthcare May 3, 2018
    • Jan 3, 2018 Music and Mood: Assessing the Predictive Value of Audio Features on Lyrical Sentiment Jan 3, 2018
  • 2016
    • Jun 8, 2016 Algorithmic Bias Jun 8, 2016
    • May 26, 2016 Computational Creativity May 26, 2016
  • 2015
    • Mar 12, 2015 penn play promotional profile pictures Mar 12, 2015
    • Jan 21, 2015 fnar 247: environmental animation master post Jan 21, 2015
  • 2014
    • Dec 26, 2014 photographs 01 Dec 26, 2014
    • Aug 21, 2014 Morton Salt Girl 3D Model Aug 21, 2014
    • May 11, 2014 rotary telephone May 11, 2014
    • May 6, 2014 project dump May 6, 2014

Riley Wong © 2014 · contact