Here you can find information about Sam’s ResM Thesis at Plymouth University, titled: “Sonic Analysis for Machine Learning: Multi-Layer Perceptron Training using Spectrograms” and related materials.
This thesis presents efforts to lay the foundations for an Artificial-Intelligence musical compositional system conceived on similar principles to DeepDream, a revolutionary computer vision process. This theoretical system would be designed to engage in stylistic feature transfer between existing musical pieces, and eventually to compose original music either autonomously or in collaboration with human musicians and composers. In this thesis, construction of the analysis and feature recognition systems necessary for this long-term goal is achieved through the use of neural networks.
Originally, DeepDream came about as a way of visualising the weights inside neural network layers – matrices of variables containing the data that determines what information the network has learned – for better understanding of training and trouble- shooting of such networks that have been trained to classify images. This approach spawned an unexpectedly artistic process whereby feature recognition could be used to alter images in a dreamlike fashion, akin to seeing shapes in clouds.
The proposed musical version of this process involves analysing sound files and generating spectrograms – pictures of the sound that could be manipulated in much the
same ways as regular images. As described in this thesis, a sizeable bank of sound samples has been gathered – of individual musical notes from a selection of instruments – in pursuit of this application of the DeepDream architecture.
These samples are curated, edited and analysed to produce spectrograms that make up a dataset for neural network training. Using the Python programming language and its machine learning library ‘Scikit Learn’, a rudimentary deep learning system is constructed to be trained on the sample spectrograms and learn to classify them. Once this is complete, additional tests are performed to determine the validity and effectiveness of the approach.
The additional materials folder contains the audio files of the samples the system was trained on (including all additional tests), the CSV files generated from these samples, and the Python & Max/MSP code used during the project.
You can download the thesis itself here
Download the additional materials folder here – (more than 1gb size)