diff --git a/README.md b/README.md index 8bfdb75..5b186bd 100644 --- a/README.md +++ b/README.md @@ -15,7 +15,7 @@ We show examples on how to perform the following parts of the Deep Learning work This demo is implemented as a MATLAB project and will require you to open the project to run it. The project will manage all paths and shortcuts you need. There is also a significant data copy required the first time you run the project. -## Part 1 - Data Preparation +## Part 1 - Data Preparation ([View on Browser](./markdown_view/Part01_DataPreparation.md)) This example shows how to extract the set of acoustic features that will be used as inputs to the LSTM Deep Learning network. @@ -23,7 +23,7 @@ To run: 1. Open MATLAB project Aircompressorclassification.prj 2. Open and run Part01_DataPreparation.mlx -## Part 2 - Modeling +## Part 2 - Modeling ([View on Browser](./markdown_view/Part02_Modeling.md)) This example shows how to train LSTM network to classify multiple modes of operation that include healthy and unhealthy signals. @@ -31,7 +31,7 @@ To run: 1. Open MATLAB project Aircompressorclassification.prj 2. Open and run Part02_Modeling.mlx -## Part 3 - Deployment +## Part 3 - Deployment ([View on Browser](./markdown_view/Part03_Deployment.md)) This example shows how to generate optimized c++ code ready for deployment. diff --git a/markdown_view/Part01_DataPreparation.md b/markdown_view/Part01_DataPreparation.md new file mode 100644 index 0000000..d6b6175 --- /dev/null +++ b/markdown_view/Part01_DataPreparation.md @@ -0,0 +1,256 @@ +# Air Compressor Data Classification +# Part 1: Data Preparation + + +Copyright 2020 The MathWorks, Inc. + + + + +![image_0.png](Part01_DataPreparation_images/image_0.png) + + +# Configuration + + +Click on the checkboxes below to choose options for how to run this script. + + + +```matlab:Code +doFeatureExtraction = false; % if unchecked, this will save time by loading previous results +``` + + + +Make sure we run this as a project. + + + +```matlab:Code +try + prj = currentProject; +catch + open("Aircompressorclassification.prj"); + OpenPart1; + prj = currentProject; +end +``` + +# Create Datastore + + +The recorded data is sorted by subfolder. Each subfolder contains over 200 recordings of the labeled state. + + + + +We create our datastore by providing the input data folder and specifying that the source of the labels is the name of the subfolders. + + + +```matlab:Code +dataFolder = 'AirCompressorData'; +ads = audioDatastore(dataFolder,'IncludeSubfolders',true,'LabelSource','foldernames'); +``` + + + +We then reset the random number generator (for consistent results) and shuffle the data. + + + +```matlab:Code +rng(3); +ads = shuffle(ads); +``` + +# Split Into Training and Validation Sets + + +Split the data into training and validation by doing a 90% training, 10% validation split. The `countEachLabel` command will show us how many samples of data belong to each category in the dataset. + + + +```matlab:Code +[adsTrain,adsValidation] = splitEachLabel(ads,0.9,0.1); +countEachLabel(adsTrain) +``` + +| |Label|Count| +|:--:|:--:|:--:| +|1|Bearing|203| +|2|Flywheel|203| +|3|Healthy|203| +|4|LIV|203| +|5|LOV|203| +|6|NRV|203| +|7|Piston|203| +|8|Riderbelt|203| + + +```matlab:Code +countEachLabel(adsValidation) +``` + +| |Label|Count| +|:--:|:--:|:--:| +|1|Bearing|22| +|2|Flywheel|22| +|3|Healthy|22| +|4|LIV|22| +|5|LOV|22| +|6|NRV|22| +|7|Piston|22| +|8|Riderbelt|22| + +# Data Preparation +## Human Insight + + +The data we are working with are time-series recordings of acoustics from different parts of an air compressor. As such, there are strong relationships between samples in time. + + + +```matlab:Code +sampleData = read(adsTrain); +sampleDataCategory = adsTrain.Labels(1); +plot(1:numel(sampleData), sampleData); +xlabel("Sample number"); +ylabel("Amplitude"); +title("Class: " + string(sampleDataCategory)); +``` + + +![figure_0.png](Part01_DataPreparation_images/figure_0.png) + + + +Listen to a sample of the audio if desired. + + + +```matlab:Code +% sound(sampleData,16000); +``` + + + +Because of this, we should be able to use a type of recurrent neural network (RNN) as a model for the data. The type of RNN we will eventually select is a bi-directional long short term memory (LSTM) network. However, before we can get to modeling, it's important to prepare the data adequately. Oftentimes, it is best to transform or extract features from 1-dimensional signal data in order to increase a model's representative power, as we see in the diagram below: + + + + +![image_1.png](Part01_DataPreparation_images/image_1.png) + + + + +In this part of the workflow, we will focus on how to engineer features from the original data that will aid in the model's ability to classify the inputs. + + +# Generate Training Features + + +The next step is to extract the set of acoustic features that will be used as inputs to the network. + + + + +The Audio Toolbox provides a set of Spectral Descriptor features that are commonly used as inputs to deep learning networks. + + + + +We can extract the features with individual functions, or we can simplify the workflow and use a single object called audioFeatureExtractor to do it all at once. + + + +```matlab:Code +trainingFeatures = cell(1,numel(adsTrain.Files)); +windowLength = 512; +overlapLength = 0; + +aFE = audioFeatureExtractor('SampleRate',16e3, ... + 'Window',hamming(windowLength,'periodic'),... + 'OverlapLength',overlapLength,... + 'spectralCentroid',true, ... + 'spectralCrest',true, ... + 'spectralDecrease',true, ... + 'spectralEntropy',true,... + 'spectralFlatness',true,... + 'spectralFlux',false,... + 'spectralKurtosis',true,... + 'spectralRolloffPoint',true,... + 'spectralSkewness',true,... + 'spectralSlope',true,... + 'spectralSpread',true); + +if doFeatureExtraction + reset(adsTrain); + index = 1; + tic; + while hasdata(adsTrain) + data = read(adsTrain); + trainingFeatures{index} = extract(aFE,data); + index = index + 1; + end + fprintf('Extraction took %f seconds.\n',toc); +else + load("TrainingFeatures.mat"); + disp("Training data features loaded.") +end +``` + + +```text:Output +Extraction took 31.829615 seconds. +``` + +# Normalize Training Features + + +Networks will often train better when normalized. Calculate the mean and standard deviation and normalize each element of the training feature set. + + + +```matlab:Code +allTrainingFeatures = cat(1,trainingFeatures{:}); +M = mean(allTrainingFeatures); +S = std(allTrainingFeatures); + +for index = 1:numel(adsTrain.Files) + trainingFeatures{index} = ((trainingFeatures{index} - M)./S).'; +end +``` + +# Generate and Normalize Validation Features + + +Repeat the feature extraction for the validation features. Perform the normalization inside the loop. + + + +```matlab:Code +validationFeatures = cell(1,numel(adsValidation.Files)); + +if doFeatureExtraction + index = 1; + tic; + while hasdata(adsValidation) + data = read(adsValidation); + validationFeatures{index} = extract(aFE,data); + validationFeatures{index} = ((validationFeatures{index} - M) ./ S).'; + index = index + 1; + end + fprintf('Validation Extraction took %f seconds.\n',toc); +else + load("ValidationFeatures.mat"); +end +``` + + +```text:Output +Validation Extraction took 3.328731 seconds. +``` + diff --git a/markdown_view/Part01_DataPreparation_images/figure_0.png b/markdown_view/Part01_DataPreparation_images/figure_0.png new file mode 100644 index 0000000..b7516f8 Binary files /dev/null and b/markdown_view/Part01_DataPreparation_images/figure_0.png differ diff --git a/markdown_view/Part01_DataPreparation_images/image_0.png b/markdown_view/Part01_DataPreparation_images/image_0.png new file mode 100644 index 0000000..9744c5d Binary files /dev/null and b/markdown_view/Part01_DataPreparation_images/image_0.png differ diff --git a/markdown_view/Part01_DataPreparation_images/image_1.png b/markdown_view/Part01_DataPreparation_images/image_1.png new file mode 100644 index 0000000..c690f9b Binary files /dev/null and b/markdown_view/Part01_DataPreparation_images/image_1.png differ diff --git a/markdown_view/Part02_Modeling.md b/markdown_view/Part02_Modeling.md new file mode 100644 index 0000000..500975d --- /dev/null +++ b/markdown_view/Part02_Modeling.md @@ -0,0 +1,199 @@ +# Air Compressor Data Classification +# Part 2: Train and Evaluate a Model + + +Copyright 2020 The MathWorks, Inc. + + + + +![image_0.png](Part02_Modeling_images/image_0.png) + + +# Configuration + + +Click on the checkboxes below to choose options for how to run this script. + + + +```matlab:Code +doTraining = false; +doTesting = true; +``` + + + +Make sure we run this as a project. + + + +```matlab:Code +try + prj = currentProject; +catch + open("Aircompressorclassification.prj"); + OpenPart2; + prj = currentProject; +end +``` + +# Load Data + + +Load data that was preprocessed in the previous section. + + + +```matlab:Code +load("TrainingFeatures.mat"); +load("ValidationFeatures.mat"); +reloadDatastore; +``` + +# Define Network + + +Now that we have extracted features from our signal, we will define a long short term memory (LSTM) deep neural network. + + + + +![image_1.png](Part02_Modeling_images/image_1.png) + + + + +Use an LSTM network. An LSTM layer learns long-term dependencies between time steps of time series or sequence data. The first lstmlayer will have 100 hidden units and output the sequence data. Then a dropout layer will be used to reduce probability of overfitting. The second lstmlayer will output just the last step of the time sequence. + + + +```matlab:Code +layers = [ ... + sequenceInputLayer(size(trainingFeatures{1},1)) + lstmLayer(100,"OutputMode","sequence") + dropoutLayer(0.1) + lstmLayer(100,"OutputMode","last") + fullyConnectedLayer(8) + softmaxLayer + classificationLayer]; +``` + +# Define Network Hyperparameters + +```matlab:Code +miniBatchSize = 32; +validationFrequency = floor(numel(trainingFeatures)/miniBatchSize); +options = trainingOptions("adam", ... + "MaxEpochs",50, ... + "MiniBatchSize",miniBatchSize, ... + "Plots","training-progress", ... + "Verbose",false, ... + "Shuffle","every-epoch", ... + "LearnRateSchedule","piecewise", ... + "LearnRateDropFactor",0.1, ... + "LearnRateDropPeriod",20,... + 'ValidationData',{validationFeatures,adsValidation.Labels}, ... + 'ValidationFrequency',validationFrequency); +``` + +# Train The Network + + +This network takes about 100 seconds to train on an NVIDIA RTX 2080 GPU. + + + +```matlab:Code +if doTraining + airCompNet = trainNetwork(trainingFeatures,adsTrain.Labels,layers,options); +else + load("TrainedModel.mat"); +end +``` + + +![figure_0.png](Part02_Modeling_images/figure_0.png) + +# Test The Network + + +Now that the network has been trained, we can test it on the validation data. + + + +```matlab:Code +if doTesting + validationResults = classify(airCompNet,validationFeatures); +else + load("ValidationResults.mat"); +end +``` + + + +View the confusion chart for the test results: + + + +```matlab:Code +cm = confusionchart(validationResults,adsValidation.Labels); +``` + + +![figure_1.png](Part02_Modeling_images/figure_1.png) + + + +View the overall accuracy percentage of the validation and test results: + + + +```matlab:Code +accuracy = sum(validationResults == adsValidation.Labels) / numel(validationResults); +disp("Accuracy: " + accuracy * 100 + "%") +``` + + +```text:Output +Accuracy: 96.0227% +``` + +### Visualize LSTM Activations + +\hfill \break + + +```matlab:Code +X = trainingFeatures{1}; +sequenceLength = size(X,2); +idxLayer = 2; +features = zeros(100,sequenceLength); + +if doTesting + for i = 1:sequenceLength + features(:,i) = cell2mat(activations(airCompNet,X(:,i),idxLayer)); + [net, YPred(i)] = classifyAndUpdateState(airCompNet,X(:,i)); %#ok + end +else + load("LSTMActivations.mat"); +end + +``` + + + +Visualize the first 40 hidden units using a heatmap. + + + +```matlab:Code +heatmap(features(1:40,1:40)); +xlabel("Time Step") +ylabel("Hidden Unit") +title("LSTM Activations") +``` + + +![figure_2.png](Part02_Modeling_images/figure_2.png) + diff --git a/markdown_view/Part02_Modeling_images/figure_0.png b/markdown_view/Part02_Modeling_images/figure_0.png new file mode 100644 index 0000000..ef1cc66 Binary files /dev/null and b/markdown_view/Part02_Modeling_images/figure_0.png differ diff --git a/markdown_view/Part02_Modeling_images/figure_1.png b/markdown_view/Part02_Modeling_images/figure_1.png new file mode 100644 index 0000000..5b07bc5 Binary files /dev/null and b/markdown_view/Part02_Modeling_images/figure_1.png differ diff --git a/markdown_view/Part02_Modeling_images/figure_2.png b/markdown_view/Part02_Modeling_images/figure_2.png new file mode 100644 index 0000000..b2f9c9d Binary files /dev/null and b/markdown_view/Part02_Modeling_images/figure_2.png differ diff --git a/markdown_view/Part02_Modeling_images/image_0.png b/markdown_view/Part02_Modeling_images/image_0.png new file mode 100644 index 0000000..1d094ee Binary files /dev/null and b/markdown_view/Part02_Modeling_images/image_0.png differ diff --git a/markdown_view/Part02_Modeling_images/image_1.png b/markdown_view/Part02_Modeling_images/image_1.png new file mode 100644 index 0000000..c690f9b Binary files /dev/null and b/markdown_view/Part02_Modeling_images/image_1.png differ diff --git a/markdown_view/Part03_Deployment.md b/markdown_view/Part03_Deployment.md new file mode 100644 index 0000000..dcd4a02 --- /dev/null +++ b/markdown_view/Part03_Deployment.md @@ -0,0 +1,282 @@ +# Air Compressor Data Classification +# Part 3: Deployment + + +Copyright 2020 The MathWorks, Inc. + + + + +![image_0.png](Part03_Deployment_images/image_0.png) + + +# Configuration + + +Click on the checkboxes below to choose options for how to run this script. + + + +```matlab:Code +doCodeGeneration = false; +``` + + + +Make sure we run this as a project. + + + +```matlab:Code +try + prj = currentProject; +catch + open("Aircompressorclassification.prj"); + OpenPart3; + prj = currentProject; +end +``` + +# Load Data + + +Load data that was preprocessed in a previous section. + + + +```matlab:Code +load("TrainingFeatures.mat"); +load("ValidationFeatures.mat"); +load("ValidationResults.mat"); +load("Metrics.mat"); +reloadDatastore; +``` + +# Deploying to Embedded System +## Create Functions to Process Data in a Streaming Loop + + +Once we have a trained network with satisfactory performance, it may be desirable to apply the network to test data in a streaming fashion. + + + + +There are many additional considerations that must be taken into account to make the system work in real world embedded system. + + + + +For example, + + + + - The rate or interval at which classification can be performed with accurate results + - The size of the network in terms of generated code (program memory) and weights (data memory) + - The efficiency of the network in terms of computation speed + + + +In MATLAB, we can mimic how the network will be deployed when used in hardware on a real embedded system and begin to answer these important questions. + + +## Streaming Feature Extraction + + +First, we will create a new function that does the feature extraction step in a streaming fashion. It will accept one frame of data and output the features for that frame. + + + +```matlab:Code +edit extractFeatures.m +``` + +## Combined Streaming Feature Extraction and Classification + + +Next, create a function that combines the feature extraction and deep learning classification. This is the function that we will generate code for. + + + +```matlab:Code +edit streamingClassifier.m +``` + +# **Test Streaming Loop** + + +Next, we test our feature extraction function in a streaming loop. + + + + +We will stream audio one frame at a time. This represents the system as it would be deployed in a real-time embedded system. We can measure the timing and accuracy of the streaming implementation. + + + + +We'll streaming in an amount of data equivalent to five audio files. At the a time interval equal to each file we'll evaluate the output of the classifier. At the conclusion, we'll ensure that this classification result output matches the non-streaming test we did above. + + + +```matlab:Code +% Build a signal source using N audio files from the test set +load('TrainedModel.mat') +clear functions; +resetState(airCompNet); +reset(adsValidation); +N = 5; +labels = categories(ads.Labels); +numLabels = numel(labels); + +% Create a dsp.SignalSource so we can read the audio in a streaming fashion +hopLength = 512; +audioSource = dsp.SignalSource('SamplesPerFrame',hopLength); + +% Label counter variable +j = 1; + +% Pre-allocate array to store results +streamingResults = categorical(zeros(N,1)); + +% Create AudioLoopTimer object +framesPerFile = size(validationFeatures{1},2); +at = audioexample.AudioLoopTimer(framesPerFile*N,hopLength,16e3); + +% BEGIN initialization time measurement +ticInit(at) + +% Setup streaming loop +while(j < N+1) + + % Read one audio file and put it in the source buffer + data = read(adsValidation); + release(audioSource); + audioSource.Signal = data; + + % Setup feature vector + features = zeros(size(validationFeatures{1})); + + % Setup scores vector + scores = zeros(numLabels,framesPerFile); + + % Inner loop over frames + for i = 1:framesPerFile + + ticLoop(at) % BEGIN loop timing measurement + + % Get a frame of audio data + x = audioSource(); + + % Apply streaming classifier function and store score + [scores(:,i),features(:,i)] = streamingClassifier(x,M,S); + + tocLoop(at) % END loop timing measurement + end + + % Store class result for that file + [~, result] = max(scores(:,end), [], 1); + streamingResults(j) = categorical(labels(result)); + + % Plot scores to compare over time + classNames = string(airCompNet.Layers(end).Classes); + figure; + lines = plot(scores'); %#ok<*NASGU> + xlim([1 framesPerFile]) + legend("Class " + classNames,'Location','northwest') + xlabel("Time Step") + ylabel("Score") + str = ["File" j "Prediction Scores Over Time Steps. Predicted Class:" char(streamingResults(j))]; + title(str); + + j = j + 1; +end +``` + + +![figure_0.png](Part03_Deployment_images/figure_0.png) + + +![figure_1.png](Part03_Deployment_images/figure_1.png) + + +![figure_2.png](Part03_Deployment_images/figure_2.png) + + +![figure_3.png](Part03_Deployment_images/figure_3.png) + + +![figure_4.png](Part03_Deployment_images/figure_4.png) + + +### Measure Accuracy of Streaming Test + + +We can now look at the test results for the streaming version of the classifier and the non-streaming. They should be identical. + + + +```matlab:Code +testError = mean(validationResults(1:N) ~= streamingResults); +disp("Error between streaming classifier and non-streaming: " + testError*100 + "%") +``` + + +```text:Output +Error between streaming classifier and non-streaming: 0% +``` + +## Generate Code for ARM Cortex-A using MATLAB Coder Support Package + + +Now we can create a code generation configuration for our function. We will generate code using the ARM Compute Library to produce code that can be deployed to a Raspberry Pi or another type of ARM Cortex-A device. + + + + +Requires MATLAB Coder Interface for Deep Learning Libraries Support Package. + + + + +Create a configuration object for a library: + + + +```matlab:Code +if doCodeGeneration + cfg = coder.config('lib'); + cfg.GenCodeOnly = true; + cfg.GenerateMakefile = false; + cfg.TargetLang = 'C++'; + dlcfg = coder.DeepLearningConfig('arm-compute'); + dlcfg.ArmArchitecture = 'armv7'; + dlcfg.ArmComputeVersion = '19.02'; + cfg.DeepLearningConfig = dlcfg; + cfg.HardwareImplementation.ProdHWDeviceType = 'ARM Compatible->ARM Cortex'; +``` + + + +Generate code: + + + +```matlab:Code + codegen -config cfg streamingClassifier -args {single(ones(256,1)),single(ones(1,10)),single(ones(1,10))} -d arm_compute -report + OpenCodegenReport +else + OpenCodegenReport +end +``` + + +```text:Output +Warning: Removed 'D:\work\aircompressorclassification\arm_compute\interface' from the MATLAB path for this MATLAB session. + See 'doc path' for more information. +Warning: Removed 'D:\work\aircompressorclassification\arm_compute\examples' from the MATLAB path for this MATLAB session. + See 'doc path' for more information. +Warning: Removed 'D:\work\aircompressorclassification\arm_compute\html' from the MATLAB path for this MATLAB session. + See 'doc path' for more information. +Code generation successful: View report +``` + diff --git a/markdown_view/Part03_Deployment_images/figure_0.png b/markdown_view/Part03_Deployment_images/figure_0.png new file mode 100644 index 0000000..0a6e11a Binary files /dev/null and b/markdown_view/Part03_Deployment_images/figure_0.png differ diff --git a/markdown_view/Part03_Deployment_images/figure_1.png b/markdown_view/Part03_Deployment_images/figure_1.png new file mode 100644 index 0000000..f2544f3 Binary files /dev/null and b/markdown_view/Part03_Deployment_images/figure_1.png differ diff --git a/markdown_view/Part03_Deployment_images/figure_2.png b/markdown_view/Part03_Deployment_images/figure_2.png new file mode 100644 index 0000000..25ab8ed Binary files /dev/null and b/markdown_view/Part03_Deployment_images/figure_2.png differ diff --git a/markdown_view/Part03_Deployment_images/figure_3.png b/markdown_view/Part03_Deployment_images/figure_3.png new file mode 100644 index 0000000..1843a38 Binary files /dev/null and b/markdown_view/Part03_Deployment_images/figure_3.png differ diff --git a/markdown_view/Part03_Deployment_images/figure_4.png b/markdown_view/Part03_Deployment_images/figure_4.png new file mode 100644 index 0000000..b0c0bad Binary files /dev/null and b/markdown_view/Part03_Deployment_images/figure_4.png differ diff --git a/markdown_view/Part03_Deployment_images/image_0.png b/markdown_view/Part03_Deployment_images/image_0.png new file mode 100644 index 0000000..78cc380 Binary files /dev/null and b/markdown_view/Part03_Deployment_images/image_0.png differ