Introduction: Arduino Music Notes Detector
Detecting music notes from the audio signal is difficult to do especially on Arduino due to limited memory and processing power. Generally, the note is not a pure sine wave that makes the detection difficult. If we take the frequency transform of various musical instruments, it may contain multiple harmonics based on the note being played. Every instrument has its own signature combination of various harmonics. In this code, I tried to make a program that can cover as many instruments as possible. You may refer attached video in which I tried to test the various types of instruments, various types of tones generated by the keyboard, and even sound of vocal are checked. Accuracy of the detection varies instrument to instrument. For some instrument (i.e. piano) in a limited range (200-500Hz) it is accurate, while some instrument it has low accuracy (i.e. Harmonica).
This code makes use of a previously developed FFT code called EasyFFT.
The demonstration of the code is shown in the above video with various types of instrument sound as well as vocal.
Attachments
Supplies
- Arduino Nano/Uno or above
- Microphone module for Arduino
Step 1: Algorithm for Note Detection
As mentioned in the previous step, the detection is difficult due to the presence of multiple frequencies in the audio samples.
The program works in the following flow:
1. Data acquisition:
- this section takes 128 samples from audio data, the separation between two samples (sampling frequency) depending on the frequency of interest. In this case, we are using spacing between two samples is used to apply Hann window function as well as amplitude/RMS calculation. This code also does rough zeroing by subtracting 500 from analogread value. This value can be changed if required. For a typical case, this values work well. Further, some delay needs to be added to have a sampling frequency of around 1200Hz. in the case of 1200Hz sampling frequency max of 600 HZ frequency can be detected.
for(int i=0;i<128;i++)<br> { a=analogRead(Mic_pin)-500; //rough zero shift sum1=sum1+a; //to average value sum2=sum2+a*a; // to RMS value a=a*(sin(i*3.14/128)*sin(i*3.14/128)); // Hann window in[i]=4*a; // scaling for float to int conversion delayMicroseconds(195); // based on operation frequency range }
2. FFT:
Once data is ready, FFT is performed using EasyFFT. This EasyFFT function is modified to fix FFT for 128 samples. The code is also modified to reduce memory consumption. The original EasyFFT function designed to have up to 1028 samples (with the compatible board), while we only need 128 samples. this code reduces memory consumption of around 20% compared to original EasyFFT function.
Once FFT is done, the code returns the top 5 most dominant frequency peaks for further analysis. This frequency are arranged in descending order of amplitude.
3. For every peak, the code detects possible notes associate with it. this code only scans up to 1200 Hz. It is not necessary to have note the same as the frequency with max amplitude.
All frequencies are mapped between 0 to 255,
here the first octave is detected, for example, 65.4 Hz to 130.8 represents one octave, 130.8 Hz to 261.6 Hz represents another. For every octave, frequencies are mapped from 0 to 255. here mapping starting from C to C'.
if(f_peaks[i]>1040){f_peaks[i]=0;}<br> if(f_peaks[i]>=65.4 && f_peaks[i]<=130.8) {f_peaks[i]=255*((f_peaks[i]/65.4)-1);} if(f_peaks[i]>=130.8 && f_peaks[i]<=261.6) {f_peaks[i]=255*((f_peaks[i]/130.8)-1);} if(f_peaks[i]>=261.6 && f_peaks[i]<=523.25){f_peaks[i]=255*((f_peaks[i]/261.6)-1);} if(f_peaks[i]>=523.25 && f_peaks[i]<=1046) {f_peaks[i]=255*((f_peaks[i]/523.25)-1);} if(f_peaks[i]>=1046 && f_peaks[i]<=2093) {f_peaks[i]=255*((f_peaks[i]/1046)-1);}
NoteV array values are used to assign the note to the detected frequencies.
byte NoteV[13]={8,23,40,57,76,96,116,138,162,187,213,241,255};
4. After calculating note for every frequency it may be the case that there are multiple frequencies that exist which suggests the same note. To have an accurate output code also considers repetitions. The code adds up all frequency values based on amplitude order and repetitions and peaks the note with maximum amplitude.
Step 2: Application
Using the code is straight forward, however, there are also multiple limitations that need to be kept in mind while it. The code can be copied as it is used for note detection. The below points need to be considered while using it.
1. Pin Assignment:
Based on the attached Pin assignment needs to be modified. For my experiment, I kept it to Analog pin 7,
void setup() {<br>Serial.begin(250000); Mic_pin = A7; }
2. Microphone sensitivity:
Microphone sensitivity needs to be modified such waveform can be generated with good amplitude. Mostly, the Microphone module comes with a sensitivity setting. appropriate sensitivity to be selected such that signal is neither too small and also not clips off due to higher amplitude.
3. Amplitude threshold:
This code activates only if the signal amplitude if high enough. this setting needs to be set manually by the user. this value depends upon microphone sensitivity as well as application.
if(sum2-sum1>5){ . .
in the above code, sum2 gives RMS value while sum 1 gives mean value. so the difference between these two values gives the amplitude of the sound signal. in my case, it works properly with an amplitude value of around 5.
4. By default, this code will print the detected note. however, if you are planning to use the note for some other purpose, the directly assigned number should be used. for example C=0;C#=1, D=2, D#=3 and onward.
5. If instrument have higher frequency, the code may give false output. the maximum frequency is limited by the sampling frequency. so you may play around below delay values to get optimum output. in below code delay of 195 microseconds. which may be tweaked to get optimum output. This will affect the overall execution time.
{ a=analogRead(Mic_pin)-500; //rough zero shift sum1=sum1+a; //to average value sum2=sum2+a*a; // to RMS value a=a*(sin(i*3.14/128)*sin(i*3.14/128)); // Hann window in[i]=4*a; // scaling for float to int conversion delayMicroseconds(195); // based on operation frequency range }
6. this code will only work till 2000Hz frequency. by eliminating the delay between sampling around 3-4 kHz of sampling frequencies can be obtained.
Precautions:
- As mentioned in the EasyFFT tutorial, the FFT eats up a huge amount of memory of Arduino. So if you have a program that needs to store some values it is recommended to use a board with higher memory.
- This code may work well for one instrument/vocalist and bad for another. Real-time Accurate detection is not possible due to computational limitations.
Step 3: Summery
Note detection is computationally intensive work, getting real-time output is very difficult especially on Arduino. This code can give around 6.6 samples /seconds (for 195 microseconds delay added). this code works well with the piano and some other instruments.
I hope this code and tutorial be helpful in your project related to music. in case of any doubt or suggestion feel free to comment or message.
In the upcoming tutorial, I will modify this code for music chord detection.
so stay tuned.