For Feedbacks | Enquiries | Questions | Comments - Contact us @ innovationmerge@gmail.com
What?, Why?, How?
Software’s Required:
- Python 3.6
Network Requirements
- Internet to download packages
Implementation
- pyAudioAnalysis has feature_extraction() function which extracts total 64 short-term features.
- 34 short-term features
- 30 delta features
Reference
Install pyAudioAnalysis
git clone https://github.com/tyiannak/pyAudioAnalysis.git
pip install -e .
Troubleshooting if error
- Error : ImportError: failed to find libmagic. Check your installation
- Solution : pip install python-magic-bin==0.4.14
- Issue resolved link
short-term Feature Extraction
from pyAudioAnalysis import audioBasicIO
from pyAudioAnalysis import ShortTermFeatures
import matplotlib.pyplot as plt
import cv2
[Fs, x] = audioBasicIO.read_audio_file("data/doremi.wav")
F, f_names = ShortTermFeatures.feature_extraction(x, Fs, 0.050*Fs, 0.025*Fs)
short-term Feature(f_names[4]) ID 5 - Spectral Spread
- It is a measure of the average spread of the spectrum in relation to its centroid.
- The spectral spread indicates the distribution of the audio signal around its centroid.
- Large Spectral Spread - Noise like signals
- Low Spectral Spread - Tonal sounds
Reference
Reference
short-term Feature(f_names[5]) ID 6 - Spectral Entropy
- Spectral entropy is a measure of uniformity of each frame of the signal, also entropy can be used to capture the distinct spectral peaks.
- If entropy is more, then uniform is the distribution.
- This feature is used in study, Spectral entropy indicates electrophysiological and hemodynamic changes in drug-resistant epilepsy
Reference
Reference
short-term Feature(f_names[6]) ID 7 - Spectral Flux
- Spectral flux is the measure of change between the normalized magnitudes of two adjacent frames. It is calculated by comparing the power spectrum for one frame against the power spectrum from the previous frame.
- This feature is used to identify music and speech signals. Higher rate of change represents music.
- It is used in study Speech/Audio Signal Classification Using Spectral Flux Pattern Recognition
Reference
Reference
short-term Feature(f_names[7]) ID 8 - Spectral Rolloff
- Spectral rolloff is the frequency below which 90% of the magnitude distribution of the spectrum is concentrated, e.g. 85%, lies.
Reference
Reference
short-term Feature(f_names[9]-f_names[21]) ID 9 - MFCCs
- Mel Cepstral Coefficient(MFCC) describes the overall shape of a spectral envelope. * * This features are based on the Fourier transform. After taking the Fourier transform of an analysis window, the magnitude spectrum is passed through a Mel filterbank with varying bandwidth mimicking the human ear, i.e. small bandwidth at low frequencies and large bandwidth at high frequencies.
- The output energy of each filterbank is log transformed and MFCCs are obtained by taking the Discrete Cosine Transform of the outputs.
Reference
Reference
short-term Feature(f_names[22]-f_names[33]) ID 10 - Chroma Vector
- Vector A 12-element representation of the spectral energy where the bins represent the 12 equal-tempered pitch classes of western-type music (semitone spacing).
Reference
short-term Feature(f_names[34]-f_names[34]) ID 11 - Chroma Deviation
- The standard deviation of the 12 chroma coefficients.
Reference
Clone and Run the project
Demo
64 Audio Features Demo