For Feedbacks | Enquiries | Questions | Comments - Contact us @ innovationmerge@gmail.com
What?
- Features extraction of raw data is very important to understand the relationship present in it.
- Features extraction and analysis is must before creating any machine learning model. Analysis of structured data is easier than analyzing unstructured data such as Audio.
- The audio data cannot be understood directly by using normal media tools, This article explains the process of extraction features and understanding of Audio data.
Why?
- Due to audio feature extraction, we can perform audio classification, audio recommendation and prediction of in machine learning space.
- According to this thesis, FEATURE SELECTION AND ANALYSIS FOR STANDARD MACHINE LEARNING
CLASSIFICATION OF AUDIO BEEHIVE SAMPLES, Audio Features of bee buzzing sound is used to determine the health of a hive.
How?
- This article explains how to extract features of audio using an open-source Python Library called pyAudioAnalysis.
- pyAudioAnalysis has two stages in audio feature extraction
- Short-term feature extraction : This splits the input signal into short-term windows (frames) and computes a number of features for each frame. This process leads to a sequence of short-term feature vectors for the whole signal.
* Mid-term feature extraction : This extracts a number of statistics (e.g. mean and standard deviation) over each short-term feature sequence.
- Short-term feature extraction : This splits the input signal into short-term windows (frames) and computes a number of features for each frame. This process leads to a sequence of short-term feature vectors for the whole signal.
- pyAudioAnalysis is licensed under the Apache License and it is available at GitHub- pyAudioAnalysis.
Related Article
Software’s Required:
- Python 3.6
Network Requirements
- Internet to download packages
Implementation
- pyAudioAnalysis has feature_extraction() function which extracts total 64 short-term features.
- 34 short-term features
- 30 delta features
Reference
Install pyAudioAnalysis
git clone https://github.com/tyiannak/pyAudioAnalysis.git
pip install -e .
Troubleshooting if error
- Error : ImportError: failed to find libmagic. Check your installation
- Solution : pip install python-magic-bin==0.4.14
- Issue resolved link
short-term Feature Extraction
from pyAudioAnalysis import audioBasicIO
from pyAudioAnalysis import ShortTermFeatures
import matplotlib.pyplot as plt
import cv2
[Fs, x] = audioBasicIO.read_audio_file("data/doremi.wav")
F, f_names = ShortTermFeatures.feature_extraction(x, Fs, 0.050*Fs, 0.025*Fs)
short-term Feature(f_names[0]) ID 1 - ZCR
- ZCR(Zero Cross Rate) is rate of sign-changes of the signal during the duration of a particular frame.
- It is the rate at which the signal changes from positive to zero to negative or from negative to zero to positive.
- Low ZCR values correspond to a Lower frequency signal portion and vice-versa.
- Zero crossing rates features are used identify noise or silence or speech hence one of the application is Voice activity detection (VAD), i.e., finding whether human speech is present in an audio segment or not.
Reference
Reference
short-term Feature(f_names[1]) ID 2 - Energy
- The energy of a audio signal indicates the strength of the signal energy. It is sum of squares of the signal values, normalized by the respective frame length.
- This feature will help in dividing the audio signal into four energy-based regions: noise, low, medium and high.
- Each signal can be annotated like these and can be used for machine learning algorithms.
Reference
Reference
short-term Feature(f_names[2]) ID 3 - Entropy of Energy
- The Entropy of Energy describes the time domain distribution of the audio signal, It also can be interpreted as a measure of abrupt changes.
- This feature is used in Audio Steganalysis
- ![Entropy of Energy Output (Source: iNNovationMerge)
Reference
short-term Feature(f_names[3]) ID 4 - Spectral Centroid
- Spectral Centroid indicates the center of gravity of the spectrum for a sound is located. It is measured as the weighted mean of the frequencies present in audio.
- If the value of center of the spectrum is less, then spectrum energy is more concentrated in the low frequency range.
- If the frequencies in music are same in entire audio then spectral centroid would be around a centre.
- This feature is used in texture classification.
Reference
Reference
Other Features
- Spectral Spread, Spectral Entropy, Spectral Flux, Spectral Rolloff, MFCCs, Chroma Vector, Chroma Deviation Features will be covered in next Part
Clone and Run the project
Demo
64 Audio Features Demo