Options
Audio Feature based Monotone Detection and Affect Analysis for Teachers
Journal
2022 IEEE Region 10 Symposium, TENSYMP 2022
Date Issued
2022-01-01
Author(s)
T S, Ashwin
Rajendran, Ramkumar
Abstract
E-Learning platforms' dependence has intensified in the teaching-learning process in the current technological era and post-pandemic situations. The MOOCs and other online environments play a vital role in rendering the knowledge to the learners through video lectures and presentations. These lectures must be very effective as it is one way of communication most of the time. Monotonous is defined as a lack of variation in tone or pitch that negatively impacts the teaching-learning process significantly. It becomes challenging to hold the listeners' interest. In this study, we used the speech audios and tested them with three feature domains (time, frequency, and time and frequency domains): Amplitude envelope, Zero-Crossing Rate, Root Mean Square Energy, Mel Spectrograms, and MFCC. The feature values of these were compared to the proposed thresholds to obtain visualisation images. Based on the comparison with the threshold, the monotone teacher was identified. Annotators further manually verified these results. Mainly there is no existing work that deals with affect analysis of monotone speakers. The results of monotone teacher detection are considered for further testing of the audio data to classify speech emotions using the trained weights of YAMNet and TRILL deep learning architecture trained on RAVDESS, IEMOCAP, PsychExp databases. It was observed that the non-monotone teachers' speech was classified with positive emotions like happy and joy. In contrast, the monotone teacher's speech had more negative emotions like sad, disgust, fear, and angry.
Subjects