Improving Multi-Modal Emotion Recognition using Entropy-based Fusion and Pruning-based Network Architecture Optimization |
➖ |
 |
➖ |
Improving Speaker-Independent Speech Emotion Recognition using Dynamic Joint Distribution Adaptation |
➖ |
 |
➖ |
Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition |
 |
 |
➖ |
Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition |
➖ |
 |
➖ |
Generalization of Self-Supervised Learning-based Representations for Cross-Domain Speech Emotion Recognition |
➖ |
 |
➖ |
Improving Speech Emotion Recognition with Unsupervised Speaking Style Transfer |
 |
 |
➖ |
Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting |
➖ |
 |
➖ |
CLAP4Emo: ChatGPT-Assisted Speech Emotion Retrieval with Natural Language Supervision |
 |
 |
➖ |
EMOCONV-Diff: Diffusion-based Speech Emotion Conversion for Non-Parallel and in-the-Wild Data |
 |
 |
➖ |
Large Language Model-based Emotional Speech Annotation using Context and Acoustic Feature for Speech Emotion Recognition |
➖ |
 |
➖ |
Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition |
➖ |
 |
➖ |
Customising General Large Language Models for Specialised Emotion Recognition Tasks |
➖ |
 |
➖ |
RL-EMO: A Reinforcement Learning Framework for Multimodal Emotion Recognition |
 |
 |
➖ |
Zero Shot Audio to Audio Emotion Transfer with Speaker Disentanglement |
 |
 |
➖ |
TRUST-SER: On the Trustworthiness of Fine-Tuning Pre-Trained Speech Embeddings for Speech Emotion Recognition |
 |
 |
➖ |
STYLECAP: Automatic Speaking-Style Captioning from Speech based on Speech and Language Self-Supervised Learning Models |
 |
 |
➖ |
Frame-Level Emotional State Alignment Method for Speech Emotion Recognition |
 |
 |
➖ |
Gradient-based Dimensionality Reduction for Speech Emotion Recognition using Deep Networks |
 |
 |
➖ |
Disentanglement Network: Disentangle the Emotional Features from Acoustic Features for Speech Emotion Recognition |
➖ |
 |
➖ |
Balancing Speaker-Rater Fairness for Gender-Neutral Speech Emotion Recognition |
➖ |
 |
➖ |
Prompting Audios using Acoustic Properties for Emotion Representation |
➖ |
 |
➖ |
Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech |
 |
 |
➖ |
A Robust Pitch-Fusion Model for Speech Emotion Recognition in Tonal Languages |
 |
 |
➖ |
Modeling Intrapersonal and Interpersonal Influences for Automatic Estimation of Therapist Empathy in Counseling Conversation |
➖ |
 |
➖ |
Towards Improving Speech Emotion Recognition using Synthetic Data Augmentation from Emotion Conversion |
➖ |
 |
➖ |
Emohrnet: High-Resolution Neural Network based Speech Emotion Recognition |
➖ |
 |
➖ |
Fine-Grained Disentangled Representation Learning for Multimodal Emotion Recognition |
➖ |
 |
➖ |
Investigating Salient Representations and Label Variance in Dimensional Speech Emotion Analysis |
➖ |
 |
➖ |
Adaptive Speech Emotion Representation Learning based on Dynamic Graph |
➖ |
 |
➖ |
Enhancing Two-Stage Finetuning for Speech Emotion Recognition using Adapters |
➖ |
 |
➖ |
Speech Swin-Transformer: Exploring a Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition |
➖ |
 |
➖ |
Emotion-Aware Contrastive Adaptation Network for Source-Free Cross-Corpus Speech Emotion Recognition |
➖ |
 |
➖ |
Dynamic Speech Emotion Recognition using a Conditional Neural Process |
➖ |
 |
➖ |
MS-SENet: Enhancing Speech Emotion Recognition through Multi-Scale Feature Fusion with Squeeze-and-Excitation Blocks |
 |
 |
➖ |
GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Accurate Speech Emotion Recognition |
➖ |
 |
➖ |
Multi-Source Unsupervised Transfer Components Learning for Cross-Domain Speech Emotion Recognition |
➖ |
 |
➖ |
Self-Supervised Domain Exploration with an Optimal Transport Regularization for Open Set Cross-Domain Speech Emotion Recognition |
➖ |
 |
➖ |
Multi-Modal Emotion Recognition using Multiple Acoustic Features and Dual Cross-Modal Transformer |
➖ |
 |
➖ |
Speech Relationship Learning for Cross-Corpus Speech Emotion Recognition |
➖ |
 |
➖ |
Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation |
➖ |
 |
➖ |
MF-AED-AEC: Speech Emotion Recognition by Leveraging Multimodal Fusion, ASR Error Detection, and ASR Error Correction |
➖ |
 |
➖ |
Improving Domain Generalization in Speech Emotion Recognition with Whisper |
➖ |
 |
➖ |
Comparing Data-Driven and Handcrafted Features for Dimensional Emotion Recognition |
 |
 |
➖ |
Speech Emotion Recognition with Distilled Prosodic and Linguistic Affect Representations |
➖ |
 |
➖ |
MCM-CSD: Multi-Granularity Context Modeling with Contrastive Speaker Detection for Emotion Recognition in Real-Time Conversation |
 |
 |
➖ |