Skip to main navigation menu Skip to main content Skip to site footer

Electrotechnical and Computer Engineering

Vol. 39 No. 09 (2024): Proceedings of Faculty of Technical Sciences

MULTIMODAL EMOTION RECOGNITION USING COMPRESSED GRAPH NEURAL NETWORKS

  • Тијана Ђуркић
DOI:
https://doi.org/10.24867/28BE43Djurkic
Submitted
September 6, 2024
Published
2024-09-06

Abstract

In this paper, first, an algorithm for emotion recognition based on sound, text and video, using graph neural networks is described. Then, the results of applying compression to the model are present, which is needed so that large models can be used on smaller devices.

References

[1] N.H. Frijda, The emotions, Cambridge University Press; 1986.
[2] Т. Носек, Б. Бркљач, Д. Деспотовић, М. Сечујски, Т. Лончар-Турукало, „Практикум из машинског учења“, Факултет техничких наука, Универзитет у Новом Саду, 2020.
[3] H. Abdi, D. Valentin, & B. Edelman, Neural networks (No. 124). Sage, 1999.
[4] J. O. Neill, An overview of neural network compression. arXiv preprint arXiv:2006.03669, 2020
[5] A. Joshi, A. Bhat, A. Jain, A. V. Singh, & A. Modi, COGMEN: COntextualized GNN based multimodal emotion recognitioN. arXiv preprint arXiv:2205.02455, 2022
[6] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, ... & I. Polosukhin, Attention is all you need. Advances in neural information processing systems, 30, 2017
[7] M. Schlichtkrull, T. N. Kipf, P. Bloem, R. Van Den Berg, I. Titov, & M. Welling, Modeling relational data with graph convolutional networks. In The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings 15 (pp. 593-607). Springer International Publishing, 2018
[8] Y. Shi, Z. Huang, S. Feng, H. Zhong, W. Wang, Y. Sun, Masked label prediction: Unified message passing model for semi-supervised classification. arXiv preprint arXiv:2009.03509, Sep 8, 2020
[9] C. Busso, M. Bulut, C. C. Lee, A. Kazemzadeh, E. Mower, S. Kim, ... & S. S. Narayanan, IEMOCAP: Interactive emotional dyadic motion capture database. Language resources and evaluation, 42, 335-359, 2008
[10] A. B. Zadeh, P. P .Liang, S. Poria, E. Cambria, & L. P. Morency, Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 2236-2246), July, 2018
[11] Н. Симић, С. Сузић, Т. Носек, М. Вујовић, З. Перић, М. Савић, и В. Делић, Speaker recognition using constrained convolutional neural networks in emotional speech. Entropy, 24(3), 414, 2020
[12] T. Liang, J. Glossner, L. Wang, S. Shi, & X. Zhang, Pruning and quantization for deep neural network acceleration: A survey. Neurocomputing, 461, 370-403, 2021