Modality-based Factorization for Multimodal Fusion

Elham J. Barezi, Pascale Fung


Abstract
We propose a novel method, Modality-based Redundancy Reduction Fusion (MRRF), for understanding and modulating the relative contribution of each modality in multimodal inference tasks. This is achieved by obtaining an (M+1)-way tensor to consider the high-order relationships between M modalities and the output layer of a neural network model. Applying a modality-based tensor factorization method, which adopts different factors for different modalities, results in removing information present in a modality that can be compensated by other modalities, with respect to model outputs. This helps to understand the relative utility of information in each modality. In addition it leads to a less complicated model with less parameters and therefore could be applied as a regularizer avoiding overfitting. We have applied this method to three different multimodal datasets in sentiment analysis, personality trait recognition, and emotion recognition. We are able to recognize relationships and relative importance of different modalities in these tasks and achieves a 1% to 4% improvement on several evaluation measures compared to the state-of-the-art for all three tasks.
Anthology ID:
W19-4331
Volume:
Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Isabelle Augenstein, Spandana Gella, Sebastian Ruder, Katharina Kann, Burcu Can, Johannes Welbl, Alexis Conneau, Xiang Ren, Marek Rei
Venue:
RepL4NLP
SIG:
SIGREP
Publisher:
Association for Computational Linguistics
Note:
Pages:
260–269
Language:
URL:
https://aclanthology.org/W19-4331
DOI:
10.18653/v1/W19-4331
Bibkey:
Cite (ACL):
Elham J. Barezi and Pascale Fung. 2019. Modality-based Factorization for Multimodal Fusion. In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), pages 260–269, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Modality-based Factorization for Multimodal Fusion (Barezi & Fung, RepL4NLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4331.pdf
Data
IEMOCAP