Online ISSN: 2515-8260

AN ANALYSIS ON LEARNING OF VISUAL QUESTION ANSWERING USING MULTI-MEDIA COMPREHENSION ALGORITHM (MMCQA) IN NATURAL LANGUAGE PROCESSING

Main Article Content

Dr S Venkata Lakshmi1 , M Therasa2 ,Karthik Elangovan3 , S Sharanyaa4

Abstract

Abtract. Gaining knowledge of Vision and language is becoming a happening topic with in-depth research in Artificial Intelligence (AI). This AI, NLP, and computer vision have recently hit by issues like Graphical Question Answering (GQA). Here, we present the mission of the Multimedia Machine Modal Comprehension Question Answering Algorithm (MMCQA), focusing on addressing multimodal queries concerning words, figures, and pictures. Dataset addresses the questions and has around twelve thousand and odd lessons and more than thirty-six thousand multi-modal questions obtained from the science curriculum. Our study demonstrates that a significant part of queries need resolving of texts, figures, and cognitive analysis, denoting that the info in our report is way more complicated than the earlier studies and obvious question answering datasets. Lastly, we put forth a method based on dual-LSTM having spatial as well as temporal focus and prove that it is useful compared to other standard GQA methodologies via experiential studies.

Article Details