Enabling Cognitive Visual Question Answering

Originally published on August 30, 2018 by SingularityNet.IO

Exploring a hybrid approach to visual question answering through deeper integration of OpenCog and a Vision Subsystem.

Introduction

Let us imagine a scenario in which Sophia, the social humanoid robot, is asked a simple question by someone:

“Sophia, is it raining?”

If Sophia says “yes” to the question, does she know why she gave that answer? In other words, how does Sophia answer the question?

The ability to answer questions about visual scenes, in other words, the ability to perform Visual Question Answering (VQA) is something that comes naturally to humans. However, the current state-of-the-art models of VQA leave much to be desired.

One of the control systems used to operate Sophia is OpenCog, a cognitive architecture. OpenCog operates over a knowledge base represented as a hypergraph called Atomspace. For Sophia to accurately answer questions about visual scenes, the content of those scenes needs to be made accessible to OpenCog.

In an earlier research article, we discussed that the simplest way to achieve that would be to process images with a Deep Neural Network (DNN) and to insert the descriptions of the images into Atomspace. One example of such a DNN would be YOLO, which describes an image with a set of labeled bounding boxes.

Although such a simple approach can be useful for semantic image retrieval, it will not be sufficient for answering arbitrary visual questions.

Continued…

Enabling Cognitive Visual Question Answering

Exploring a hybrid approach to visual question answering through deeper integration of OpenCog and a Vision Subsystem.

Introduction

Playasia and Hanson Robotics to Launch Merchandise Line

Cereproc’s Technology Enhances Sophia’s Voice Through Human-Centered AI

Reimagining Inclusion: Incorporating Robots into your Organization

Newsletter

Recent Posts

Follow Hanson Robotics

Recent Comments

Archives

Categories

Meta

Resources

Company

Newsletter

Enabling Cognitive Visual Question Answering

Exploring a hybrid approach to visual question answering through deeper integration of OpenCog and a Vision Subsystem.

Introduction

You may also like

Playasia and Hanson Robotics to Launch Merchandise Line

Cereproc’s Technology Enhances Sophia’s Voice Through Human-Centered AI

Reimagining Inclusion: Incorporating Robots into your Organization

Newsletter

Recent Posts

Follow Hanson Robotics

Recent Comments

Archives

Categories

Meta

Resources

Company

Newsletter