Toshiba Corporation has created the world’s most accurate and adaptable Visual Question Answering (VQA) AI, which can distinguish not just persons and objects in photographs but also colors, shapes, appearances, and background elements. Toshiba presented its AI technology at ICANN2021. The AI can learn the information needed to handle a wide range of inquiries and replies, overcoming the long-standing issue of answering questions about the positioning and appearance of people and objects. It can be used for a variety of purposes without requiring any adjustment.
The VQA AI accurately answered 66.25 percent of questions without any pre-learning and 74.57 percent with pre-learning in studies using a public dataset with a massive volume of photos and data text. The AI may locate a worker standing in a particular location, which necessitates recognizing the individual, position, form, and color. It can also be used to recognize certain scenes in broadcast and surveillance video recordings.
In Japan, rising worker shortages are projected in the coming years, with a similar trend emerging in other advanced countries. The advent of COVID-19 has exacerbated the problem, making it more important than ever to safeguard worker safety and decrease onsite management burdens. Artificial intelligence (AI) is one option that is rapidly being implemented in manufacturing facilities. The global AI industry, which includes software, hardware, and services, is predicted to reach $327.5 billion in 2021, up 16.4% from the previous year, and $554.3 billion by 2024.
To get to this point, you’ll need to create a determination function that lays out how the AI should recognise an inspection item. When checking for headgear, for example, it must learn how to identify and determine if a person is wearing a hat—and this must be done for each and every item identified. It’s critical to have the flexibility in a workplace that allows for fast changes in inspection items, but this is difficult with existing AI due to the time required to set up and update the determination function.
Toshiba’s new AI satisfies the need for flexibility while also providing the best level of accuracy in answering queries in the globe. Its ability to recognize people and objects and image backgrounds, plus the extensive database at its disposal, ensure that it can quickly process the features of images and pre-learned questions to derive the correct answer. The AI can deliver an acceptable answer to a query from roughly 3,000 answer patterns after learning a big library of photos, questions, and solutions that cover the presence of people and objects, as well as information like their position and status. The AI is quite adaptable, and it can be updated by adding inspection items, or it may be adjusted to handle a different situation by using a simple “Image and Question” method to add new question phrases.
The AI can determine whether an object is on a path or if a human is standing in a designated area, also whether or not there is an object. It is believed that using AI to monitor safety at manufacturing sites will increase workplace safety, reduce supervisor workloads, and contribute to improved work style.
Toshiba achieved accuracy scores of 66.25 percent without pre-learning and 74.57 percent with pre-learning in a performance examination using a global standard public dataset, the highest values ever recorded. In comparison, the present approaches yielded 65.88 percent and 74.00 percent results, respectively.
The new AI’s adaptability makes it ideal for searching for specific moments from broadcast content, certain conditions or persons in disc drive recorders and surveillance cameras, and previous near-misses in comparable situations.
Toshiba plans to continue system development and accuracy improvements in fiscal 2023, with the goal of incorporating AI technology into safety monitoring systems.