Fujitsu and OIST Begin Joint R&D
Recently, a variety of successful cases has put the spotlight on reinforcement learning, in which a computer acquires an action selection policy suited to the environment through trial and error, based on rewards for certain actions. With reinforcement learning techniques to date, however, the designer had to specify the information of interest beforehand, and the learning process had to be done over again for each problem, limiting applicability in the real world.
In this joint research, the partners will look at how the human brain learns, and incorporate those mechanisms into reinforcement learning algorithms, with the goal of producing an artificial intelligence (AI) with human-like applied skills to tackle a wide range of real-world problems.
Machine learning, which creates a variety of task executors based on the data, has also moved forward in practical terms in the areas of image and voice recognition, and now forms the core of AI technology. One particularly appealing subcategory is reinforcement learning, in which the computer acquires an action-selection policy adapted to an environment through trial and error, based on rewards for certain actions.
The human brain is capable of learning applied skills in which it can select what is important from different kinds of information, apply past learning to new problems, and select a behavior as needed from among those suited to a particular situation, or that have a greater degree of certainty and safety. For example, a person in a crowd can instantly identify people or obstacles they need to watch out for, depending on the direction they wish to take, and avoid collisions. A person who already knows how to play chess can also generally quickly pick up shogi (a Japanese game similar to chess). Moreover, it is possible for a good player to make an appropriate choice according to the situation, depending on if a standard move should be played, or if a move based on deeper thoughts is required. But existing reinforcement learning techniques need a designer to specify the information of interest beforehand, and need to retrain for every problem, which limits applicability in the real world.
Moving forward, OIST and Fujitsu Laboratories will begin work on the problems of handling massive volumes of input data, and selecting actions where multiple policies learn in parallel, such as policies that can flexibly adapt in response to changes in the environment or more conservative responses.Fujitsu Laboratories aims to build on the results of this joint research to develop AI solutions for real-world applications, such as ICT system management and energy management. Computers will thereby more efficiently be able to acquire policies adjusted to environments without needing manual setting or adjusting.
Комментарии