OREANDA-NEWS. Hitachi, Ltd. today announced that it has developed a technology that analyzes huge volumes of text data on issues that are subject to debate, and presents reasons and grounds for either affirmative or negative opinions on those issues in English. This technology focuses on values such as health, economics and public safety, which are considered important to people and communities when expressing opinions, and uses correlations between those various values and relevant issues in the society to identify reasons and grounds with a high degree of reliability from among large volumes of news articles. By using multiple viewpoints, it is able to present reasons and grounds without bias toward a single perspective.

This is a basic technology that will contribute to artificial intelligence enabling logical dialogue between humans and computers. The technology could be applied to future systems to analyze contents of company documents, published reports or electronic medical records, in order to form opinions and generate data to support decision making.

In recent years, with the evolution of analysis technologies and information & telecommunication technologies such as the Internet, attention has been attracted to technologies that analyze "Big Data" - which is generated every day by various sensors and POS systems - and identify valuable information. At the same time, there has been an increasing demand for effective use of data such as company documents, published reports and electronic medical records to help give additional value and make management decisions. However, the development has been tough because we needed to overcome technological challenges in extracting correlations between issues and their values as mentioned above from huge volumes of text data.

In 2014, Hitachi developed a technology that extracts specified information from electronic medical records (e.g., illnesses and affected areas) with a high degree of accuracy. Using this technology, Hitachi has now developed a new technology for analyzing large volumes of news articles about a given topic, and presenting reasons and grounds for opinions, in English, which are highly reliable.

When giving reasons or grounds for opinions on a question that is subject to debate, it is assumed that people use their own respective viewpoints. Hitachi focused on values such as health, economics and public safety, which are considered important to people and communities, and created a "Value Dictionary" that systematically organizes those values based on a database - a database that records affirmative and negative opinions regarding a large number of discussion topics. Specifically, a list of values that serve as a basis of decision making by people or communities, and the system extracts words demonstrating a strong relationship to the values based on the frequency of use in the database, designating those words either as "positive" or "negative" in relation to those values.

Furthermore, the values and relevant words were systematically arranged by assigning a score according to "importance" based on the frequency of use. For example, in the case of the value "Health," the relations with words, such as "exercise" which is positive, and "disease" and "obesity" which are negative, were systematically arranged.

The system identifies the types of values encompassed in recorded issues, from among the various sentences used in large volumes of news articles, and creates database expressing whether those issues have positive or negative effects on those values. For example, from an article stating that "Noise is harmful to health," it is determined that the issue of "noise" has the negative effect of suppressing the value "Health," and this information is managed as database. Using this method, the system created approximately 250 million metadata (issue - value correlation data) from around 9.7 million news articles.

The system uses this huge volume of metadata as well as the Value Dictionary outlined in (1) above to select multiple values with strong correlations with a given topic from among the many news articles. By searching for sentences in all of the news articles that contain one of these values, the system extracts sentences that could potentially serve as reasons or grounds for agreement or disagreement with the topic in question.

The sentences extracted using the Value Dictionary (1) and the Metadata (2) are scored based on the source of the quote, the numerical evidence and the rhetorical expressions in order to estimate whether the sentences have a strong correlation with the specified topic and value. By processing all of the sentences that could potentially serve as reasons or grounds for opinions, and evaluating scores, it is possible to select and present reliable grounds.