Login

top news

The Pentagon said that about 2 thousand American troops are in Syria

The Admiral Nakhimov cruiser is ready for testing

The IMF reported a rapid increase in salaries of Russians

A Russian woman had a new jaw made from her leg

30.07.2024, 16:14

YandexGPT Experimental entered the top of the LLM Arena rating

Science and engineering

Source: OREANDA-NEWS

OREANDA-NEWS Yandex is preparing an update to the YandexGPT family. This is reported by LLM Arena, an open crowdsourcing platform for evaluating large language models in Russian. Yandex has confirmed that it is working on a new, more powerful version of its basic language model.

The model called YandexGPT Experimental was in the top of the LLM Arena rating on the same level as the GPT-4o, GPT-4 Turbo and Claude 3.5 Sonnet. The LLM Arena rating evaluates how well the models answer questions in Russian

The LLM Arena platform was launched by independent developers from the Russian ML community. The service gives users free access to various large language models (LLM), in return, users determine which model, in their opinion, gives the best answer. Based on the collected user ratings, the authors of the service build a model leaderboard, according to which models can be compared with each other.

The logic of the service and the principle of operation were taken from the foreign service LMSYS Chatbot Arena — one of the most reputable benchmarks in the foreign market.

Unlike its foreign counterpart, the LLM Arena focuses on the Russian language, and Russian LLMs such as YandexGPT, GigaChat, Saiga, and Whirlwind have been added. The authors of the service noted that they want to create an objective, open and up-to-date benchmark of LLM models in Russian.

In the future, the service intends to add a multimodal arena, and make the benchmark a target in the Russian market.

There are already several benchmarks of LLM models in Russia, such as rulm-sbs2, MERA, Arena-Hard-Auto. Unlike current benchmarks, the evaluation of models does not occur automatically by another stronger model, or on the basis of private closed tests, but with the help of human live evaluations of real users, which makes the benchmark more objective.

Подписаться на наши группы: Instagram | VK | Facebook | Twitter

Rate this article:

Комментарии

Для добавления комментария необходимо войти под своей учётной записью или зарегистрироваться.

Комментариев нет

НОВОСТИ ПАРТНЁРОВ

Latest news in section

A Russian woman had a new jaw made from her leg

20.12.2024 08:18

Russia has developed a new technique that will help in the fight against thrombosis

19.12.2024 09:11

GLONASS will launch a new generation of devices

12.12.2024 08:30

The Ministry of Health has developed a code of ethics for the use of AI in medicine

10.12.2024 15:11

Kaliningrad has created a water-repellent material for electronics

03.12.2024 10:45

Russia has developed a frost-resistant paving stone that glows in the dark

29.11.2024 11:09

The Gamalei Center has begun developing technology to create an HIV vaccine

29.11.2024 10:32

An innovative UAV impact protection system has been successfully tested in Russia

27.11.2024 10:50

It became known about the difficulties with the European IRIS 2 satellite system

21.11.2024 11:40

In Russia created software that will replace the products of the departed ABBYY

20.11.2024 09:40

Apple has decided to buy a TV

18.11.2024 14:21

An international experiment to simulate an expedition to the moon has been completed in Russia

14.11.2024 17:07

A lunar cannon project has been developed

12.11.2024 11:42

The Luna-26 orbiter will be able to work for up to three years

07.11.2024 12:28

Spending on oil exploration in the world has grown record high

05.11.2024 15:57

Rostec told about the mining system "Agriculture"

29.10.2024 08:54

Metal-free dyes for colored solar panels have been created in Russia

28.10.2024 09:06

A tropical fish from Australia was unexpectedly spotted in Russia

23.10.2024 17:25

The postponement of the launch of Bion has been confirmed

03.10.2024 00:16

A simulator for the lunar station will be built in Europe

01.10.2024 16:47

Russian biotechnologists have proposed to create a network of eco-settlements

25.09.2024 10:58

Rostec told about the work of the Electropribor plant

25.09.2024 10:09

Russian scientists are preparing to end cooperation with CERN

23.09.2024 09:12

Moscow students and schoolchildren won the high-tech championship

23.09.2024 08:37

Moscow and Hanoi are ready to build a nuclear technology center in Vietnam

18.09.2024 12:06