Over the past week of the media platform, such as the new york times , they actively discussed the topic of collecting high -quality data for teaching artificial intelligence. The reports indicate that the leading companies in the field of AI, including Openai and Google, are faced with legal and ethical problems when trying to expand their information databases.
The Openai reported, the company standing behind the development of the GPT-4 model used more than a million hours of video from YouTube to teach its audio-transcripta model Whisper, despite doubts about the legality of such actions. Greg Brockman, President of Openai, personally participated in the collection of data, which caused discussions about the borders of the company in “conscientious use” of information.
In response to accusations, representatives of Openai and Google emphasized that their companies use a variety of data sources, including publicly affordable ones, and also investigate the possibility of creating synthetic data. Nevertheless, Google also recognized the use of content with YouTube to teach their models, which, according to their representatives, corresponds to agreements with the creators of the content on the platform.
A change in the GOOGLE privacy policy is of particular interest, which is supposed to be aimed at expanding the possibilities of using consumer data.
company
* META and its products are recognized as extremist, their activities are prohibited in the territory of the Russian Federation.