Alibaba’s QWEN2.5-VL AI Revolutionizes Computing

The Chinese company Alibaba has released an updated version of its artificial intelligence model qwen2.5-vl . Now the system not only works better with images and videos, but also, which is especially important, can directly interact with computers and smartphones.

The developers offer a model in three versions – for 3, 7 and 72 billion parameters. Each version can analyze the most different visual content: from simple pictures to complex diagrams, graphs and interfaces. It is worth noting that only OpenAi Pro subscribers have similar possibilities for managing devices who pay $ 200 a month for the Operator Mode mode.

The Alibaba team said that the new version has noticeably advanced in how it recognizes the images. The system now distinguishes elements from films, television shows and different products, which significantly expands its capabilities when working with media content.

QWEN2.5 -VL also learned to work with long videos – more than an hour. She can find the right moment in the recording and determine exactly what is happening there.

When the QWEN team checked its model with 72 billion parameters, it turned out that it exceeds the well-known Gemini 2 Flash, GPT-4O and Claude 3.5 Sonnet systems where you need to understand documents, diagrams and videos.

The developers continue to improve the system – soon it will learn to better reason logically. In the meantime, anyone can try QWEN2.5-VL: just download the chat company or install the model through the Huging Face platform.

Since the system was created taking into account the requirements of Chinese regulators, there are a certain framework in its work. For example, if you ask her to create images of politicians – whether it is Xi Jinping, Joe Biden or Donald Trump – she will respond with an error message.

As TechCrunch writes, the chatbot will also not discuss conflicting political topics like the misses of Xi Jinping.

/Reports, release notes, official announcements.