October 20, 2023, researchers from various universities and Eleuther AI, known for its open models, presented Llemmma – an open model of a large language training (LLM), specially designed to solve mathematical problems.
Llemma surpasses other leading mathematical models, including Google Mineerva, providing a reliable platform for further research. Despite the fact that Llemma is not an ideal solution of mathematical tasks, this is an important step in the development of specialized models and can stimulate the research of AI in new areas.
Llemma was created on the basis of Code Llama, the adaptation of the open model Llama 2, configured to specific sets of data data. Researchers have developed two versions of the model: one with 7 billion parameters and the other with 34 billion. These models were additionally configured to Proof-Pile-2, a set of data created by researchers, which consists of scientific articles, web data with mathematical content and mathematical code.
In their experiments, researchers found that Llemma shows better performance compared to all known open models on mathematical standards. Llemma can also use tools and prove formal theorems without additional tuning, as well as use computing tools, such as Python interpreter, to solve mathematical problems.
Researchers released all their assets, including models with 7 and 34 billion parameters, a set of Proof-Pile-2 data and a code to reproduce their experiments. According to researchers, Llemma is the first open model that corresponds to the performance of the latest generation closed models.
They expressed the hope that Llemma and Proof-Pile-2 will become a useful base for future work to understand the generalization of language models, the study of the limits of dominant-specific language models and improve the mathematical capabilities of linguistic models.
In general, Llemma is part of a wider initiative to develop LLM specializing in a particular field, demonstrating that even smaller models can give significant results with improved data and large data sets.