The American company Apple broke its tradition of secrecy by unveiling a model of generative artificial intelligence called Openelm. This open model surpasses many other LLMs trained on public data, showcasing Apple’s commitment to transparency.
According to Apple, Openelm is 2.36% more accurate than the AI2 Institute’s model in February, using only half the tokens for initial training. While the performance difference may be slight, Apple’s introduction of Openelm underscores its pursuit of excellence, hinting at a potential industry-leading position in the future.
A notable innovation in Openelm is Apple’s decision to not only reveal the model but also provide a full set of tools for training and evaluation. This marks a departure from previous practices where companies only shared model weights and code, as Apple now offers all necessary resources for training and evaluation using public data.
Openelm utilizes a technique called layer scaling to effectively distribute parameters in the transformer model, leading to enhanced model accuracy in tests involving standard tasks.
The training of Openelm drew upon massive datasets such as GitHub, Wikipedia, and Stackexchange. Additionally, Apple included code for converting models into the MLX library, enabling their use on Apple devices and eliminating the need for cloud services in data processing.
Despite its superior accuracy, Openelm exhibited lower performance compared to previous models due to the use of the non-optimized Romsnorm algorithm. Apple plans further optimization to enhance the model’s speed in future iterations.
Apple advises caution when using the Openelm model, as it lacks security guarantees and has the potential to generate malware. This stands in contrast to Microsoft’s decision to retract its Wizardlm 2 model from public access due to non-compliance with new security policies.