Apple has publicly shared four open-source models boasting of enhanced accuracy for queries, which could help the development of future AI models.
As the tech industry continues to race forward with AI developments, Apple has continued to offer more glimpses into the technology it is working on. In the latest public release, Apple has released a quartet of open-source models.
Referred to as Open Source Efficient LLMs, or OpenELMs, the instruct models are hosted on collaborative platform Hugging Face. Hugging Face is used to host AI models, as well as to train them and to work with others to make improvements.
OpenELM refers to an open-source library that combines multiple large language models (LLMs) using evolutionary algorithms.
The four OpenELM models use a "layer-wise scaling strategy" to allocate parameters within layers of a transformer model for increased accuracy, the Model Card for the releases reads.
The models were pre-trained using the CoreNet library. Apple provided both pre-trained and instruction-tuned models using 270 million, 450 million, 1.1 billion, and 3 billion parameters.
The pre-training dataset used a combination of a subset of Dolma v1.6, RefinedWeb, deduplicated PILE, and a subset of RedPajama. This resulted in a data set with approximately 1.8 trillion tokens.
In a related paper released on Tuesday, the researchers behind the project say that the reproducibility and transparency of large language models is "crucial for advancing open research." It also helps ensure the trustworthiness of results, and allows for investigations into model biases and risks.
As for the accuracy of the models, it is explained that using a parameter budget of one billion parameters, OpenELM has a 2.36% improvement in accuracy over OLMo, while requiring half the number of pre-training tokens.
The authors of the models and the papers include Sachin Mehta, Mohammad Hossein Sekhavat, Qingqing Cao, Maxwell Horton, Yanzi Jin, Chenfan Sun, Iman Mirzadeh, Mahyar Najibi, Dmitry Belenko, Peter Zatloukal, and Mohammad Rastegari.
The release of the source code for the models is the latest attempt by Apple to publicize its developments in AI and machine learning.
This isn't Apple's first public release of AI routines. In October, it shared an open-source LLM called Ferret, which improved how a model could analyze an image.
In April, a new version of Ferret added the ability to parse data points within an app's screenshot and to generally understand how the app functions.
There have also been papers released about generative AI animation tools and the creation of AI avatars.
WWDC in June is expected to include quite a few advancements in AI for Apple's products.
1 Comment
Releasing OpenELM models is the smart thing to do development wise, hopefully this means Apple will develop and release any bridge/networking software needed for multiple big memory Macs to work together like having a couple of Mac Ultras sitting next to each other sharing the load if a developer chooses to do so.
The beauty of the OpenELM models long term is being able to leverage all the Apple Silicon devices Mac, iPhone, iPad, Apple Vision, and the Apple Watch along with the Apple OS ecosystems that already work well together.