Microsoft’s investments into OpenAI totaling billions demonstrate that training AI models is challenging. However, a team of experts at Stanford has achieved it on a budget that makes it feasible for AI businesses to emerge out of garages.
How Stanford Achieved Low-Cost AI Training with Maximum Results
The researchers had access to the open-source language model, LLaMA 7B, which played a key role in their success. Notably, this model was provided by Mark Zuckerberg’s Meta firm and is ironically one of the smallest and most economical language models presently accessible.
Trained on a vast corpus of tokens, the language model does possess certain capabilities; however, its functionality is far from what we have seen with ChatGPT.
The researchers applied the utilization of GPT via an API to generate more instruction/output combinations based on 175 human-written pairs already existing. The API allowed them to draw on the AI behind the chatbot for this purpose.
The LLaMa model was post-trained using a dataset generated by the researchers. This data, costing them $500, consisted of 52,000 sample conversations created in no time using 20 statements at a time.
The researchers used eight 80-GB A100 cloud processing computers, enabling them to complete their task within three hours at less than $100.
The process used to create Alpaca, the trained model, was not optimized; however, researchers were able to surpass the performance of ChatGPT – which was tested across various domains – when it comes to AI. Additional improvements are possible with GPT-4, their latest version of AI.
Source: @IntEngineering