An implementation of model & data parallel GPT3-like models using the mesh-tensor flow library.
If you want to play around with our pre-trained models, we highly recommend you explore the integration of the Hugging Face Transformer. It is an excellent way to utilize our existing models without extra effort easily.
Training and inference with GPT-2 are officially supported on TPUs and are expected to work on GPUs. Consequently, we will focus primarily on our GPU-specific repo: GPT-NeoX; this repository will be mostly archived.
At scales beyond 200B parameters, although neo can technically realize a training step, its efficiency is very low.
GPT-NeoX was the natural choice for our development due to the easy availability of many GPUs and other factors. This, in turn, spurred us to use this new platform.
Using our evaluation harness, assessments for GPT-2 and GPT-3 have not always resulted in the same values reported in their papers. We are trying to understand why there is a discrepancy and would highly appreciate feedback from others who can help us test our eval harness.
The potential for open-source generative AI on MacOS to unlock creativity is immense, and it’s exciting to see how this technology will evolve and impact the creative industry in the coming years. Whether you’re a designer, artist, or creative professional, this technology is a game-changer worth exploring.