The architecture that put OpenAI on the map for revolutionary machine learning models and the one that has inspired many others might be coming forward with a new iteration. GPT-3 is an autoregressive model that uses self-learning deep neural networks to perform Natural Language Processing tasks like no other.
Let us look at where this network that sets standards for generalization in machine learning algorithms and inspired networks like DALL E and DALL E 2 might be leading and how it will outperform its predecessor.
The GPT models are a series of autoregressive language models powered by deep learning, capable of generating language prompts with a click of a button. GPT-3 previously had set a precedent for such autoregressive models, which are still inspiring ideas for newer and newer models.
The GPT-3 was shown to the public in May of 2020, which is almost two years ago. Unfortunately, it was released a year after GPT-2, which was released a year after the original GPT paper was published. But given the world condition and the blow in-office research and development took, we might have the next iteration right around the corner.
With OpenAI’s CEO Sam Altman’s recent and previous interviews in the past year, a likely forecast for the upcoming model can be somewhere around Q1 next year. Let us see some of the things we at NimbleBox think might be coming with the new model.
GPT-4 is the discussed and anticipated iteration of the GPT-3 model, which is expected to outperform the existing models opening more and more doors for generalization in machine learning algorithms. For example, it was observed by using GPT-3 weights for the training of DALL-E 2 that generating images using captions was at its peak in terms of other models like Stable Diffusion.
If GPT-4 follows a trend and is somehow able to generate even more human-like text than GPT-3, we will get to see a model capable of making videos and audios on its own and cutting costs at a bunch of productions where hiring artists, maybe a steep cut in the budget.
According to Altman, GPT-4 will not be much bigger than GPT-3 in terms of the sheer parameters, which makes it slightly in the ballpark of 180 Billion to 280 Billion, which is still lighter than the heaviest models out there, some of them being Gopher, from DeepMind, and Megatron-Turing MLG from Microsoft and Nvidia.
The model is said to have such a size due to the realization by many companies that more significant may only sometimes be better. Making the most out of a small model is something that we need now until the IoT devices and deploy nodes are strong enough to run instances on their own.
When dealing with terabytes worth of training weights, there is a debate between the tradeoff of cost vs. accuracy. GPT-3, with all its impressive milestones, was amazingly only trained once and has been so only a few times since its inception; here we might be looking at one more fundamental difference between the two models if somehow OpenAI can give us the ability to train or even tune and optimize the model to provide better results.
Since GPT-4 is aiming for a relatively more minor network than the others in the market, we might be looking at a denser model than GPT-3 to compensate for the lost feature extraction and encoding purposes.
We might also be looking at a more user-oriented and understanding model, which will have the ability to be more aligned to the user’s intent behind asking for a specific writeup, script, joke, or otherwise.
All these are some of the predictions we make for the upcoming model and may very well be true given the market situation and where generality in Artificial Intelligence is leading. Let us now look at what such a model with such predicted features may achieve for the general public if the access is there for the people.
OpenAI has been known to limit the public’s access to GPT-3 and initially DALL-E 2 because of various ethical and legal reasons, simply because of the power these models hold and the compute they take to run.
Altman’s Twitter has been indicative and evident of the power these models hold in a rather humorous way, but it begs whether we will get access to GPT-4. It might be through the token-based system followed by GPT-3 or open access like DALL E 2.
We are nearing a perfect generalised AI with a new outstanding model coming every month. It is taking its gradual baby steps, both expensive and challenging. Still, it makes everyone think about when the perfect Turing Machine will be developed and whether it will be OpenAI, DeepMind, or a wildcard lab.
In this article, we have talked about the upcoming GPT-4 model from OpenAI and some of the predictions for the same from our side. If you like reading about this and more news related to Artificial Intelligence and Machine Learning, check out more such articles from our publication.
Want to unlock $350K in cloud credits and take your ML efforts to the next level? NimbleBox.ai is here to help. We’ll help you blast through model deployment 4x faster and reduce your headache of infra management by 80%. NimbleBox makes it a breeze to develop and deploy ML models in production.
Want to learn more? Let’s discuss how NimbleBox can support your ML project.