添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
相关文章推荐
文武双全的铁链  ·  期望 · Jest 中文网·  4 周前    · 
眼睛小的酸菜鱼  ·  Execution returns ...·  1 月前    · 
冷静的牛肉面  ·  python ...·  2 月前    · 
旅行中的蛋挞  ·  Wexpect version 4.0.1 ...·  2 月前    · 
失望的烈马  ·  SQL ...·  3 月前    · 
讲道义的鞭炮  ·  中国教育学会 - ...·  4 月前    · 

What Can We Expect From GPT-5?

The moment we’ve all been waiting for - GPT-5 and the impressive capabilities of its predecessor, GPT-4. Image by Editor

It may seem very difficult to keep up with the fast movement in AI and technology. Every week or month, something new drops and now you’re here learning something new, again!

This time it is GPT-5.

GPT-4 was released in March 2023, and since then everybody has been waiting for the release of GPT-5. Siqi Chen tweeted on March 27th saying that “gpt5 is scheduled to complete training this December.” However, this statement has been clarified by OpenAI CEO Sam Altman at an MIT event in April when asked about GPT-5 stating “We are not and won’t for some time,”.

So that clarifies that. However, some experts have suggested that OpenAI release a GPT-4.5, an intermediate release between GPT-4 and GPT-5 by Q3/Q4 of 2023. Improvements are always being made to current models and this could be a potential release of GPT-4.5. Many are saying that GPT-4.5 has multimodal capability potential, which has already been demonstrated in GPT-4 developer livestream in March 2023.

Although there are high expectations for GPT-5, GPT-4 still needs to iron out some of its creases. For example, GPT-4’s inference time is very high along with it being computationally expensive to run. There are other challenges such as accessing the GPT-4 APIs.

Although there is work to do, what we can say is that each of the GPT releases has pushed the boundaries of AI technology and what it is capable of. AI enthusiasts are excited to explore GPT-5's groundbreaking features.

So what features can we expect from GPT-5? Let’s find out.

Reduced Hallucination

This is all about trust, the main reason why most users do not believe in AI models. For example, GPT-4 scored 40% higher than GPT-3.5 in internal factual evaluations under all nine categories, as shown in the image below. This means that GPT-4 is less likely to respond to disallowed content, and 40% more likely to produce factual reponses, in comparison to GPT-3.5.

As new releases will continue to improve on current challenges, it is said that GPT-5 will reduce hallucination to less than 10%, making LLMs more trustworthy.

Image by OpenAI

Compute Efficiency

As stated earlier on, GPT-4 is very computationally expensive, at $0.03 per token. This is in comparison to GPT-3.5’s cost of $0.002. That is a big difference. GPT-4 being trained on a one trillion parameter dataset and infrastructure reflects the cost.

Whereas Google’s PaLM 2 model is only trained on 340 billion parameters and has efficient performance. If OpenAI plans to compete with Google's PaLM 2, they will need to look into ways of reducing the cost, and the size of GPT-4 parameters - all whilst being able to maintain performance.

Another aspect to look into is a better inference time, for the time it takes a deep learning model to predict new data. The more features and plugins within GPT-4, the more compute efficiency becomes. Developers are already complaining to OpenAI that GPT-4 APIs frequently stop responding, which forces them to use GPT-3.5.

Taking all of that into consideration, we can expect OpenAI to overcome these challenges with a GPT-5 release that is smaller, cheaper and more efficient.

Multi-Sensory

In the come-up to GPT-4’s release, a lot of people were going crazy over its multimodal capabilities. Although it has not been added to GPT-4 yet, this is where GPT-5 may come and be the star of the show and truly make it multimodal.

Not only can we expect it to deal with images and text, but also audio, videos, temperature, and more. Sam Altman stated in an interview “I’m very excited to see what happens when we can do video, there’s a lot of video content in the world. There are a lot of things that are much easier to learn with a video than text.”

Increasing the type of data that can be used to make conversations more dynamic and interactive. Multimodal capabilities will be the fastest link to Artificial general intelligence (AGI).

Long-Term Memory

GPT-4’s maximum token length is 32 thousand tokens, which was impressive. But with the world releasing model after model, we have models such as Story Writer that can output 65 thousand tokens.

To keep up with the current competition, we can expect GPT-5 to introduce a longer context length, allowing users to have AI friends that can remember their persona and history for years.

Improved Contextual Understanding

Being a large language model (LLM), the first thing we can expect is an improvement and enhanced ability in understanding context. If we merge this with the point above about long-term memory, GPT-5 could have the potential to maintain context over long conversations. As a user, you will have more catered and meaningful responses that are consistent with your requirements.

With this comes a more advanced understanding of language, with the main component of natural language being emotion. Potential capabilities of contextual understanding in GPT-5 can allow it to be more empathetic and produce appropriate replies to continue to engage in the conversation.

Wrapping it up

There is more to find out about the potential capabilities of GPT-5, and we won't be able to find out any more information till closer to the release. This article is based on the current challenges that GPT-4 and GPT-3.5 face, and how OpenAI can use these hurdles to overcome and produce a high performant release of GPT-5.
Nisha Arya is a Data Scientist, Freelance Technical Writer and Community Manager at KDnuggets. She is particularly interested in providing Data Science career advice or tutorials and theory based knowledge around Data Science. She also wishes to explore the different ways Artificial Intelligence is/can benefit the longevity of human life. A keen learner, seeking to broaden her tech knowledge and writing skills, whilst helping guide others.