Grok V1.5

By Martin Thomas, 15 April, 2024

Enter Grok!

X is charging forward in the conversational AI space with its early access program for their first generational multimodal model large language model.

As of the date of this blog in mid-April 2024 Grok will soon be available to early testers and existing Grok users.

Grok aims to challenge the existing players with competitive capabilities in:

  • multi-disciplinary reasoning
  • understanding documents
  • science diagrams
  • charts
  • screenshots and
  • photographs

Real-World

On its Grok 1.5 Vision Preview page (here) X highlights a differentiator in Grok's capabilities in understanding the real physical world spatial understanding when compared to ChatGPT-4V, Claude 3 Sonnet and Claude 3 Opus as well as Gemini Pro 1.5.

Grok aims to impress with a highlight on Grok writing Python code from a hand drawn flow chart of a simple guess the number game.

Another highlight is the Real-World Understanding Quality Assurance called RealWordQA.  This is a new benchmark designed to evaluate basic real-world spatial understanding of multimodal models. 

Into the Community!

X are excited to announce they have released RealWordQa to the community under CC BY-ND 4.0.

You can download the dataset here...

More coming

The journey is only beginning and much more will be present in the near future including improvements in modalities such as images, audio and video.

Sign in

You can sign in with X to Grok at here...

 

If you would like us to help you with your digital journey please contact us here...

Blog tags
Blog Image
Grok Cover Image
Blog short description
X's Generative AI