The Text to Video Ai Race

By Martin Thomas, 20 May, 2024

Text to video is a super charged Ai race.

The race has many players including two of the big players in Ai in the broader sense: Google and OpenAi.

  • Google has Veo.
  • OpenAi has Sora.

Both models have as you would expect their strengths.

Recent test examples show the progress these tools are making as well as areas they need to improve upon.

Some of the output may still does not appear as 'natural' as you may like but wow they are improving and growing with capability every day.

Veo from Google's Deepmind see here
 

  • The videos generated are of very high quality.
  • There is strong creative control for the prompter.
  • Cinematic affects have good controls through the prompts.

Creators a-may at the time of this blog be able to use some of the latest features through VideoFX a new experimental tool at labs.google.


Their waitlist is open here...
 

See an example here...

The prompt used was: A lone cowboy rides his horse across an open plain at beautiful sunset, soft light, warm colors
 


Sora see here... 
Using long prompts is what distinguishes Sora.
The ability to generate complex scenes.
Ability to simulate what happens with physics in our real world.

See an example here:


The debate over text to video will no doubt continue.

What is it major use? How will it affect creative jobs?  Will is handle the complex nature of complex scene creation?

and how will success be judged?

I expect success will be judged in the end by the consumer not by some technical coding analysis of the best tool.  Remember, if you are old enough VHS and Betamax :)?

The success will possibly revolve around the ability of these Ai tools not only to create 'realistic' video but in how they integrate into existing technologies and the processes that are involved in rapid video creation.

Success will also of course revolve around the UX of these tools.  How easy is it to use the tool?  How easy is it to set up the tool?

and of course what are the costs vs benefits?

Lets also not forget the prompts when comparing outputs - an analysis of what is best may in the end also involve what was the best prompt!


There are many other tools out there and emerging whether they play directly in this space or closely related, such as:

Synthesia.io see here...
Known for its ease of use, avatars and high quality of output.

Pictory.ai see here...
Reported strengths are ease of use with social media and ability to consume long-form content.

RunwayML see here...
A strong tool of artists and designers with a powerful creative suite.

Fliki see here...
An affordable alternative.

Colossyan see here...
Wide range if use cases with API access.

LTX-Studio see here...

See an example here...
 

The recommendation for anyone interested in Ai in general and in video creation is to watch this space as often as you can!  

Why watch frequently?  This is a race and the 'athletes' are well past the starting buzzer and are running flat out, creating as they go, opening new possibilities at a rapid rate.

Positions of these competitors may change but one thing for sure is the impact of what they are creating will impact you one way or another if you are interested in video.

So best be on top of this now and what it might mean for you and your business.

 

If you would like us to help you with your digital journey and capabilities please let us know here...

 

Blog tags
Blog Image
Man on Horse Text to Video Ai Veo
Blog short description
Growing text to video Ai capabilities