Key Generative AI Models

By Martin Thomas, 21 February, 2024

AI is still 'all the rage' and keeps expanding bringing new and exciting opportunities and capabilities.

Generative AI utilises Large Language Models (LLMs).

These models have and are continuing to revolutionize the way we interact with digital content, automate tasks, and generate new information. 

In this blog, we'll explore the main categories of LLMs and generative AI models currently available in the marketplace, providing a brief overview of each category and highlighting some examples within them.

 

LLM's are the 'generative models' that are the foundation of how the AI works, answers questions and learns.

However, not all LLM's are the same and they can have different approaches to how they work and learn.

As these models develop they are starting to be classified into certain types of generative model examples are:

  • Unimodal model
  • Multimodal model
  • Proprietary models
  • Open source models

Some of this categorisation refers to the functioning of the model, some to the approach the model takes to its performance, some refers to the access to the model or ownership.

As AI becomes more sophisticated AI will start to fall into one or more of these evolving categories and distinctions may become more hazy.

Here are some key examples of AI models that it is useful to know:

Text Generation Models

Simply put these generate coherent and contextually relevant text as an output based on input prompts.  The text could be a answer to a question, creatively writing some content.

This makes them very versatile for applications such as content creation, conversation agents and similar.

Examples would be: 

  • GPT-4 (OpenAI): The latest iteration of the Generative Pre-trained Transformer, GPT-4, is a powerhouse in text generation, capable of producing highly coherent and contextually accurate text across a broad range of topics and formats.
  • LaMDA (Google): Short for Language Model for Dialogue Applications, LaMDA is designed to engage in free-flowing conversations on a seemingly endless number of topics, with a strong emphasis on maintaining context and generating human-like responses.
  • Jurassic-1 (AI21 Labs): This model stands out for its ability to understand and generate text in a wide variety of languages and domains, offering tools for content creation, summarization, and more.

Translating Models

Focused on breaking down language barriers, these models provide high-quality translations by understanding the context and nuances of the source text. They are essential for global communication and content localization.

  • DeepL Translator: Renowned for its ability to produce translations that often surpass those of its competitors in terms of accuracy and fluency, DeepL leverages advanced deep learning techniques.
  • Google Neural Machine Translation (GNMT): This system uses a large neural network for end-to-end learning of translations, known for its ability to translate between a significant number of languages while maintaining contextual meaning.
  • Facebook’s M2M-100: A multilingual model capable of translating directly between many language pairs without relying on English as an intermediary, enhancing the quality and efficiency of translations.

Video Generation Models

These can generate video either from scratch or from prompts.  They can create the whole video or change an existing video. This is a rapidly evolving category at the cutting edge of AI research.

An example of this that is exciting those interested in this video sector is the new Sora from Openai.

Other examples would include:

  • PixelRNN (Pixel Recurrent Neural Network) An advanced generative model developed for generating images pixel by pixel and particularly well suited to handling sequences of data.
  • Video Vision Transformer (ViViT) a model that applies the Vision Transformer (ViT) architecture to video analysis. It processes video by dividing it into frames and then further into patches, treating these patches as tokens similar to words in a sentence. This approach allows ViViT to capture both spatial and temporal information within videos, making it highly effective for tasks like video classification and action recognition.

Reinforcement Learning Models

This is a category of LLM generative AI model that has a trial and error feedback loop that helps with very complex tasks and associated learning.

Example: 

  • PPO-Critic or Proximal Policy Optimization (PPO) is a reinforcement learning algorithm that combines the PPO algorithm's approach to policy optimization with a value function critic. PPO, developed by OpenAI, is an advanced policy gradient method designed for training deep neural networks to make decisions in environments with high-dimensional input spaces, such as robotics, gaming, and simulated environments for AI research. .

Style Transfer Models

Designed for images this category enables artistic styles to move from one image to another.

Examples are: 

  • NightCafe AI is an online platform that leverages artificial intelligence to enable users to create digital art through techniques like neural style transfer and text-to-image generation. It stands out for its user-friendly interface and accessibility, making it possible for individuals without any background in art or computer science to generate unique and compelling artworks.
  • Neural Style Transfer s an innovative technique in the field of computer vision and artificial intelligence that allows the stylistic elements of one image (the "style image") to be applied to the content of another image (the "content image"), effectively creating a new image that merges the two. This technique is based on the understanding that the deep neural networks used for image processing can separate and recombine content and style of natural images.

Graph Generation Models

These are designed for creating structures and network models.

Examples include: 

  • GraphRNN (Graph Recurrent Neural Network) is a deep learning model designed specifically for generating graphs. Introduced by Jiaxuan You, Rex Ying, Xiang Ren, William Hamilton, and Jure Leskovec in a paper presented at the International Conference on Machine Learning (ICML) in 2018, GraphRNN addresses the challenge of modeling the complex, variable-sized structures of graphs, which are ubiquitous in real-world data, from social networks and molecular structures to transportation networks.

Conditional Generative Models

This category of models is designed to generate data based on specific attributes.

Examples are: 

  • Conditional Generative Adversarial Networks (Conditional GANs or cGANs) are an extension of the basic Generative Adversarial Network (GAN) framework, introduced to enable the generation of targeted outputs based on conditional inputs. This conditional model allows for the generation of more specific and controlled outputs, as opposed to the unsupervised nature of standard GANs that generate outputs from random noise without any direct control over the type of generated content.

 

Understanding the type or category of AI and how a particular AI overlaps these categories and may therefore be useful to your business application is an important 'watch this space' to enable you to get the best from AI in the future.

 

If you would like us to help you with your digital solution then please speak to us.

 

Blog tags
Blog Image
AI
Blog short description
Generative AI models