Tavus Introduces Phoenix-3, Raven-0, and Sparrow-0: A Family of Models Powering the First AI Agents That Truly See, Hear, and Engage in Real-Time, Face-to-Face Interaction

First-of-their-kind AI models reverse-engineer human likeness, perception, and cadence to create lifelike conversational video AI that is nearly indistinguishable from human interaction

Leading AI startups and Fortune 500 companies alike use Tavus to create novel experiences that entertain, educate, solve problems, show empathy, and much more

Tavus Introduces Phoenix-3, Raven-0, and Sparrow-0: A Family of Models Powering the First AI Agents That Truly See, Hear, and Engage in Real-Time, Face-to-Face Interaction

For Editorial Contact:
Leigh Disher
leigh@gmkcommunications.com

Tavus, a leading AI research company backed by Sequoia, announced today the launch of three groundbreaking AI models that set new industry benchmarks for human-AI interaction. With Phoenix-3 (the first full-face AI rendering model), Raven-0 (the first AI perception model that sees and reasons like a human), and Sparrow-0 (a state-of-the-art conversational turn-taking model), Tavus redefines AI-human interaction by delivering the building blocks for hyper-realistic, capable, and emotionally-aware video agents.

This press release features multimedia. View the full release here: https://www.businesswire.com/news/home/20250306296766/en/

These models, impressive on their own, become even more powerful when they work together to create the next iteration of Tavus’ groundbreaking Conversational Video Interface (CVI), unlocking the power of human conversation at infinite scale. With CVI, developers can bring AI agents to life, making them seem human. This marks a step function change in how we will interact with computers.

The shift is significant because it enables people to take advantage of the ease of face-to-face conversation, during which humans quickly compress complex layers of information—emotional, verbal, tonal, and visual— to communicate with machines that understand nuance and respond naturally. By seeing, hearing, and understanding intent at human speed, CVI creates a natural dialogue; it bridges the gap between cognition and technology like never before.

"We believe effective conversations have solved problems since the beginning of time – they’ve prevented wars, inspired revolutions, and sparked love. But as technology has advanced, we prioritized efficiency over connection, replacing real conversations with scripted chatbots and robotic AI that feel cold and impersonal, and don't lead to a resolution,” said Tavus CEO Hassaan Raza. “But there’s a reason why we connect best through face-to-face conversations; they’re natural, expressive, and full of information and subtle cues that build trust and understanding. With our new models and cognitive architecture, we’re marrying the EQ of face-to-face conversations with the IQ and efficiency of AI. We’re not just generating talking heads—we’re building an operating system for AI that feels genuinely human, understands expressions and emotions, and responds naturally, creating a true sense of presence. It’s the blueprint for a new kind of agent that transforms human-computer interaction.”

Tavus’ breakthrough technology unlocks new possibilities across industries—from virtual sales assistants and customer service representatives to emotionally intelligent AI agents in healthcare, education, and training environments. Users at companies like CVS, Alibaba, and Deloitte are already leveraging Tavus' technology, while breakout startups such as Delphi and Mercor rely on Tavus as the backbone of their AI-powered video experiences.

Meet the Family

Phoenix is Tavus' foundational model for Gaussian-diffusion animation, continuously evolving to push the boundaries of AI-driven human expression. Now in its third evolution, Phoenix-3 is the most advanced full-face animation model ever created—capable of cloning any individual’s likeness with unprecedented accuracy and fidelity. Every muscle in the face can stretch, wrinkle, and shift to express emotion as naturally as a human. Whether responding with surprise, amusement, frustration, or curiosity, Phoenix-3 analyzes conversation context to generate real-time facial micro-expressions that evolve naturally, making interactions feel truly alive. This is the first AI model capable of continuous, full-face animation and persistent identity preservation, unlocking a new frontier of realism for digital humans.

Raven-0 gives AI real perception, allowing it to see and interpret the world like a person would. Instead of taking static snapshots, it watches continuously, understanding actions, text, and surroundings in real time. This means, for example, that an AI tutor can notice when a student looks confused and adjust its explanation, or a support assistant can follow along on your screen, pointing out exactly what to do. Even a simple smile isn’t just detected; it’s understood, distinguishing politeness from real happiness. Raven-0 elevates each dialogue, embedding visual context and awareness into every interaction so that conversations truly resonate on a human level.

Sparrow-0 fixes what most AI gets wrong—timing. Today’s AI still treats conversations like a game of hot potato, either cutting in too soon or leaving you hanging in awkward silence. Sparrow-0 makes AI conversations feel natural by understanding the rhythm of speech, when to talk, when to pause, and when to listen. Instead of reacting to filler words like “uh” or waiting for a long silence, it picks up on tone, pacing, and context to jump in at the right moment. This means AI won’t awkwardly interrupt mid-thought or leave long gaps before replying. Whether in a fast-paced discussion or a thoughtful exchange, it adjusts in real time, making conversations flow smoothly, sometimes even better than human-to-human dialogue. Built on a custom transformer model, Sparrow-0 eliminates the robotic feel of typical AI, setting a new standard for seamless interaction and outperforming all competitors on early benchmark evals.

To show how lifelike and responsive AI can be within CVI when these three models work together, Tavus created a demo experience that features Charlie, an AI researcher who can search the internet, generate images, follow along via screenshare for technical support, and even role-play different scenarios within CVI—highlighting the depth of interaction enabled by these new advancements. Try the demo on the Tavus homepage.

For customers interested in leveraging the future of conversational AI, Phoenix-3, Raven-0, and Sparrow-0 are now available.

About Tavus

Tavus is a market-leading generative AI video research company building foundational models and operating systems for human-AI interaction. Inspired by the human brain, Tavus' cognitive architecture enables you to build hyper-realistic AI agents that see, listen, and respond, bringing the human touch to digital experiences at scale. Its AI models and APIs power virtual humans for real-time conversations and lifelike video generation, transforming industries like education, healthcare, recruiting, marketing, sales, financial services, and more. Tavus' technology is used by Fortune 500 companies and innovative startups alike to create AI-driven experiences that feel truly engaging and interactive. Headquartered in San Francisco, Tavus is backed by Sequoia Capital, Scale Venture Partners, Y Combinator, HubSpot, and other leading investors.

With our new models and cognitive architecture, we’re marrying the EQ of face-to-face conversations with the IQ and efficiency of AI. — Tavus CEO Hassaan Raza