Key takeaways (TL;DR)
- Live interactive video calls with photorealistic AI Humans deliver engaging user experiences.
- Low latency (~2 seconds) and high-definition (HD) video quality meet strict performance requirements.
- Set to launch with 125 AI humans, poised to scale to thousands of users.
Company snapshot
Industry: Conversational AI platform
Size / stage: Early-stage startup
Use-case: Real-time AI Human video calls
The challenge
Delphi is a young company with a bold vision: to let people interact face-to-face with AI humans—lifelike digital versions of real individuals—through live video. The team wanted to enable a “FaceTime with your AI twin” experience on their platform, allowing users to hold natural, real-time conversations with AI replicas of themselves or others.
Achieving this meant solving several hard problems at once. Delphi needed to stream convincing, high-resolution video of AI humans that could respond almost instantly to user inputs. They also had to synchronize each avatar’s speech with a custom-cloned voice and map realistic facial expressions in real time. All of this had to work seamlessly within Delphi’s app and existing voice AI infrastructure.
With a high-profile launch approaching (including a major media story and even a Times Square billboard), the pressure was on to find a technology partner that could bring this interactive vision to life—without compromising on quality or reliability.
Why they chose Tavus
After exploring various video generation tools, Delphi’s team found that most options fell short of their requirements. Some platforms could create basic talking avatars but suffered from uncanny visuals or slow, non-interactive output, while others offered higher fidelity but were prohibitively expensive to scale up to hundreds of AI Humans. Tavus stood out as the clear solution.
The Tavus platform met Delphi’s key criteria: photorealistic AI humans with accurate lip-sync and expressions, delivered via an API that supports real-time interaction. Crucially, Tavus also integrated easily with Delphi’s existing voice-cloning setup—allowing the AI humans to speak in each user’s unique voice via Delphi’s 11 Labs engine.
Equally important was performance: Tavus demonstrated low-latency response times (sub 1-second), ensuring conversations with AI humans feel immediate.
By partnering with Tavus, Delphi gained a reliable way to deploy live AI video calls at scale on a tight timeline, without sacrificing quality or exceeding their budget.
Closing thought
By leveraging Tavus, Delphi is poised to deliver lifelike, real-time AI human conversations at scale, turning an ambitious vision of next-generation communication into an imminent reality.