All Posts

AI, News, and Ethics

HIPAA-compliant healthcare conversational AI platforms: what to look for

Written by

Tavus Team

publish date

May 8, 2026

Gaussian Splatting: Explained Through Code

A patient wakes at 2 am with a question about the medication she started yesterday. The discharge instructions don't answer what she's feeling right now, so instead of calling the ER or waiting until Monday, she may stop taking it. A follow-up call that could have caught a complication never happened.

Healthcare leaders know conversational AI belongs in these gaps, and that what patients need at 2 am is presence: the sense that someone is actually paying attention. The harder part is buying that presence without compromising what the Health Insurance Portability and Accountability Act (HIPAA) requires.

This guide is written for the three people in the buying room: the product lead, the engineering lead, and the compliance officer evaluating AI Personas for patient-facing workflows.

What are HIPAA-compliant healthcare conversational AI platforms?

HIPAA-compliant healthcare conversational AI platforms are real-time audio or video systems designed to conduct patient-facing conversations while meeting the legal and technical requirements set by HIPAA for handling patient health information.
They combine conversational capabilities such as speech recognition, language understanding, and response generation with the administrative, physical, and technical safeguards the law requires of any system that creates, stores, or transmits that data.

In practice, that means signed Business Associate Agreements with every vendor in the data path, audited access controls, encryption for data in motion and at rest, and clear escalation rules when a conversation crosses into clinical territory. These platforms handle protocol-driven workflows such as intake, medication adherence, post-discharge follow-up, and chronic condition check-ins, while routing anything requiring a clinician's judgment to a human.

What HIPAA requires from a conversational AI platform

HIPAA has three rules that matter most:

The Privacy Rule sets national standards for protecting individually identifiable health information, Protected Health Information, or PHI.
The Security Rule protects the electronic subset, ePHI, through required safeguards.
The Breach Notification Rule requires notification following a breach of unsecured PHI. PHI is health information tied to a specific person: a name paired with a diagnosis, a medication list linked to an address, a symptom described in a conversation where the patient has identified themselves.

Any vendor that creates, receives, maintains, or transmits PHI on behalf of a covered entity must sign a Business Associate Agreement (BAA) before touching patient data. Using a cloud service provider to process ePHI without a BAA violates federal regulations.

The Security Rule organizes requirements into administrative safeguards, physical safeguards, and technical safeguards, all of which must account for data in motion during live conversational AI sessions carrying PHI in real time.

Why conversational AI raises compliance questions legacy tools don't

Traditional Electronic Health Record (EHR) systems typically handle PHI as structured records governed by access controls and auditability. Conversational AI changes that model in six ways.

Real-time audio and video capture can create PHI before later processing steps occur. Large language model (LLM) components can generate and infer PHI risk in ways traditional systems do not. Transcripts stored for model improvement can create additional privacy and retention risk.

Persistent Memory carrying context across sessions introduces retention and cross-session contamination risks. Retrieval-Augmented Generation pipelines use probabilistic retrieval rather than deterministic access controls. Integrations with EHR systems via Fast Healthcare Interoperability Resources (FHIR) APIs can pull structured clinical records into conversational applications, where access control must be managed through mechanisms such as SMART on FHIR and OAuth 2.0.

Real-time capture, model inference, transcript retention, Persistent Memory, probabilistic retrieval, and FHIR-based EHR access define the evaluation framework for buyers.

What to look for in a HIPAA-compliant healthcare conversational AI platform

Four evaluation areas separate platforms that can hold PHI safely from those that cannot: how data is handled, whether a BAA is in scope, who has access and how that access is logged, and how the platform handles clinical escalation. Walk through each in a proof of concept before signing.

Data handling and storage

A patient describes a miscarriage history during video intake. That utterance is PHI the moment it's captured, and every downstream handling decision matters.

Verify where that audio stream is stored, how long it persists, and whether any copy is used beyond the immediate clinical interaction. Confirm data residency: which regions and cloud providers host patient data, along with the retention policy and whether retention windows are configurable per customer.

Most critically, confirm in writing whether any patient data is used for model training or product improvement. If the answer isn't an explicit contractual prohibition, the compliance officer will have follow-up questions.

BAA availability and scope

Your pilot deploys a medication-adherence AI Persona to 200 patients. If the pilot tier isn't covered by the BAA, those 200 conversations just created unprotected PHI.

Will the vendor sign a BAA, and which product tier does it cover? Some vendors restrict BAA availability to enterprise pricing, leaving pilots uncovered.

Ask which subprocessors are in scope and whether BAA obligations flow down to them, including cloud infrastructure and LLM inference providers.

Access controls and audit logging

A patient discloses substance use during intake, and six months later, an audit asks who accessed that transcript. Your logs must answer that question at the session level.

Role-based access must govern who views session recordings, accesses transcripts, and modifies the AI's clinical knowledge base. Audit logs must capture every PHI access event, exportable and compatible with your Security Information and Event Management (SIEM) infrastructure.

Clinical-language guardrails and escalation

A patient mentions unexpected chest tightness in a post-discharge conversation. The platform needs to keep the conversation inside scope, detect triggers that require a human clinician, and make the escalation path clear. The practical question is what happens next: does the conversation pause, route to a nurse line, or log the event and continue?

Three platform questions to ask a vendor before buying

For the compliance officer: Will you sign a BAA, and which plan tier does it cover? What is logged, where, for how long, who has access, and how are transcripts and session data handled for model improvement? What is the vendor's current SOC 2 posture and audit cadence?
For the engineering lead: Can we bring our own LLM, clinical Knowledge Base, and EHR integration? What's the concurrency ceiling and uptime SLA? Which subprocessors handle ePHI, and do BAA obligations flow down to each?
For the product lead: What happens when the platform detects a clinical escalation trigger? What's the latency for patient-facing workflows? Can we white-label the experience?

If a vendor can't answer those questions directly, treat that as a warning.

Where conversational video fits in a healthcare workflow today

Conversational video is well-suited to protocol-driven, high-volume healthcare interactions where presence helps, and independent clinical judgment is not required. In those workflows, patients often follow through more readily when the interaction feels attended to.

Acute clinical decision-making, emergency triage, and any conversation requiring a licensed clinician's judgment belong on a human escalation path. Stating that boundary explicitly answers the compliance officer's concern about scope creep.

The healthcare conversations conversational AI is already holding well

Healthcare teams are already using conversational AI in several structured, high-volume workflows.

Patient intake: Patients can share sensitive symptoms conversationally instead of working through paper forms, often surfacing details a checkbox never captures.
Pre-procedure preparation: These conversations walk patients through what to expect before surgery or imaging and confirm understanding more actively than a PDF alone.
Post-visit education: The AI can reinforce what the clinician covered days later, when the patient is home, trying to remember whether to take medication with food.
Medication guidance: This works for dosing and side-effect questions when Objectives and Guardrails prohibit dose-change recommendations.
Post-discharge follow-up: Structured check-ins flag clinical deterioration for human review. An AI video agent detecting hesitation or visible distress during a wound-care question can escalate before the patient minimizes the concern.
Chronic-condition check-ins: Regular outreach helps maintain contact with patients managing diabetes, hypertension, or heart failure, track symptom trajectories, and escalate when thresholds are crossed.
Symptom triage: These conversations can guide patients toward the right level of care without practicing medicine.

Healthcare teams get the most from these workflows when the path is structured, volume is high, and escalation boundaries are explicit.

The platform layer behind a conversation that stays inside clinical policy

Video-based conversational AI introduces biometric data, ambient audio, and visual signals that expand both the PHI boundary and the platform's capacity for clinically relevant perception. That expansion is what makes presence possible in a patient conversation, and it's also what requires a full-stack platform, not a face on top of an LLM.

Tavus provides real-time conversational video infrastructure as a full-stack platform across four pillars: perception, intelligence, personality, and rendering. The Conversational Video Interface (CVI) is the pipeline that wires those pillars together into a single real-time session.

Within that session, the LLM layer reasons over clinical policy, Knowledge Base content, and Objectives and Guardrails to produce responses that stay within scope. It decides what to say next, routes content, and commits, or discards generated responses based on updated signals from the other models in the loop.

Raven-1, the multimodal perception system, fuses audio and visual signals into a continuous understanding of the patient's state. A patient who says "I understand" while looking away with a tightening expression is communicating something different from one who says it with direct eye contact, and Raven-1 catches the mismatch rather than processing the audio and visual channels in parallel.
Sparrow-1, the conversational flow model, governs when the AI Persona speaks, waits, or holds the floor open. That timing matters in healthcare because patients don't describe symptoms in neat linear sequences. They ramble, circle back, and trail off, and the platform can hold the floor open instead of interrupting, then respond when a human listener would.
Phoenix-4, the real-time facial behavior engine, renders the AI Persona's visual response with emotional micro-expressions at 40fps and 1080p.

In practice, a patient in a post-discharge conversation trails off mid-sentence, and her expression tightens. Raven-1 fuses the audio hesitation with the visual tension, catching the mismatch between what she's saying and how she's saying it. The LLM layer determines this, matches an escalation trigger, and decides what to say next, while Sparrow-1 holds timing to give her space.

When she doesn't continue, the AI Persona acknowledges her concern, explains that it's connecting her with a nurse, and logs the event. Phoenix-4 renders that response with the facial behavior a patient would expect from someone actually paying attention.

HIPAA compliance is available on Tavus Enterprise plans, and the platform holds SOC 2 (Service Organization Control 2) certification.

The platform features that keep healthcare conversations safe and grounded

When evaluating AI platforms that are HIPAA compliant, these are the platform features worth testing in a proof of concept:

Knowledge Base: Grounds every response in approved clinical content using retrieval-augmented generation (RAG) with ~30ms retrieval speed, so a patient asking about metformin side effects gets a clinician-aligned answer. Currently supports English-language content, which should be factored into multilingual deployment planning.
Guardrails: Defines what the AI Persona can answer, what it defers, and what triggers escalation. A patient mentioning new chest pain or a worsening post-op wound crosses a named Guardrail and is routed to a human clinician; a patient asking how to store a medication stays inside the conversation. This is the single most important feature for compliance evaluation.
Persistent Memory: A returning patient picks up where the last conversation ended instead of repeating their history. Memory is scoped per participant, so cross-session context stays correctly associated.
Objectives: Holds conversations accountable to measurable outcomes: confirm medication understanding, complete pre-procedure education, and verify post-discharge instructions are clear.
Function Calling: Lets the AI Persona take action mid-conversation: log a symptom update, schedule a follow-up, or page a clinician on the escalation path.

For healthcare teams, Knowledge Base, Guardrails, Objectives, Persistent Memory, and Function Calling matter most when the goal is grounded responses and a clear escalation path.

Where compliance meets presence

The patient on the other end of a HIPAA-compliant conversation is the one awake at 2 am, the one trying to remember what the clinician said three days ago, the one navigating recovery alone. What she needs is the feeling that someone is actually in the room with her, which is what presence at 2 am looks like when a clinic is closed.

Compliance is what makes that conversation legally possible; presence is what makes it worth having. The platforms worth evaluating treat both as part of the same system.

See it for yourself. Book a demo.

Phoenix-4: Real-Time Human Rendering with Emotional Intelligence

Phoenix-4 is the first real-time model to generate and control emotional states, active listening behavior, and continuous facial motion as a single, unified system. It is a real-time behavior generation engine, built from the ground up, that goes beyond photorealism to transform conversation data into emotionally responsive, context-aware facial expression and head motion with millisecond-level latency.

Eloi Du Bois

February 18, 2026

From random noise to real images: Understanding diffusion and flow matching

A clear intro to diffusion and flow-matching: data distributions, ODE vs SDE, and the path from Gaussian noise to realistic images/videos powering SOTA models.

Karthik Ragunath Ananda Kumar

September 22, 2025

Introducing the evolution of Conversational Video Interface – now with Emotional Intelligence

Introducing our new family of state-of-the-art AI models: Phoenix-3, Raven-0, and Sparrow-0. Together they bring Conversational Video Interfaces (CVI) to the next level, and power Charlie, our new demo persona.

Julia Szatar

March 6, 2025

Developer Account

PALs Account