AI recruiting: how video agents handle screening at scale

Written by

Tavus Team

publish date

May 28, 2026

Introducing Dom, a real-life interpretation of knowledge navigator

Most high-volume hiring efforts break down at the screening stage. A recruiter can only run so many live conversations in a week, even when hundreds or thousands of candidates are waiting. A recruiter at a national retail chain opens her Monday morning to 4,200 new applications for 300 seasonal positions. She'll spend most of the week on the same fifteen-minute screening call, repeated hundreds of times, asking the same questions and listening for the same signals. Most candidates will never get that call at all.

Those who do will wait days or weeks to hear back, and by then, many will have already accepted another offer. This is the screening bottleneck AI recruiting is meant to address. AI is already widely deployed for sourcing, scheduling, and resume scoring, where the work is structured, and the inputs are mostly text.

The screening conversation itself is harder to automate. It depends on a human recruiter meeting a candidate face-to-face and gauging presence, communication style, and cultural alignment. AI humans, full-stack systems that see, hear, understand, and respond in real-time conversations, are starting to do that screening work directly.

Tavus is the human computing company building these AI humans. Conversational video AI brings them into the screening process, where they conduct live, two-way interviews with candidates at any hour in 42+ languages, with a stronger visual presence than text- and voice-only tools.

The state of AI recruiting in high-volume hiring

AI adoption in talent acquisition has nearly doubled in a single year. Forty-three percent of organizations now use AI for HR tasks, up from 26% in 2024, according to SHRM's 2025 Talent Trends. Recruiting is one of the top functions driving that growth.

In most recruiting deployments today, AI handles upstream work such as sourcing matches, drafting outreach messages, parsing resumes, and scheduling. Human recruiters retain the final decisions on candidates, while live screening conversations remain mostly human, mostly slow, and mostly the bottleneck.

High-volume hiring is where that bottleneck shows first. Gartner identifies high-volume, low-complexity roles such as retail, customer service, and drivers as a leading use case for AI-first recruiting. Adoption in those verticals is already outpacing trust: only 26% of job applicants believe AI will evaluate them fairly. Low trust in an AI-led evaluation process can push candidates to drop out before finishing it.

Inside the AI recruiting workflow

AI touches nearly every stage of hiring today, though the depth and maturity vary significantly. The upstream stages, where inputs are structured and outputs are ranked lists, have the most established tooling. Mid-funnel and downstream stages like screening, scheduling, and offers are where the technology is still maturing.

Sourcing and matching candidates

Machine learning models analyze talent pools to identify passive candidates before a role is posted. Skills-based matching has replaced keyword-based resume filtering, and platforms aggregate profiles from multiple sources into unified candidate views.

Screening and assessment

AI is commonly used to generate interview questions, filter resumes, and handle initial candidate interactions. Conversational AI in the form of text chat interfaces handles initial candidate interactions at companies like Medtronic.

Scheduling and offer management

Automated scheduling tools coordinate availability across candidates and interviewers. AI-generated offer letters and onboarding workflows are emerging and remain less mature.

Limits of text-only AI recruiting tools

Text-based screening chat interfaces are fast and can contribute to candidate drop-off during screening. Candidates may disengage from hiring processes when AI-led interviews feel impersonal or unclear. The interview stage often loses candidates in hiring funnels.

Text-based AI moves candidates through a process and captures less of what recruiters are evaluating. Candidates adapt by gaming text screens with keyword-stuffed responses, distorting the signal the screening is designed to capture. Candidates also want nonverbal cues, such as a head nod or a facial expression, to gauge how the conversation is going. Without those cues, the experience feels unsettling.

Video agents and the AI recruiting screening problem

Interviewers pay attention to nonverbal cues during screening. Tone, expression, hesitation, and timing all factor into how a candidate's answer lands, and none of those signals come through in text alone. Video agents bring those signals into screening and run a consistent conversation with every candidate.

Reading nonverbal signals at the interview stage

Interview studies have found that nonverbal characteristics, including eye contact, head movement, and professional appearance, can shape interview outcomes. Nonverbal cues matter in employment interviews, and a text or voice-only screen captures none of them.

The screening conversation becomes a live, face-to-face interaction in which the AI video agent interprets the candidate's tone, expression, and hesitation as a unified signal.

A candidate joins a scheduled interview. Raven-1, the multimodal perception system, fuses the candidate's vocal tone with their facial expression, catching the mismatch between confident words and nervous body language. That perception, kept no more than 300ms stale, flows to the large language model (LLM) intelligence layer, which reasons about pacing and decides to offer an encouraging prompt before the next question.

Conducting structured interviews at scale

When a candidate trails off mid-answer and restarts from a different angle, Sparrow-1, Tavus's audio-native conversational flow model, recognizes the continuation and holds the floor open. With a 55ms median floor-prediction latency, 100% precision, and zero interruptions on the benchmark, Sparrow-1 doesn't cut in with the next question when a silence threshold is crossed. It waits at the moment a human interviewer would.

Industrial-organizational psychology research shows that structured interviews are among the strongest selection procedures for predicting job performance. They require standardized questions asked of every candidate, scored against an established rubric. Unstructured interviews, despite being rated highest for perceived effectiveness by hiring managers, are among the worst predictors of actual on-the-job performance, as summarized in Harvard Business Review.

Running structured interviews consistently across thousands of candidates requires trained interviewers applying rubrics uniformly, with independent scoring and no inter-interviewer discussion before evaluation. That is the operational gap an AI human is built to close.

An AI human for candidate screening is designed to deliver a consistent interview flow. Every candidate gets the same question flow, the same evaluation criteria, and the same opportunity to demonstrate their thinking.

For recruiting teams that need this consistency to operate inside their existing hiring stack, the requirement is infrastructure rather than a finished product. The Conversational Video Interface (CVI) exposes the API-first infrastructure that recruiting platforms build on, with configurable persona behavior, structured question flows, and real-time rubric scoring via Function Calling, which submits evaluations directly to the applicant tracking system (ATS) during the conversation.

Building responsible AI recruiting

The regulatory environment for AI in hiring is moving fast, and enterprise teams deploying AI recruiting need infrastructure that keeps pace. Bias audits, candidate notifications, and disclosure requirements are already in force in some U.S. jurisdictions and are being phased in across the EU. Most teams also keep human reviewers in the loop by design, with AI handling the conversation and recruiters retaining the decision.

Bias auditing and explainability

The U.S. Equal Employment Opportunity Commission (EEOC) explicitly identifies AI-driven hiring tools as a priority enforcement area through 2028 in its Strategic Enforcement Plan. NYC's Local Law 144 requires independent bias audits annually, public disclosure of audit results, and candidate notification at least ten business days before an AI tool evaluates them. The EU AI Act phases in requirements for high-risk AI systems through 2026-2027.

Within the CVI platform, Objectives and Guardrails set measurable completion criteria for each screening conversation and define compliance boundaries that determine when the AI escalates to a human reviewer. In a regulated hiring context, Guardrails can enforce that the AI-human stays within approved question domains, flag potential accommodation needs, and route edge cases to human recruiters before a determination is made.

Keeping recruiters in the loop

In many hiring funnels, candidates encounter automated resume filters or text chat interfaces instead of live screening conversations. Most candidates in a 4,200-applicant queue get a form rejection without speaking to anyone. Many teams still expect human judgment to remain central even as AI handles more of the workflow.

Recruiting platforms building on CVI can configure webhook callbacks that deliver conversation transcripts, perception analysis, and rubric scores to human recruiters for review. The AI human runs the screening conversation, and recruiters review the outputs and make the hiring decision.

AI recruiting with conversational video agents

Tavus's CVI brings four components together for AI recruiting: Sparrow-1 predicts conversational timing (when to speak, when to listen), Raven-1 fuses audio and visual signals to interpret the candidate, an LLM layer reasons about what to say next, and Phoenix-4 renders the response in real time.

On top of that stack, Memories, Knowledge Base, Guardrails, Objectives, and Function Calling shape what the AI Persona remembers across interviews, which role criteria it grounds its answers in, which topics it stays away from, which outcomes it drives toward, and which systems it can call.

The four layers run in a closed loop in real time, with combined system latency under 600ms from the moment a candidate finishes speaking to the moment the AI human starts responding.

Phoenix-4, the real-time facial behavior engine, draws on more than ten controllable emotional states to render active listening behavior throughout: nodding, responsive micro-expressions, and attentional cues while the candidate speaks. The model is full-duplex, keeping the AI @human's listening face continuously visible, so the candidate does not see a frozen avatar mid-answer.

The Knowledge Base, a retrieval system that uses retrieval-augmented generation (RAG) with ~30ms retrieval speed, grounds the AI human's responses in the company's actual job descriptions, leveling guides, and benefits documentation. That speed, up to 15x faster than typical RAG systems, lets the AI human cite a benefits clause or salary band without breaking the conversation's rhythm. The Knowledge Base currently supports English-language documents, which is worth factoring in for global hiring teams.

With Memories, the AI human carries context across sessions, so a candidate who returns for a second-round conversation does not have to repeat what they have already shared. Memories can be tied to a candidate, a session, or a shared context, such as a hiring panel, which keeps each candidate's evolving record separate from anyone else's interview thread.

The future of AI recruiting

Every candidate who joins a screening call wants the same thing: to feel seen, heard, and given a fair shot. The recruiter who once gave them fifteen minutes of focused attention is where that feeling has always come from.

In high-volume hiring, that fifteen minutes has been reserved for the few who make it through the filters. AI humans can give it back to every candidate who applies, which is where employer brand and funnel retention begin. That is presence at scale.

See it for yourself. Book a demo.

Frequently asked questions about AI recruiting

What is AI recruiting?

AI recruiting is the application of artificial intelligence across the hiring pipeline, from sourcing and matching candidates to screening, interviewing, scheduling, and offer management. It includes machine learning-powered resume scoring and conversational AI that conducts live screening interviews, with text chat interfaces, voice agents, and real-time video agents, each capturing different levels of candidate signal.

Can AI recruiting reduce hiring bias?

Structured interviews have been shown to reduce racial bias when evaluators are trained using scoring rubrics. The risk is that AI systems can also inherit and amplify biases from historical hiring data. Responsible deployment requires independent bias audits, ongoing monitoring, and clear escalation paths to human reviewers.

Will AI replace human recruiters?

Many recruiting deployments use AI for initial screening, qualification checks, and scheduling, which are often interactions candidates already experience through text chat interfaces, automated forms, or no live conversation at all. The pattern most recruiting teams describe is AI extending what a human recruiter can reach, not replacing the recruiter.

What's the difference between AI recruiting and recruiting automation?

Recruiting automation follows predefined rules: if a resume contains certain keywords, move it to the next stage. AI recruiting applies machine learning and NLP to interpret unstructured information, learn from patterns, and make contextual recommendations. An AI human conducting a live screening interview perceives tone, expression, and hesitation as an integrated signal, then adapts its follow-up questions in real time.

‍