How to Tell If a Candidate Used ChatGPT in a Video Interview

Something felt off, but the recruiter couldn't say what. The candidate answered every question fluently — structured, confident, almost rehearsed. But when pushed to elaborate on one detail, they looped back to the same talking points. No new information came out. The answer was complete, but the thinking wasn't there.

That's what AI-assisted interview fraud looks like in practice. It doesn't look like cheating. That's the entire point.

Why AI assistance is hard to spot

Traditional cheating left obvious evidence — notes in frame, eyes darting to the side, an earpiece with an audible voice. AI assistance is subtler. The candidate is on camera, answering in their own voice, looking at the lens. What's happening is off-screen.

A tool like ChatGPT, running on a second monitor, phone propped nearby, or a hidden browser tab, generates the answer. The candidate reads it aloud, slightly rephrasing. To a human reviewer watching in real time, it can look like composed, well-organized thinking — especially if the candidate is practiced at it.

The behavioral fingerprints

1. Structural perfection across every answer

AI-generated responses tend to follow a predictable pattern: a brief framing statement, three supporting points, a neat summary. Human answers to complex questions are messier — they start in the middle, self-correct, build toward a conclusion. When every answer across a 40-minute interview follows the same three-part structure regardless of the question's complexity, that's a signal worth noting.

2. Lateral eye movement with a reading rhythm

Genuine thinking involves looking away — up and to the left sometimes, or briefly at the ceiling. People externalize cognitive effort. Reading involves a different pattern: the eyes track left-to-right at a consistent rhythm, at a fixed focal distance. If a candidate's gaze moves laterally during the pause before answering — particularly if the movement has a steady cadence and restarts — they may be reading, not retrieving.

3. Near-zero latency before structured responses

A genuine expert takes a moment before answering a complex question — not because they don't know the material, but because they're choosing how to frame it. When a candidate receives a difficult question and begins responding at full pace within one or two seconds, with immediate structure and no false starts, that timing anomaly is worth flagging. The answer didn't need to be constructed; it arrived ready-formed.

4. Inability to elaborate unprompted

Ask a follow-up: "Can you say more about the second point you mentioned?" A candidate who genuinely knows the material will expand in a new direction — they'll add an example, a counterpoint, a caveat from experience. A candidate who read an AI-generated answer will often restate the same points, possibly more briefly. The depth doesn't increase because the depth was never there.

The tell isn't the answer itself. It's what happens when the answer runs out. Genuine expertise has more behind it. AI-generated expertise ends exactly where the generated text ended.

5. Keyboard or typing sounds in the audio

If a candidate is typing during their own verbal responses — not taking notes, but entering text into something — the audio track will often contain it. Soft keystrokes, the faint tap of a touchscreen, or the click of a phone placed near a microphone are subtle but detectable. In an interview context, typing during a verbal answer is anomalous behavior that deserves a second look.

Why manual review doesn't scale

A careful human reviewer watching a single recording can notice some of these patterns. But most hiring teams are processing 20, 50, or 100 interviews simultaneously. The signals that catch AI assistance — reading cadence in eye movement, structural consistency across all answers, response timing relative to question complexity — require the kind of focused, technical attention that degrades with volume and fatigue.

They also require something most review workflows don't have: a way to see patterns across the whole session, not just individual moments. A single three-part answer is plausibly a coincidence. Seven three-part answers in a row is a pattern. Spotting the pattern requires seeing all seven at once, compared against a baseline — not watching them one at a time over the course of a morning.

What changes when you analyze systematically

When an entire recording is analyzed against a consistent set of signals — gaze patterns, response timing, audio activity, structural analysis of each answer — it becomes possible to surface not just individual anomalies but a coherent picture across the session. Each finding points to a moment on video, so it can be reviewed rather than merely asserted.

That's the difference between "something felt off" and "here's what was off, at 11:47 and 22:03 and 35:29, and here's what each one looks like."

See these signals detected automatically

HireBetter analyzes every interview recording and surfaces each flag with a timestamp and reviewable clip — so you can verify it, not just trust it.

Start analyzing free Sign in