Have you ever listened back to a recording of yourself and thought, that doesn’t sound like me?
Maybe it was a voicemail, a video call, or a voice note you sent without thinking. And when you heard it played back, something felt off. You found yourself wondering whether that’s really what you sound like to everyone else.
Now imagine hearing that voice for the first time. In an interview, with no context, no history, and no relationship to fall back on. Just your voice, landing in a room full of judgment.
That’s what your interviewer hears.
We spend so much time preparing what we’re going to say. We rehearse answers. We refine our stories. We memorise accomplishments and practise our closing lines. And all of that matters. But something most of us never stop to consider is how we say those things.
And this might matter more than we think it does.
Albert Mehrabian’s research ( Professor of Psychology at UCLA), often cited but rarely understood in its full context, found that when there is inconsistency between what someone says and how they say it, people overwhelmingly trust the non-verbal signals. Specifically, 38% of how a message is received comes down to vocal qualities, tone, pitch, pace, and volume, while only 7% is attributed to the actual words. The remaining 55% belongs to visual cues, body language, facial expressions, and posture.
Nearly 40% of how you are being judged in an interview is not about what comes out of your mouth. It’s about how your voice carries it…
The signals you don’t know you’re sending
Something I’ve noticed, sitting across from thousands of candidates over the years.
The ones who don’t land aren’t usually the ones who gave the wrong answers. They’re the ones whose voices quietly undermined the right ones. It shows up in ways most people never think about.
Speed. When nerves hit, pace accelerates. Words start tumbling into one another, sentences lose their shape, and the interviewer stops absorbing what’s being said because they’re working too hard just to keep up. Research from the University of Michigan’s Institute for Social Research found that people who speak at a moderate pace, roughly 3.5 words per second, are perceived as significantly more credible and persuasive than those who rush. Too fast reads as anxious, too slow reads as unsure. The middle ground reads as someone who trusts what they’re saying enough to let the words breathe.
Filler words. “Um.” “Like.” “You know.” “Sort of.” We all use them. But in an interview setting, they accumulate. A study conducted by Quantified Communications, a firm specialising in leadership communication analytics, found that the average professional uses approximately one filler word for every minute of speech. In high-pressure settings, that number climbs sharply. And the impact is measurable. Their research showed that reducing filler words by even a modest amount can increase a speaker’s perceived competence by up to 28%. That’s not a small margin. That’s the difference between sounding like someone who knows their field and someone who might.
Uptalk. This is the rising intonation at the end of a statement that makes it sound like a question. “I led a team of twelve?” “We delivered the project under budget?” It’s subtle, but interviewers register it immediately. A study published in PLOS ONE found that speakers who used uptalk were consistently rated as less confident, less authoritative, and less hirable, even when the content of their answers was identical to those delivered with downward, declarative intonation.
You could be saying exactly the right thing and still sounding like you’re asking for permission to say it.
Why this matters more now than ever before
The way we interview has fundamentally changed. Since 2020, video and phone interviews have become the norm rather than the exception. A report from Indeed found that 82% of employers now use virtual interviews as part of their hiring process. Many use them as the primary screening method.
What that means, practically, is that for a significant number of candidates, the first impression isn’t made in a room. It’s made through a screen. Or worse, through a phone line where there’s no body language to lean on at all. In those settings, your voice isn’t just part of the equation. It is the equation.
And yet, almost nobody prepares for it.
Candidates will spend hours researching a company, tailoring a CV, choosing the right outfit for a video call, and then give virtually no thought to the instrument that carries every single one of their answers into the room.
The voice-confidence loop
Your voice doesn’t just reflect your confidence. It actively shapes it.
Research from Harvard Business School, building on the work of social psychologist Amy Cuddy, explored how physical and vocal behaviours create feedback loops that influence internal states. When you speak slowly and deliberately, your nervous system begins to settle. When you drop your pitch slightly and allow pauses between thoughts, your brain interprets those cues as signals of safety. You start to feel more in control, not because the nerves have disappeared, but because your voice is telling your body that you’ve got this.
The opposite is equally true. When your voice races and climbs, your body follows. Adrenaline spikes. Thoughts scatter. And the interviewer watches it happen in real time, usually without understanding exactly what they’re seeing, only knowing that something feels slightly off.
This isn’t about faking a voice that isn’t yours. It’s about understanding that the voice you bring into a high-pressure room is often not your real voice at all. It’s your stress voice. And it doesn’t represent you well.
So what can you actually do about this?
Record yourself. Pick one of your prepared interview answers. Say it out loud. Record it on your phone. And then play it back. Do not judge yourself, just listen. Notice where you speed up. Notice where the filler words cluster. Notice whether your statements land like statements or float upward into questions.
Then do it again. Slower. With pauses where the full stops should be.
You will feel the difference before you hear it.
Breathe before you speak, not while you speak. One of the simplest and most overlooked techniques in interview preparation is learning to take a breath before answering a question rather than rushing to fill the silence. That pause, which feels like an eternity to you, reads as thoughtfulness to the person across the table. Research on perceived leadership presence consistently shows that individuals who pause before responding are rated as more senior, more measured, and more trustworthy than those who answer immediately.
Warm up your voice. This sounds strange until you consider that singers, actors, and broadcasters all do it before they perform. Your voice is a physical instrument. It benefits from being warmed up. Five minutes of humming, reading aloud, or simply having a conversation with someone before your interview can make a remarkable difference to your vocal range, clarity, and steadiness. You wouldn’t walk into a marathon without stretching, so please, don’t walk into an interview with a cold voice.
A word for the people on the other side of the table
If you’re a hiring manager reading this, I’d ask you to consider something. A nervous voice is not a weak voice. A candidate who speaks quickly in the first three minutes may well be someone who communicates with extraordinary clarity once they feel safe. A person whose tone wobbles at the start might be the steadiest presence on your team six months from now.
The voice you hear in an interview is a stress response, not a character trait.
And if we’re honest about the data, if 38% of how we assess someone’s credibility is driven by vocal tone, then we owe it to ourselves and to the people sitting across from us to interrogate that instinct rather than simply trust it. The best hire you ever make might be the one who didn’t sound perfect in the first five minutes but had everything you needed underneath!


