Top Down vs Bottom-Up: The Battle to Understand Speech

by Kathi Mestayer

(This article also appears on the “Hearing, Health and Technology Matters,” site

Top-down grabs wheel, runs into ditch

There are two different ways in which we make sense out of speech – “top-down” and “bottom-up.”

I first read about this in The Language Instinct by Steven Pinker, in which he describes how we take sound input and make sense of it as speech. Let’s start with the top-down system, since it’s the one we’re more aware of.

Top-down auditory processing is more, shall we say, thoughtful, using tools like context (dinner table, board meeting, classroom), expectations (past experience, person speaking), and nonverbal cues (facial expressions, body language).  It considers those factors, along with the speech sounds, and does its best to interpret what was said.

The bottom-up system, on the other hand, makes a lightning-fast, best guess based on the raw sound data. Period. No consideration of context or those other complicating, time-consuming factors. As a result, bottom-up attempts can be comically wrong, like the mis-heard lyrics of The Battle Hymn of the Republic, “he is trampling out the vintage where the great giraffes are stored.”  A thoughtful, deliberative system, like top-down, would not report those lyrics, especially if you’ve heard that song a thousand times. But bottom-up’s job is to get you an interpretation really, really fast.  No editor, no proofreader…a hip-shot.

But sometimes a fast reaction is needed, because we’re busy using up top-down capacity with things like multitasking or making a tough decision.  In those cases, our brains opt for the bottom-up mode and hope for the best.

So, who makes the decision about delegating tasks to the slow road or the express lane?  Our brains do, usually without consulting us, and that’s how we end up on the receiving end of ‘great giraffes’ or ‘national pelvic’ radio.

Top-down and bottom-up, toe-to-toe

Because it has a few more milliseconds to work with, it’s natural that top-down, the more deliberative process, is correct far more often.  But not always.

Take the other night in a noisy restaurant, when my brain handed the task to top-down, assuming that it would be in a better position to tell us what was being said. I was sitting with two friends who were chatting away, when the waitress came up to me and asked, “Are you ready to order?”

“Yes,” I answered.  Then she turned and walked away.

Hmm, what just happened?  I sat there for awhile, puzzled.

A few minutes later, the waitress came back to our table, and said, “Are you ready to order now?”

“Didn’t you ask us that the last time you were here?”

“No, I asked if you needed a little more time.”

At that point, top-down started whirring away, figuring things out. Putting the pieces back together, I see that my top-down system took over as the head interpreter, elbowed bottom-up out of the way, and made the call based on what it expected the waitress to say as she approached the table – were we ready to order?  Nice try.

In so doing, top-down completely ignored what speech sounds were available in that noisy space (which, in fairness, were pretty garbled).  Bottom-up would have given me, “did you betty the border?”  And bottom-up plus top-down would probably have gotten it right.  But top-down, in this case, was about as helpful as the great giraffes. Of course, bottom-up gets a kick out of this. He’s usually the one who gets things wrong. It’s not as amusing as hearing canned spinach instead of the king’s speech, but a new kind of lapse.

Good thing I’m not a control freak. Now, I have two different kinds of mistakes to look out for.  I’m just batting clean-up.  Who’s on first?


Kathi Mestayer writes for Hearing Health MagazineBe Hear Now on, and serves on the Board of the Virginia Department for the Deaf and Hard-of-Hearing.  She has kindly contributed this article to Every Girl’s Dream, but it also appears on the “Hearing, Health and Technology Matters,” site       

In this photo she is using her iPhone with a neckloop,audio jack, and t-coils which connects her to FaceTime, VoiceOver, turn-by-turn navigation, stereo music and movies, and output from third party apps, including games, audiobooks, and educational programs.


Leave a Reply

Your email address will not be published. Required fields are marked *