Skip to main content
60th Annual GRAMMY Awards - I’m Still Standing: A GRAMMY Salute To Elton John - Show

Why the Grammys sound amazing and sometimes go wrong

The Grammys’ audio coordinator explains why there’s no substitute for live performance

Share this story

Michael Abbott has overseen audio at the Grammys for decades. It’s an extremely high-pressure position, so I expected Abbott to have the personality to match. But when I met him in the hallways of a trade show last month, I immediately noticed that he carries a cool, laissez-faire demeanor. It’s disarming, which made it easy to forget how elite his resume is — at least until our conversation started. After pleasantries were exchanged, the first thing Abbott said caught me off guard. “I have a reputation preceding me,” he chuckled. “I’ve done so many amazing things, and have been involved in some debacles as well.”

Not only has Abbott mixed and produced audio for shows like The Voice and Shark Tank, but he’s also done presidential debates and awards shows, such as the ESPYs, the Country Music Awards, and the Oscars. These are some of the most prestigious broadcasts on television, and it can be easy to forget how much work goes into what you hear at home. While millions are riveted to the visual drama and eye candy happening on-screen, it’s Abbott’s job to run a team that juggles hundreds of audio channels, microphones, and sometimes thousands of inputs across multiple stages, all to make sure that you hear every voice and instrument with clarity and precision.

As a member of the Recording Academy (and a Governor on the Chicago Chapter Board), I wanted to know what goes into making the Grammys sound great ahead of this Sunday’s show. Here’s what happens behind the scenes to make the audio for music’s biggest night run as smoothly as possible.

This interview has been edited and condensed for clarity.

How far in advance do artists plan their Grammy performances?

A lot of the performances are not defined until they almost show up onstage. All of the artists are trying to make their own statement and their own impact on it. Some acts don’t even want to tell you what they’re doing ahead of time. After artists have rehearsed, we don’t allow changes because we have to lock down what we’re doing. Making sure everyone from lighting to assistant directors to sound is on the same page is a big deal.

There are tens of millions of people watching the Grammys, and artists can sell 300,000 units after the show. It’s a whole different market model now, though. It’s all streaming, so you don’t have the same element of exposure. But it’s a career-defining moment when you get nominated for a Grammy and it’s on TV. It’s timeless for your career.

“Sam Smith came in and said, ‘Oh we want 40 strings.’”

You’re the audio coordinator for the Grammys. What does that mean?

Mixing is 5 percent of what I do. I interface with all the artists, reps, and engineers for the show. I’m kind of a fulcrum. There’s the bands and the producers telling me what they want to do, and then there’s trying to settle on what we can do.

Artists have grandiose ideas, and all of these ideas will be fluttered about, but we have to think about the capability of the room, the stage, the timing of the segment, and how long it takes to set it up. All of these are factors to be taken into consideration prior to showing up on site. Then, when you get on site, there’s still a lot of minutiae and details to get worked out. Like we can’t have cables running up and down a stage if there are set pieces coming in and out during the performance.

What’s an example of a performance where an artist had a grand vision that had to be mitigated?

Sam Smith came in and said, “Oh we want 40 strings.” There were all these string players and singers in an arc around the stage all playing to each other. I have a limited amount of time to set that up. Do we use clip-on mics that are catching the sound of the violin? Or do we put up ribbon mics that catch multiple players at once? They wanted mics up there so it looked live, even though some of it was pre-recorded.

My first inclination is that it has to be failure-proof. In this case, people are not going to be able to tell if it’s live or not. You need to pre-record those strings, which is something we would never normally do, but in this instance, the division in the content has exceeded what we can do, so we try to pre-record them. Well, they also want them live. We ended up using the ribbon mic, which has an 8 polar pattern.

Another act wanted 12 background vocals, and for them all to have microphones, and they wanted them live. I said, “You know, they’re moving all over the stage and the performance is going to get marginalized. I think you need to pre-record them.” And we did. Then they showed up and said they wanted them live, and I said we can’t do it.

I’m the guy that has to make these decisions based on what I know. An artist’s rep might have a sense of what they want, but I know in execution, we can’t pull it off. I can’t get 12 vocal mics on top of all the other mics working at the same time because it’s just too complicated.

I have to mitigate expectations to a degree, but I also have to embrace them. At the end of the day, as they leave the Grammys, I want them to think they got the best of what they wanted to come across on the show. If they go away happy, I’ve done my job. Producers will send me gifts. I’ve gotten some very, very nice headphones.

Nice headphones… a lot of them?

You know those Rubbermaid bins?

Oh, come on. Filled with headphones?

Yeah. I probably have four of those. It’s stupid how many headphones I get. It’s great! I love it! I use Audio-Technicas. I like those the best, and on planes, I prefer Bose.

What would you say is the most challenging circumstance for you? Is it when you have to focus on one voice at a time?

My observation is when you have a piano up on a stage and that’s your only instrumentation and in the case of, say, an Adele where we had a problem with the piano mic, you have nowhere to go.

Now, in that case, we had both the MIDI voice module and also mics on the strings, and the strings were the problem. But musically, you’d rather hear the actual strings. There’s a difference in piano sounds from a Bösendorfer, which is like an earthquake with its low register, to a Steinway, which is traditionally kind of a dark-sounding piano, to a Yamaha which is more bright and strident, to a Fazioli. All these things have very different tones and timbres, and the live performance makes a difference.

So do you not have any redundancy for Grammy performances?

We have redundancy. In each of our systems — we have two — there is redundant playback. It triggers seamlessly because we take the timecode out of the primary machine, and we feed it to a gated input in the switch. So if the first machine stops, it automatically switches to the second machine.

That’s standard operating procedure for all our music playbacks and all our video clip playbacks. There is a large amount of failure mitigation that is in place because all the failures that we’ve experienced before, not only in this show but in other shows. We have methodologies, but when you combine humans and technology, it’s inevitable something will fail.

Given that there is going to be failure, why would you not ask everyone to provide a pre-recorded version of their set?

It would admit that there’s going to be failure, first off. There is a value in failure mitigation. For example, we have one spare microphone, and if the mic dies, the stage manager is the only one who has that microphone, and he knows when to go out and hand it to somebody.

“Isn’t this what being live is all about? It’s the train wreck.”

I was doing a show many years ago where I had Chubby Checker onstage, and he started clapping with the microphone in his hand. All of a sudden the battery went flying out the back, and two people with mics came out from stage left and two people with mics from stage right, and I was like, “Noooo!” That was a learning experience.

I think you lose the moment. Isn’t this what being live is all about? It’s the train wreck. I mean it’s the basis of reality television. The reason people watch reality television is because they want to see a meltdown. At some point, people watching a music show want to see a failure.

We had a failure with Metallica and Lady Gaga in 2017, and I got raked over the coals for that. But you know what they do now? CBS shows that clip as a promo for this year’s show. It shows Gaga and James Hetfield singing together on one mic! Now, had his mic not been disconnected by a dancer on-stage, that moment wouldn’t have happened. That’s the live moment you can’t get on a sterile, pre-recorded track.

So you know, to a degree, how things are going to run, and you have to allow for improvisational moments.

You put processes in place and hope for the best. There’s something about live television that can’t be replicated in a Netflix-streamed show or even a comedy show that they’ve recorded and edited.

Grammys audio by the numbers

4 — performance areas

4 — outdoor trucks for handling audio

9 — audio workstations total

60 — 53-foot trailers bringing equipment in and out

70 — people handling audio during the live telecast

192 — audio tracks handled in the outdoor trucks

350 — microphones used

1,800 — microphone inputs used during the week

3,300 — microphone cross patches done over the week

*Some figures estimated

It sounds like there are a lot of moving parts. How many stages are there?

There are the main left and right stages. And then there’s a center dish stage that’s very small, about 12 feet in diameter, and a passerelle downstage. So there are four performance areas in total. And all of those have multiple cables going to them during setup in part to provide scalability when we get into rehearsals. We have to be prepared when we’re building the show to make sure we have enough connectivity with all of those positions, which we do.

What’s the transition process like between performances on different stages?

First, you have to get everything off the stage that finished. That takes three to five minutes. And then you have to put things back up on the stage, which is another three to five minutes. It’s another one to two minutes to check everything once it’s set up.

So each transition takes about 10 to 15 minutes. But then we also do transitions between stages. It’s a very fast-paced show in the sense that a lot of set changes are going on. It’s heavily matrixed, and you have to be very cognizant of what else is going on at the same time so you don’t do things like inadvertently pull patches.

How many people are on your team?

I’ve got 46 people that work with me split up over nine workstations. Everything from comms to RF coordination to signal connectivity to the mix positions. On top of that, there are probably another 22 stagehands.

What are consoles?

You’ve probably seen a photo of a console before, maybe in a recording studio. It’s an expansive board with rows of sliding faders, knobs, and buttons that have a wide variety of purposes. Larger, professional boards — like the kind Abbott uses — give you direct physical control over almost any aspect of the audio, which is necessary when you’re working with a lot of individual tracks at a quick pace, like at the Grammys.

Abbott started on analog consoles, but he has moved to digital. Analog consoles tend to have limited mixing features. When used in a large-scale setting, an analog console might have to be supplemented with a bunch of outboard gear that provides effects and advanced routing options. Digital consoles have signal processing chips instead of analog circuits, which gives you increased freedom with configuration, along with robust tools for processing, automation, and more.

A console snapshot is a saved configuration of how a digital console was previously set up. It’s a template that allows you to store and recall the state of a console down to volume levels, EQs, and effect settings. Abbott saves snapshots after every show he does so he can recall all his setup work instantly, instead of having to start from scratch every time.

Where is everyone located?

There are two NEP Denali trucks outside that are 53-foot expandos. Those are the audio and video production facilities. They’re called Summit A and Summit B. Then there are two trucks where all the music mix is being done at the top of the ramp. The rest is inside the arena — front of house, on the sides, and so on.

Why are all of these trucks outside?

There’s no room. There are typically anywhere from 55 to 60 53-foot trailers bringing equipment in and out, plus all the bands’ equipment for a show that’s about three and a half hours long. There are 18-20 performances — some bands bring in all their gear, and some we get from an outside company — so we have to keep a path to get things in and out.

So how does audio get from inside the theater to the trucks?

Everything goes upstage right to an analog split system. That then gets cross patched into the front of house sound system and stage monitors. That’s a DiGiCo console platform. They utilize the OPTOCORE fiber network for all that signal connectivity. It’s analog in and then digital out to the PA speakers. And on each stage is a fall back mixer that has about 120 channels. Then the same splitter system sends the music up to two mix trucks that have 168 channels each. It’s the same basic signal just split and shipped into Multichannel Audio Digital Interface (MADI) protocol. Then everything is shipped by fiber to the Summit production truck, which handles all the microphones, the audience mics, the RF mics, everything.

How many channels, inputs, and outputs are you managing in total?

We have 24 MADI, 64 channel paths, 128 AES, 192 analog. That’s 192 channels in, 192 channels out. And then there are five stage boxes that have over 400 channels in and 80 channels out. That’s just one truck. I haven’t gone to the music mix. I haven’t gone to the front of house. Just one example: there are roughly 1,800 inputs that the mics are distributed to. It’s massive. That’s why you have people who have done it 15 years in a row.

Dani Deahl speaking with the Grammys’ audio coordinator Michael Abbott at NAMM.
Dani Deahl speaking with the Grammys’ audio coordinator Michael Abbott at NAMM.
Photo by Vlad Savov / The Verge

Is it a very small pool of people who do audio for these high-profile events?

It’s a very specialized niche, and yes, the people who are consistently doing the shows, it’s a small group. The current crop of people I work with for the Grammys have a combined 500 years of live broadcast experience.

Is that because of the level of expertise that’s needed, or is it because you can’t have any fuck-ups and you need people you trust?

“It takes a certain personality to work under stress and not fold up like a cheap beach chair.”

Well first off, there is always going to be failure. You can’t create a 100 percent no-failure situation. It’s impossible. Like I said, any time there’s a human and technology involved, there’s going to be a potential for failure no matter what.

Everybody knows their place. They all know the rundown, the timeline, the schedule, and what’s expected. They show up on time. Of course, their skill sets are essentials needed for a show of this magnitude. But it takes a certain personality to work under stress and not fold up like a cheap beach chair.

One of the things I do is tell people, I want you to put your hand on the console for 30 seconds. When you raise it off, if it’s wet, you shouldn’t be doing this. You’re not in the right state of mind to do this because you’re stressed. When you’re mixing a live show, you’re in a zen state of mind, and you know what you need to do. It’s all very structured and disciplined.

What does it mean?

MADI protocol — MADI (Multichannel Audio Digital Interface) allows you to send up to 64 channels of audio through a fiber optic or coaxial cable over long distances. It supports audio formats up to 24-bit/192kHz without lossy compression. This gives you a simple and robust data stream that can be handled with a single cable.

Dante — Dante is a combination of software, hardware, and network protocols that delivers uncompressed, multichannel, digital audio distribution over a standard Ethernet network with almost no latency. It allows for audio, control, and other data to not just exist on the same network, but be instantly configured and routed to different devices.

Redundancy — Redundancy is a backup of key instruments and pieces of equipment on hand that ensure a live performance can continue in case something goes awry. Much of this is now automated, so if something fails, the backup can very easily take over.

Patching — Audio patch bays are essentially switchboards for quickly and easily directing signal flow. A patch is a unique point in the patch bay where an engineer can decide how to route inputs and outputs. Cross patches can connect disparate audio interfaces, consoles from different systems, and radio frequency transmissions.

Ribbon microphone — A ribbon mic is a type of dynamic microphone that has a thin strip of metal suspended in a magnetic field. They are bidirectional (giving them a figure 8 polar pattern), which means they can pick up sound well from the front and the back, but not so well from the sides. Abbott used these with Sam Smith between his background performers to catch multiple voices and instruments using fewer microphones.

How has the technology changed for doing audio for live shows?

When I was in the ‘80s working for a network, a lot of this technology didn’t exist. We were using analog consoles. We started using digital consoles around 1994, but they were still connected by analog paths. In 2011, we started using digital consoles, which were connected by a fiber ring array.

The expectations are that you can do things quicker because of the automation and digital capabilities. With a lot of the shows I did, we used to get an “ESU” day, which is “everybody sets up” or “everyone shows up” for eight hours. Then we’d do a run-through and go home. And now we come in at 2 o’clock in the afternoon and start rehearsals immediately because they know that we can program the consoles.

Depending on the type of show, I sometimes take the initiative to come in with a basic template. Like with the ESPY awards, I’ll come in and recall the snapshot from last year and start building from there. I’m doing the music, I’m doing the production mix, I’m doing the recording. It’s a fairly complex show, and the amount of time it would take me to build it again would be... all day. But then sometimes there are problems where you come back a year later and question the decisions you made. You have to backwards-interrogate to see what you’re doing and make sure you’re not leaving a mine for yourself later on.

I use spreadsheets extensively and databases. I’ve got databases for plans going back 30 years. I have hard copies, but they’re also on my computer with all the specs, all the mics used. It’s empirical data that helps me anticipate needs.

What are your favorite speakers?

I’ve been doing this since the ‘70s, and I have specific companies that are part of my audio DNA. I call it audio DNA because these are the tools that I’ve used for years and years. I’m a big fan of JBL. I have JBL in my home. I listen to JBL as much as I can because I know what it sounds like, and the great thing about JBL is they developed speaker systems that all utilize the same type of high-frequency horn. They have such a smooth response. The physics of horns are very complicated. The sweetness and the air that comes from the JBL horns, it’s the best.

The Grammys are generally held in the same space every year. Do you still have to tune the speakers every time?

The sound system is about the same every year, due to weight restrictions, set elements, and everything else. But this year, we’re using a new type of JBL speaker, the A12. Last year, we used the VTX 25. So the processors require different settings, and we’re better off starting from scratch.

But the procedurals are the same. We set up a variety of mics to measure reverberation in the room. You set mics in about 10 different locations in the venue and do measurements, so the front row has the same acoustical energy as 10 rows back, as 20 rows back, and so on. The goal is to have a smooth response so every seat’s covered.

There’s a lot of challenge in the room because there’s a hard surface at the back. We try to steer the energy away from that, into just the seats. In the VIP boxes around the perimeter, there’s glass, and you get some artifacts from bounce and reflection at high frequencies. Any big room is a challenge. The low frequencies bounce around and high frequencies reflect pretty heavily. There are a lot of trade-offs with how we lay things out. We try to place speakers where we get the least amount of artifacts.

So yeah, in a perfect world, if the front of house console snapshot from two years ago is called back up, it should be 90 percent of the way there. Down to the EQ from the podium mics, handheld mics, and so on. To that degree.

How long does it take to set up the system?

We go in on Monday to start loading in. We continue on Tuesday into Wednesday. Then we go into rehearsals for three days, and the show is Sunday.

What is tuning an audio system?

The goal of tuning a sound system is to have the same quality of sound no matter where you are within a venue — essentially, trying to match the acoustic characteristics of a space with the speakers you are using. Ideally, you want to get as close as possible to having the same levels, frequency response, and clarity for every seat.

Certain things that can be changed to affect sound include location and direction of speakers, levels, driver polarity, phase and delay, and filter / EQ.

Audio engineers often use software like SMAART to help them analyze the frequency responses throughout a venue and make changes accordingly. Often, pink noise is blasted through speakers, which is then picked up by reference mics set up at different spots within a venue and measured using the software to make adjustments. Pink noise is used instead of white noise because it’s closer to how the human ear and brain perceive sound, which is in terms of octaves, not absolute frequency.

Can you give me some raw numbers for the Grammys? How many mics, how many speakers, how many instruments?

Microphones vary from show to show. I think on this one, it’s in the neighborhood of about 350. For the microphones, there are 3,300 cross patches done over a five-day period of inputs. We use roughly 48 RF microphones, and then 24 transmitters with 75 receiver belt packs.

There are several PA system clusters, 16 deep of JBL A12s. There are four main front clusters, then there are four delay clusters — to make up for the sound traveling to the back of the house — and then there are some ring satellite speakers up high around the top seating area.

Then there are probably another two flown clusters of dual 18-inch subwoofers. Probably eight front fill speakers across the front. For the stage there are 75 stage monitors for the artist to hear their voices and instruments.

Are you using the theater’s existing sound system?

We don’t use the theater’s existing system because it’s not arrayed where our seats are. The speaker system they have in there is for when they have basketball and hockey, and we seat people on the floor. The way their speakers are physically hung doesn’t fit our seating plan. We have to bring in everything new. Their speakers are flown high enough that we’re not impacted by them being in the way.

What technologies are coming down the road for live broadcast audio?

The next challenge is going to be Dante. The producers won’t pay for the price point needed to do the transition and / or have the manpower.

Michael Abbott backstage at The Grammys.
Michael Abbott backstage at The Grammys.
Photo: The Recording Academy

What is Dante?

Well, we started off sending audio with copper wires that we termed as analog. It’s warm and fuzzy. You can feel it. It’s hot. Now, we have fiber, which is 12 pieces of glass with light shooting through it that can handle a myriad of signals at once. To now IP addressing, which is what Dante is.

We’re in the networking stage of connectivity through signals. And that is a whole level of management of skill sets that you have to have in place on top of the existing firmament that we use. My producers don’t want to pay for those people. I have a hard time getting them to understand that we have to use digital RF microphones as opposed to analog because of the Federal Communications Commission reallocating the RF spectrum by selling off the bandwidth that we use.

What does the FCC have to do with the type of equipment you use onstage?

The FCC regulates the use of RF. Anything that’s broadcast over the airwaves, they regulate. They’ve auctioned off bandwidth to the telephone and communications companies, which diminishes the frequencies that we can use for our microphones. I think it was two years ago, it first hit the brick wall where we realized we can’t run as much RF simultaneously as we’d like to because the frequencies aren’t available.

It just continues. They put deadlines on when certain bandwidths are going to be cut off. I believe 600 MHz is dead to us. It’s been allocated to public utilities, for wireless broadband services. 4G and 5G networks require bandwidth to get these devices to work. So the FCC is allocating these frequency bands to be auctioned off for billions of dollars.

Do you think that all of this effort is actually worth it? You would, by the sound of it, save so many headaches if you were going to pre-record all this stuff.

Music is tribal. From the Stone Age to African drums to Roman amphitheaters that were built so you didn’t need amplification to Gothic cathedrals where the Gregorian chants were long and sustained because of the amount of reverberation in the room to this day and age where people still connect with all of those same things. If we pre-recorded it, and you played it back as an album, there’d be no soul. It wouldn’t appeal to people.

If you just play a track and the artist sings to it, there’s no spontaneity. There’s no emotion. It’s milquetoast. A live performance is so unpredictable from night to night to night. You’re always going to have nuanced differences, and it’s never going to be the same.

There’s nothing like the visceral thrill of mixing live music. When I used to work live tours with almost 200,000 people, it was one of the most invigorating things to hear all these people clap and applaud at the end of the song. Thousands of people all dancing and bouncing at the same time is a tribal experience. You can only do that with live music.

Disclosure: Dani Deahl is a voting member of the Recording Academy and has an unpaid position as Governor on the Recording Academy’s Chicago Chapter Board.

Update February 10th, 11:58AM: Adjusted some figures on equipment used during rehearsals and live broadcast of the Grammys.