There is little debate that Web Audio is cool. Take for example Stepkit by Brent Jackson (embedded below).
It's definitely a fun toy to play with, but most of us probably couldn't think of how this might be relevant to our jobs. When I presented 8-bit game music with the Web Audio API at last year's Fluent Conference, I readily admitted that it was intended to be purely fun rather than practical.
Recently I explored the idea of adding audio to web apps, but I think the big problem isn't that web developers were unsure how to add audio to their app, but that they don't think they should add audio to web apps. In this article, I'd like to make the case that you should be considering audio when designing your web application user interface.
This article is based upon my presentation from presentation at Fluent 2015.
Examples of Audio in UI
Once you begin to think about it, audio is actually very common in UIs. Let's look at some examples of where audio usage is common and then we'll explore why it is used.
The most obvious example is in game UI. In the example below from the Watch Dogs game, you can see that we're not just talking about music, but the interaction with menus and transitions all include audio effects.
Many, if not most, mobile apps include some level of audio effects for both interactions and notifications. When you pull to refresh, or press certain actions, you often get some form of audio feedback. For example, here's the audio effects settings for the Twitter app on Android that allows you to enable/disable sound effects.
Many, though by no means all, desktop applications have some level of audio-enhanced interactions or notifications. Skype may be among the first that comes to mind (perhaps not all positively) as it includes all forms of sound effects when a connection is made, a person signs on, a message is received, etc. Slack is another one that I use often that has a number of audio effects, mostly associated with notifications. Outlook, as shown below, adds sounds to a number of events that occur within the app.
The Web is (mostly) Silent
So, desktop apps, mobile apps and games all include audio within their UIs but the web is mostly silent. In fact, I suspect most developers don't even consider audio at all when designing their UI.
In part, I think this is for two reasons. The first is that we only had access to the Web Audio API relatively recently. However, browser support is now very broad.
The second reason is that I think there is a miserable legacy for audio on the web starting with MIDI and moving on to annoying Flash intros. Developers have, at times, abused the audio options available to them (case in point) and this has left many people with a bad taste in their mouths.
Why Apps Include Audio
If you think about the types of applications that include audio above, they tend to do so for a variety of reasons.
Sound Communicates Atmosphere
This is the reason that most game UIs use audio - it helps to create the atmosphere that the game is trying to represent. Looking back at the example earlier of Watch Dogs, it is converying a slightly futuristic, perhaps slightly dystopian, atmosphere with its choice of sounds.
If you think about it another way, science fiction movies and television often depict computers that respond with all sorts of sounds when the user interacts with them. This isn't intended to reflect real life - in reality, this would likely be annoying - but the sounds are part of what makes the scene feel futuristic.
We've all seen Star Trek etc and are used to the idea of 'high tech' machines beeping, whether to alert us or simply as user feedback devices when activated. Similarly many cellphones default to beeping on user actions eg entering txt etc. But it is rare to see it used on a website, not that anyone wants to be 'beeping' on every click, but there is the potential to add to the character & feel of a site via use of sound...
- Tim Prebble
This use-case probably doesn't apply frequently to web apps. However, there are brands that have sound as a key part of their brand identity. This can be taken into account when choosing the right audio to add to your UI for other reasons, which we'll discuss below. Even if your app does not have a brand identity to adhere to, it's important to consider the kind of atmosphere your sound choices may make.
Sound Communicates When You're Distracted
It's important to keep in mind what we mean by distracted here. We don't exclusively mean when you aren't paying any attention to the application, though that can be important too. Sometimes, "distracted" simply means that you are paying attention to some other aspect of the application than the part that I need you to notice at this moment.
Being a mostly visual medium, web application UIs tend to rely entirely on visual cues to communicate important information and changes. A new news item may glow briefly as it appears on a list or slide in using some form of animation. The goal here is to get your attention on a part of the app that may have changed or contain important new information.
However, when you are not looking at the application at all - say, for instance, you're on another tab or outside of the browser entirely - we have very limited visual options to gain the user's attention.
Audio is particularly useful when there is no screen or when looking at the screen is not possible or not desirable (such as when users want to multitask).
- Karen Kaushansky
Sound Conveys Meaning
This is key, and goes beyond conveying atmosphere we discussed earlier. The distracted use-case is probably the easiest to grasp, but using sound to convey or enhance meaning is, in my opinion, the most important when used properly. To illustrate what I mean, let's first look at an (admittedly silly) example.
The sounds in this scene were intentionally unrealistic, however, I bet you could have closed your eyes and still known, more or less, what was going on. This may seem unimportant to you until you consider the fact that, to take just one example, no one actually makes a springy noise when they jump. Yet, you hear that springy noise and you think "jump." Why? Because that sound conveys a meaning, and that meaning was learned and reinforced over time.
In much the same way, if you design your use of sound with a good deal of thought, the users of your web application can learn the meaning and importance of different sounds.
Gaver (1986) investigated representational earcons, which he called auditory icons. His auditory icons are caricatures of naturally occurring sounds such as bumps, scrapes, or even files hitting mailboxes.
- Meera M. Blattner, Denise A. Sumikawa, and Robert M. Greenberg (1989)
The quote above discusses the idea of "earcons," which are effectively sound icons. If you think about what an icon is - a very simple, abstract image that conveys a lot of meaning. For instance, the much-maligned "hamburger menu" is just a few simple lines and yet we generally know that it represents some sort of menu or navigation.
Earcons are similar in that they are simple but can convey a lot of meaning. However, unlike what the above quote implies, they do not need to be realistic sounds. In fact, the meaning of an earcon does not need to be obvious the first time, as its meaning can be learned over time (though this doesn't mean you shouldn't consider your sound choices carefully).
[Earcons] are audio messages used in the user-computer interface to provide information and feedback to the user about computer entities.
- Karen Kaushansky
Thinking About Sound
As we've seen, sound can be used to achieve a number of important goals within a UI that cannot simply be done visually. However, if you choose to consider adding sound to your application, it's important to do so very carefully. Much like overusing any sort of UI element (like too many animations or too many fonts), overusing or poorly using audio can be painful to your users - perhaps even more so.
Choose Your Sounds Carefully
This rule applies to several aspects of your sounds:
- Implied meaning - while, as we discussed earlier, the meaning of sounds can be learned, some sounds come with an implied meaning - for instance, a rising pitch usually implies a rising value or dissonant sounds usually imply some form of error.
- Avoid being annoying - overuse of audio can be annoying, but so can the wrong sounds - beware of harsh sounds, especially if it will be used frequently (I'm looking at you Slack!).
"Would the addition of sound to this event provide redundant information and therefore better feedback? Or would this sound merely be superfluous and/or annoying?"
- Victor Lombardi
Plan for silence
You cannot guarantee that your users will always be able to hear audio, so make sure you never communicate information only via sound - have the sound enhance a visual cue. Even if the user doesn't have sound disabled, they could be using a browser that doesn't fully support the Web Audio API.
The following is from the Apple iOS Human Interface Guidelines, but the same rules generally apply to desktop web apps:
Users switch their devices to silent when they want to:
- Avoid being interrupted by unexpected sounds, such as phone ringtones and incoming message sounds
- Avoid hearing sounds that are the byproducts of user actions, such as keyboard or other feedback sounds, incidental sounds, or app startup sounds
- Avoid hearing game sounds that are not essential to using the game, such as sound effects and soundtracks
Hopefully I've shown that there's a case to be made for including audio in your UI. Not all apps need audio, but I think it should be considered whenever planning the UI of a web app - even if it is considered and decided that it doesn't make sense in this situation.
However, perhaps you are still unsure of situations where you might use audio. So, I've put together a list of some ideas I had, though it is by no means comprehensive. Some of these were discussed in my prior article, Adding Audio to Web Apps, which covers the actual implementation details. You can also find even more examples in my GitHub repository.
Here's just some of my own ideas to get your (better) ideas flowing (note: if I have an example of this in my repository, it will be linked):
- Page updates/status
- Indicate changes to a page.
- Redo/Undo - again, as a reinforcer of an action we don't ever want our users to do accidentally.
- Time limit - for instance, on for a session timeout or a purchase that has a time limit (for example, tickets). This and other examples can be used in conjunction with the Page Visibility API to ensure that the warning noise only occurs when the page is not visible to the user.
- Enable or disable specific functionality - another example of reinforcing an important action.
- Multiple pushes modify a value - again, especially useful if the button and the value are not in close proximity.
- For loading/loaded of asynchronous content - useful for long loading processes, this can reinforce a visual loading cue but can make the user aware a process has finished even if the visual cue is offscreen or the user is distracted.
- On error or standard notification - this is a good case where the meaning of a sound can be learned - for example an error notification will have a different sound than a purely informational notification.
- Data/email received - for notifying the user of important information being received - if a lot of these will occur, it should be combined with the Page Visibility API so that the user only hears these when the page is not visible.
- Motion/Gesture based interactions - as non-pointer-based interaction becomes more common, so do inadvertent actions and thus the importance of reinforcing certain actions with sound.
I'd love to hear your ideas for using sound in web applications. After my Fluent session, I heard from a number of people who gave me fantastic examples of where these very sorts of use cases came up in their apps. If you have any of your own, please feel free to share in the comments.
Header image courtesy of Iwan Gabovitch