A path with choice: What we learned from designing an inclusive audio guide

Desi Gonzalez, The Andy Warhol Museum, USA


How can museums design inclusive experiences that respond to real user needs, are sustainable, and fit into the museum’s larger digital strategies and workflows? In the fall of 2016, The Andy Warhol Museum launched Out Loud, an inclusive audio guide that is available for all visitors but designed especially with visually impaired audiences in mind. At the heart of the process was a user-centered design and development approach that went beyond just looking at technology and instead touches upon every point of a visitor’s journey.

Keywords: audio guide, accessibility, inclusive design, mobile, user research

Pop art is for everyone. I don’t think art should only be for the select few, I think it should be for the mass of American people. —Andy Warhol, from Mike Wrenn, Andy Warhol: In His Own Words, 15


In October 2016, The Andy Warhol Museum in Pittsburgh launched Out Loud, an inclusive audio guide designed for all visitors, but with a particular focus on visitors who are blind or have low vision. Out Loud’s release was the culmination of an extensive eight-month user-centered design process in which we worked closely and iteratively with Pittsburgh community members who have visual impairments. The result is an iOS-based audio guide that uses Bluetooth low-energy beacons to push out content based on location; implements a “smart” audio player, breaking audio stops into modular chunks of content that are dynamically reordered based on a user’s needs or preferences; and provides a mix of voices and tones, interweaving interpretive content with content designed for accessibility. This paper outlines the process of designing and developing an audio guide that serves visitors across abilities and suggests best practices for designing inclusive mobile applications and audio guides for museum contexts. Always keeping our museum’s namesake and focal point in mind, The Warhol team believes that Out Loud gets us closer to achieving Warhol’s wish to make Pop art for everyone.

About The Andy Warhol Museum

The Warhol is the largest single artist museum in the United States. Its permanent collection galleries present the life and art of Andy Warhol chronologically from top to the bottom: the seventh floor displays works and ephemera from his birth through the early Pop years; the sixth floor represents the 1960s; and so on. The Warhol also presents contemporary artists and performers whose work demonstrates the legacy of or is in dialogue with that of Warhol.

One of the four Carnegie Museums of Pittsburgh (CMP), The Warhol is a member of a consortium of cultural institutions. Each museum has a unique identity and structure but shares resources and ideas across its sister components.

Accessibility and inclusion at The Warhol

Central to the development of the app is the spirit that designing for inclusion makes a better experience for all. Out Loud is one of a larger series of accessibility initiatives that The Warhol has undertaken over the last several years. The audio guide was preceded by a pilot project developed in the summer of 2014 in conjunction with the special exhibition Halston and Warhol: Silver and Suede; the resulting proof-of-concept iOS app used Bluetooth beacons to deliver visual descriptions based on location. Other recent accessibility initiatives at The Warhol include a series of sensory-friendly events for teens and adults, an FM assistive listening system in the museum’s theater, and tactile reproductions featuring raised line and textured reproductions of key collection works. When possible, the Out Loud team endeavored to incorporate these other accessibility initiatives with the audio guide—for example, the tactile reproductions are accompanied by narrated audio that guides visitors through a work’s composition, and we use the audio guide to message to visitors about accessible offerings such as the theater assistive listening system.

A man and a woman stand in the galleries of the Warhol, sharing an audio guide device. The man on the left holds a white cane in one hand and leans over to touch a white tactile reproduction with his other hand.
Figure 1: visitors use Out Loud while experiencing the tactile reproductions installed on the 7th floor of The Warhol

Background research

When you think about a visual art museum, you don’t necessarily think about a blind person coming into this venue. —User research interview, 2016

As one of our user-experts shared during an interview session, members of the visually impaired community often feel that a visual arts museum—emphasis on the visual—isn’t a space for them. However, several museums and nonprofit organizations have long worked to make art and material culture accessible to people who are blind or low vision.

One of the more common offerings are visual description tours (alternatively called “audio description” or “verbal description” tours), which can either be in-person, docent-led pre-scheduled tours, or mobile experiences for independent visitor use. A few museums offer visual description mobile and audio guide tours, including the Museum of Modern Art, the National September 11 Memorial & Museum, the San Francisco Museum of Modern Art, and the J. Paul Getty Museum. In virtually all cases, the description tours are siloed experiences, separate from other interpretive audio content the museum may offer. The American Council of the Blind’s Audio Description Project maintains a comprehensive, crowd-sourced list of both docent-lead and mobile audio description tours at museums, parks, and exhibits (http://www.acb.org/adp/museums.html).

In recent years, a handful of cultural institutions have taken on more ambitious mobile projects that make collection objects accessible to visitors with visual impairments. Launched in 2012, Access American Stories is an English/Spanish bilingual crowd-sourced audio application that accompanied the American Stories exhibition at the Smithsonian’s National Museum of American History. Users were invited to record their own description of an object or listen to others’ descriptions (http://origin.apps.usa.gov/access-american-stories.shtml). More recently, Digita11y is a cross-institutional project funded by the Institute of Museum and Library Services with the goal of producing a reusable toolkit that museums can use to build accessible applications (https://www.digita11y.org). Like Access American Stories, Digita11y is exploring the concept of crowd-sourcing to populate descriptions. At the heart of the Digita11y platform is Roundware, an open-source audio framework that allows users to contribute their own content and then plays back layers of audio in a soundscape based on the location of where audio was recorded (http://roundware.org).

The Canadian Museum of Human Rights (CMHR), a museum in Winnepeg, Canada that opened its doors in 2014, was designed with inclusion and accessibility at its core. This early mandate for inclusive design led the CMHR team to incorporate many features within the museum that makes content accessible to visitors with visual impairments. These include a universal keypad that allows visitors to control screen and audio throughout the exhibitions; a mobile application that delivers content (including audio descriptions) via Bluetooth low-energy beacons; and universal access points that indicate when accessible content is available on the mobile guide, such as a tactile cane-detectable floor strip. As the team behind CMHR described, inclusive design is about “designing and developing with the consideration of all abilities from the outset. The inclusive design approach will ensure the museum experience is not only accessible for all ages and abilities, but is enriching and satisfying for all. It is not a design style, but an orientation to design” (Wyman et al., 2016). This approach, which considers inclusive design as central to the experience instead of as an additional layer, would become a guiding principle in the development of Out Loud.

Process and project management

When The Warhol team began working on this project in January 2016, we recognized that we had a lot to learn about designing technologies for audiences with visual impairments. We saw the project as an opportunity to become experts in Web and mobile accessibility and set museum-wide standards for inclusive design in digital projects. With this in mind, we decided to adopt agile and user-centered approaches informed heavily by research on best practices for mobile accessibility.

First coined in the 2001 in the Manifesto for Agile Software Development (Beck et al., 2001), “agile” is used in the technology industry to refer to a loose set of principles centered around designing and developing software products in iterative cycles. Agile is often contrasted with the waterfall model of project development, in which stakeholders sequentially define requirements, come up with a concept and design, and finally build the project. Agile methodologies allow for more flexibility and adaptability along the way, with the final product shaped by testing. Unlike with the waterfall method, an agile process might result in a product that is completely different from its initial requirements.

Another set of philosophies core to the development of Out Loud is user-centered design and its cousins, human-centered design and designing thinking. While scholars, designers, and companies may differ on the distinctions between these three terms and whether they prescribe particular methodologies, all three approaches share a commitment to starting the design process from a position of understanding real human needs. User-centered design entails identifying needs by going straight to the source (in other words, talking to the intended audience), designing based on the findings of this initial research, and continually testing designs with users to improve in the product. Out Loud is the result of a user-centered approach: Before we created a single wireframe or wrote a line of code, we talked to community members with visual impairments about what constitutes successful museum and technology experiences. Below, we describe the particular methodologies we adopted in our process.

Teams and collaborators

The team structure and the many collaborators and contractors who contributed to Out Loud were essential to our agile and user-centered design approach.

The core project team consisted of six staff members. Three were full-time staff members of The Warhol:

  • A project manager who directed the ultimate vision of the audio guide, was responsible for making sure it fit within our existing digital ecosystem and museum workflows, and led user experience research;
  • An education specialist familiar with the collection and how we interpret objects;
  • An accessibility liaison who has experience working with populations with disabilities.

We also partnered with the Carnegie Museum of Pittsburgh’s Innovation Studio, an in-house design and development laboratory shared among the four museums within the CMP consortium. There were three core team members from the CMP Innovation Studio:

  • A lead software developer
  • A junior software developer
  • An audio producer

There were many benefits to working with the Innovation Studio instead of hiring contractors for technical development and audio production. As an in-house agency, the Innovation Studio understood our challenges as a museum and was concerned about the long-term sustainability of the product. Knowing that they might be able to reuse some of the technologies at the other three CMP museums, they were and continue to be more invested in the maintenance and viability of Out Loud. Additionally, we collaborated closely as peers, moving away from a traditional client-contractor relationship.

The core team carried out day-to-day project responsibilities, meeting regularly over the course of half a year. On a monthly basis, we met with an advisory team. The advisory team consisted of stakeholders and experts in the field who provided feedback and support throughout the project. The four advisory team members included The Warhol’s director of development, The Warhol’s director of visitor services, the head of the CMP Innovation Studio, and an outside advisor who founded Conversant Labs, a startup devoted to building voice-based technologies for users with visual impairments.

We also worked with consultants and community partners to develop Out Loud. Prime Access Consulting, an agency dedicated to inclusive design technology with a specialty in museums and public spaces, authored and produced the visual descriptions and tactile guided experiences for the audio guide. Prime Access also provided technical feedback on mobile accessibility. Additionally, we worked with artist and technologist David James Whitewolf of Tactile Reproductions LLC to design and fabricate touchable adaptations of Warhol artworks.

Perhaps most importantly, user-experts and local disability groups played a critical role in Out Loud. Over the course of seven months, we worked with three Pittsburgh-area residents who served as our primary user testers and co-designers. These user-experts vary in degree and type of visual impairments—including full blindness, low vision, and colorblindness—as well as level of comfort with technology. We also welcomed organizations such as the Pittsburgh Accessibility Meetup and Blind and Vision Rehabilitation Services of Pittsburgh for testing sessions and to assist with outreach and marketing. Ultimately, our users were just as much designers as the core team: their experiences and feedback directly shaped Out Loud.

Project cycle

Out Loud development consisted of four overlapping stages: research, design, content development, and technical development. We held monthly advisory team meetings starting with a concept meeting in mid-March. We worked backwards from the advisory team meeting dates, using these gatherings to test prototypes. During the second half of the project, we would conduct outside usability tests with our user-experts one week before the advisory team meeting, and then present findings to our advisors.

This graphic represents the overlapping phases in the Out Loud project development cycle—research, design, content development, and technical development—from January through July 2016
Figure 2: timeline for Out Loud project cycle

This project management approach was beneficial for a number of reasons. Advisory team meetings became milestones with hard deadlines, keeping project momentum from languishing. Additionally, engaging advisors in the same usability tests as our user-experts fostered empathy for users. Finally, advisors were able to glimpse into the many decisions and skills needed to develop an audio guide and therefore could better advocate for the project when it came to organizational buy-in and fundraising.

Research phase

Our research phase consisted of three main lines of inquiry: mapping the landscape of accessible museum technology for audiences with visual impairments; investigating best practices in accessible mobile design and development; and interviewing user-experts who are blind or low vision on their needs. Of these three, the interviews proved the most essential in formulating the design principles that would, in turn, drive Out Loud’s design. Through these interviews, we wanted to gain an understanding of how our user-experts use technology, how they experience museum and cultural institutions, and what kind of audio experiences they consume. (The full list of interview questions is available online at http://bit.ly/OutLoudAppendixA). While the initial research phase outlined in the chart above seems to have an end date, the Out Loud team strongly believes that user research is never done. Even after the audio guide’s launch, we continually seek feedback to inform future development.

Design and usability testing

After the initial research stage, we began creating mock ups and prototypes of both audio content and app interfaces. Armed with an agile mindset, we recognized that we were building things that we might ultimately throw away. Early app designs were paper prototypes, and we often tested these via informal sessions with Warhol staff members.

During the latter half of the design and development cycle, we moved toward building interactive, iOS-based prototypes. We conducted two formal usability sessions of these prototypes with our user-experts and offered compensation in exchange for their participation. Our first session focused on testing the “Everything” navigation view (now called the “Stories” view) which displays all of the available audio, as well as two prototypes for the audio player. Our second session focused on the “near me” functionality that delivers audio based on location, “The Warhol” navigation view that displays information about the museum and amenities, and using the audio guide in the galleries with different types of headphones. (Usability testing session protocols are available online at http://bit.ly/OutLoudAppendixB and http://bit.ly/OutLoudAppendixC.)

Through this process, we developed best practices for conducting usability tests, with a specific focus on audiences with visual impairments.

  • Clearly define goals for the usability session: Don’t try to test everything. We designed tasks specifically around features that we wanted to test; targeting the user study this way allowed us to reach concrete conclusions on how to move forward with our designs. At times we prototyped two versions of a feature and conducted A/B testing.
  • Clearly define researcher roles: We assigned one facilitator to each user tester; the facilitator’s role was to guide the participant through tasks and ask questions. We also had an observer take copious notes on interactions and dialogue. By relieving the facilitator of documentation responsibilities, he or she can better concentrate on interacting with the tester. When working with users with visual impairments, especially when moving around in public spaces, it can also be helpful to assign a staff member to serve as a sighted guide.
  • Create a natural user environment: When possible, emulate an experience similar to what it would be like to be at the museum. For example, when we asked our user-experts to test the “near me” functionality of the audio guide in the galleries, we had a staff member accompany each participant as a sighted guide. The role, however, was not merely to help the user navigate the space; the staff member used the audio guide as well, as if the tester and the sighted guide had come to visit the museum together.
  • Ask first: Always ask before you help a user tester, touch the screen, or take a device away. This is a great rule for any usability test, but doubly important when working with users with visual impairments. It is both a sign of courtesy and a way to empower participants during a usability test, putting the “expert” in the term user-expert.
  • Speak as you usually would: When working with users who are blind or have low vision, don’t worry about using language like “see” and “look,” as in “Do you see what I mean?” As the author of Just Ask: Integrating Accessibility Throughout Design explains, “People who are blind often make comments such as, ‘I can’t find what I’m looking for,’ and ‘I don’t see it on this [Web] page’” (Henry, 2007).

In late July, as we were nearing a beta release of Out Loud, we invited members of the community to attend an accessibility meetup event at The Warhol to test the audio guide. The Pittsburgh Area Accessibility Meetup is a community group, organized through the website Meetup.com, that tends to meet on a monthly basis to socialize and explore a shared interest in accessible design and technology. The members of this community represent a wide range in age, background, and ability. While this event was less formal than the two previous usability sessions, it became another essential moment to gather feedback. At this session, the feedback we received was focused on the hardware, device set-up, and the logistics of customizing the device. When we soft-launched the audio guide at the Leadership Exchange in Arts and Disability conference in August 2016, we used the opportunity to refine the process of preparing devices for on-site visitors.

Design principles

The Warhol team took CMHR’s approach to heart when developing Out Loud. Rather than considering accessibility as an additional layer or a siloed experience, we considered the needs of visitors across abilities as we made design decisions. In developing Out Loud, we were committed to the following: 1) providing a visual description for every object represented on the audio guide; 2) integrating visual descriptions with interpretive content; and 3) creating an audio guide designed for all visitors across abilities. The final commitment means that we offer features that may accommodate visitors with disabilities beyond blindness or low vision, such as including transcripts of all text, offering messaging for visitors with sensory disorders, and providing neck loops that visitors can borrow to use with their T-coil hearing aids.

While these commitments were a good start, we still had a lot of decisions to make when developing Out Loud. To aid us in these efforts, we synthesized learnings from our research phase into design principles. Our user-experts all had some degree of visual impairment, but we found that the majority of the learnings are ones that could apply to all visitors regardless of ability. These five principles would serve as guideposts at every point of the design process.

Provide a path with a choice

Users want choice and self-direction in their museum experience. One interviewee told us that she prefers “a little more independence” during a visit: “I don’t want to feel like I’m being babysat.” However, navigating a new space is difficult for visually impaired audiences. One of our interviewees expressed a feeling of self-consciousness when in museum spaces, with particular concern about safety to the artwork: “I don’t want to be the kid that puts his hand in the canvas.” Our interviewees reported wanting some guidance through their museum visit while allowing them to customize their visit based on their interests.

Build a social experience

The majority of our visually impaired visitors come to museums with a companion. “I usually go to museums with other people,” a user-expert shared, “it’s a social thing.” Another noted that he “can’t remember going to a museum by [himself],” even before he lost his vision. All of our interviewees see museums as social spaces and commented that they like to discuss the artwork they are experiencing with a friend. As one person explained, “I want to talk to someone immediately about the artwork I’m seeing.”

Use industry standards for interactions

Technologies like iOS VoiceOver are powerful tools that some of our user-experts use everyday. Other interviewees were newer to technology or adjusting to a new disability and therefore are currently learning to use these tools. We wanted to design an experience that reinforced industry standards and natural interactions for our audiences. Our interviewees encouraged us to find inspiration in simple apps that they use on a daily basis, such as Apple’s Podcasts app.

Even apart from accessibility, adhering to common patterns in interaction design can be useful. One interviewee explained that when she opens an app, she uses VoiceOver, Apple’s screen reader, to get a sense for the layout: “I expect a back button in the top left corner, the navigation on the bottom,” and for audio or video, “a play/pause button in the center of the screen.” Another user-expert shared, “I don’t like when an app requires too much backward and forward,” emphasizing that accessing content via too many screens or clicks can be a frustrating experience.

Keep physical interactions to a minimum

Visually impaired visitors use guide dogs, canes, or friends to move around the museum. One of our user-experts described that, when she is in a public space, she is likely holding the leash of her service dog in one hand and and the arm of a friend in the other. Another interviewee who uses a white cane explained that when it comes to interacting with a mobile device, “I like things that I can do with one hand.” Additionally, sound is an important sense when navigating a new space, and therefore audio should be kept to a minimum while visitors are moving. Our user-experts had strong opinions about headphone styles, preferring earbuds to over-the-ear headphones because the former can be used on just one ear while allowing the other ear to listen for environmental cues.

Provide a mix of voices and tones

Every visitor comes to the museum with a unique set of perspectives and tastes, and therefore will be interested in learning about different things. While one of our interviewees said she likes “hearing about the history around a work of art more than how it’s made,” another preferred to learn about “the details—the medium, the style, the process to make it.” Additionally, our interviewees favor human-sounding voices with a personal touch. One user-expert noted that he preferred human voices to synthesized speech, and especially enjoyed museum content delivered via a natural, conversational tone: “It’s got to sound good.” Several of our interviewees are voracious consumers of podcasts and television and thus expect high-quality, engaging audio experiences.

Many of above design principles are embodied in a quote from one of our interviewees. A good museum experience would allow her to be “engaged, social, experiencing art in a way that was expressly made for me: it allows me to be independent.” These design principles would come to inform every decision with regards to content development, technology, interface design, and hardware and logistics, as outlined in the following sections.

Content development

I like when [museum and cultural offerings] are not a children’s thing and not a thing about disability. —User research interview, 2016

In designing Out Loud, we were determined not to silo accessible content apart from other interpretive content. Instead of creating “a thing about disability,” we hoped the audio guide would be a seamless, inclusive educational experience regardless of a user’s ability. More generally, our user research demonstrated that all visitors—whether they have a visual impairment or not—bring different perspectives to the museum experience and therefore want the ability to learn about Warhol in ways that best fit their interests.

With these learnings in mind, we designed an audio guide content strategy that breaks away from the traditional, several-minute-long audio file per artwork, and instead uses modular chunks of audio content—“chapters”— grouped into larger “stories.” Each story features a representative object from the museum’s collection and therefore an associated visual description, but the overall “story” might address a theme, series of artworks, or time period in Warhol’s life. This modular structure allows visitors to jump around to content that reflects their interests and decide to dive deeper if they’d like to learn more. Modularity also mitigates an internal challenge for museum staff: collection objects often get rotated out of the galleries, whether for conservation purposes or because of travel to other venues, and the modular audio blocks associated with themes or series allow us to better repurpose audio content. When, for example, Big Torn Campbell’s Soup Can (Pepper Pot) was replaced with another example of a an early hand-painted Campbell’s soup can, we were able to reuse several of the existing audio chapters and only needed to record a new visual description and introduction.

Content designed specifically for visitors with visual impairments

Out Loud offers two kinds of audio content designed specifically for visitors with visual impairments: visual descriptions and the guided narratives that accompany the tactile reproductions. We worked with Prime Access Consulting to produce both of these content types.

Visual descriptions translate the formal elements of a work of art into a detailed verbal narrative. Interviews with user-experts revealed that, while a vivid and well-crafted visual description is an important and necessary entry point to understanding an object and related information about the work, the description isn’t why they are at a museum. “Visual description is super important, but it can’t be too long or boring,” one interviewee explained. “Just quickly describe what it is so I can get to the good stuff”—that is, the interpretive content. With this in mind, we aim to keep visual descriptions between 30 and 60 seconds long when possible.

Out Loud also works in conjunction with tactile reproductions of selected iconic Warhol artworks. These durable, large-scale raised line and textured reproductions, created using a computer numerical controlled (CNC) router and digital imaging software, allow visitors to learn about Warhol works through the sense of touch. The audio guide offers narrative recordings—what we call “guided tactile experiences”—that lead visitors through the experience of the reproductions.

While the visual descriptions and guided tactile experiences were both designed specifically as tools to aid individuals who are blind or have low vision in understanding what a work of art looks like, The Warhol team strongly believes that these are tools that can encourage sighted visitors to engage in close looking and careful observation.

Interpretive content

Every Out Loud “story” includes a short introduction, a visual description of the representative object, and several chapters of interpretive content. The interpretive content is diverse, covering topics as broad as biographical information or anecdotes from Warhol’s life, the historical and social context within which Warhol was working, art historical analysis or a curatorial perspective, and explanations of Warhol’s process. We also include clips of archival recordings of Warhol’s friends and family members. Within the story “Pop Products: Campbell’s Soup,” for example, users can listen to a visual description of Big Torn Campbell’s Soup Can (Pepper Pot), learn about Warhol’s early tension between abstract expressionism and pop art styles, explore Warhol’s interest in consumer products, or listen to Warhol’s friend Leila Davies Singelis describe the moment when Warhol telephoned to tell her about his newest subject matter, soup cans. The interpretive content takes on a conversational tone; apart from the visual and tactile descriptions, none of our audio recordings were scripted. Each story includes perspectives from multiple voices; we recorded interviews with people as diverse as curators, educators, publications staff, and Warhol’s nephew.


We implemented several technology solutions that make Out Loud and the museum experience accessible to visitors who are blind or have low vision, including supporting iOS accessibility features such as Dynamic Type and VoiceOver, and using beacons to push out audio content based on a user’s location.

Designing an accessible iOS application

Technologies like screen readers have opened new possibilities for people with visual impairments. With more recent versions of iOS, Apple has baked in many accessibility features that allows their devices to be customizable to users’ needs and preferences. There are many built-in iOS accessibility features that accommodate across a spectrum of abilities, but we focused on supporting two particular features: VoiceOver and Dynamic Type.

Dynamic Type is a feature that allows users to set preferred text sizes. When Dynamic Type is properly implemented, titles, headlines, labels, body text, and captions will adjust at different scales according the content type, so that the body text might magnify at a greater rate than a headline in order to improve readability while prioritizing more important content.

VoiceOver is Apple’s screen reader functionality. A screen reader is an audio interface that translates what we visually see on a screen into speech, using a synthesized computer voice. With VoiceOver, users can use a series of gestures to navigate the screen. VoiceOver allows users to change the voice, currently offering 10 options in U.S. English and many more in other accents, in both female and male varieties. Users can also change the rate of speech according to their preference. (During our research phase, we discovered that some of our community partners used a VoiceOver speaking rate of 100%—double the speed of average speech—a rate which, for many laypeople, is indecipherable. One of our interviewees mentioned that she uses such an accelerated speed for two reasons: one, it’s more efficient to navigate her device that way; and two, she used it for privacy—when she is using VoiceOver in public, it prevented others from hearing what she was consuming, whether a text message or a long-form article.)

The following is a list of learnings from designing VoiceOver, but is in no way exhaustive list of best practices.

  • Structure matters: Screen reader users aurally scan a page or screen the way a sighted user might visually scan for certain headlines or words, jumping from heading to heading and from hyperlink to hyperlink. The semantic structure of a website or an application—and style elements within it—already provides a hierarchy that can help both sighted and non-sighted users navigate through a screen. As the Carnegie Museum of Pittsburgh’s Web Accessibility Guidelines state, “Many screen readers have functionality that allows a user to select to read only the headings on the page, or only the links. Giving precedence to the way headings and links are written is a significant way we can make browsing the web easier for these users” (The Studio at Carnegie Museums of Pittsburgh, 2016). In this sense, a solid codebase that follows modern conventions is already a big step toward designing an accessible and screen reader-compatible experience.
  • When crafting VoiceOver attributes, order and consistency are key: Like many screen readers, VoiceOver incorporates attributes that provide information on what users are accessing or interacting with. Attributes include the following: state (“transcript closed”), labels (short words or phrases that describe a control or view, such as “play”),  role, ordinality (“two of seven”), and hint (“double tap to open transcript”). Our consultant, Sina Bahram from Prime Access Consultant, advises that the standard heuristic for ordering VoiceOver attributes is first state, then label, role, ordinality, hint, and finally global message. Maintaining a hierarchy for VoiceOver attributes provides a consistent and predictable experience throughout the user experience.
  • Announce state changes: Avoid automatic state changes; it is always best to give agency in deciding when something starts and stops. If something does happen automatically within the app, use VoiceOver labels to communicate it. In Out Loud, when a user selects a story, the story opens the player view and automatically plays the audio; we feel this is mitigated because the VoiceOver label indicates that selecting a story will play audio.
  • When in doubt, take it out: If you aren’t going to announce an interaction or change of state, don’t include it or change it. For example, on one of our iterations of the player view that displays the full list of chapters within a story, we had built an interaction so that if users had one transcript expanded and subsequently selected a different transcript to open, the first transcript would collapse. The problem for VoiceOver users, however, was that users would hear that the new transcript open but would not be informed that the former had closed. We decided to allow multiple transcripts to be toggled open at one time. Another instance when we had to make a decision about automated state changes was with the autoplay feature between chapters within a story. If a user does not have VoiceOver enabled, Out Loud’s default is to automatically play the following chapter five seconds after the previous one ends. When a user turns VoiceOver on, however, we changed the default so that Out Loud will not autoplay the next chapter, but rather requires the user to press play to start the audio. (In both cases, users can toggle the autoplay feature on or off, thus overriding our defaults.) We made this decision based on both best practice but also based on findings during user testing: all of our participants relying on VoiceOver to navigate through the app preferred to choose when to play the next audio chapter, while testers who were not using VoiceOver functionality reported preferring autoplay.
  • Pay attention to language: Certain nomenclature, artwork titles, and stylistic conventions don’t translate well to speech, so considering language for VoiceOver labels and menu items can be crucial. In one instance, we featured an artwork titled Nosepicker I: Why Pick on Me. We were surprised to find that VoiceOver pronounced the first part of the title as “Noh-suh-pick-er Eye” instead of “Nose Picker One.” To resolve this issue, we created a new data field in the code that would allow us to override the display title with our own VoiceOver label that was formatted to account for these aberrations.
  • Make use of Magic Tap: Magic Tap is a kind of wildcard gesture in VoiceOver that Apple Developer describes as performing “the most-intended action.” App developers can assign a special functionality that will be activated by using a two-finger double tap anywhere on the screen, even if VoiceOver’s focus is not on the button that will be activated. Many seasoned VoiceOver users expect Magic Tap to be implemented on an application. We implemented Magic Tap to play and pause audio. (When we tested an early prototype of the audio player with one of our user-experts, she instinctively used Magic Tap to pause the audio, and was pleased to discover that the gesture worked as she intended.)

“Near me” and Bluetooth low-energy beacons

The “Near me” section of Out Loud allows onsite users to find audio content based on their location within the museum. We use low-energy Bluetooth beacons to do so. For VoiceOver users, the screen reader announces whenever new stories are detected while on the “Near me” view.

When The Warhol prototyped an accessible audio guide in the summer of 2014, staff tested beacons as a mechanism to push content based on location. This prototype taught us a lot about how to use—and how not to use—beacons in our 2016 Out Loud implementation. In the summer 2014, we associated one beacon per artwork, and the audio guide would present the audio with the strongest signal. Often, beacons were placed behind or on the inside frame of an artwork. User testing sessions revealed that this deployment was rather noisy, with the app often switching back and forth between artworks that the device was detecting even when the user was not moving through the space.

In implementing beacons for Out Loud, we looked to the learnings from the summer 2014 prototype and spoke to museum colleagues who have deployed beacons at their institutions. Inspired by conversations with a staff representative from the Guggenheim, who had recently deployed a “near me” functionality on the Guggenheim app, we decided drop the one-to-one beacon/artwork correlation in favor of using beacons to cover regions of the galleries. In practice, what this meant was that we aimed for precision on a gallery or room level, rather than artwork level. With this approach, we treat beacons as infrastructure, installing them to cover an entire floor so that the beacons don’t need to be moved when objects are rotated off the floor. Another benefit of this strategy was that we were less likely to put valuable artwork in danger by attaching beacons to the artwork’s inner frame. One drawback to this implementation is that we are not able to be as precise about an object’s proximity to a user. We decided that, for the first version of Out Loud, we were willing to sacrifice precision because our user-experts with visual impairments indicated that they would most likely visit the museum with a companion who would help them navigate through the space. In future versions of Out Loud, we hope to improve upon precision of beacon deployment.

Interface design: a case study

We iterated on all parts of the audio guide’s interface, from the introductory tutorial to museum amenity information. In this section, I take an in-depth look at the evolution of one element of the application: the learning player, which presents three to six modular audio chapters within one story. The order of the chapters are dynamically rearranged as the audio guide learns about a user’s preferences or needs. If a user has VoiceOver enabled, the visual description audio file moves to the top of the list, right after the short introduction, so that users with visual impairments can get a sense of an artwork before diving into deeper content about it. Additionally, each chapter is assigned a category—such as historical context, artistic process, or archival material—on the backend, allowing Out Loud to tailor a path based on user interests. For example, if a user fully listens to chapters assigned the “historical context” category but always skips archival audio, the app will move the historical context audio higher up in the queue.

Early designs: inspiration from unlikely places

Once we settled on the concept of modular audio chapters within a story, we knew we had to design an audio player interface that would best surface the content. We found inspiration in two places: “card”-based interaction design and the Quartz news app.

Cards are a popular UI pattern found in websites and mobile applications such as Pinterest, Google, and Facebook. Moving away from fully designed, static Web pages, these platforms use cards—self-contained rectangles that present content and interactions to a user—to communicate quick stories and serve as an entry point for more information. In the card-based prototype, the entire screen represents one story, displaying the story title and basic information about the representative object at the top of the screen. The center of the screen surfaces one card at a time that represents either the currently playing audio or reveals other chapters a user can listen to. While an audio chapter plays, the card contains audio player controls; after the chapter finishes playing, a new card replaces it, now displaying options of available related audio.

A wireframe of an early card-based design of the app. The entire screen represents the "Overview" story, while a card in the center of the screen displays audio controls.
Figure 3: card-based prototype: audio controls


Wireframe of an early card-based prototype. The screen is currently on the "Flowers" story, with a card in the center of the screen displaying options of more audio files the user can listen to.
Figure 4: card-based prototype: related audio

Our second concept was inspired by the Quartz news app, which “delivers you the headlines as if you’re having a text chat with your best friend who just happens to also be a knowledgeable journalist” (Lagace 2016). On an interface that resembles a Whatsapp or iMessage chain, Quartz reveals short bursts of witty and concise takes on the current events. Upon opening the app, it delivers a short “teaser” for a news item. Users can then select a button that reveals more content, or otherwise skip to the next major story. We were intrigued by Quartz’s conversational tone and the way the application slowly revealed a news story, allowing users to dive deeper if they so desired. In our Quartz-inspired prototype, the app reveals one audio chapter at a time. When the chapter finishes, buttons emerge with options for chapters that a user has not yet listened to. 

In this screen shot of a mobile application, we see a thumbnail of an artwork named "Flowers" at the top of the screen and audio controls at the bottom of the screen. In the middle of the screen are two transcripts of an audio file, with buttons below showing each transcript indicating what other content is available.
Figure 5: screenshot of Quartz-inspired prototype

We initially created paper prototypes of both design directions and tested them with our advisory team. However, to effectively test these concepts with users with visual impairments, we decided to code interactive prototypes with basic VoiceOver commands implemented. Our developers opted to build the applications using the React Native framework and a Redux architecture (React Native, 2015). This technology decision allowed us to prototype different components without having to fundamentally change large parts of the application for the final design.

We A/B tested the card-based and Quartz-inspired designs, asking user testers to interact with each design and then compare the experiences. Reception to both was mixed. Users enjoyed the concept of the modular audio content that allowed them to delve deeper into Warhol’s life and art. However, they felt that the audio guide provided too much of a path and not enough of a choice: they never could see an overview of all of the available audio content. Additionally, on both prototypes, actionable buttons changed location on the screen from chapter to chapter, which was a confusing experience for users with visual impairments who are attempting to get a handle on a new interface.

Intermediate designs: shifting gears

The A/B usability test for our first two design concepts proved to be a humbling experience. We learned that unconventional interface designs, while exciting to us as interaction designers, could be very difficult for visitors with visual impairments. We knew we’d have to give up on the idea of the “slow reveal” of our modular audio chapters in favor of surfacing all of the available content at once.

It was back to the drawing board. Our next prototype gave listed all available chapters on one screen. A user could scan the list of chapters and select what he or she wanted to listen to; subsequently the audio player controls—still in the card format from our first design—would appear under the heading. The title of the chapter simultaneously serves as a button that plays the audio and a progress bar that visually indicates how much of the audio had played. While we were excited that this new design neatly surfaced all of the content within a story, we were not happy that the audio player controls were not in a consistent location for each chapter.

This mock-up of an application screen represents one audio guide "story" consisting of multiple audio chapters. The chapters are represented by horizontal bars stretching across the screen, with the title of the chapter written on the bar. The "introduction" bar is filled in entirely in green, where as the "historical context" bar is only green for the first fifth of the bar, while the rest is gray. Under the "historical context" bar is a set of audio controls.
Figure 6: wireframe of an early list-view design

The following design maintained the list of chapters and the progress bar, but now moved the audio player controls to a static bar at the bottom of the screen, above the navigation tabs. We felt like we were on track, but the static audio player elements—which included a thumbnail of the representative image, story and chapter titles, play/pause, backward and forward buttons, and audio speed—felt visually busy and inconsistently applied. We iterated on several bottom audio players before landing on the final design.

In this wireframe of the audio guide, the "chapters" within the Blotted Line Technique story are listed on the screen. At the bottom of the screen is a bar containing the audio controls. The audio control section contains a small artwork thumbnail on the left hand side, the title of the story and chapter next to it, and play, back, forward, and speed control buttons on the right-hand side.
Figure 7: wireframe of a second list-view player, with audio controls at the bottom of the screen


In this wireframe of the audio player, the audio controls are contained at the bottom of the screen. The controls include the name of the chapter in the center of the screen, a play button on the left, and a next button on the right. Controls to rewind 5 seconds and change the audio speed are staggered in between the previous three items, located slightly below them.
Figure 8: wireframe of a third list-view player, with audio controls at the bottom of the screen

Final design

In the end, we landed on a design that reveals all audio chapters on the screen at once, so that a user could easily scan the options—whether visually or with VoiceOver—and find consistent audio controls that persist at the bottom of the screen.

Screen shot of the final Out Loud learning player design. The player is currently playing the introduction to the Pop Products story.
Figure 9: screenshot of the final Out Loud learning player design
Screen shot of the final Out Loud learning player design, in which the introduction to the Pop Products story has completed and a user can select to play the following chapter.
Figure 10: screenshot of the final Out Loud learning player design
Screen shot of the final Out Loud learning player design, in which the user has listened to all of the chapters and can now choose to close the audio player
Figure 11: screenshot of the final Out Loud learning player design
Screen shot of the final Out Loud "Stories" view, with the audio player controls persisting at the bottom of the screen.
Figure 12: screenshot of the final Out Loud “Stories” view, with the audio controls persisting at the bottom of the screen

In the final design of the learning player, users see all chapters within a story on one screen. A blue progress bar under each chapter title indicates how much of the chapters have been played. The audio control buttons are at the bottom of the screen, above the navigation bar. The audio control bar persists on the screen even when a user navigates to another part of the app, allowing a user to navigate the app while listening to audio. When users are not on the current story screen, tapping on the title of the story and chapter within the audio controls will take a user back to the full story learning player view.

When a chapter ends, the audio control area cues up the following chapter; a user can select to play the following chapter (or, if autoplay is turned on, the chapter will automatically start after five seconds). The buttons on the player all remain in the same place, but the inactive buttons (such as changing the audio speed) are grayed out. For screen reader users, VoiceOver indicates to a user that the icons are not actionable by listing the state as “dimmed.” When a user completes the final audio chapter in the story, he or she has the option to close the audio controls.

Hardware and logistics

As we prepared to launched Out Loud to the public in October 2016, we tested various hardware options and refined the logistics for offering the audio guide onsite. In addition to the device set-up, headphones, and other accessory considerations outlined below, we conducted training with frontline staff on how to use Out Loud with accessibility features and tips for engaging with visitors with disabilities.

Device set-up for onsite usage

Because iOS devices are customizable to individuals’ needs, we encourage visitors to download the app on their own device. For visitors without iOS devices or who prefer not to download the app, we offer iPod Touch devices to borrow for free on-site at the museum. Frontline staff are trained to offer an explanation of accessibility features on the audio guide to interested museum visitors. If a user requests certain accessibility features, the staff member can then adjust the device to their needs. Device setup can include the following features:

  • Guided Access: Guided Access is a feature on iOS devices that temporarily restricts users to a single app, preventing users from accessing other applications, and preventing other applications from interrupting the user’s experience. We turn on Guided Access for all visitors who borrow a device, regardless of the visitor’s request for accessibility features.
  • VoiceOver: Frontline staff are trained in navigating the audio guide using basic VoiceOver gestures and can offer a quick tutorial in using the application with the iOS screenreader.
  • Shortcuts: iOS allows users to establish accessibility shortcuts that can be accessed by triple-clicking the home screen button. At The Warhol, we have set up our devices with triple-click shortcuts for Guided Access and VoiceOver.
  • Dynamic Type: Dynamic Type does not have an accessibility shortcut and therefore needs to be adjusted via the “Settings” app in order to change app text size. Unfortunately, users cannot adjust Dynamic Type within the app, but for future versions of Out Loud we are entertaining solutions that allow users to control text size within the app.

Headphones and other accessories

Usability sessions and pilot events like the Accessibility Meetup were great opportunities to prototype how we deploy headphones with Out Loud devices. We allow and encourage visitors to use their own. For those who do not have headphones or prefer to borrow ours, we supply visitors with over-the-ear headphones for general use. However, we maintain a stash of disposable earbuds that we offer upon request, a decision we based on user research findings that showed that visitors with visual impairments prefer earbuds. Unfortunately, we found that it would not be financially viable to offer disposable earbuds to every visitor who borrows a device.

During our usability sessions, we had also tested bone conduction headphones that a user can connect to a device wirelessly using bluetooth. Bone conduction headphones, such as those produced by Aftershokz, conduct sound to the inner ear through the bones of the skull and therefore do not cover the ear canal. Bone conduction transmission can be used by individuals with certain kinds of hearing loss. We were excited about the possibility of offering bone conduction headphones both to visitors with hearing loss and visitors with visual impairments who rely on environmental sound to navigate through spaces. During our testing sessions with users who are blind, we received positive feedback on the use of bone conduction headphones. Ultimately, however, we decided against offering these headphones because they are costly, starting at around $100 a pair.

In addition to headphones, we offer various accessories for visitors who borrow a device in-house. All of our audio devices come with a lanyard, a feature requested by our testers with visual impairments who often have their hands occupied with a cane or a sighted guide dog. We also offer headphone splitters so that two visitors can share one device. Finally, for visitors who are hearing impaired, we offer neck loops that allow the iOS device to be compatible with a telecoil hearing aid.


We publicly launched Out Loud on October 25, 2016, to positive feedback and strong interest from the local community and press. The launch was a culmination of over half a year of carefully considering and crafting the entire visitor journey, from when a user first requests an audio guide at the front desk to how users engage with the tactile reproductions in the galleries. Out Loud is also the result of a project approach that involved many stakeholders and user-experts whose feedback and expertise were invaluable in this process. Since October, we have turned our focus to life after launch, addressing challenges such as how to maintain fresh content as objects are rotated out of the galleries. In spring 2017, we will be conducting a thorough evaluation of Out Loud, the findings of which will inform future development and expansion of Out Loud to all seven floors of the museum.

The development of Out Loud was a learning process for The Warhol. Adopting agile and user-centered design processes forced us out of our comfort zones. When it came to developing accessible mobile experiences, we team members began as novices and had to ramp up quickly. Building an understanding of our users’ needs and preferences through conversation, co-design, and testing became our key to unlocking this knowledge. Soliciting feedback from users was a humbling experience, forcing us to let go of preconceived notions or concepts we were attached to in favor of what real-world user testing revealed. As we endeavored to create an educational “path with a choice” for all of our visitors, regardless of ability, we, too, grew greatly during this journey.


Out Loud would not have been possible without the dedication and support of countless colleagues and friends. I’d first like to thank the dream team behind the audio guide: Nicole Delezon and Leah Morelli, my education and accessibility gurus throughout the whole journey, as well as Ruben Niculcea, Jeff Inscho, and Sam Ticknor of the CMP Innovation Studio for their tireless work on technical development, audio production, and putting together a snazzy video featuring our process. We couldn’t have built Out Loud without our consultants, community partners, and co-designers: Ann Lapidus, Gabe McMorland, Brian Rutherford, Chris Maury of Conversant Labs, and Sina Bahram of Prime Access Consulting—this audio guide is for you! The Out Loud advisory team—Karen Lautanen, Danielle Linzer, Jeff Inscho, and Chris Maury—met with us monthly to provide invaluable feedback on and support for the project. A million thanks to Jessica Beck, Jose Diaz, Abby Franzen-Sheehan, Grace Marston, and Donald Warhola for lending their knowledge and voices to the audio guide. Many thanks to Sarah Outhwaite at the Guggenheim for sharing her knowledge on beacon wrangling.

Accessibility initiatives at The Andy Warhol Museum are generously supported by Allegheny Regional Asset District, The Edith L. Trees Charitable Trust, and the FISA Foundation in honor of Dr. Mary Margaret Kimmel.


Beck, K., M. Beedle, A. van Bennekum, A. Cockburn, W. Cunningham, M. Fowler, J. Grenning, J. Highsmith, A. Hunt, R. Jeffries, J. Kern, B. Marick, R. C. Martin, S. Mellor, K. Schwaber, J. Sutherland, & D. Thomas. (2001). Manifesto for Agile Software Development. Consulted January 29, 2017. Available http://agilemanifesto.org/

Henry, S.L. (2007). “Interacting with People with Disabilities.” Just Ask: Integrating Accessibility Throughout Design. Accessed February 2, 2017. Available http://www.uiaccess.com/accessucd/interact.html

Lagace, M. (2016). “Quartz ‘texts’ you the news with new app.” Android Central. Consulted January 17, 2017. Available http://www.androidcentral.com/quartz-offers-news

React Native. (2015). “Getting Started with React Native and Redux.” Use React Native. Accessed January 23, 2017. Available http://www.reactnative.com/getting-started-with-react-native-and-redux/

The Studio at Carnegie Museums of Pittsburgh. (2016). Web Accessibility Guidelines v1.0. Consulted January 28, 2017. Available http://web-accessibility.carnegiemuseums.org/foundations/semantic/

Wyman, B., C. Timpson, S. Gillam, & S. Bahram. (2016). “Inclusive design: From approach to execution.” MW2016: Museums and the Web 2016. Published February 24, 2016. Consulted February 23, 2017. Available http://mw2016.museumsandtheweb.com/paper/inclusive-design-from-approach-to-execution/


Access American Stories. (n.d.). USA.gov. Consulted February 1, 2017. Available http://origin.apps.usa.gov/access-american-stories.shtml

Apple Developer. (n.d.). iOS Human Interface Guidelines. Consulted February 14, 2017. Available https://developer.apple.com/ios/human-interface-guidelines

Apple Developer. (n.d.). View Controller Programming Guide for iOS. Consulted February 14, 2017. Available https://developer.apple.com/library/content/featuredarticles/ViewControllerPGforiPhoneOS/SupportingAccessibility.html

The Audio Description Project: Audio Description at Museums, Parks, and Exhibits. (n.d.) American Council of the Blind. Last updated February 2017. Consulted February 15, 2017. Available http://www.acb.org/adp/museums.html.

Burgund, Halsey. (n.d.). Roundware. Consulted February 2, 2017. Available http://roundware.org/

Digita11y. (n.d.). Consulted January 2, 2017. Available https://www.digita11y.org

Wrenn, M. (1991). Andy Warhol: In His Own Words. Omnibus Press, London.


Cite as:
Gonzalez, Desi. "A path with choice: What we learned from designing an inclusive audio guide." MW17: MW 2017. Published March 1, 2017. Consulted .

Leave a Reply