Week five is the first week of the XNA 4.0 workshop where I'll step away from doing an overview of the "textbook" chapters, and instead focus on the weekly topics themselves.
In Week five we take a look at XNA 4.0's Audio/Video features. If you DO want to read the chapters in the "textbook", check out chapters 7 and 12. But in truth, you'll find everything you need to know below, or in the external resources provided at the bottom.
XNA's Audio/Video support can be broken up in to a few smaller categories:
- Playing 2D Audio
- Playing 3D Audio
- Accessing Media
- Recording Audio
- Playing Video
Playing 2D Audio
The first of the four categories above "Playing 2D Audio" can itself be broken up into two categories:
- Custom Audio Playback
- Simple Audio Playback
Custom Audio Playback

Prior to XNA 3.0 the only method for Audio playback was the Custom Audio Playback method. This method entails using the XACT Framework (Cross-Platform Audio Creation Tool) to combine wave files into sounds and then bind sounds to cues to be triggered at run-time. The benefit to the XACT method of creating sounds is it cleanly separates the job of audio designer/composer from programmer. Engineers are freed from needing to know what sounds to play and when. All they need to do is use the in-game API to trigger cues, and whatever nifty effects have been created by the audio engineer will play. Additionally, using the XACT method sounds can be cleanly grouped into categories, with volume control and other settings being configurable at a categorical level. This is great for allowing users to modify the ambient sound volume, or music volume, while leaving the sound effect volume alone.
The main downside to the XACT method is it doesn't go through the traditional content pipeline XNA developers are used to. The XACT tool instead creates project files which reference external wave files, and then these XACT project files are added as a resource to the XNA Content Pipeline. This necessarily changes the way sounds are visualized in the Content folder, and also how they're accessed in game. To access the sound cues, programmers are required to instantiate bank objects, and then make regular calls to the AudioEngine class which is responsible for processing audio data.
The main classes involved in the Custom Audio Playback method are:
- AudioCategory
- AudioEngine
- Cue
- SoundBank
- WaveBank
All of the above classes can be found in the Microsoft.Xna.Framework.Audio namespace.
Simple Audio Playback
Beginning with XNA 3.0 Microsoft introduced the SoundEffect class. SoundEffect is an in-game asset type similar to Texture, SpriteFont, etc... which maps one-to-one with an on-disk sound file. SoundEffect objects can be associated with sound files of different types, including WAV, WMA, or MP3.
Unlike with the XACT Framework, there's no good way to create randomization over which sound will play when you call SoundEffect.Play, there's no way to make it cross-fade between multiple sounds, and there's no good way to make multiple waves play at the same time, or with time delays. This level of customization requires the use of the XACT Framework.
What you do get with the SoundEffect class is the ability load an audio file with one line of code, and play it with another. It's quick, easy, and follows the usual method of getting assets into the game.
In addition to just being able to "fire and forget" sound effects using the SoundEffect class, engineers can also get access to a SoundEffectInstance object of an associated SoundEffect, and then pause, play, or resume playback while being able to adjust things like pan, pitch, and volume.
For 90% of games out there the Sound Effect API will work just fine. Also note that the XACT Framework isn't available for Windows Phone 7. So if your plan is to hit the Windows Phone Marketplace, SoundEffect and its instance class are your only options.
In addition to the SoundEffectInstance class which was introduced in XNA 3.0, Microsoft introduced the DynamicSoundEffectInstance class in XNA 4.0. This class allows you to work directly with audio buffers to enable the dynamic creation of audio at run-time, and can also be used to implement streaming of audio from disk.
The main classes involved in the Simple Audio Playback method are:
- DynamicSoundEffectInstance
- SoundEffect
- SoundEffectInstance
All of the above classes can be found in the Microsoft.Xna.Framework.Audio namespace.
Playing 3D Audio
Whether you're using custom sounds or the simple sound effect API, there will come a time when you may want to work with 3D (positional) sound. Positional sound works the same as normal audio, except that the volume will rise/lower and interpolate between speakers based on the origin of a sound relative to the listener. This allows you to easily create effects where audio sounds as though it's coming from the left or right of you. I have never tried using the 3D Sound API with more than two speakers, so I'm not sure how well it works with 5.1 and 7.1 speaker setups.
To use 3D audio with XACT, simply call Cue.Apply3D, passing in an emitter and listener. Similarly, to use 3D audio with the SoundEffect API, call SoundEffectInstance.Apply3D.
The main classes involved in 3D Audio are:
- AudioEmitter (used to set data for the origin)
- AudioListener (used to set data for the listener)
All of the above classes can be found in the Microsoft.Xna.Framework.Audio namespace.
Accessing Media
Up until this point I've mostly been assuming engineers will be accessing their audio through the Content Pipeline, using either XACT projects or sound files which are loaded at run-time and then played. However, beginning with XNA 3.0, when Microsoft added (and subsequently removed) support for the Zune, they also added access to your system's media library through the MediaLibrary class. This includes all the songs, albums, cover art, and play lists in your audio collection, and also the pictures and videos (more on this later) in your Libraries.
These assets are all accessed using intuitively named classes in the Microsoft.Xna.Framework.Media namespace in combination with the MediaPlayer class, another addition in XNA 3.0.
To use the MediaPlayer class just call one of the static functions such as MediaPlayer.Play and pass in the Song you want. You can filter the song you want by using the MediaLibrary.Songs collection, or if you want more control you can first access the albums in your library through the MediaLibrary.Albums property, etc...
With XNA 4.0 also comes a nifty new method on the Song class called Song.FromUri. This method allows you to play a song from the web or from file, but does not work on the Xbox 360. It also doesn't support spaces in the path name, so make sure that doesn't happen or you'll get an exception.
The main classes involved in accessing audio from the media library are:
- Album, AlbumCollection
- Artist, ArtistCollection
- Genre, GenreCollection
- MediaLibrary
- MediaPlayer
- MediaQueue
- PlayList, PlayListCollection
- Song, SongCollection
Recording Audio
Being able to play audio is all well and good, but we also need to be able to record or stream microphone audio. Beginning with XNA 4.0 Microsoft has added the Microphone class to the Microsoft.Xna.Framework.Audio namespace.
Using the Microphone.Default or Microphone.All properties developers can gain access to one or more Microphone objects hooked up to their Phone, PC, or Xbox 360. Once you have a reference to the Microphone, it's as easy as calling Microphone.Start and Microphone.Stop in order to record audio. Then, using the Microphone.GetData method (in response to BufferReady events) developers can gather the audio as a byte[] array to use in combination with the DynamicSoundEffectInstance class discussed previously to play back the audio.
There are of course other properties on the Microphone class to help you identify important information such as the SampleRate, and BufferDuration, and to determine whether it's a headset or not, but you get the basic idea. Play, queue buffers, stop.
As an aside I should note that recording audio is meant to record audio for use in a game. If you just want to use a microphone to communicate with your friends over Xbox Live, that comes for free.
If you're taking advantage of the GamerServices API you can call LocalNetworkGamer.EnableSendVoice to allow audio to be transmitted. With the addition of several read-only properties such as NetworkGamer.HasVoice, NetworkGamer.IsTalking, and NetworkGamer.IsMutedByLocalUser, developers can add interface elements to indicate who can/is talking and whether they're muted.
Playing Video
And finally we come to playing video. Starting with XNA 3.1 Microsoft introduced the VideoPlayer class, located in Microsoft.XNa.Framework.Media. This class allows you to play videos (obviously).
To use it you simply:
- Construct an instance of the VideoPlayer class
- Call VideoPlayer.Play(video)
Then each frame call VideoPlayer.GetTexture and it will return, sequentially, the frames of the video as a Texture2D object. This is great because it means it can be used with SpriteBatch to draw your video as an animated sprite, or applied to a 3D quad to be rotated, stretched, and then projected into any 3D universe. Great for adding video to signs, on soda vending machines, etc...
Initially the only way to gain access to a Video was through the Media Library (discussed previously), but in XNA 4.0 Microsoft added Video as an asset type, allowing you to Content.Load any WMV file via the Content Pipeline.
External References
MSDN Documentation:
AppHub Education Catalog