You are currently browsing the category archive for the ‘Engineering’ category.

As a corollary to my last post, here’s an analogy for you:

The difference between 16-bit (CD-quality) and 24-bit (“HD”) audio is 8 bits, and 2**8 = 256. That means that 24-bit audio can provide up to 256 times the amplitude “depth” of 16-bit.

Let’s compare audio bit depth with water depth. The surface of the water is analogous to unity gain, 0 dB. Unity gain is the loudest representation of sound a digital audio signal can provide. When multiple successive samples hit unity gain, the signal isn’t describing anything during that time. It’s just being loud. Severely brickwall-limited audio can have lots of samples whose value is 0dB. Think of this audio as a swimmer that stays on the water’s surface. They always swim on the top of the water, regardless of the depth.

Let’s say 16-bit audio is like a part of an ocean where there’s a 16-foot-deep coral reef. If a swimmer holds their breath, they can swim down below the surface and explore a bit, and find lots of neat stuff. There’s probably more interesting stuff under the surface of the water than at the top. When audio waveforms steer clear of unity gain (i.e. by not being clipped), more of their original resolution is present.  They can be heard more accurately, just like diving below the surface of the water helps the diver to see the objects that are under water more accurately.

Now, let’s say 24-bit audio is water that’s 256 times deeper than that 16-foot section: about 4100 feet deep; almost a mile. Again, the swimmer could stay on the surface, but they’ll be able to find much more life underneath. Putting on some scuba gear or deep-sea-diving equipment, they’ll be able to go even deeper than when they hold their breath, and potentially find lots more stuff. When audio waveforms don’t hit unity gain and can use at least some of the extra resolution that 24 bits provide vs. 16, they can be played back with even more accuracy. However, if the swimmer (in the case of recorded music, the mastering engineer) stays near the water’s surface, they won’t find much.

Mastering engineers need to learn to swim below the water’s surface before the music industry moves to 24-bit formats by default.

nin-theslip-24-96-wav-replaygain, originally uploaded by aharden.

Which of these tracks is not like the others? Brian Gardner at Bernie Grundman Mastering in LA was the mastering engineer for these 24-bit-depth tracks, some having over -10dB of Replaygain. That’s kind of like saying you have a 24-foot-deep pool when in reality you dug a 24-foot-deep hole and filled it with 10 feet of concrete before putting water in it.

I sent my Gmail account a 53kB AMR sound file (about 1 minute of audio) from my phone and when it showed up in my Gmail inbox the attachment was a ~512kB WAV file (mono, 8-bit).  I checked my phone to make sure it hadn’t auto-converted the file before sending it and it says it didn’t.  This means Gmail could be a handy part of an AMR-to-podcast solution.  Besides its conversion and email gateway roles, it would serve as a handy data archive.

New York Times, August 8, 2007: Slip in CD Sales Adds to Warner Losses:

Warner Music’s share price has fallen more than 50 percent this year as sales of recorded music struggle against digital music and piracy.

Electronic Musician, September 1, 2003: Masters on Mastering:

“It’s a losing battle for musicality,” [award-winning mastering engineer Bob] Ludwig laments. “To me, it’s a fact that highly compressed music is tiring to the ear and doesn’t make you want to listen to something over and over again. Could this be one of the reasons for the record industry’s demise?

“The problem is that many artists, producers, and A&R people are very short-sighted,” he continues. “If you take a new recording and compare eight bars of a piece that’s been mastered by four different engineers, often the loudest one sounds immediately the most impressive to the listener. Hardly anyone listens to 40 or 50 minutes of the whole recording and decides how the total musical experience was for them. Radio play used to be an excuse, but levels now are radically high, and it can be proven that the high levels make them more difficult to broadcast. Just ask Bob Orban, who makes many of the compressors used in FM stations around the world.”

The industry needs to own up to the sonic quality of its product and quit blaming new technologies and its customers for its losses.

Scott and I recently started recording weekly “Zubritsky’s Corner” podcasts, both to talk about sports in general and to prepare for the third full season of the BDFL podcast. This week was a watershed moment in our podcasting history. After I suggested that Scott and I could greatly improve the quality of the ‘cast by having each of us record our side of the Skype call with separate mics at CD-quality and then mixing the results together, he quickly purchased a decent recording kit. After setting it up with Audacity and sending me a test WAV, we were all set.

I haven’t sung the praises of REAPER in a while. It’s software I purchased last year that’s been my main audio recording/editing tool ever since. I used it to record my side of last Friday’s conversation; the signal chain was my venerable Radio Shack mic plugging into my Mackie 1202-VLZ mixer, which was monitored by my M-Audio Audiophile 24/96 card. It was very easy to monitor my recording level in real-time with REAPER. Once we were done I saved out the new REAPER project. The next morning I received Scott’s recording and after about 15 minutes in REAPER I’d cut, cued, and panned our conversation. In another 15 minutes I’d pulled in our intro/outro music (Brad‘s “Look and Feel Years Younger”), spliced it in with fades, set all the channel levels and applied the excellent W1 Limiter to the mix. Then it was simple work to render the project as a FLAC and hand it off to Foobar 2000 for tagging and MP3 conversion. The results are here. Compare it to our podcast from the previous week. To my ears it’s a dramatic improvement. What do you think?

I’ve been following Justin Frankel’s REAPER project ever since Brad pointed it out. It recently went 1.0; it’s up to v1.06 now. It’s a well-conceived multitrack audio/MIDI editor for Windows that’s highly configurable. I’ve barely scratched its surface using it to edit, master, and render the last few BDFL podcasts. However, I know it’s for real, and at $40 for the non-commercial license, it’s a steal. Recommended. It’s unrestricted shareware, which means you can completely try before you buy. It’s too bad more audio software (especially plugins) isn’t like that.

A note if you buy: the RegNow service’s “download protection” add-on, which is offered when you make the purchase, appears to be useless. As long as you keep your issued key in a safe place, you can download the software from the REAPER site and activate it as opposed to relying on RegNow’s service.

I’m going to add another audio interface to my basement computer (which powers ICYG and Manolas‘ color printing, among other things) since that’s been the best place for me to record the BDFL podcasts. Adding another mic-in/line-out combo will allow me to route Skype calls through a separate input on my UB802 mixer, and that means I can more easily control its level in real-time. It will also simplify my signal routing a bit.

I’ve decided to go PCI vs. USB because I want to use a card that supports ASIO (low-latency I/O that routes around Windows’ audio mixer) out of the box and I don’t necessarily need the tactile control of levels than an external USB interface would provide. I was thinking about the Creative Audigy SE, which is cheap but apparently doesn’t offer ASIO support. Then I saw the Audigy 4 package, which includes a remote control and would probably allow me to get full-res DVD-Audio playback on my main upstairs PC. I could take out my M-Audio Revolution 5.1 card that’s attached to my Klipsch surround setup and move it downstairs to handle the recording duties. The Revo is a great card (with ASIO support) but doesn’t provide full-resolution DVD-Audio playback with WinDVD or PowerDVD.

I think I’ll go get the Audigy 4 today and I’ll blog about its pros/cons later. It’s been quite a while since I purchased a Creative sound card; my last was an AWE64 Gold. It’s still in use in my son’s computer.

Update: Got the Audigy 4, put it in, and installed all the included software (yes, I’m brave). The only thing I played with so far has been the DVD-Audio player’s rendering of my Yes “Fragile” DVD-A and it was superb. The included remote is compact and has a USB receiver; it’s convenient to use. I’ll have to look around and see if it’s truly locked to the Creative apps or if I can use it elsewhere (FB2K?) as well.

I’ve had a chance to compare some six-channel, 24-bit, 96kHz musical content encoded in MLP and FLAC. Remember this complaint? Well, let’s just say the mix sounds great on my M-Audio/Klipsch 5.1 setup regardless of the format. My comparison points:

  • FLAC at compression level 8 required 1.42% more disk space than the MLP.
  • The FLAC content can be played at full resolution with Foobar2000.
  • The MLP content can be played at 16-bit/48kHz resolution in WinDVD. (Thank you, DVD Forum, for deciding that my hardware isn’t worthy of full-resolution playback of content I’ve purchased. References here.)
  • The FLAC content can be easily transcoded to other formats for use with other software or devices I own.

Combine this excellent DVD-Audio reference page with this instructive page that includes a link to this archive, which contains a Win32 port of mkisofs, and you too can figure out how to make a correctly-laid-out image of a DVD-Audio disc on Windows. (Given the appropriate source files.)

Nerd alert!

After popping the DVD-V disc of Deadwing SE into my computer, I noticed that the DTS 5.1 version of the album played back more quietly than the PCM stereo version. I had already noticed that the CD version of the album was mastered with some pretty aggressive limiting, although unlike some other CDs, I don’t notice any distortion or crackling as a result, just a lack of headroom. To me, this translates into how much the music is allowed to breathe. But to be honest, the quieter passages of the album are given a decent amount of headroom in the stereo mix.

So, like any other audio nerd, I wanted to see if I could rip the DTS track, mix it down to stereo, and apply just enough tweaking to get a version of the album on CD that had more dynamic range than the official stereo mix.

I’d never even considered listening to DTS audio outside of DVD playback, but I knew that DVD Decrypter supported raw stream ripping. I used it to rip the full DTS streams off the DVD-V disc as .dts files, which Foobar 2000 can play with the foo_dts plugin. Since I didn’t have any experience mixing down 5.1 to stereo, I found two other FB2K plugins that did: ATSurround and Channel Mixer. I ended up using Channel Mixer to do the mixdown since it supported a “stereoimage width” parameter and level controls for center, rear, and subwoofer mixdown. I used the 1.25 setting on the stereoimage width, 0.9 for center and rear mixdown, and 1.0 for subwoofer mixdown. I also applied a little low-end EQ (-5dB@55Hz, -2dB@77Hz, -1dB@110Hz) to tame the bass. Playing back the DTS with these settings produced a stereo mix that sounded pretty decent and peaked at about -4.5dB. I took that into Sound Forge and used the Wave Hammer plugin to limit at 6dB and maximize. This trimmed the peaks while still leaving plenty of headroom.


Waveforms of the initial crash of the title track of Porcupine Tree’s “Deadwing”. From top to bottom: my rejiggered DTS downmix, the official stereo mix from CD, and the official stereo mix from the LPCM track of the DVD-A (which is 48kHz, 24-bit, but has similar amplitude).

Deadwing “Official Mix” (23 sec)
Deadwing “Alex’s DTS Downmix” (23 sec)

Listening to my downmixed version compared to the official stereo mix, the difference in volume is quite apparent. Also, I can hear phase problems with the drums in certain areas, and, of course, the panning of many of the musical elements is different. The dynamic range of my version is greater, but because of the discrepancies I can’t say I prefer it to the official stereo mix. I’d like an official stereo mix that had more headroom, but I guess that war has already been lost. Louder is better, right?