Cogitations of a Semi-Pro Wordsmith: July 2016

Friday, July 29, 2016

Independent Audiobook Creation Episode 2: Basic Terminology

I just uploaded the second episode of my series on Independent Audiobook Creation, where I go over some of the basic terminology that might be unfamiliar to writers looking to narrate their own audiobooks.

Next video (should be up Monday evening, MST) will show the hardware setup I'm using, and go over some of the options for hardware.

Let me know what you think!

Thursday, July 28, 2016

Intro video for Independent Audiobook Creation

The first video in my Independent Audiobook Creation series is up. It's a touch over 5 minutes long, and just gives a little background about where I'm going with the process.

Let me know what you think!

Tomorrow's episode 2 will cover basic terminology & sound concepts, primarily those that are used at ACX.

Episode 3 will cover my hardware setup, and suggest changes you might make in your own setup, and I intend to have that video up by Monday evening (MST).

Episode 4 will cover the software toolchain, and cover the recording and initial editing steps I use.

Episode 5 will cover the filters and engineering manipulations I use to enhance and prepare the audio files for submission.

And, finally, Episode 6 will be a basic summary of the episodes 2–5, and wrap up the process.

Hope you enjoy them, and that I'm able to help at least one of you in your process!

Monday, July 25, 2016

Refined Audiobook Toolchain & Workflow

Welcome back to this installment of my personal journey toward narrating and producing the audiobook of my own published books Hooray for Pain! and With It or in It. Today, I am going to detail the steps that I've come up with, hopefully in enough of a non-technical way that it will be reasonably easy to follow and understand for anyone who is trying this themselves for the first time.

As I mentioned in my last entry, I wanted to try to reduce the number of applications in my toolchain, as well as simplify the process in other ways. My aim was to try to get to the point where I could use only the free tool Audacity. I have found, however, that Audacity has some interface issues that make it very difficult for me to use for the editing stage of the process, and so I have to record & edit the audio in Sound Studio before moving into Audacity for most of the filtering.

To make this a little simpler, I basically see this as a three-step process:

Record the audio
Edit out the mistakes
Filter & enhance the remainder

Let me break it down a little bit more.

First, record the audio. Have your microphone, software, book (or kindle or whatever), studio, hot tea, water, and such all ready to sit down and do the recording. Hang up the signs that say "do not disturb" or whatever you need to do. Then, record your narration. When you are recording, if you make a small mistake (such as skipping over or mumbling a word) make a sharp noise, and start over from the most recent natural pause. I snap my fingers in front of the mic, other people will clap, but whatever it is it should make a distinctive mark in the audio software's waveform representation. This just makes it easier to see later when editing. Then, start over from that most recent natural pause, and continue. Record as long as you are comfortable, and then stop. I always make sure to record a few seconds of the studio in as near to silence as I can before and after. This is called "room tone" and serves several purposes.

Second, once a recording is complete, edit out mistakes. By "complete," of course, I mean once you've finished recording a particular section, chapter, poem, or whatever. You can wait until you've finished recording every section, or you can do each one as you stop recording, it's really up to you. The only caveat there is that for ACX (and for your listeners, even if you don't or can't use ACX for your audiobook publishing), it is important that the entire finished product has a similar sound throughout, so replicating the conditions of recording is vital. Having the finished product meet certain technical requirements (such as ACX's requirements) will help, but if you record several chapters in one location, and then change locations for others, your listeners might notice the difference in quality and background.

In any event, sit down with your editing software and listen to the finished product. Delete any segments where you made mistakes—this is where those snaps or claps come in handy—until you have only the correct narration of your work, with all of the reading mistakes edited out. Then, listen to it again from beginning to end. If you have edited out any mistakes, sometimes it's hard to tell exactly how long a pause should be until you hear it in playback. If you find that a pause is too long, you can always edit it out; conversely, if you've recorded a few seconds of silent room tone you can copy a few dozen milliseconds of it, and use the "paste" function to insert a little extra pause where needed.

Third, once you have a correct narration, with the spacing and pauses and everything else in their proper place, you will want to filter & enhance the sound file. Exactly what, how, and how much will depend on your specific recording environment, software, microphone, and so forth, but here are some tools common to many software packages to consider (all are named using the conventions in Audacity where applicable, although most of them are pretty common names for pretty common sound concepts):

Noise Reduction
Compression
Limiting
Click Removal
Ducking
Equalization

or Treble/Bass reduction or increase

Echo or Reverb

Some of these will be a good idea to use on each recording, but the order in which you do them matters, at least in theory. Also, most of these filters/effects/modifications are done on the entire file (except the Ducking, which I explain below).

So, based on my experimentation and output results so far, this is my workflow. Please note that this has not yet been submitted to ACX, and even if it had I can't guarantee your results will be the same, but this setup should at least get you on the right track.

I always do a noise reduction first. That should help to prevent any further processing from increasing the sound from the noise, helping to keep the noise floor as low as possible. For ACX, the noise floor can be at most -60 decibels (dB), so removing the noise early makes sense to me. It also makes the next step easier, since the compressor has a much more silent floor to detect.

Next, I run a compressor on the file. I use the following settings:

My Compressor settings in Audacity

and click OK to start the process. It only takes a few seconds. What compression does in this context is, essentially, it brings the valleys up and the peaks down so the overall range of sound levels (called the dynamic range) is tighter (that is, it decreases the dynamic range, essentially squishing it together a bit). This allows the sound to be able to be amplified more without distortion or clipping (both of which would sound horrible in this context). By also clicking the "Make-up gain" checkbox, the compressor will perform an amplification of the resulting audio, increasing its overall signal. Now, the noise floor slider is something you will likely want to play around with; I have found that in my setup, -50 guarantees that all of my vocals will be picked up (and not clipped), but that it won't trigger on other noises that might be present (like the A/C kicking on, for example). Yours might be different, but I suggest setting it no lower than the ACX noise floor of -60, and you will likely want it higher than that.

When the compressor finishes, the resulting sound file is quite a bit louder, but the peaks of it (at 0 dB) are too loud for ACX (which requires a max of -3 dB). So my next step is to run a Limiter on the file:

My Limiter settings in Audacity

What a limiter does kind of depends on the type selected. A "limit" (hard or soft) will compress any sound peaks that go above the "Limit to" setting so that they don't breach that Limit; a "clip" will cut the sound off if it goes higher than that Limit. By setting a "soft" limit, as the sound approaches the Limit it will be progressively diminished, more diminished the higher it goes. It will start to diminish those peaks before it reaches the Limit, which results in more softly rounded peaks. A "hard" limit will not diminish the peak until it breaches that Limit, and then will compress it, resulting in a more flattened peak. The Clip settings will distort the sound, especially the "hard" clipping, and I don't recommend it for this purpose (though, you may find through your own experimentation that it works for your situation). The soft limit gently rounds off the peaks, and at -3.1 dB will keep them below the threshold for ACX.

Next, I run the Click Removal, which helps to eliminate any stray lip smacking or sharp breath noises I might have made while recording.

At this point, I stop and listen to the entire piece from beginning to end. If I hear any lip-smacks or breathing noise that the Click Removal didn't catch, I will stop and select a few milliseconds before and after the noise, and then duck the audio. Essentially, what that means is that on only the selected audio, I decrease its amplitude so that it is no longer audible on normal listening. (Technically, ducking is when you decrease the sound of a given track below that of other audio present, but the process is the same whether there is other audio present or not, so that's what I'm going with.)

Once I've gotten through the entire file and removed as much of the distracting breathing and lip noise as I can identify, I give it another listen. This time, I'm listening for the balance of treble & bass tone. If it sounds too tinny, I'll use the equalizer to adjust the relative balance of bass, mid tones, and treble sounds. Conversely, if it's too bass-rich, I'll equalize in the other direction. Some software will have specialized equalizer filters for Bass or Treble Enhance (or Reduce), and you can experiment with those to hear how it sounds.

In addition, there's a consideration for reverberation or reverb settings. Initially, I was using a tool in Final Cut called a doubler, which essentially just provides a perfect duplicate of the selected sound only a few milliseconds later (Audacity calls this effect Echo), and has basically the same effect, providing some additional depth to the sound. Reverb essentially takes the sound and uses mathematics to simulate what that sound might be if it were bouncing off of different substances in different sizes of location. The essential difference between echo (or doubler) and reverb is that the echo simply duplicates the sound, with a delay (how long between the original and echoed sound) and a decay (how much of the original sound is reproduced in the repeated signal, each time it is repeated). Any decay below 1 will reduce the sound for each echo, and a very small decay (such as .05, say) will result in the sound quickly dying out. The reverb will simulate the echoed sound being bounced around inside a room of a size you specify (numeric size, not square feet), and has a lot more options to consider. I have experimented with it and have found that a straight echo with a delay of 0.05 second, and a decay of 0.05, gives my vocal recordings more richness without the obviousness of reverberation. However, I encourage you to experiment and find your own settings (and, share them below if you are inspired to do so)!

One thing to keep in mind: if you decide you want to use echo or reverb, I recommend doing it before you apply the limiter, or set your original limiter a bit lower (say, -3.5 or -3.8 dB). The echo or reverb effect may increase the peaks, and if so it may pop you over the ACX submission threshold.

Finally, the last thing I do is run the ACX check on the file. Here is the output from that for my latest recording file:

Output from the ACX Check analysis

As you might expect, this tool is very handy but is not a guarantee that the file will pass ACX once submitted, it just gives you a sense of where it stands. I believe that this will probably work fine most of the time, however, so I am definitely using it at least as a baseline.

Now, at this point sometimes there are special effects that need to be considered. For example, in one part of With It or in It, I included a paraphrased transmission that was sent over military radio frequencies to our unit on a gunnery range. After recording the audio, I saved the file as an AIFF file and import it into Final Cut (since I couldn't find a plug-in for Audacity that performed this), where I applied the "Car Radio" filter to just that section of the recording. It made that section sound like it was being transmitted over the radio, which was exactly the effect I was looking for. Once done, I exported that file back out as an AIFF and opened it in Audacity to re-check with the ACX Check tool, just to make sure I hadn't messed up the floor or peak levels (mostly; this section was only a few seconds and was unlikely to affect the RMS level significantly).

Once this part has been completed, I recommend sending this file to a trusted friend (or a solid beta reader/listener, if you have one willing to help) for a second set of ears. It especially helps if they are familiar with your book, but this is not strictly necessary. Ask them for their feedback on the overall sound, any distortion, clipping, or distractions that may be present that you missed, and overall flow. Sometimes, you will discover that you thought you cut out a mistake but didn't, or that you accidentally skipped a word, sentence, or paragraph (or, accidentally cut them out).

Finally, when you are ready to submit to ACX (if you are going this route), you'll need to save the finished file as an MP3 file, with a bit rate of at least 192 bps, Constant Bit Rate, and as a mono file if you can (or a stereo if you can't, but do not create a "joint stereo" if your software gives you that option).

So, that's where I am right now with this workflow. During the rest of the week (in addition to other things) I will be working on getting up those videos I discussed before, which hopefully will help elucidate any unclear areas. Let me know what you think by leaving a comment below!

CloudAge™ Author news: Scrivener for iOS is finally released!

After a many-year-long wait, Literature and Latte has released the iOS version of their outstanding writing software, Scrivener. Although I have not yet had a chance to purchase & use it—when I do, I will bring you a full review, of course!—all of the reviews I've run across have been very positive.

According to L&L, the software will sync (through Dropbox) with your macOS, Windows, and all iOS devices (those that it can run on, naturally), as long as the macOS or Windows versions are the latest (macOS: version 2.8, Windows: version 1.9.5). It requires iOS 9.0 or higher, and will run on the iPhone albeit with a few features missing (corkboard, for example).

If you've been thinking about transitioning to Scrivener, and have been waiting for the iOS version (maybe you have a desktop at home and bought an iPad instead of a laptop), then now seems like the perfect time to do so!

If you're already using it, share your experiences below and let everyone know how it's working for you. After I've had a chance to grab a copy myself, I will blog on some of the things that are likely to be most important, such as how it works as a daily-use writing tool, sharing for edits (with other Scrivener users, for example), research & writing, and preparing for submissions & publication.

On an only tangentially related topic, I am still working on getting my workflow perfected for my audiobook, and am still aiming for a blog entry tomorrow on that subject. Stay tuned!

Saturday, July 23, 2016

Audiobook Workflow & Tools, updated

So, I'm still waiting for feedback from ACX, but even if (as I suspect) their feedback is that the files do not meet their technical requirements, I do have some updates for everyone based on my trials and tribulations (if you haven't followed along, check out the first few entries in this topic posted over the last few days).

First, a slight update to the hardware. Instead of the Røde VideoMic Me, I am now using an Audio-Technica ATR-55 that I found for a steeply discounted price in a secondhand store. This is a dynamic shotgun mono microphone, with a tunable pickup pattern. It has three settings for pickup: off, cardioid, and hypercardioid (labelled "tele" on this microphone).

For the novice, the pickup pattern describes the shape of directions in which the microphone "hears," in essence. Without getting into too much technical detail, the "cardioid" pattern refers to a pickup pattern shaped somewhat like a human heart (for an excellent explanation, as well as an equally excellent diagram, check out this answer from music.stackexchange.com). Shotgun mics have a construction that tends to emphasize sounds to the front of the microphone (and, to a lesser extent, the back), and although they do also take in sounds from the sides (off-axis, if you think of it that way, with the on-axis being in the long direction of the microphone). Essentially, what happens is the microphone takes the sounds that come in from the off-axis, and then de-emphasizes them in favor of the sounds that come down the on-axis. With the "tele" setting turned on with the ATR-55, it basically makes that cardioid shape thinner, and stretches down the on-axis a little bit farther to the front (and a little bit to the back, as well).

I have also been doing some deliberate digging to find ways to not have to use Final Cut Pro X—which is still an excellent tool, don't get me wrong. It's just that it's really for video editing, and eats a lot of RAM & CPU time on my computer as a result. It may also not be your best bet, if all you want to do is narrate your own book, at home, because it is a $299 investment. It is really, really awesome, though, so if you already have it for some reason it is a great tool.

So I looked into some ways I could streamline by trying to have Sound Studio and Audacity do more work, possibly using Final Cut only for one or two steps (if at all) that couldn't be done in the other two. I have not completed this process, but I believe that I may be able to only use Audacity, and skip Sound Studio and Final Cut Pro X. When I finalize this process, I'll give more details. I am still planning to do a series of short videos as well, covering these topics:

Basic Sound Terminology and Concepts

This video will be a few minutes long, and will explain some of the terms used in sound editing & engineering, for the writer (who may not care other than the fact they need to have a basic grasp in order to self-publish audiobooks)

Hardware Setup

This will likely detail my specific setup, and point out where you might make changes or use different equipment (and why)

Software Toolchain and Workflow

This will show the tools I use, and then follow a single sound file from recording, to editing, to filtering & other manipulations, and finally to saving in the final format for submission.

These tools will be for Macintosh (since that's what I have and use), but will be conceptually identical on PCs running Windows or Linux—especially with Audacity, since it works on all of these platforms—and should be easy to understand in general.

I intend to have this workflow nailed down by about Monday evening (Mountain Standard Time, since I live in Arizona). That will likely result in a blog on the details Tuesday, and having the videos up starting probably Friday, 29 JUL 2016.

Thursday, July 21, 2016

Updated Audiobook Progress

First, the bad news: I still haven't heard back from ACX about the sample files that I uploaded, but I am reasonably sure they aren't going to meet their technical requirements.

And that's actually good news. This is because I found some tools that helped me to determine with some amount of reasonableness what ACX is looking for, and whether those files meet their criteria or not. And they don't, which means I need to re-record them, but that's actually okay.

It's okay, because I have learned a few things that are streamlining my audiobook recording process.

I hadn't been paying as much attention to the exact mic set up as I should have, such as its exact position, distance from my face when speaking, etc. Now, I have a quite precise measurement and arrangement for these, which has made the recordings much more consistent.
I found a specific tool for analyzing against ACX's published specifications: the ACX Check plugin for Audacity. As they note in the wiki entry for the plugin, using this is not a guarantee that ACX will accept your audio, but it does provide excellent information about some of the technical requirements (noise floor, peak, and RMS levels specifically). You can find out where you stand. Fixing it, that's another show (sorry, Alton). (I do intend to do a couple of videos on how to use Final Cut Pro X's audio tools, and Audacity, to create ACX-ready audio files.)
I have been experimenting with the filters in Final Cut Pro X, Audacity, and Sound Studio, and have come up with a pretty good workflow for that, which I will detail in an upcoming blog entry.

So, all in all, the progress has been good even if the output hasn't yet been validated (and probably won't be until I re-record those portions I'm reasonably sure will not be validated). And, in a couple of days, I'll walk you through my updated workflow.

Cogitations of a Semi-Pro Wordsmith

Affiliate Disclosure