In case you haven’t heard, I’m creating my own audiobook for my summer launch of Forewarned. I started the process in January, and I’m just now putting the finishing touches together now, almost 5 months later. So why would I (or anyone!) decide to do something so daunting!
The truth is, Amazon’s KDP is making it hard to say no to AI generated audiobooks. If you’ve recently checked your dashboard, you’ll see the message: Add audiobook with virtual voice. And: Your eBook is eligible for an audiobook. Unless it isn’t- which is the case with my second edition of Best Kept Secrets. That’s because my publisher shared the rights to the first edition with Blackstone Publishing (Audible) and they’re still selling that audiobook on Amazon.
My point is, it’s SO EASY to give in to the easy way out.
But have you heard an AI audiobook? My community reports that the virtual voice lacks emotion and expression. I hear you can choose the tone of the voice - female, male, gruff, sweet - but it lacks luster. Most of all it lacks the ability to give meaning to the words.
I thought I could do better. So before embarking on this project, I checked a few boxes.
I majored in recording and engineering at Berklee College of Music. I knew a little bit about getting a good quality sound.
I had an 8-Track recording studio for a few years in Boulder, CO. (ditto above)
I’ve been recording my piano students for fifteen years using Audacity. I knew this software was easy to edit, I’d done it before. Easy-peasy.
I’ve had practice reading books and acting on stage. I was in the Verona Area Community Theater for about 10 years, and though I played supporting roles, I learned a lot about intonation and vocal expression.
I read ALL of the Harry Potter books to my kids. Trust me, by the time the last book came out, my kids were old enough to read it themselves. My daughter was a senior in high school. But by then, it was just the way we did things. Mom reads the Harry Potter books with all the voices. (Yep. I did that. LOL!)
I thought recording would only take a couple weeks.
Okay, I was clearly mistaken about the last thing. The point is, I had the confidence. I had the know how and all we needed was a great vocal mic.

We started recording in mid January. We used Audacity software and my new laptop. We made a new file out of each chapter, and decided to file the chapters in groups of 12. There are 74 chapters, the epilogue, and the intro and closing remarks. But within a week, we had a problem.
Only about 60% sounded good even before editing. The recordings kept having patches of muffled or squeaky spots. It was peaking and distorted at times. We couldn’t figure out why. Eventually we learned that when OneDrive updated, and it updates often, it stole power from the app and from the microphone. I had too many apps on my computer for it to work. Even when we shut down OneDrive, we had issues.
We bought a new laptop to compensate for the storage and did not connect it to Chrome or the internet. Then we re recorded the first 13 chapters.
We were in business again, but it was very slow going. I was able to record about 12 chapters a week. But then I found the post production to be the most time consuming part of the process. I didn’t realize the number of times I’d clear my throat, cough, swallow, repeat a phrase to clarify the expression, repeat a phrase because I’d read it wrong, stopped to take a drink of water, paused for too long, didn’t pause long enough… and the list goes on. All those little things have to be edited out of the recording.

And did I mention leveling the peaks? Oh boy. When those characters get excited or angry, the levels go off the charts and the sound distorts a little. The entire recording has to be leveled to -0.3 dB or ACX won’t take it. The first time I listened back, I removed all the breath sounds and repeated things. I deleted long stretches of silence between sentences and other extraneous sounds. The second time through the entire recording, I leveled all the peaks. Tagged the places that I needed to re-record. Ugh, there were up to 10 voice overs per chapter. Punching in those snippets took time. I got it down to a science.
At the end of the day, the first 3 12-chapter files are about 2 hours long. That’s 36 chapters. Somewhere around the middle of the book, the chapters get shorter. So the next 2 12-chapter files were about an hour and a half long. The last file, chapters 60 through the closing remarks, was again about 2 hours long. The total length in real time is about 11 hours. If you’re like me, when you listen to an audiobook you’ll speed it up a little. Most of us read faster than that anyway.
The eleven hours of recorded material took me five months to create. Between trips to see the grand kids and a bathroom remodel, there were interruptions. And in April, I knew that if I wanted to finish it before the book launch, I had to put aside all my other writing and focus solely on this project.
I have one last thing to add to the recording and the grand kids are coming to visit again. So next week I’ll record the Chopin Trois Etude no. 1 for the intro and the closing. The piece has relevance to my character Daphne. Throughout the book she wants to practice and the ruined old piano at the lake house is too out of tune and has too many dead keys. I want my readers to hear the piece so they know how it sounds.
Today I’m telling everyone, “I’ll never do this again.” But like the pain of childbirth, you forget. And now that I’ve streamlined my process it will likely take much less time. Especially now that I know what I’m in for.
On the other hand, AI is improving all the time. It’s possible that by the time I want another audiobook, virtual voices will be able to emote. Will it be the same as a human voice? Will it be as nuanced? I still have my doubts. And I don’t think for a moment that will be easy. Even with your AI recording you have to listen and make sure it pronounces the words correctly. Because AI doesn’t know everything.
A coming-of-age story for every generation! I read this in two sittings. It was a bit like watching a train wreck - I could not look away. I so wanted to mother and lecture those children! Absolutely amazing. The characters were so real - like watching my siblings make all the stupid mistakes they made when we were young. Phillips expertly weaves the uncertainties of puberty, friendship, and acceptance into characters whose secrets conflict and collide in a deadly climax. The summer of 1976 setting comes alive, nostalgic in its innocence and heartbreakingly accurate in its crumbling family values, sucking the reader in and never letting go.
~ Sharon Lynn, Award-winning author of A Cotswold Crimes Mystery series
Tracey — My hat is off to YOU!
Wow, what an ordeal. You'v convinced me that it's not something I want to tackle.