Creating an audiobook used to mean booking a recording studio, hiring voice actors, scheduling engineers, and spending weeks (and thousands of dollars) in post-production. A single finished hour of audiobook narration costs between $5,000 and $15,000 in a traditional studio — and that's for a single voice, in a single language.
AI has changed this entirely. With Notevibes, you can upload your manuscript, let AI detect every speaking character in your book, assign each one a unique voice, generate professional-quality narration for every chapter, and publish your finished audiobook — all in a single sitting.
This guide covers every step in detail, with mockups of the actual Notevibes interface so you know exactly what to expect. Whether you're a self-published author, a publisher looking to expand your catalog, or a content creator exploring audio, this is the complete playbook.
What You'll Need
Your book file
EPUB, Kindle (.mobi/.azw3), PDF, DOCX, or plain text
A Notevibes account
Free to sign up, plans start at $19/month
A few minutes
Upload to finished audiobook in one afternoon
That’s it
No studio, no actors, no audio engineering skills needed
Upload Your Manuscript
Start by navigating to the Audiobook Narration page and clicking “Create Your Audiobook.” You'll see a drag-and-drop upload area that accepts five book formats:
Drop your book file here
or click to browse
EPUB
.epub
Kindle
.mobi / .azw3
Word
.docx
Plain Text
.txt
EPUB — The gold standard. Notevibes reads the EPUB table of contents, detects chapters automatically, and preserves the full document structure. If your book has chapter headings, they'll be extracted perfectly.
Kindle (.mobi / .azw3) — Direct upload with chapters preserved. Book metadata (title, author, publisher) is extracted automatically.
PDF — AI-powered text extraction with layout detection. Handles multi-column layouts, detects and skips page numbers, headers, and footers. Converts tables to spoken descriptions.
Word (.docx) — Heading styles (H1, H2, H3) are automatically converted to chapter boundaries.
Plain Text (.txt) — Use custom chapter markers (like --- or Chapter N) for flexible delimiter support.
The Audiobook Workspace
Once your book is uploaded and parsed, you'll land in the audiobook workspace. This is your command center for the entire production process. On the left, you'll see a chapter sidebar listing every chapter detected from your book:
Chapter 1 — The Beginning
3,420 words
Chapter 2 — Old Friends
4,180 words
Chapter 3 — The Letter
2,870 words
Chapter 4 — Into the Garden
5,210 words
Chapter 5 — The Crossing
3,950 words
Each chapter shows:
- Chapter title (editable — click to rename)
- Word count for estimating narration length
- Status indicator: green (audio generated), orange (content changed since last generation), gray (not yet generated)
You can reorder, duplicate, delete, and create new chapters directly from the sidebar. The main content area shows a rich text editor for the selected chapter — powered by Tiptap — where you can edit your text, format with bold/italic/underline, and add bullet lists.
At the top of the workspace, you'll find the toolbar with the main voice selector, generate button, download options, and the characters panel toggle:
Aoede
en-US · Female
The toolbar gives you quick access to:
- Main voice selector — sets the default narrator voice for the entire book. Click to change it. This voice is used for all narration paragraphs unless overridden by character detection.
- Characters button — opens the character detection panel (covered in Step 3). The badge shows how many characters have been detected.
- Credits balance — your remaining generation credits.
- Generate button — click to generate audio. A dropdown lets you choose between generating the current chapter or the full book.
- Download button — export options including per-chapter MP3, merged full book, and ZIP archive.
Detect Characters with AI
This is where the magic happens. Click the “Detect Characters” button, and Notevibes AI scans your entire manuscript to identify every speaking character. The AI doesn't just find names — it understands narrative roles, counts dialogue lines, maps every paragraph to the right speaker, and automatically detects your book's genre to shape the narration style. No manual preset selection needed — the AI figures out whether your book is a thriller, romance, fantasy, or memoir and adapts the delivery accordingly.
For each detected character, the AI determines:
- Name and any aliases used in the text
- Role classification: narrator, protagonist, antagonist, mentor, sidekick, supporting, or minor
- Gender (for voice matching)
- Dialogue line count and chapter appearances
- Narrative arc and character evolution across the book
- A suggested AI voice based on the character’s attributes
- Per-paragraph delivery directions (“whispered, trembling”, “cold, controlled”) generated from the scene context
The characters panel is fully editable. Rename characters, change their roles, swap voices, or remove false detections. When you're happy with the lineup, click “Apply to All Chapters” to propagate the voice assignments across your entire book.
The AI also auto-detects your book's genre (fiction, thriller, romance, fantasy, sci-fi, memoir, etc.) and uses it to shape character portraits, scene illustrations, and the overall visual theme if you later publish with Storybook Mode. There's no manual genre or narration style selector to worry about — the AI reads your text and adapts automatically.
Assign Unique Voices
Each character needs a distinct voice. The AI suggests voices automatically based on gender, role, and personality, but you can override any assignment. Click on a character's voice to open the voice selector:
Aoede
Female · en-US
Kore
Female · en-US
Leda
Female · en-US
Orus
Male · en-US
Puck
Male · en-US
Charon
Male · en-US
550+ voices across 57 languages
The voice selector offers 550+ AI voices organized by:
- Gender: Male, Female, or All
- Type: All, Starred (your favorites), Recent (last 5 used), Natural (Chirp3-HD), Standard
- Language: 57 languages including regional accents (American, British, Australian, Indian, Scottish, Nigerian, and more)
- Preview: Click the play button on any voice card to hear a sample before selecting
The latest Chirp3-HD voices are the most natural-sounding, with 30 unique voice profiles named after astronomical objects: Aoede, Kore, Orus, Puck, Leda, Charon, and more. Each one supports all 57 languages natively — meaning the same voice can narrate your book in English, French, Japanese, or Arabic with authentic accents.
Per-Paragraph Voice Control
After character detection, every paragraph in your book gets assigned to a specific voice. The paragraph gutter — the left-side overlay in the editor — gives you a visual map of who's speaking where:
The rain hammered against the window. She knew, even before opening the letter, that everything was about to change.
Reflective, slow — building tension
The envelope was thin — just a single sheet folded in thirds. No return address. Her name in handwriting she hadn’t seen in seventeen years.
Whispered, trembling
“Wait,” she whispered, pressing the paper flat on the table. “This can’t be right. Unless... unless he knew all along.”
Cold, controlled
“You weren’t supposed to find that.” His voice came from the doorway, flat and measured. “Not yet.”
The paragraph gutter shows:
- Voice avatar: A visual indicator of which character is speaking this paragraph. Click to change the voice.
- Speaker name: The character name displayed below the avatar.
- Sync status: Green dot = audio matches current text. Orange dot = text has changed since audio was generated.
- Play button: Click to hear this specific paragraph. The currently playing paragraph is highlighted.
- Scene direction: When Adaptive Delivery is enabled, each paragraph shows an italicized delivery cue (e.g., “Whispered, trembling” or “Cold, controlled”).
Generate Audio
Click the “Generate” button in the toolbar to start producing your audiobook. A dropdown gives you two options:
Generate Chapter
Generate audio for the current chapter only. Great for testing voice assignments before committing to the whole book.
Generate Full Book
Generate all chapters in parallel. This is the fastest way to produce your complete audiobook.
Character Voices
Detect characters and assign unique voices
Adaptive Delivery
Scene-aware vocal directions per paragraph
When Character Voices is enabled, clicking generate will first run AI character detection (if characters haven't been detected yet), then generate audio with per-character voice assignments. When Adaptive Delivery is on, the AI computes scene-aware delivery directions for each paragraph before synthesis.
After clicking generate, the progress dialog opens and shows real-time status for each chapter:
The Midnight Garden
2 of 5 chapters complete
Credits used
12,480
Behind the scenes, Notevibes processes each chapter through a sophisticated pipeline: paragraphs are grouped into voice segments, chunked for the TTS engine, and synthesized with per-character voices and delivery directions. For dialog-heavy scenes, the engine batches consecutive short dialogue segments from two speakers into a single multi-speaker request — producing more natural conversational flow with fewer audio stitching artifacts.
Preview & Edit
After generation completes, every paragraph becomes playable. Click the play button in the paragraph gutter to hear individual paragraphs, or use the audio player bar at the bottom to play through the entire chapter.
The editing workflow is non-destructive:
- Edit text freely — the orange sync indicator tells you which paragraphs need re-generation
- Change a character’s voice — only the paragraphs using that voice are marked stale
- Adjust delivery directions — toggle Adaptive Delivery on/off or edit individual directions
- Regenerate selectively — you can regenerate a single chapter without touching the rest
- The generated audio for unchanged chapters is preserved and cached
Generate Illustrations
Once your audio is generated, you can bring your audiobook to life visually. Notevibes generates two types of AI illustrations: character portraits and scene illustrations. These power Storybook Mode when you publish your book.
Narrator
Aoede
Elena
Kore
Marcus
Orus
Old Keeper
Puck
Rinn
Leda
Character portraits are generated using your book's character descriptions, genre, and role classifications. The AI applies genre-specific visual styles:
Fantasy
Painterly illustration, golden and amber lighting, detailed textures
Sci-Fi
Sharp digital art, cool blue and neon accents, futuristic rim lighting
Romance
Soft focus, warm golden hour lighting, delicate and emotional
Thriller
Noir-influenced, high contrast, chiaroscuro lighting, moody shadows
Horror
Dark, desaturated palette with eerie undertones and dramatic shadows
Children's
Bright, colorful illustration with soft rounded shapes and warmth
The forest clearing
¶1–¶3
Through the archway
¶4–¶7
The stone fountain
¶8–¶12
Scene illustrations are generated at key narrative moments in each chapter. The AI uses the paragraph text, delivery directions, and character information to create contextual illustrations. These appear on the left page in Storybook Mode's Illustrated layout, creating a book-spread experience where listeners see the scene while hearing the narration.
Download Your Audiobook
Once you're happy with the narration, click the download button. Three export options are available:
Download Chapter MP3
Current chapter as a single MP3 file
Full Book (Merged MP3)
PopularAll chapters combined into one file
All Chapters (ZIP)
RecommendedIndividual MP3 per chapter — Chapter_01.mp3, Chapter_02.mp3...
Audio Quality (auto-applied)
Format
192 kbps CBR MP3
Sample Rate
44.1 kHz
Loudness
-20 dB RMS (broadcast standard)
Peak Level
≤ -3 dB
Every export is broadcast-quality. Notevibes handles volume normalization, peak limiting, and format encoding automatically — no audio engineering tools or post-processing needed. The files are ready to upload directly to Audible, Google Play Books, Apple Books, or any other audiobook distributor.
Publish Your Audiobook
You have two publishing paths: publish directly on the Notevibes Library for instant availability with rich features, or export and distribute to external platforms.
Publish on Notevibes Library
When you generate a book, it's automatically added to your library as a private draft. To publish publicly:
- 1.Open the book in your Library and fill in metadata: author, description, genre, language, tags
- 2.Optionally generate AI character portraits and scene illustrations
- 3.Enable Read Along (word-level sync) for interactive reading
- 4.Click “Publish” to make the book available on the Notevibes Library
- 5.Share the public URL with your audience
Published books on Notevibes come with Storybook Mode, Read Along, character portraits, scene illustrations, and listener analytics included — features you won't find on any other audiobook platform.
Export to External Platforms
Download your audio files and distribute to any major platform:
Audible
The world’s largest audiobook marketplace
Google Play Books
Reach Android and Google Home users
Apple Books
iPhone, iPad, and Mac listeners
Spotify
Audiobooks on the streaming platform
Kobo
190+ countries worldwide
Findaway Voices
Wide distribution to 40+ retailers
Full Commercial Rights
All paid Notevibes plans include full commercial rights to generated audio. No royalty splits to Notevibes, no per-sale fees, no platform lock-in. Your audiobook, your revenue. Publish and sell on any platform worldwide.
Sell Your Audiobook
Coming SoonDirect sales powered by Stripe
We're building direct audiobook sales into Notevibes. Set your own price, accept payments through Stripe, and keep your revenue. No middleman, no revenue share to Notevibes. We handle hosting, delivery, and the listening experience — you keep the earnings.
Set your price
You decide what your audiobook is worth
Stripe checkout
Trusted, global payment processing
Instant payouts
Revenue goes directly to your Stripe account
No revenue share
Notevibes takes $0 from your sales
Your audiobook + Storybook Mode + Read Along + Analytics
Buyers get the full Notevibes listening experience. You get a storefront without building one.
What Your Listeners Experience
Publishing on the Notevibes Library gives your audience three ways to experience your audiobook — features you won't find on any other platform. These are powered by the character portraits, scene illustrations, and audio sync you generated in the previous steps.
Storybook Mode
Transforms your audiobook into a visual reading experience. Listeners see beautifully typeset pages with the narration synced to the text — a hybrid of audiobook and illustrated storybook.
The forest was silent save for the distant call of a nightingale. Elena pressed her palm against the ancient oak, feeling the bark rough beneath her fingers.
“We shouldn't have come here,” Marcus whispered from behind the thornwall. “Not tonight.”
She didn't answer. Instead, she stepped forward, through the archway of twisted branches, into the moonlit clearing.
Genre-specific themes
Fantasy gets parchment and gold. Romance gets soft pinks. Thrillers get noir contrast. 9 themes auto-applied from detected genre.
Three layout modes
Illustrated (scene + text spread), Classic (portrait badges inline), Cinematic (visual novel style).
Page-turn animations
Slide, flip, or none. Drop caps, ornamental dividers, and chapter title pages.
Character portraits & scenes
Your generated portraits float beside character dialogue. Scene illustrations fill the left page.
Read Along
Word-level highlight sync that follows the narration in real time. Each word lights up as it's spoken, making it easy to follow along.
The rain hammered against the window pane. She knew, even before opening the letter, that everything was about to change.
The envelope was thin — just a single sheet folded in thirds. No return address. Her name in handwriting she hadn't seen in seventeen years.
“Wait,” she whispered, pressing the paper flat on the table. “This can't be right.”
3-level highlighting
Active word, surrounding sentence, and current paragraph — each with its own highlight intensity.
Click to jump
Click any word to jump the narration there. Auto-advance between chapters.
Accessibility
Valuable for language learners, readers with dyslexia, and multi-modal reading.
Listener Analytics
Track how listeners engage with every chapter of your audiobook with the built-in analytics dashboard.
12,847
Total Plays
3,291
Unique Listeners
68%
Completion Rate
Plus geographic insights (listener locations by country), device & behavior data (mobile vs. desktop, session length, chapter drop-off points), and daily aggregation for trend analysis.
Traditional Studio vs. AI Narration
Traditional Studio
$5,000 – $15,000+ per finished hour
2–6 weeks turnaround
Schedule actors, engineers, studio time
Re-records cost extra per session
One language per production
Notevibes AI
RecommendedStarting at $19/month
One afternoon — upload to finished audiobook
Upload your book and click generate
Unlimited revisions included
57 languages from the same manuscript
A 10-hour audiobook produced in a traditional studio would cost between $50,000 and $150,000, take weeks of studio time, and be locked to a single language. With Notevibes, the same audiobook can be created in an afternoon, revised unlimited times, and published in 57 languages — all from a single manuscript.
Frequently Asked Questions
How long does it take to create an audiobook?
Most books go from upload to finished audiobook in a single afternoon. Upload takes seconds, character detection runs in under a minute, and audio generation processes all chapters in parallel. A 10-chapter novel typically generates in 5–15 minutes.
Can AI really detect characters in my novel?
Yes. The AI scans your entire manuscript, identifies every speaking character by name, assigns narrative roles (protagonist, antagonist, narrator, mentor, sidekick, supporting, minor), counts dialogue lines, tracks chapter appearances, and even detects narrative arc and character evolution. You can review and override every assignment.
What if I want to change a voice after generating?
Fully supported. Change any character’s voice in the characters panel, and the affected paragraphs are marked stale (orange indicator). Regenerate only the changed chapters — the rest of your audiobook is preserved.
Can I use this for non-fiction books?
Absolutely. Use the Non-Fiction or Narrator preset for clear, authoritative delivery. Character detection is optional — for non-fiction, you can use a single narrator voice for the entire book. The AI handles tables, citations, and multi-column layouts.
Do I need audio editing skills?
No. Notevibes handles all audio engineering automatically: volume normalization, peak limiting, and noise floor management. The exported files are broadcast-quality and ready to upload to any platform with zero post-processing.
Can I publish in multiple languages?
Yes. All 550+ voices support 57 languages natively. Upload your translated manuscript and generate with the same (or different) voices. The voices produce authentic accents in every supported language.
What’s the difference between Storybook Mode and Read Along?
Read Along adds word-level highlight sync to the text display. Storybook Mode is a full visual experience with illustrated pages, genre themes, character portraits, scene illustrations, drop caps, and page-turn animations. Storybook Mode includes Read Along’s highlighting capabilities.
Do I own the rights to my AI-generated audiobook?
Yes. All paid plans include full commercial rights. No royalty splits to Notevibes, no per-sale fees, no platform lock-in. Publish and sell anywhere.
Ready to Create Your Audiobook?
Upload your manuscript, let AI do the heavy lifting, and publish a professional audiobook — all in one sitting. No studio, no actors, no audio engineering.
Create Your Audiobook550+ voices · 57 languages · Full commercial rights · Broadcast-quality audio