Complete Guide

How to Create an Audiobook with Notevibes

From manuscript to published audiobook in one afternoon. This is the complete guide to turning your book into an AI audiobook with Notevibes, complete with UI screenshots, tips, and everything you need to publish on Audible, Apple Books, and Spotify.

18 min read 10 steps 550+ AI voices 72 languages

Creating an audiobook used to mean booking a recording studio, hiring voice actors, scheduling engineers, and spending weeks (and thousands of dollars) in post-production. A single finished hour of audiobook narration costs between $5,000 and $15,000 in a traditional studio, and that's for a single voice, in a single language.

AI audiobooks have changed this entirely. With Notevibes, you can upload your manuscript, let AI detect every speaking character in your book, assign each one a unique voice, generate professional audiobook narration for every chapter, and publish the finished book, all in a single sitting.

This guide covers every step in detail, with mockups of the actual Notevibes interface so you know exactly what to expect. Whether you're a self-published author, a publisher looking to expand your catalog, or a content creator exploring audio, this is the complete playbook.

Why Create an Audiobook?

Cinderella
The Wooden Horse of Troy
Beauty and the Beast
The Minotaur
Puss in Boots

The audiobook market is worth over $7 billion and growing at 25% year over year. More than half of US adults have listened to an audiobook, and the number keeps climbing. Audiobooks reach people that print and ebooks don't: commuters, runners, parents multitasking, and anyone who prefers listening over reading.

For authors and publishers, turning a book into an audiobook is an additional revenue stream from a manuscript you've already written. No new content is needed, just a different format. And with AI narration removing the traditional barriers of cost and time, there's no reason to leave this audience unserved.

Audiobooks also expand accessibility. Listeners with visual impairments, dyslexia, or learning disabilities gain access to your work. Language learners use audiobooks alongside text to improve comprehension. A single manuscript can reach a dramatically wider audience as an audiobook.

And the risk is near zero. If you already have a published book, an audiobook is just another distribution channel for the same content. With AI narration starting at $19/month, you can test the entire workflow for less than the cost of a single book cover. Generate a sample chapter, listen to it, decide if it's worth finishing. If it is, the full audiobook takes an afternoon. If not, you're out $19, not $5,000.

$7B+

Market size, growing 25% YoY

50%+

US adults have listened to an audiobook

72

Languages supported by Notevibes

3 Ways to Create an Audiobook

Before diving into the step-by-step guide, it helps to understand your options. There are three main approaches: the DIY audiobook route, hiring a narrator, or using AI. Each comes with different trade-offs:

Self-Record

Full creative control
Authentic author voice
Requires equipment ($500–$2,000)
Time-consuming (weeks to months)
Audio editing skills needed
Single voice only

Cost: $500\u2013$2,000 (equipment)

Timeline: Weeks to months

Hire a Narrator

Professional quality
Emotional depth and range
$200–$400 per finished hour
2–6 weeks turnaround
Revisions cost extra
One language per narrator

Cost: $1,400\u2013$2,800 per 7-hour book

Timeline: 2\u20136 weeks

AI Narration (Notevibes)

Recommended
550+ voices, 72 languages
Multi-voice character narration
One afternoon from upload to done
Unlimited revisions included
No audio engineering needed
Full commercial rights

Cost: Starting at $19/month

Timeline: One afternoon

AI narration offers the fastest, most affordable path to a published audiobook, with multi-voice character support that even traditional studios struggle to match at scale. The rest of this guide walks you through the AI approach using Notevibes, step by step.

What You'll Need

Your book file

EPUB, Kindle (.mobi/.azw3), PDF, DOCX, or plain text

A Notevibes account

Free to sign up, plans start at $19/month

A few minutes

Upload to finished audiobook in one afternoon

That’s it

No studio, no actors, no audio engineering skills needed

Prepare Your Manuscript for Audio

A book written for the page doesn't always work for the ear. Before you upload, a few edits will save you hours of cleanup later.

  • Visual references: Replace "see figure 3" or "the chart below" with a spoken description. Listeners can’t see your diagrams.
  • Tables and data: Convert tables to spoken summaries. "Revenue grew from $2M to $5M between 2022 and 2024" works better than a grid of numbers.
  • Footnotes and endnotes: Decide if they should be read inline, moved to a "notes" chapter at the end, or cut entirely. Most listeners prefer inline.
  • URLs and links: Read them out only if essential. Otherwise replace with "visit our website" or cut them. Nobody wants to hear "h-t-t-p-s colon slash slash."
  • Complex sentences: Read your longest sentences out loud. If you run out of breath, split them. Audio needs shorter, more direct phrasing.
  • Abbreviations: Decide if "Dr." should be read as "Doctor" and "US" as "United States." Be consistent throughout.
  • Pronunciation guide: Make a list of unusual names, places, and made-up words with their intended pronunciation. Fantasy novels and non-fiction with technical terms especially need this.
Tip: You don't need a separate manuscript. Make these edits in a copy of your book file (EPUB, DOCX, or TXT) before uploading. Notevibes also has a full text editor, so you can fix things after upload too.

Copyright & Audio Rights

Before you generate anything, make sure you have the right to create an audio version of the text.

  • You wrote the book: You own all rights, including audio. You’re good to go.
  • You published with a traditional publisher: Check your contract. Audio rights may belong to the publisher. If they haven’t exercised those rights, you may be able to request them back.
  • You’re narrating someone else’s work: You need explicit written permission from the copyright holder. A license agreement covering audio adaptation is standard.
  • Public domain books: Works where the copyright has expired (generally pre-1928 in the US) are free to narrate and publish. No permission needed.
  • Ghostwritten books: If you hired a ghostwriter, confirm that your contract includes a full rights transfer covering audio adaptation.

This isn't legal advice. If you're unsure about your rights, consult an intellectual property attorney before publishing.

1

Upload Your Manuscript

Start by navigating to the Audiobook Narration page and clicking “Create Your Audiobook.” You'll see a drag-and-drop upload area that accepts five book formats:

Notevibes: Upload Your Book

Drop your book file here

or click to browse

EPUB

.epub

Kindle

.mobi / .azw3

PDF

.pdf

Word

.docx

Plain Text

.txt

EPUB: The gold standard. Notevibes reads the EPUB table of contents, detects chapters automatically, and preserves the full document structure. If your book has chapter headings, they'll be extracted perfectly.

Kindle (.mobi / .azw3): Direct upload with chapters preserved. Book metadata (title, author, publisher) is extracted automatically.

PDF: AI-powered text extraction with layout detection. Handles multi-column layouts, detects and skips page numbers, headers, and footers. Converts tables to spoken descriptions.

Word (.docx): Heading styles (H1, H2, H3) are automatically converted to chapter boundaries.

Plain Text (.txt): Use custom chapter markers (like --- or Chapter N) for flexible delimiter support.

Tip: EPUB gives the best results for chapter detection. If you have your book in multiple formats, EPUB is the way to go. Most ebook creation tools (Calibre, Vellum, Scrivener) can export to EPUB.
2

The Audiobook Workspace

Once your book is uploaded and parsed, you'll land in the audiobook workspace. This is your command center for the entire production process. On the left, you'll see a chapter sidebar listing every chapter detected from your book:

Chapters5 chapters

Chapter 1: The Beginning

3,420 words

Chapter 2: Old Friends

4,180 words

Chapter 3: The Letter

2,870 words

Chapter 4: Into the Garden

5,210 words

Chapter 5: The Crossing

3,950 words

Each chapter shows:

  • Chapter title (editable, click to rename)
  • Word count for estimating narration length
  • Status indicator: green (audio generated), orange (content changed since last generation), gray (not yet generated)

You can reorder, duplicate, delete, and create new chapters directly from the sidebar. The main content area shows a rich text editor for the selected chapter, powered by Tiptap, where you can edit your text, format with bold/italic/underline, and add bullet lists.

At the top of the workspace, you'll find the toolbar with the main voice selector, generate button, download options, and the characters panel toggle:

Audiobook Workspace Toolbar
Aoede

Aoede

en-US · Female

The toolbar gives you quick access to:

  • Main voice selector : sets the default narrator voice for the entire book. Click to change it. This voice is used for all narration paragraphs unless overridden by character detection.
  • Characters button : opens the character detection panel (covered in Step 3). The badge shows how many characters have been detected.
  • Credits balance : your remaining generation credits.
  • Generate button : click to generate audio. A dropdown lets you choose between generating the current chapter or the full book.
  • Download button : export options including per-chapter MP3, merged full book, and ZIP archive.
Tip: If the AI misidentified chapter boundaries (especially common with PDFs), you can manually split or merge chapters in the sidebar. The content editor supports full text editing, so you can fix any extraction issues before generating audio.
3

Detect Characters with AI

This is where the magic happens. Click the “Detect Characters” button, and Notevibes AI scans your entire manuscript to identify every speaking character. The AI doesn't just find names. It understands narrative roles, counts dialogue lines, maps every paragraph to the right speaker, and automatically detects your book's genre to shape the narration style. No manual preset selection needed. The AI figures out whether your book is a thriller, romance, fantasy, or memoir and adapts the delivery accordingly.

Characters: The Midnight Garden
Detected Characters5
Aoede
Narratornarrator
Voice: Aoede142 lines
Kore
Elenaprotagonist
Voice: Kore87 lines
Orus
Marcusantagonist
Voice: Orus54 lines
Puck
Old Keepermentor
Voice: Puck31 lines
Leda
Rinnsidekick
Voice: Leda23 lines

For each detected character, the AI determines:

  • Name and any aliases used in the text
  • Role classification: narrator, protagonist, antagonist, mentor, sidekick, supporting, or minor
  • Gender (for voice matching)
  • Dialogue line count and chapter appearances
  • Narrative arc and character evolution across the book
  • A suggested AI voice based on the character’s attributes
  • Per-paragraph delivery directions (“whispered, trembling”, “cold, controlled”) generated from the scene context

The characters panel is fully editable. Rename characters, change their roles, swap voices, or remove false detections. When you're happy with the lineup, click “Apply to All Chapters” to propagate the voice assignments across your entire book.

The AI also auto-detects your book's genre (fiction, thriller, romance, fantasy, sci-fi, memoir, etc.) and uses it to shape character portraits, scene illustrations, and the overall visual theme if you later publish with Storybook Mode. There's no manual genre or narration style selector to worry about. The AI reads your text and adapts automatically.

Tip: Character detection runs three AI passes in parallel: one builds each character's persona (who the voice is) and a vocabulary of emotion tags for their typical delivery; another casts an AI voice for each character; the third walks every chapter to assign paragraphs to speakers, write a scene direction for each paragraph (e.g. “whispered, trembling”), and insert inline [emotion tags] at delivery shift points inside the text itself. For non-fiction books, skip character detection and use a single narrator voice.
4

Assign Unique Voices

Each character needs to sound different. The AI picks voices automatically based on who the character is, but you're in control. Click any character's voice to swap it:

Select Voice
Aoede

Aoede

Female · en-US

Kore

Kore

Female · en-US

Leda

Leda

Female · en-US

Orus

Orus

Male · en-US

Puck

Puck

Male · en-US

Charon

Charon

Male · en-US

550+ voices across 72 languages

550+ voices, filterable by gender, language (72 languages with regional accents), and type. The 30 natural voices sound the most human. Each one speaks all 72 languages natively: same narrator, different language, authentic accent.

Customize how a voice sounds

Every voice comes with a set of controls that shape how it delivers your text. You can adjust these per character:

  • Pitch: Low, medium, or high. A low-pitch narrator feels grounded and authoritative. A high-pitch sidekick feels youthful and energetic.
  • Timbre: Smooth, raspy, breathy, resonant, silky, gravelly, or nasal. This is the texture of the voice. A gravelly old wizard sounds nothing like a silky romantic lead.
  • Age: Youthful, young adult, mature, or aged. Matches the character to the voice.
  • Energy: Soft-spoken, moderate, or projected. A whispered confession needs soft-spoken. A battlefield speech needs projected.
  • Pace: Slow, moderate, or fast. Slow pacing gives weight to literary prose. Fast pacing drives action scenes.
  • Accent: Neutral, British, Australian, Southern US, New York, Indian, Irish, or Scottish. Match the character to their world.

These controls auto-generate a voice persona prompt behind the scenes. For example, selecting Aoede with a British accent, slow pace, and mature age produces:

“A mature woman with a warm & lively voice, British accent, a slow, measured pace.”

You can also write your own persona prompt from scratch. Click the edit icon next to the auto-generated text to switch to manual mode. This gives you full control over how the voice behaves:

Thriller narrator

A cold, measured voice. Never rushes. Pauses before key reveals. Lets tension build in the silence between sentences.

Children’s storyteller

Warm and playful, like reading to a child before bed. Gets excited at surprises, whispers at secrets, gasps at scary parts.

Memoir narrator

Personal and reflective. Reads like someone telling their own story to a close friend. Quiet confidence, no performance.

Fantasy world-builder

Rich and cinematic. Gives weight to place names and ancient words. Builds atmosphere. Slows down for scenery, picks up in action.

Archetype: the one-line audio profile

Below the persona field, every character has an Archetype line — a short tagline that sits at the top of the TTS prompt as the character's audio profile. The AI fills this in automatically during detection, but you can rewrite it to steer delivery at a high level:

Villain

ancient king addressing his war council

Sidekick

panicked best friend who just saw a ghost

Mentor

retired spy mentoring a protégé at 2 a.m.

Narrator

old sailor telling a long story by the fire

Keep the archetype to one line (200 characters max). The persona describes how the voice sounds; the archetype frames who is speaking and in what situation. The two work together.

Emotions: the per-character tag vocabulary

Click the Emotions button on any character to pick a short list of inline delivery tags that suit them — things like [whispers], [sighs], [sarcastic], or [trembling]. The picker is organized into four tabs:

  • Emotion: Positive (joy, hope, awe), negative (fear, anger, grief), and complex (sarcasm, anticipation, determination).
  • Delivery: Vocal sounds like laughs, sighs, gasps, crying — and volume/tone modifiers like whispers, shouting, breathy, trembling.
  • Pacing: Slower, faster, dramatic pause, rushed — useful for controlling rhythm on specific lines.
  • Custom: Free-text input for creative tags. Examples from the Gemini prompting guide: [like an orc], [like a robot], [like an old wizard]. Works best for non-human voices.

Tags picked here form that character's vocabulary. The AI prefers tags from this list when it drops inline [tags]into that character's lines during detection, and you can always drop more into the text yourself — they render as colored pills in the editor. Tags are shift markers, not constant labels: they apply at the moment you place them, not to the whole paragraph.

Recommended voices by genre

Literary Fiction

Aoede (warm, lively), Alnilam (emotional storyteller), Laomedeia (soothing storyteller)

Warm tone and measured pace let prose breathe.

Thriller / Mystery

Charon (commanding), Gacrux (commanding contralto), Algieba (authoritative)

Strong energy and deep tone build tension.

Romance

Zephyr (whispery, intimate), Kore (friendly, natural), Fenrir (expressive tenor)

Soft energy carries emotional scenes.

Children’s Book

Leda (youthful, airy), Puck (youthful baritone), Autonoe (bright, animated)

High energy and warm tone keep kids engaged.

Non-Fiction / Business

Schedar (crisp broadcaster), Algieba (authoritative), Achernar (poised, articulate)

Clear tone and slow pace sound professional.

Fantasy / Sci-Fi

Enceladus (resonant bass), Callirrhoe (dramatic), Pulcherrima (deep velvet)

Deep tone and dramatic energy suit world-building.

Tip: Star the voices you like. They stick to the top of the selector across projects. For books with a big cast, pair opposites: a low-pitch gravelly voice for the villain, a high-pitch breathy voice for the sidekick. Listeners can tell characters apart instantly.
5

Per-Paragraph Voice Control

After character detection, every paragraph in your book gets assigned to a specific voice. The paragraph gutter, the left-side overlay in the editor, gives you a visual map of who's speaking where:

Chapter 3: The Letter
Narrator
Narrator

The rain hammered against the window. She knew, even before opening the letter, that everything was about to change.

Narrator
Narrator

Reflective, slow, building tension

The envelope was thin, just a single sheet folded in thirds. No return address. Her name in handwriting she hadn’t seen in seventeen years.

Elena
Elena

Whispered, trembling

“Wait,” she whispered, pressing the paper flat on the table. “This can’t be right. Unless... unless he knew all along.”

Marcus
Marcus

Cold, controlled

“You weren’t supposed to find that.” His voice came from the doorway, flat and measured. “Not yet.”

Each paragraph row shows four controls in a single line:

  • Speaker name + voice avatar: shows which character is speaking. Click the avatar to swap the voice. The colored dot on the avatar shows sync status: green = audio matches current text, orange = text changed since last generation.
  • Play button: click to hear this paragraph. The active paragraph highlights in the editor.
  • Voice direction: the clapperboard icon. The AI writes a scene-level delivery cue for each paragraph during detection (e.g., “Whispered, trembling” or “Cold, controlled”). Click the icon to edit the direction manually. A highlighted means a direction is set; a faded one means none.

Alongside the per-paragraph voice direction, the AI also drops inline emotion tags into the text itself at delivery shift points. These render as colored pills in the editor — for example [whispers], [sighs], or [sarcastic]. Click any pill to remove it, or place your cursor anywhere in the text and use the Emotions button in the toolbar to insert one. Three layers work together on every line: persona (who the voice is), voice direction (how the scene feels), and emotion tags (how this specific line is delivered).

Tip: If the AI assigned the wrong speaker to a paragraph (common in scenes with multiple characters talking), just click the voice avatar in the gutter and select the correct character. The change applies immediately and will be used in the next generation.
6

Generate Audio

Click the “Generate” button in the toolbar to start producing your audiobook. A dropdown gives you two options:

Generate Options

Generate Chapter

Generate audio for the current chapter only. Great for testing voice assignments before committing to the whole book.

Generate Full Book

Generate all chapters in parallel. This is the fastest way to produce your complete audiobook.

Character Voices

Detect characters and assign unique voices

When Character Voices is enabled, clicking generate first runs AI character detection if it hasn't run yet — building personas, archetypes, emotion-tag vocabularies, voice casting, paragraph-to-speaker assignments, per-paragraph scene directions, and inline emotion tags in one pass — then synthesizes audio with all three delivery layers applied per line.

After clicking generate, the progress dialog opens and shows real-time status for each chapter:

Generating Audiobook...

The Midnight Garden

2 of 5 chapters complete

Credits used

12,480

Chapter 1 : The BeginningDone
Chapter 2 : Old FriendsDone
Chapter 3 : The LetterGenerating...
Chapter 4 : Into the Garden
Chapter 5 : The Crossing

Behind the scenes, Notevibes processes each chapter through a sophisticated pipeline: paragraphs are grouped into voice segments, chunked for the TTS engine, and synthesized with per-character voices and delivery directions. For dialog-heavy scenes, the engine batches consecutive short dialogue segments from two speakers into a single multi-speaker request, producing more natural conversational flow with fewer audio stitching artifacts.

Tip: Generate a single chapter first to verify you're happy with the voice assignments and narration style. Once everything sounds right, generate the full book. This saves credits and avoids re-generating chapters you're not happy with.
7

Preview & Edit

Once audio is generated, you have two ways to listen. The play button in the gutter previews a single paragraph on its own. The audio player bar at the bottom plays the full chapter as one continuous track. These are separate: previewing a paragraph doesn't affect chapter playback.

Chapter 3: The Letter
Narrator
Narrator

The long grass rustled at her feet as the White Rabbit hurried by.

Alice
Alice

Excited, breathless wonder

“Curiouser and curiouser!” cried Alice. “Now I’m opening out like the largest telescope that ever was!”

Narrator
Narrator

Measured, matter-of-fact

And so it was indeed: she was now more than nine feet high, and she at once took up the little golden key.

1:24
3:08

Editing is non-destructive. Change anything, and only the affected paragraphs need re-generation:

  • Edit text: the paragraph turns orange (stale). Unchanged paragraphs keep their audio.
  • Swap a character’s voice: only paragraphs assigned to that voice are marked stale.
  • Edit a scene direction: click the clapperboard icon to rewrite it. The paragraph is marked stale until you regenerate.
  • Regenerate selectively: regenerate one chapter, or just the stale paragraphs. The rest of your audiobook is untouched.
Tip: Listen to where chapters meet. Play the last paragraph of one chapter and the first of the next. If the transition feels off, adjust pause durations in TTS settings: 600ms between paragraphs, 400ms at periods, 200ms at commas.
8

Generate Illustrations

Your audiobook doesn't have to be audio-only. Notevibes generates two types of AI illustrations that power Storybook Mode when you publish: character portraits and scene illustrations.

Character Portraits

The AI reads your character descriptions, understands their role in the story (protagonist, antagonist, mentor, sidekick), and generates a portrait for each one. Portraits appear beside dialogue in Storybook Mode so listeners can see who's speaking.

Character Portraits: Alice’s Adventures in Wonderland
Character Portraits5 of 5 ready
Alice

Alice

protagonist

Queen of Hearts

Queen of Hearts

antagonist

Cheshire Cat

Cheshire Cat

mentor

Hatter

Hatter

supporting

White Rabbit

White Rabbit

supporting

The visual style adapts to your book's genre automatically. The AI detected the genre during character detection, so portraits already match:

Fantasy

Painterly illustration, golden and amber lighting, detailed textures (armor, robes, jewelry)

Sci-Fi

Sharp digital art, cool blue and neon accents, futuristic rim lighting, polished and clean

Romance

Soft focus, warm golden hour skin tones, intimate framing, dreamy atmosphere

Thriller

Noir-influenced, high contrast, chiaroscuro lighting, moody shadows, tense

Horror

Dark, desaturated palette, eerie undertones, dramatic shadows, subtle wrongness

Children's

Bright, colorful, soft rounded shapes, warm and inviting, storybook feel

Literary Fiction

Refined, atmospheric, muted tones, restrained elegance, quiet emotional depth

Historical

Classical oil-painting style, muted earth tones, visible brushwork, dignified and timeless

Mystery

Noir-influenced, desaturated palette with one warm accent, guarded expressions

If you've built character profiles with custom designs (outfits, poses, expressions), those are used as visual references during portrait generation. The AI keeps the character consistent across all illustrations.

Scene Illustrations

Scene illustrations are generated at key narrative moments in each chapter. The AI analyzes your text and builds a detailed visual description for each scene: the physical setting, lighting, atmosphere, which characters are present, camera angle, and color palette. All of this feeds into a composed image prompt.

Scene Illustrations: Alice’s Adventures in Wonderland
Scene Illustrations
Down the rabbit hole

Down the rabbit hole

Chapter 1

The pool of tears

The pool of tears

Chapter 2

A caucus race

A caucus race

Chapter 3

Each scene illustration is built from six analyzed components:

  • Setting: the physical environment, materials, architecture, objects in the space
  • Lighting: source, direction, quality, color temperature
  • Atmosphere: weather, sensory texture, emotional weight of the moment
  • Characters: who’s present, posture, physical action
  • Camera: cinematic framing and angle (wide shot, low-angle, over-the-shoulder)
  • Color palette: 3–5 dominant colors specific to the scene mood

These appear on the left page in Storybook Mode's Illustrated layout, creating a book-spread experience: the scene on one side, the narrated text on the other.

Tip: Both portrait and scene generation run in the background. Click “Generate All” for portraits or “Generate Scenes” for illustrations, and keep working on other chapters while they process. You'll see a progress indicator on each character or scene card.

Character Consistency: Voice and Visual

A 20-chapter novel means 20 chapters where Elena needs to sound and look the same. Notevibes handles consistency at both layers automatically.

Voice consistency

When you run character detection, each character gets a voice assignment that applies to the entire book, not per chapter. Every paragraph tagged as “Elena” uses the same voice, same persona prompt, same delivery style, whether it's chapter 1 or chapter 20.

  • Voice assignments are global: change Elena’s voice once, and every paragraph she speaks across all chapters updates.
  • Persona prompts carry through: if you set Elena as "a young woman with a warm voice, British accent, moderate pace," that prompt applies everywhere she appears.
  • Adaptive delivery adjusts per paragraph (whispered in chapter 3, shouting in chapter 12), but the underlying voice stays the same.
  • If you swap a voice mid-production, only the affected paragraphs are marked stale. Regenerate just those chapters.

Visual consistency

Portraits and scene illustrations use a reference chain to keep characters looking the same across every image:

  • Reference sheet: before generating the final portrait, the AI creates a 5-view turnaround of each character (front, 3/4, side, back, action pose). This locks the character’s facial features, hair, skin tone, clothing, and proportions.
  • Portrait generation: the turnaround sheet is fed to the AI as a visual reference. The portrait matches those features exactly.
  • Scene illustrations: when a character appears in a scene, their portrait is included as a visual reference (up to 4 characters per scene). The AI matches their appearance in the scene to their established portrait.
  • Character profiles: if you’ve designed custom outfits, poses, or expressions in the character design system, those saved portraits seed the entire reference chain from the start.
  • Genre style: all portraits for the same book share the same genre-specific art style (fantasy painterly, sci-fi digital, romance soft-focus). Mixing styles within a book doesn’t happen.
Tip: If a character's portrait doesn't look right, regenerate just that character. The new portrait becomes the reference for all future scene illustrations. You don't need to redo scenes that have already been generated unless you want them updated too.
9

Book Cover

Every audiobook platform requires a cover image. ACX (Audible) requires a square image at 2400×2400 pixels minimum, in JPEG or PNG format. Google Play Books, Apple Books, and Kobo have similar requirements.

If you already have a book cover, you can use it. Most ebook covers are portrait-oriented (3:2 or similar), so you may need to crop or reformat to square for audiobook platforms.

If you need a cover, Notevibes has a built-in AI book cover generator. Enter your book title, describe the cover, pick a genre and art style, and the AI generates a cover with your title integrated as genre-appropriate typography. The cover generator uses the same character portraits and genre styles from your audiobook, so the visual identity stays consistent.

  • Square format: 2400×2400px for ACX/Audible, 1400×1400px minimum for most other platforms
  • No audio-specific text: don’t put "audiobook" on the cover unless it’s different from your print edition
  • High contrast: the cover will be displayed as a tiny thumbnail on phones. Bold text, clear imagery.
  • File format: JPEG or PNG, under 10 MB
Tip: If you're publishing the same book as both an ebook and an audiobook, you can use the same cover. Just make sure you have a square crop version for audiobook platforms.
10

Download Your Audiobook

Once you're happy with the narration, click the download button. Three export options are available:

Download Your Audiobook

Download Chapter MP3

Current chapter as a single MP3 file

Full Book (Merged MP3)

Popular

All chapters combined into one file

All Chapters (ZIP)

Recommended

Individual MP3 per chapter: Chapter_01.mp3, Chapter_02.mp3...

Audio Quality (auto-applied)

Format

192 kbps CBR MP3

Sample Rate

44.1 kHz

Loudness

-20 dB RMS (broadcast standard)

Peak Level

≤ -3 dB

Every export is broadcast-quality. Notevibes handles volume normalization, peak limiting, and format encoding automatically. No audio engineering tools or post-processing needed. The files are ready to upload directly to Audible, Google Play Books, Apple Books, or any other audiobook distributor.

Tip: If you're distributing through Audible, use the All Chapters (ZIP) download. Audible requires separate files per chapter. The ZIP export names files as Chapter_01.mp3, Chapter_02.mp3, and so on, ready to upload.
11

Publish Your Audiobook

You have two publishing paths: publish directly on the Notevibes Library for instant availability with rich features, or export and distribute to external platforms.

Publish on Notevibes Library

When you generate a book, it's automatically added to your library as a private draft. To publish publicly:

  1. 1.Open the book in your Library and fill in metadata: author, description, genre, language, tags
  2. 2.Optionally generate AI character portraits and scene illustrations
  3. 3.Enable Read Along (word-level sync) for interactive reading
  4. 4.Click “Publish” to make the book available on the Notevibes Library
  5. 5.Share the public URL with your audience

Published books on Notevibes come with Storybook Mode, Read Along, character portraits, scene illustrations, and listener analytics included, features you won't find on any other audiobook platform.

Publish Your Audiobook

Title

Alice’s Adventures in Wonderland

Author

Lewis Carroll

Genre

Classic Literature, Fantasy

Language

English (US)

Description

Alice falls down a rabbit hole into a fantastical underground world populated by peculiar creatures. A timeless classic brought to life with multi-voice AI narration and illustrated scenes.

Visibility

Public

Export to External Platforms

Download your audio files and distribute to any major platform:

Audible

The world’s largest audiobook marketplace

Google Play Books

Reach Android and Google Home users

Apple Books

iPhone, iPad, and Mac listeners

Spotify

Audiobooks on the streaming platform

Kobo

190+ countries worldwide

Findaway Voices

Wide distribution to 40+ retailers

AI Narration Acceptance by Platform

All major audiobook platforms accept AI-narrated content. Here's the current status:

Audible / ACX

Accepted with disclosure. Tag as “AI-narrated” during upload. Standard royalty terms apply.

Google Play Books

Fully accepted. No special disclosure required beyond standard metadata.

Apple Books

Accepted. Recommend noting AI narration in the book description.

Spotify

Accepted via distributors (Findaway, DistroKid). Follow distributor guidelines for AI disclosure.

Kobo

Accepted. AI narration disclosure recommended in metadata.

Findaway Voices

Accepted. Wide distribution to 40+ retailers with AI narration support.

Tip: Always disclose AI narration in your audiobook metadata. It's required by some platforms and builds trust with listeners. Most distributors have added specific fields for AI narration disclosure, so fill them in during upload.

Full Commercial Rights

All paid Notevibes plans include full commercial rights to generated audio. No royalty splits to Notevibes, no per-sale fees, no platform lock-in. Your audiobook, your revenue. Publish and sell on any platform worldwide.

Published? Now get listeners. Read the audiobook marketing guide for distribution strategy, advertising platforms, social media clips, email launches, and a 30-day playbook.

Sell Your Audiobook

Coming Soon

Direct sales powered by Stripe

We're building direct audiobook sales into Notevibes. Set your own price, accept payments through Stripe, and keep your revenue. No middleman, no revenue share to Notevibes. We handle hosting, delivery, and the listening experience. You keep the earnings.

Set your price

You decide what your audiobook is worth

Stripe checkout

Trusted, global payment processing

Instant payouts

Revenue goes directly to your Stripe account

No revenue share

Notevibes takes $0 from your sales

Your audiobook + Storybook Mode + Read Along + Analytics

Buyers get the full Notevibes listening experience. You get a storefront without building one.

Start Creating

What Your Listeners Experience

Publishing on the Notevibes Library gives your audience three ways to experience your audiobook, features you won't find on any other platform. These are powered by the character portraits, scene illustrations, and audio sync you generated in the previous steps.

Storybook Mode

Transforms your audiobook into a visual reading experience. Listeners see beautifully typeset pages with the narration synced to the text, a hybrid of audiobook and illustrated storybook.

Alice's Adventures in Wonderland
Chapter 7: A Mad Tea-Party
Alice's Adventures in WonderlandChapter 7
AliceAliceprotagonist

There was a table set out under a tree in front of the house, and the March Hare and the Hatter were having tea at it.

“No room! No room!” they cried out when they saw Alice coming. “There's plenty of room!” said Alice indignantly.

The Hatter opened his eyes very wide on hearing this; but all he said was, “Why is a raven like a writing-desk?”

~ 42 ~
4:12
11:05

Genre-specific themes

Fantasy gets parchment and gold. Romance gets soft pinks. Thrillers get noir contrast. 9 themes auto-applied from detected genre.

Three layout modes

Illustrated (scene + text spread), Classic (portrait badges inline), Cinematic (visual novel style).

Page-turn animations

Slide, flip, or none. Drop caps, ornamental dividers, and chapter title pages.

Character portraits & scenes

Your generated portraits float beside character dialogue. Scene illustrations fill the left page.

Read Along

Word-level highlight sync that follows the narration in real time. Each word lights up as it's spoken, making it easy to follow along.

Chapter 3 : The Letter
SyncedWord highlight

The rain hammered against the window pane. She knew, even before opening the letter, that everything was about to change.

The envelope was thin, just a single sheet folded in thirds. No return address. Her name in handwriting she hadn't seen in seventeen years.

“Wait,” she whispered, pressing the paper flat on the table. “This can't be right.”

0:42
5:32

3-level highlighting

Active word, surrounding sentence, and current paragraph, each with its own highlight intensity.

Click to jump

Click any word to jump the narration there. Auto-advance between chapters.

Accessibility

Valuable for language learners, readers with dyslexia, and multi-modal reading.

Listener Analytics

Track how listeners engage with every chapter of your audiobook with the built-in analytics dashboard.

12,847

Total Plays

3,291

Unique Listeners

68%

Completion Rate

Plus geographic insights (listener locations by country), device & behavior data (mobile vs. desktop, session length, chapter drop-off points), and daily aggregation for trend analysis.

Audiobook on YouTube

YouTube is the second-largest search engine in the world, and audiobook content is growing fast on the platform. Uploading your audiobook (or sample chapters) to YouTube is free marketing that reaches listeners who would never find you on Audible.

Why YouTube works for audiobooks

  • Discovery: people search YouTube for "fantasy audiobook" or "free audiobook" every day. Your book shows up where they’re already looking.
  • Sample chapters: upload the first 2–3 chapters as a free preview. Link to the full audiobook on Audible, Apple Books, or Notevibes Library in the description.
  • Full audiobooks: some authors upload the entire book for ad revenue. This works especially well for public domain titles and series where the first book drives sales of sequels.
  • Playlists: organize chapters into a playlist. Listeners can play through the whole book in order without clicking between videos.

How to create an audiobook video

  • Use your book cover as the video thumbnail and static background image.
  • Add chapter timestamps in the description so viewers can jump to specific sections.
  • Include SEO keywords in the title: "Book Title | Full Audiobook | Fantasy | AI Narration"
  • Add a description with your book summary, links to buy the full version, and a note that narration is AI-generated.
  • If you published with Storybook Mode on Notevibes, you can screen-record the visual playback for a richer video.
Tip: YouTube audiobooks work best for series. Upload Book 1 for free, and let the algorithm drive listeners to buy Books 2+. Several indie authors have grown to 10K+ subscribers this way.

Social Media Clips & Video

A finished audiobook is a content machine. Every chapter has moments worth clipping: a dramatic reveal, a funny line, a tense confrontation. These clips drive listeners to your full audiobook across every platform.

Turn scenes into video clips

Notevibes can generate video from your scene illustrations using VEO. Each scene panel gets cinematic camera movement: slow push-ins for close-ups, tracking shots for medium scenes, dramatic dollies for action. Pair these animated scenes with the narration audio, and you have a video clip ready for social media without any external editing tools.

  • Establishing shots: slow cinematic push-in, 8 seconds. Great for chapter openings and setting the mood.
  • Close-ups: slow zoom into character face, 4 seconds. Perfect for emotional moments and reveals.
  • Action panels: dynamic tracking camera, 8 seconds. Best for fight scenes, chases, and plot twists.
  • Reaction shots: steady hold with subtle drift, 4 seconds. Works for dialogue-heavy clips.

Platform-specific tips

TikTok

Vertical 9:16, 15–60 seconds

Hook in the first 2 seconds. Use the most dramatic or mysterious moment from your book. Add text overlay with the book title. End with "Full audiobook in bio." Trending sounds optional but help reach.

Instagram Reels

Vertical 9:16, 15–90 seconds

Same content as TikTok, different audience. Use the cover art as the first frame. Add captions (80% of Reels are watched on mute). Tag #BookTok and #Audiobook.

YouTube Shorts

Vertical 9:16, up to 60 seconds

Pick a cliffhanger moment and cut right before the resolution. Link the full audiobook in your channel. YouTube Shorts feed directly into your long-form audiobook videos.

X / Twitter

Square 1:1 or landscape, 30–60 seconds

Post the audio clip with the cover art as a static image. Quote a compelling line from the passage. Thread multiple clips for a "listen to this voice" series.

Facebook

Square 1:1, 30–60 seconds

Post in audiobook and book club groups. Longer clips work here (up to 3 minutes). Add a "listen with headphones" note. Link directly to the audiobook.

Pinterest

Vertical 2:3 pin with audio link

Pin your cover art with a text overlay of a quote from the book. Link to the Notevibes Library page or Audible listing. Pinterest drives long-tail traffic for months.

What to clip

  • The opening paragraph of Chapter 1. This is your hook. If it doesn’t grab in 15 seconds, listeners won’t click through.
  • A dramatic reveal or plot twist. Cut right before the resolution to create a cliffhanger.
  • A dialogue exchange between two characters. Multi-voice clips showcase the character detection feature.
  • An emotional scene with adaptive delivery. Whispered confessions, heated arguments, or quiet reflections sound incredible in short clips.
  • A behind-the-scenes clip: screen record the character detection panel or Storybook Mode playback. Process content performs well on all platforms.
Tip: Post 3\u20135 clips in the first week after launch. Space them 1\u20132 days apart. Each clip should feature a different scene or character voice so followers get a feel for the full audiobook. Always include a link to the full audiobook in your bio or description.

Traditional Studio vs. AI Narration

Traditional Studio

×

$5,000 – $15,000+ per finished hour

×

2–6 weeks turnaround

×

Schedule actors, engineers, studio time

×

Re-records cost extra per session

×

One language per production

Freelance Narrator

$200 – $400 per finished hour

2–4 weeks turnaround

Finding and vetting takes time

Revisions negotiated per contract

One language per narrator

Notevibes AI

Recommended

Starting at $19/month

One afternoon, upload to finished audiobook

Upload your book and click generate

Unlimited revisions included

72 languages from the same manuscript

Real-World Example: A 50,000-Word Novel (~7 Finished Hours)

Traditional Studio

$35,000 – $105,000

2–6 weeks

Freelance Narrator

$1,400 – $2,800

2–4 weeks

Notevibes AI

From $19/month

One afternoon

Cost comparison (7 finished hours)

Traditional Studio$105,000
Freelance Narrator$2,800
Notevibes AI$19/mo

Common Mistakes to Avoid

Not previewing a chapter before generating the full book

Generate one chapter first to verify voice assignments and narration style. Regenerating the entire book wastes credits and time.

Using voices that sound too similar

Assign distinctly different voices to characters. A deep male voice and a high female voice are easy to distinguish; two mid-range female voices are not.

Skipping chapter-by-chapter review

Listen to at least the first and last paragraph of each chapter. This catches pacing issues and awkward chapter transitions that won’t be obvious from the text.

Ignoring PDF formatting artifacts

PDFs often include headers, footers, page numbers, and table formatting that can end up in the narration. Always review extracted text before generating audio.

Not testing playback on mobile

Many listeners use phones. Test your audiobook on a mobile device to check volume levels, chapter transitions, and the overall listening experience.

Forgetting to set book metadata before publishing

Title, author, description, genre, and language affect discoverability on every platform. Fill these in completely before publishing or distributing.

Not preparing the manuscript for audio

Charts, footnotes, URLs, and "see figure 3" references don’t work in audio. Clean these up before uploading. Five minutes of manuscript prep saves an hour of editing later.

Skipping character detection for fiction

Using a single narrator voice for a novel with 8 characters sounds flat. Let the AI detect characters and assign unique voices. It takes under a minute and the difference is dramatic.

Publishing without a cover image

Every audiobook platform requires a square cover. ACX rejects submissions without one. Generate a cover with the Notevibes cover generator or crop your existing ebook cover to square.

Not disclosing AI narration

Most platforms require AI narration disclosure in your metadata. Skipping it can get your audiobook removed. Always add "Narrated by AI" or "AI-generated narration" to your listing.

Choosing a voice without previewing it

Every voice sounds different at different pacing and energy levels. Always preview a few paragraphs of your actual text, not just the sample clip. A voice that sounds great on a demo sentence might not suit your prose.

Not creating a pronunciation guide for unusual names

Fantasy names, foreign places, and technical terms can be mispronounced by AI voices. Write out the phonetic pronunciation in the text (e.g., "Kael-thir" instead of "Kaelthir") for tricky words.

Frequently Asked Questions

How long does it take to create an audiobook?

Most books go from upload to finished audiobook in a single afternoon. Upload takes seconds, character detection runs in under a minute, and audio generation processes all chapters in parallel. A 10-chapter novel typically generates in 5–15 minutes.

Can AI really detect characters in my novel?

Yes. The AI scans your entire manuscript, identifies every speaking character by name, assigns narrative roles (protagonist, antagonist, narrator, mentor, sidekick, supporting, minor), counts dialogue lines, tracks chapter appearances, and even detects narrative arc and character evolution. You can review and override every assignment.

What if I want to change a voice after generating?

Fully supported. Change any character’s voice in the characters panel, and the affected paragraphs are marked stale (orange indicator). Regenerate only the changed chapters. The rest of your audiobook is preserved.

Can I use this for non-fiction books?

Absolutely. Use the Non-Fiction or Narrator preset for clear, authoritative delivery. Character detection is optional. For non-fiction, you can use a single narrator voice for the entire book. The AI handles tables, citations, and multi-column layouts.

Do I need audio editing skills?

No. Notevibes handles all audio engineering automatically: volume normalization, peak limiting, and noise floor management. The exported files are broadcast-quality and ready to upload to any platform with zero post-processing.

Can I publish in multiple languages?

Yes. All 550+ voices support 72 languages natively. Upload your translated manuscript and generate with the same (or different) voices. The voices produce authentic accents in every supported language.

What’s the difference between Storybook Mode and Read Along?

Read Along adds word-level highlight sync to the text display. Storybook Mode is a full visual experience with illustrated pages, genre themes, character portraits, scene illustrations, drop caps, and page-turn animations. Storybook Mode includes Read Along’s highlighting capabilities.

Do I own the rights to my AI-generated audiobook?

Yes. All paid plans include full commercial rights. No royalty splits to Notevibes, no per-sale fees, no platform lock-in. Publish and sell anywhere.

Which platforms accept AI-narrated audiobooks?

All major platforms accept AI narration. Audible and ACX require AI disclosure during upload. Google Play Books, Apple Books, Kobo, and Findaway Voices all accept AI-narrated content. Spotify accepts audiobooks through distribution partners. Always disclose AI narration in your metadata.

How much does it cost to create an audiobook?

Traditional studio recording costs $5,000–$15,000 per finished hour. Freelance narrators charge $200–$400 per finished hour. With Notevibes AI, plans start at $19/month with unlimited revisions and 72 languages included. A 50,000-word novel (~7 hours) costs $35,000–$105,000 in a studio, $1,400–$2,800 with a freelancer, or from $19/month with Notevibes.

Can I create an audiobook from a PDF?

Yes. Notevibes uses AI-powered text extraction with layout detection to handle PDFs, including multi-column layouts, tables, headers, and footers. The AI detects and skips page numbers and converts tables to spoken descriptions. Review the extracted text before generating for best results.

How do I create an audiobook in another language?

Upload your translated manuscript and generate with the same or different voices. All 550+ Notevibes voices support 72 languages natively, producing authentic accents in every language. The same voice can narrate in English, French, Japanese, Arabic, or any other supported language.

Should I narrate my own audiobook or use AI?

It depends on the book. If you’re a speaker, podcaster, or coach and your audience knows your voice, self-narrating can work for non-fiction. For fiction with multiple characters, AI narration is usually better: 550+ distinct voices, automatic character switching, and consistent quality across every chapter. No recording equipment, no re-takes, no sound booth. You can always test both: generate a sample chapter with AI and compare.

What audio specs do audiobook platforms require?

ACX (Audible) requires MP3 at 192 kbps CBR, 44.1 kHz sample rate, RMS between -23 dB and -18 dB, peak level at or below -3 dB, and noise floor below -60 dB. Notevibes exports meet all of these automatically. No mastering or post-processing needed. Google Play Books, Apple Books, Kobo, and Spotify accept the same format.

How do I choose the right voice for my book?

Start with genre. Thrillers need a low-pitch, strong-energy voice. Romance needs soft, warm delivery. Children’s books need bright, animated voices. Then customize: adjust pitch, timbre, pace, energy, and accent per character. Preview any voice before committing. The Assign Unique Voices section above has recommended voices for every genre.

Can listeners tell it’s AI narration?

With the natural voices, most listeners cannot. The voices handle emotional delivery, pacing changes, whispers, and dialogue shifts. Where AI still falls short: very long dramatic pauses, singing, and heavy regional dialects. For 95% of audiobooks, the quality is indistinguishable from a mid-tier professional narrator.

Do I need to disclose AI narration?

Yes. Google Play Books, Kobo, Spotify, and Apple Books all accept AI narration but require disclosure during upload. ACX/Audible has stricter rules and currently limits AI narration. Always add "Narrated by AI" or "AI-generated narration" to your audiobook metadata. It builds trust with listeners and keeps you compliant.

What is a royalty share deal, and why do authors avoid it?

On ACX, a narrator can record your book for free in exchange for 50% of your royalties for 7 years. Sounds good until your book sells well and you’re locked into giving away half your earnings with no way out. AI narration eliminates this entirely: you pay a flat monthly fee, keep 100% of royalties, and can change voices or re-generate anytime.

How do I get my audiobook on Audible?

Export your finished audiobook from Notevibes as per-chapter MP3 files (ACX-compliant format). Create an ACX account, claim your book title, upload the audio files, and submit for review. ACX distributes to Audible, Amazon, and iTunes. Turnaround is typically 10–14 business days. For wider distribution, also upload to Google Play Books, Apple Books, Kobo, and Findaway Voices.

Can I sell my audiobook directly on Notevibes?

Yes. Publish to the Notevibes Library and your audiobook is instantly available with Storybook Mode, Read Along, character portraits, scene illustrations, and listener analytics included. Direct sales via Stripe (set your own price, keep the revenue) is on the roadmap.

What if my book already has illustrations?

You can use your existing illustrations or let the AI generate new ones. For children’s books with existing art, skip scene generation and use your original images. If you want AI to generate portraits and scenes in a matching style, the genre detection adapts the visual style automatically. You can also upload reference images to guide the AI’s visual consistency.

Ready to Create Your Audiobook?

Upload your manuscript, let AI do the heavy lifting, and publish a professional audiobook, all in one sitting. No studio, no actors, no audio engineering.

Create Your Audiobook

550+ voices · 72 languages · Full commercial rights · Broadcast-quality audio