The advanced audio editor

AI Audio Editor — Edit Audio Just by Chatting

The studio work that used to mean ffmpeg commands, a wall of plugins, and a timeline — now it’s one sentence. Clean up, cut, mix, split stems, generate voiceovers, even dub into another language. The AI runs the whole chain; you approve every step.

Drop your audio to start

MP3, WAV, M4A, FLAC, OGG, MP4…

ffmpeg-powered engine| neural models| non-destructive

Edit by talking

No timeline, no menus, no plugins to chain. Say what you want — “remove the hum”, “cut the part about pricing” — and the AI figures out which tools to use and runs them.

A whole studio, packed in

A server-side ffmpeg engine and neural models cover cleanup, cutting, EQ, dynamics, pitch, stem separation, voiceovers, and voice-preserving translation — reached just by asking.

Non-destructive, always

Every edit is a new version you can play, A/B, download, or roll back to. Your original is never overwritten, so you can experiment without fear.

The hard, manual way — made one sentence

Every one of these is a real operation people do by hand. The editor packs them all in and runs them for you — you just describe the result.

Normalize loudness to broadcast level
the manual way
ffmpeg -i in.wav -af loudnorm=I=-16:TP=-1.5:LRA=11 out.wav
Normalize it to podcast loudness
Strip silence and dead air
the manual way
ffmpeg -af silenceremove=start_periods=1:\
  stop_periods=-1:stop_duration=0.4:\
  stop_threshold=-40dB in.wav out.wav
Tighten the pauses
Pitch down two semitones, keep the tempo
the manual way
ffmpeg -af asetrate=44100*0.891,\
  aresample=44100,atempo=1.122 in.wav out.wav
Drop the pitch two semitones
Separate the vocals from a song
the manual way
pip install demucs
demucs --two-stems vocals song.mp3
# …then a GPU, the model weights, the paths
Pull the vocals out of this track

Just say it — here’s what happens

Type it like you’d ask a person. The AI maps your words onto the right tools and shows you the result as a version.

Remove the background hum and hiss

Neural cleanup + de-hum

Make it podcast-ready

Cleanup → EQ → loudness to −16 LUFS

Tighten the pauses and cut the filler words

Silence + filler trim

Cut the part about pricing

Finds it in the transcript, ripple-cuts it

Split this song into stems

Vocals, drums, bass, guitar, piano, other

Take the vocals out for karaoke

Vocal removal

Dub this into Spanish but keep my voice

Voice-preserving translation

Add an intro that says “Welcome to episode 12”

AI voiceover, dropped on the timeline

Speed it up 1.2× without the chipmunk voice

Tempo stretch

Boost the bass and add a little warmth

Bass boost + EQ

Everything packed in

The full toolset of a pro studio and a stack of AI models — all reachable in one conversation.

Clean up the noise

  • Neural noise removal
  • De-hum & de-rumble
  • De-ess & de-click
  • De-plosive & noise gate
  • Declip & restore

Cut & arrange

  • Trim and split clips
  • Ripple-cut a section
  • Fade in / out
  • Move & combine clips
  • Cut by transcript

Tone & dynamics

  • Parametric & voice EQ
  • Compressor & limiter
  • Loudness normalize (LUFS)
  • Bass & treble shaping

Time & pitch

  • Speed up / slow down
  • Tempo stretch — no chipmunk
  • Pitch shift ±12 semitones
  • Reverse, echo & reverb

Separate stems · AI

  • Split into 6 stems
  • Isolate vocals, drums, bass…
  • Extract or remove one instrument
  • Vocal removal for karaoke

Generate & translate · AI

  • Text-to-speech voiceovers
  • Spoken intros & outros
  • Dub into another language
  • Keep the original voice

Transcribe and translate — the two big ones

It can read your audio and speak it in another language. That’s what turns an editor into a publishing tool.

Transcribe → edit by meaning

It writes down every word, so you can edit by content instead of hunting timestamps. Say “cut the part about pricing” and it finds it and removes it — and you keep the transcript.

Translate → reach everyone

Dub your recording into another language while keeping your own voice. One podcast, ad, or lesson — every audience, no re-recording, no new talent.

Built for the work you actually do

Whatever you’re making, you describe the result — the AI handles the audio.

Podcasters & creators

  • Clean up noise & hum
  • Tighten pauses, cut tangents
  • Add AI intros & voiceovers
  • Localize the whole episode

Localization & marketing

  • Dub a VO into another language
  • Keep the original voice
  • One recording, many markets

Course creators

  • Clean up lecture audio
  • Trim down to tight lessons
  • Dub courses into new languages

Journalists & interviewers

  • Transcribe interviews
  • Cut to the quote that matters
  • Clean up field recordings

Musicians & remixers

  • Split songs into stems
  • Pull or remove one instrument
  • Karaoke & acapella versions
  • Pitch & tempo, no artifacts

Teams & business

  • Clean up meeting & webinar audio
  • Localize announcements & training
  • Normalize loudness for delivery

Real workflows, start to finish

Each step is something you say; the editor does the rest and saves a version you can roll back to.

Podcaster

Raw recording → ready to publish

  1. 1

    Make it podcast-ready

    Denoise → EQ → loudness

  2. 2

    Tighten the pauses and cut my coughs

    Silence trim + content cuts

  3. 3

    Add an intro that says “Episode 12 — …”

    AI voiceover at the start

  4. 4

    Export as MP3

    Download

Marketer

One voiceover → another language

  1. 1

    Transcribe it so I can proof the script

    Full transcript

  2. 2

    Dub it into Spanish and keep my voice

    Voice-preserving dub

  3. 3

    Now do French

    A second dubbed version

Interviewer

Interview → the clip that matters

  1. 1

    Transcribe the interview

    Searchable transcript

  2. 2

    Pull the part where she talks about funding

    Finds it, cuts it to a clip

  3. 3

    Clean up the room noise

    Neural enhancement

Musician

Song → karaoke + stems

  1. 1

    Split this into stems

    Vocals, drums, bass, guitar, piano, other

  2. 2

    Make a karaoke version

    Vocal removal

  3. 3

    Download the vocals on their own

    Per-stem export

How it works

  1. Step 1

    Drop your audio

    The editor listens, tells you what it is — podcast, voiceover, song — and flags length, levels, and any noise.

  2. Step 2

    Say what you want

    Describe the edit in plain words, or tap a suggested action. The AI plans the whole ffmpeg + neural chain and previews each step before it touches your audio.

  3. Step 3

    Play, compare, export

    Every change is its own version with a waveform — A/B it against the last, roll it back, then download MP3 or WAV.

Frequently asked

What can the AI audio editor do?

A lot — all by describing it. Clean up audio (neural noise removal, de-hum, de-ess, de-click, de-plosive, noise gate), cut sections and trim silence or filler words, shape tone and loudness (EQ, compression, limiter, bass and treble), add fades, change speed, tempo or pitch, reverse, add echo or reverb, split a song into stems, remove or isolate a single instrument, strip vocals for karaoke, generate spoken intros and voiceovers, and dub a recording into another language.

How is this different from a normal audio editor?

A normal editor gives you the timeline and the plugins and leaves the work to you. This one does the work. You describe the result; the AI plans the chain of operations, runs it, and shows you a version to approve. It is the advanced editor — everything a manual DAW or a wall of ffmpeg commands could do, without you driving the tools.

What powers the editing under the hood?

A server-side ffmpeg engine handles the classic operations — cuts, fades, EQ, loudness, pitch, format conversion — and neural models handle the AI work: speech enhancement, stem separation, and voice-preserving translation. The AI agent decides which to run and in what order; you just say what you want.

Do I need editing skills?

No. There is no timeline or controls to learn — you just chat. The AI decides which tools to use, runs the whole workflow, and shows you the result as a new version you can play.

Can it remove background noise?

Yes. It uses neural speech enhancement to lift voice out of hiss, hum, and room noise, plus targeted fixes like de-hum, de-ess, de-click, de-plosive, and a noise gate — just ask it to “clean it up” or “make it podcast-ready”.

Can it split a song into stems?

Yes. Ask it to split a track and it separates vocals, drums, bass, guitar, piano, and other into individual stems — each with its own player and download. You can also extract or remove a single instrument, or strip the vocals for a karaoke version.

Can it translate or dub my audio?

Yes. Point it at a voice recording and ask for another language, and it produces a dubbed version that keeps the original speaker’s voice — translation and voice preserved together.

Can I undo a change?

Always. Every edit is a separate version you can play, A/B against the previous one, or roll back to. Editing is fully non-destructive, so your original is always intact.

What files can I use?

Drop MP3, WAV, M4A, FLAC, OGG, AAC, or audio from an MP4 — you can also add several clips into one project. When you are done, download the result as MP3 or WAV.

Is it free?

You can start editing right away. AI processing is metered with credits, like the rest of Notevibes’ AI features.

Stop driving the tools. Just say it.

Drop a file and describe the edit. The AI does the hard part — and every change is a version you can undo.

Non-destructive — your original is never touched.