Skip to content

ericc-ch/edge-tts

Repository files navigation

@ericc/edge-tts

Generate speech audio from text using Microsoft Edge's text-to-speech API.

Heavily inspired by rany2/edge-tts and SchneeHertz/node-edge-tts

Features

  • Using standard web APIs. Should work in all (modern) JS environment
  • Provides subtitle/caption data

Installation

Using npm:

npm install @echristian/edge-tts

Using pnpm:

pnpm install @echristian/edge-tts

Basic Usage

// Web
const { audio, subtitle } = await generate({
  text: "Hello, world!",
  voice: "en-US-JennyNeural",
  language: "en-US",
});

// Create an audio element and play the generated audio
const audioElement = new Audio(URL.createObjectURL(audio));
audioElement.play();

// Access subtitle data
console.log(subtitle);

Options

GenerateOptions

Options that will be sent alongside the websocket request:

ParseSubtitleOptions

Options for parsing the subtitle:

  • splitBy (required): The function will split the cues based on this option
    • "sentence": will split the text using Intl.Segmenter
    • "word": will split the text to X count of words for each cue
    • "duration": will split the text to X duration of milliseconds for each cue
  • count (optional): Used when splitting by "words" or "duration"
    • When splitting by "words", count means the amount of words for each cue
    • When splitting by "duration", count means the duration in milliseconds for each cue
  • metadata (required): Array of metadata received throughout the websocket connection

Credits

About

Use Microsoft Edge's online text-to-speech service from JS code directly!

Topics

Resources

License

Stars

Watchers

Forks