adirik/styletts2

Generates speech from text

Input

Configure the inputs for the AI model.

Beta

Only used for long text inputs or in case of reference speaker, determines the prosody of the speaker. Use lower values to sample style based on previous or reference speech instead of text.

Seed

Seed for reproducibility

Text *

Text to convert to speech

Alpha

Only used for long text inputs or in case of reference speaker, determines the timbre of the speaker. Use lower values to sample style based on previous or reference speech instead of text.

Weights

Replicate weights url for inference with model that is fine-tuned on new speakers. If provided, a reference speech must also be provided. If not provided, the default model will be used.

Reference

Reference speech to copy style from

Diffusion Steps

Number of diffusion steps

Embedding Scale

Embedding scale, use higher values for pronounced emotion

Output

The generated output will appear here.

No output yet

Click "Generate" to create an output.