2

Speech to speech: much more than voice cloning in audiovisual content and broadcast applications

Tech Papers 2025: Speech-to-speech (STS) conversion is best known for transforming one voice into another. However, its applications extend well beyond straightforward voice replacement. This paper first demystifies STS, explaining how contemporary models analyse prosody, embed timbre, track pitch and resynthesise spectrum to blend a driving performance with a target voice.

On that foundation we present a compact, repeatable workflow that transforms a few minutes of clean speech into a deployable model within a single workday. Five illustrative case studies—reviving archival voices, rescuing noisy location dialogue, executing micro-ADR (additional dialogue recording) text fixes, crafting hybrid character timbres and crosssynthesising musical or animal sounds—demonstrate how the method compresses turnaround times and trims ADR, Sound Effects and scheduling costs, making it ideal for the fast-paced demands of film and television. Although these examples are not STS’s primary commercial drivers, they show how the technology can extend a production’s sonic palette and catalyse innovative, cost-efficient storytelling.

Latest Technical paper
2

Automatic quality control of broadcast audio

Tech Papers 2025: This paper describes work undertaken as part of the AQUA project funded by InnovateUK to address shortfalls in automated audio QC processes with an automated software solution for both production and distribution of audio content on premises or in the cloud.

Read more
Favourites:

Registered users only: Login

Share this:
Other themes: