Neural networks can tease apart the different sounds in musical recordings.
What’s new: Companies and hobbyists are using deep learning to separate voices and instruments in commercial recordings, Wired reported. The process can improve the sound of old recordings and opens new possibilities for sampling, mash-ups, and other fresh uses.
How it works: Finished recordings often combine voices, instruments, and other sounds recorded in a multitrack format into a smaller number of audio channels; say, one for mono or two for stereo. The mingling of signals limits how much the sonic balance can be changed afterward, but neural networks have learned to disentangle individual sounds — including noise and distortion — so they can be rebalanced or removed without access to the multitrack recordings.
- Audio Research Group was founded by an audio technician at Abbey Road Studios, who developed a deep learning system to remix the Beatles’ 1964 hit, “She Loves You,” which was produced without multitracking.
- Audionamix separates mono recordings into tracks for vocals, drums, bass guitar, and other sounds. The service has been used to manipulate old recordings for use in commercials and to purge television and film soundtracks of music, inadvertently playing in the background, that would be expensive to license.
- French music streaming service Deezer offers Spleeter, an open-source system that unmixes recordings (pictured above). Users have scrubbed vocals to produce custom karaoke tracks, create oddball mashups, and cleanse their own recordings of unwanted noises.
Why it matters: Many worthwhile recordings are distorted or obscured by noises like an audience’s cheers or analog tape hiss, making the quality of the musicianship difficult to hear. Others could simply use a bit of buffing after decades of improvement in playback equipment. AI-powered unmixing can upgrade such recordings as well as inspire new uses for old sounds.
We’re thinking: Endless remixes of our favorite Taylor Swift tracks? We like the sound of that!