TTS plugin

Got a great idea for the future of LMMS? Post it here.
Forum rules

Make sure to search to see if your idea has been posted before! Check our issue tracker as well, just to make sure you are not posting a duplicate: https://github.com/LMMS/lmms/issues

There could be some vsts that are tts, but a built in plugin would be nice, to add speech to your songs.
@musikbear plz!!!
OpticalBit_ALT wrote:
Fri Nov 29, 2024 1:28 am
There could be some vsts that are tts, but a built in plugin would be nice, to add speech to your songs.
@musikbear plz!!!
That would be extremely difficult. The Text-to-Speech would have to be rhythmically synchronized and also have a natural language tone. I have stopped saying "Impossible" because one week later there is an AI that does what i thought was impossible a few says earlier :p, but it is not anything lmms-team can code, i also do not any of the other DAWs has a feature like this (?)

No plugin required. This is entirely possible through parsing Software Automatic Mouth (a 1980s speech synthesiser that didn't use filters) to Xpressive time domain expressions. There are a lot of reverse engineered SAMs on Github. I am still learning and I have had limited success so far. Realistically, the "rendering / parsing" could be done externally in Python, Matlab or C++ with the resulting expression manually pasted into Xpressive.

Paste this proof of concept into Xpressive on the Alpha release and play at C2 (the syntax has changed on the nightly build).

Code: Select all

clamp(-1, floor((((t >= 0 & t < 0.06) * (0.9 * randv(t*480)) * (min(1, (t-0)*120) * min(1, (0.06-t)*120))) + ((t >= 0.06 & t < 0.2616) * ((0.75*sinew(integrate(370.5)) + 0.5*sinew(integrate(1306.5)) + 0.25*sinew(integrate(1774.5))) * (0.8 * (1 - mod(t*f*1.1, 1)))) * (min(1, (t-0.06)*50) * min(1, (0.2616-t)*50))) + ((t >= 0.2616 & t < 0.3336) * (0.9 * randv(t*480)) * (min(1, (t-0.2616)*120) * min(1, (0.3336-t)*120))) + ((t >= 0.3336 & t < 0.5136) * ((0.6*sinew(integrate(117)) + 0.4*sinew(integrate(1053)) + 0.2*sinew(integrate(2359.5))) * (0.8 * (1 - mod(t*f*1.05, 1)))) * (min(1, (t-0.3336)*50) * min(1, (0.5136-t)*50))) + ((t >= 0.5136 & t < 0.7536) * ((0.75*sinew(integrate(351)) + 0.5*sinew(integrate(585)) + 0.25*sinew(integrate(1716))) * (0.8 * (1 - mod(t*f*1.05, 1)))) * (min(1, (t-0.5136)*50) * min(1, (0.7536-t)*50))) + ((t >= 0.7536 & t < 0.9336) * ((0.6*sinew(integrate(117)) + 0.4*sinew(integrate(897)) + 0.2*sinew(integrate(1579.5))) * (0.8 * (1 - mod(t*f*1.05, 1)))) * (min(1, (t-0.7536)*50) * min(1, (0.9336-t)*50))) + ((t >= 0.9336 & t < 1.0776) * ((0.6*sinew(integrate(175.5)) + 0.4*sinew(integrate(1618.5)) + 0.2*sinew(integrate(2145))) * (0.8 * (1 - mod(t*f*1.05, 1)))) * (min(1, (t-0.9336)*50) * min(1, (1.0776-t)*50))) + ((t >= 1.0776 & t < 1.3044) * ((0.75*sinew(integrate(253.5)) + 0.5*sinew(integrate(663)) + 0.25*sinew(integrate(1599))) * (0.8 * (1 - mod(t*f*1.1, 1)))) * (min(1, (t-1.0776)*50) * min(1, (1.3044-t)*50))) + ((t >= 1.3044 & t < 1.4244) * ((0.5*sinew(integrate(175.5)) + 0.3*sinew(integrate(994.5)) + 0.15*sinew(integrate(1813.5))) * (0.8 * (1 - mod(t*f*1.05, 1)))) * (min(1, (t-1.3044)*50) * min(1, (1.4244-t)*50))) + ((t >= 1.4244 & t < 1.6044) * ((0.75*sinew(integrate(273)) + 0.5*sinew(integrate(1423.5)) + 0.25*sinew(integrate(1813.5))) * (0.8 * (1 - mod(t*f*1.05, 1)))) * (min(1, (t-1.4244)*50) * min(1, (1.6044-t)*50))) + ((t >= 1.6044 & t < 1.6764) * (0.9 * randv(t*480)) * (min(1, (t-1.6044)*120) * min(1, (1.6764-t)*120)))) * 16)/16, 1)

Edit: I haven't implemented any of the rhymical or timing ideas that musikbear has pointed out. Yes I understand to get the speech to sing and have rhythm is an even bigger task.

ewanpettigrew wrote:
Wed Jan 14, 2026 11:14 pm

No plugin required. This is entirely possible through parsing Software Automatic Mouth (a 1980s speech synthesiser that didn't use filters) to Xpressive time domain expressions.

Interesting! Thanks for the heads up!