The project on github:
https://github.com/bckpkol/stft_voice_t ... r_denoiser
It tries to detect your voice's key, correct voice's pitch, may subtract ambient noise (you should record silence for that) and saturate.
Saturation is in the wav write pass, so png it generates isn't saturated.