An Interpretable, Controllable Time-Varying IIR Denoiser for On-Device Assistive Hearing
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
We present TVF (Time-Varying Filtering), an interpretable, low-latency speech enhancement model for real-time, on-device assistive hearing.
A lightweight neural controller predicts, in real time, the coefficients of a differentiable cascade of 35 second-order IIR filters (biquads), so the model tracks non-stationary noise while keeping a fully interpretable processing chain: every spectral modification is an explicit, adjustable equalizer curve rather than an opaque `black-box' transform.
Because the biquad cascade carries the signal processing, the controller can be made very small, driving the cascade with only 24k parameters at a 10.7ms algorithmic latency, within hearing-aid budgets, and running entirely on-device so that audio never leaves the device.
We also expose the suppression-versus-preservation trade-off as an explicit control: it can be set during training through the loss weighting, and adjusted at inference, with no retraining, by mixing the noisy input with the denoised output.
On hearing-aid metrics (HASPI/HASQI) the 24k model stays within about 0.02 of DFNet3 (2.3M parameters, almost two orders of magnitude larger) while using about 29X fewer multiply-accumulates, although larger black-box models still lead on reference metrics such as PESQ.
We present TVF as a proof of concept for a compact, interpretable, and controllable denoiser for on-device assistive hearing.