These days CPUs, and GPUs are strong enough to process the voice and find patterns.
We have SSDs and lots of RAM.
Why cant we just purchase the software and run it locally? Downloading the command packages as we see fit, ie.: Download TV package, download Hi-Fi voice package, car package etc.
Safe, everything stored on local clous, as NAS could be used for the external communication, and process this stuff. Or a dedicated desktop.