I haven’t looked into it but couldn’t someone just use an LLM for natural language processing and feed that to a home assistant? Like prompt it with “break up individual commands and pass to the assistant” so when I say “living room lights on and bedroom lights off” the fucking thing does it instead of “huh? I’m a moron.”
People run whisper on HA, and there already exist intention mapping packages. They’ve been around for probably a decade already. Pretty hit or miss… mostly because there isn’t a ton of flexibility in the structure of the commands you issue it.
If someone wanted to use an online LLM to attempt to translate a complex whisper transcription into something an existing intention mapped would handle well, that’s closer to a day’s worth of goofing around rather than a year. I actually refuse to believe it hasn’t already been done.
And if you’re using an online llm to do that translation, I don’t see why that can’t be behind a paywall either.
Honestly for this task, I imagine offline models would be sufficient.
It seems stupid to even try to get an LLM to do deterministic tasks. You’ll harm the product by trying to make it something that it isn’t.
Seems much more sensible to give it modules that perform the task and give it access to those which I believe is how Google Assistant used to work.
I haven’t looked into it but couldn’t someone just use an LLM for natural language processing and feed that to a home assistant? Like prompt it with “break up individual commands and pass to the assistant” so when I say “living room lights on and bedroom lights off” the fucking thing does it instead of “huh? I’m a moron.”
They could, but they wouldn’t be able to trap that functionality behind a paywall, so they’re not interested.
It can get done by anyone with an open model in way less than a year, using the approach you’ve described.
People run whisper on HA, and there already exist intention mapping packages. They’ve been around for probably a decade already. Pretty hit or miss… mostly because there isn’t a ton of flexibility in the structure of the commands you issue it.
If someone wanted to use an online LLM to attempt to translate a complex whisper transcription into something an existing intention mapped would handle well, that’s closer to a day’s worth of goofing around rather than a year. I actually refuse to believe it hasn’t already been done.
And if you’re using an online llm to do that translation, I don’t see why that can’t be behind a paywall either.
Honestly for this task, I imagine offline models would be sufficient.