openWakeWord for Rhasspy

hermes homeautomation machine-learning mqtt rhasspy udp voice-assistant wakeword

Find a file

Dale e75a381cbe Full instructions on installation and configuration		2023-04-30 19:32:59 +02:00
.github/workflows	Update docker-image.yml	2023-04-30 15:07:32 +02:00
.flake8	Send wake word to Rhasspy	2023-04-23 12:11:27 +02:00
.gitattributes	Initial commit	2023-04-22 21:37:26 +02:00
.gitignore	Send wake word to Rhasspy	2023-04-23 12:11:27 +02:00
config.yaml.example	Add moving average filter with hysteresis	2023-04-30 13:41:59 +02:00
detect.py	Use numpy for average	2023-04-30 14:11:44 +02:00
Dockerfile	Dockerise the application	2023-04-23 17:12:57 +00:00
LICENSE	Initial commit	2023-04-22 21:37:26 +02:00
README.md	Full instructions on installation and configuration	2023-04-30 19:32:59 +02:00
requirements.txt	Send wake word to Rhasspy	2023-04-23 12:11:27 +02:00

README.md

openWakeWord for Rhasspy

openWakeWord is an open-source library for detecting common wake-words like "alexa", "hey mycroft", "hey jarvis", and other models. Rhasspy is an open-source voice assistant.

This project runs openWakeWord as a stand-alone service, receives audio from Rhasspy via UDP, detects when a wake-word is spoken, and then notifies Rhasspy using the Hermes MQTT protocol.

Why

I run Rhasspy in Base/Satellite mode. Currently each Satellite captures audio, does the wake-word detection locally and streams audio to the Base which does everything else. The Pi4 satellites runs the Rhasspy Docker container, launched with compose. The Base Rhasspy container runs on a more powerful i7 (runs other home automation software.)

Running openWakeWord in Docker eases distribution and setup (Python dependencies), allows openWakeWord to develop at a separate pace to Rhasspy (instead of bundled and released with Rhasspy.) A single instance of openWakeWord centralises configuration, and allows lower power satellites (e.g. ESP32s) richer wake-word options.

In the future I plan to add a web UI for configuration: which words to detect, thresholds, custom verifier models and maybe speaker identification. It could also include live visualisation for testing and diagnostics.

Installation

Docker

Using Docker CLI

docker run -d --name openwakeword -p 12202:12202/udp -v /path/to/config/:/config dalehumby/openwakeword-rhasspy

In docker-compose.yml (or a Docker Swarm stack file)

  openwakeword:
    image: dalehumby/openwakeword-rhasspy
    restart: always
    ports:
      - "12202:12202/udp"
    volumes:
      - /path/to/config:/config

Python local install

For testing and experimentation you can run this project locally:

Clone the repo git clone git@github.com:dalehumby/openWakeWord-rhasspy.git
Create a Python virtul environment (optional)
- python3 -m venv env
- source env/bin/activate
Install requirements pip3 install -r requirements.txt
After you've done the Configuration below
Run python3 detect.py

Configuration

Create a file called config.yaml, for example nano /path/to/config/config.yaml
Paste the contents of config.yaml.example into config.yaml to get started

UDP Ports

Rhasspy streams audio from its microphone to openWakeWord over the network using the UDP protocol. On each Rhasspy device that has a microhone attached (typically a Satellite) go to Rhasspy - Settings - Audio Recording and in UDP Audio (Output) insert the IP address of the host that's running openWakeWord, and choose a port number, usually starting at 12202. If you have multiple Rhasspy devices then each device needs its own port number, 12202, 12203, 12204, etc.

openWakeWord config.yaml would then have

udp_ports:
  base: 12202
  kitchen: 12203
  bedroom: 12204

If you are using Docker you need to open the ports to allow UDP network traffic into the container.

Using Docker CLI

docker run -d --name openwakeword -p 12202:12202/udp -p 12203:12203/udp -p 12204:12204/udp -v /path/to/config/:/config dalehumby/openwakeword-rhasspy

Or in docker-compose.yml

  openwakeword:
    image: dalehumby/openwakeword-rhasspy
    restart: always
    ports:
      - "12202:12202/udp"  # base
      - "12203:12203/udp"  # kitchen
      - "12204:12204/udp"  # bedroom
      # ... etc
    volumes:
      - /path/to/config:/config

MQTT

openWakeWord notifies Rhasspy that a wake-word has been spoken using the Hermes MQTT protocol. The MQTT broker needs to be accessible by both Rhasspy and openWakeWord. Rhasspy's internal MQTT broker is not reachable from outside of Rhasspy, so you will need to run your own, like Eclipse Mosquitto.

Once the broker is running, go to Rhasspy - Settings - MQTT. Choose External broker, set the IP address of the Host that the broker is running on, the Port number, and the Username/Password if required, similar to:

OpenWakeWord config.yaml would then have:

mqtt:
  broker: 10.0.0.10
  port: 1883
  username: yourusername  # Delete row if not required
  password: yourpassword  # Delete row if not required

On each Rhasspy, in Rhasspy - Settings - Wake Word, set Hermes MQTT, like

openWakeWord

openWakeWord listens for all wake-words like "alexa", "hey mycroft", "hey jarvis", and others. These settings to ensure Rhasspy is only activated once per wake-word, and to reduce false activations.

oww:
  activation_samples: 3  # Number of samples in moving average
  activation_threshold: 0.7  # Trigger wakeword when average above this threshold
  deactivation_threshold: 0.2  # Do not trigger again until average falls below this threshold
  # OWW config, see https://github.com/dscripka/openWakeWord#recommendations-for-usage
  vad_threshold: 0.5
  enable_speex_noise_suppression: false

In the example above, the latest 3 audio samples received over UDP are averaged together, and if the average confidence that a wake-word has been spoken is above 0.7 (70%), then Rhasspy is notified. Rhasspy will not be notified again until the average confidence drops below 0.2 (20%), i.e. the wake-word has ended.

Settings for voice activity detection (VAD) and noise suppression are also provided. See openWakeWord's Recommendations for Usage.

Contributing

Feel free to open an Issue if you have a problem, need help or have an idea. PRs always welcome.