[![Docker Image CI](https://github.com/dalehumby/openWakeWord-rhasspy/actions/workflows/docker-image.yml/badge.svg)](https://github.com/dalehumby/openWakeWord-rhasspy/actions/workflows/docker-image.yml)

# openWakeWord for Rhasspy 

[openWakeWord](https://github.com/dscripka/openWakeWord) is an open-source library for detecting common wake-words like "alexa", "hey mycroft", "hey jarvis", and [other models](https://github.com/dscripka/openWakeWord#pre-trained-models). [Rhasspy](https://rhasspy.readthedocs.io/en/latest/) is an open-source voice assistant.

This project runs openWakeWord as a stand-alone service, receives audio from Rhasspy via UDP, detects when a wake-word is spoken, and notifies Rhasspy using the Hermes MQTT protocol.

## Why
I run Rhasspy in [Base/Satellite mode](https://rhasspy.readthedocs.io/en/latest/tutorials/#server-with-satellites). Currently each Satellite captures audio, does the wake-word detection locally and streams audio to the Base which does everything else. The Pi4 satellites runs the Rhasspy Docker container, launched with  [compose](https://github.com/dalehumby/rhasspy-config/blob/main/satellite-compose.yaml). The Base Rhasspy container runs on a more powerful i7 (runs other [home automation software](https://github.com/dalehumby/homelab).)

Running openWakeWord in Docker eases distribution and setup (Python dependencies), allows openWakeWord to develop at a separate pace to Rhasspy (instead of bundled and released with Rhasspy.) A single instance of openWakeWord centralises configuration, and allows lower power satellites (e.g. ESP32s) richer wake-word options.

In the future I plan to add a web UI for configuration: which words to detect, thresholds,  [custom verifier models](https://github.com/dscripka/openWakeWord/blob/main/docs/custom_verifier_models.md)  and maybe  [speaker identification](https://github.com/dscripka/openWakeWord/discussions/22). It could also include  [live visualisation](https://huggingface.co/spaces/davidscripka/openWakeWord)  for testing and diagnostics.

## Installation

### Docker

Using Docker CLI

```bash
docker run -d --name openwakeword -p 12202:12202/udp -v /path/to/config/:/config dalehumby/openwakeword-rhasspy
```

In `docker-compose.yml` (or a Docker Swarm stack file)

```yaml
  openwakeword:
    image: dalehumby/openwakeword-rhasspy
    restart: always
    ports:
      - "12202:12202/udp"
    volumes:
      - /path/to/config:/config
```

### Python local install

For testing and experimentation you can run this project locally:

1. Clone the repo `git clone git@github.com:dalehumby/openWakeWord-rhasspy.git` 
2. Create a Python virtul environment _(optional)_
   - `python3 -m venv env`
   - `source env/bin/activate`
3. Install requirements `pip3 install -r requirements.txt`
4. After you've done the [Configuration](README.md#configuration) below
5. Run `python3 detect.py`

## Configuration

1. Create a file called `config.yaml`, for example `nano /path/to/config/config.yaml`
2. Paste the contents of [`config.yaml.example`](config.yaml.example) into `config.yaml` to get started

### UDP Ports

Rhasspy streams audio from its microphone to openWakeWord over the network using the UDP protocol. On each Rhasspy device that has a microhone attached (typically a [Satellite](https://rhasspy.readthedocs.io/en/latest/tutorials/#shared-mqtt-broker)) go to Rhasspy - Settings - Audio Recording and in `UDP Audio (Output)` insert the IP address of the host that's running openWakeWord, and choose a port number, usually starting at `12202`. If you have multiple Rhasspy devices then each device needs its own port number, `12202`, `12203`, `12204`, etc.

![Screenshot 2023-05-01 at 11 34 39](https://user-images.githubusercontent.com/5817143/235435660-23b847b9-2cd4-4800-bb54-3f8d415185e4.png)

In openWakeWord `config.yaml`, `udp_ports` has kay:value pairs. The key is the `siteId` shown at the top of Rhasspy - Settings. It might be: `base`, `satellite`, `kitchen`, or `bedroom`, etc. The value is the port listed under Rhasspy - Settings - Audio Recording.

```yaml
udp_ports:
  base: 12202
  kitchen: 12203
  bedroom: 12204
```

If you are using Docker you need to open the ports to allow UDP network traffic into the container. 

Using Docker CLI 

```bash
docker run -d --name openwakeword -p 12202:12202/udp -p 12203:12203/udp -p 12204:12204/udp -v /path/to/config/:/config dalehumby/openwakeword-rhasspy
```

Or in `docker-compose.yml`

```yaml
  openwakeword:
    image: dalehumby/openwakeword-rhasspy
    restart: always
    ports:
      - "12202:12202/udp"  # base
      - "12203:12203/udp"  # kitchen
      - "12204:12204/udp"  # bedroom
      # ... etc
    volumes:
      - /path/to/config:/config
```

### MQTT

openWakeWord notifies Rhasspy that a wake-word has been spoken using the [Hermes MQTT](https://rhasspy.readthedocs.io/en/latest/wake-word/#mqtthermes) protocol. The MQTT broker needs to be accessible by both Rhasspy and openWakeWord. Rhasspy's internal MQTT broker is not reachable from outside of Rhasspy, so you will need to run a [shared broker](https://rhasspy.readthedocs.io/en/latest/tutorials/#shared-mqtt-broker), like [Mosquitto](https://mosquitto.org/).

Once the broker is running, go to Rhasspy - Settings - MQTT. Choose `External` broker, set the IP address of the `Host` that the broker is running on, the `Port` number, and the `Username`/`Password` if required, similar to:

![Screenshot 2023-04-30 at 18 25 56](https://user-images.githubusercontent.com/5817143/235364431-75d50e0a-2e11-413f-96ff-66c76c83ac6d.png)

openWakeWord `config.yaml` would then have:

```yaml
mqtt:
  broker: 10.0.0.10
  port: 1883
  username: yourusername  # Delete row if not required
  password: yourpassword  # Delete row if not required

```

On each Rhasspy, in Rhasspy - Settings - Wake Word, set `Hermes MQTT`, like

![Screenshot 2023-04-30 at 19 06 45](https://user-images.githubusercontent.com/5817143/235366440-2fd5fcc7-d049-447c-aabc-fd710939ac18.png)

### openWakeWord

openWakeWord listens for wake-words like "alexa", "hey mycroft", "hey jarvis", and [others](https://github.com/dscripka/openWakeWord#pre-trained-models). Use `model_names` to specify which wake-words to listen for. (See [Pre-Trained Models](https://github.com/dscripka/openWakeWord#pre-trained-models) documentation, and which [`model_names`](https://github.com/dscripka/openWakeWord/blob/main/openwakeword/__init__.py) to use.)

Delete any wake-words that you don't want to activate on. Or remove the entire `model_names` section to use all pre-trained models.

```yaml
oww:
  model_names:  # From https://github.com/dscripka/openWakeWord/blob/main/openwakeword/__init__.py
    - alexa  # Delete to ignore this wake-word
    - hey_mycroft
    - hey_jarvis
    - timer
    - weather
  activation_samples: 3  # Number of samples in moving average
  activation_threshold: 0.7  # Trigger wakeword when average above this threshold
  deactivation_threshold: 0.2  # Do not trigger again until average falls below this threshold
  # OWW config, see https://github.com/dscripka/openWakeWord#recommendations-for-usage
  vad_threshold: 0.5
  enable_speex_noise_suppression: false
```
The other `oww` settings ensure Rhasspy is only activated once per wake-word, and help reduce false activations. 

In the example above, the latest 3 audio samples received over UDP are averaged together, and if the average confidence that a wake-word has been spoken is above 0.7 (70%), then Rhasspy is notified. Rhasspy will not be notified again until the average confidence drops below 0.2 (20%), i.e. the wake-word has ended.

Settings for voice activity detection (VAD) and noise suppression are also provided. (See openWakeWord's [Recommendations for Usage](https://github.com/dscripka/openWakeWord#recommendations-for-usage).)

## Contributing
Feel free to open an Issue if you have a problem, need help or have an idea. PRs always welcome.