The MEANEST Voice Assistant (with GPT-3)

Is Alexa too boring? Do you sometimes wish Siri was a little meaner when you ask it to turn off the lights? Does Google Assistant not call you fat enough? Never fear, GPT-3 is here. In this project, we'll build a voice assistant for home automation that leverages the power of GPT-3 to parse your queries and deliver snarky, contextual quips in response to your commands.

If you give the AI assistant a command like "could you please turn off the kitchen light", GPT-3 will parse that phrase to extract the device name (kitchen light), the desired state (off), and it will serve you up with a funny quip like "what, you're done gorging yourself on food in the refrigerator?" in response. You can activate the assistant with a wake word, speak a command, and the program will handle your query while speaking the funny quip in response. This project is written entirely in Python, so it's easy to modify yourself as you see fit. I have added integration for Lifx smart bulbs, but you can get this set up with many others.

Step 1: A Primer on GPT-3

It's important to understand what's actually happening here with GPT-3 before we dive in. If you're not already familiar with it, GPT-3 is an extremely advanced language model that features a simple text-based query method. You can feed GPT-3 a prompt, and it will select a textual response that best matches the prompt supplied. GPT-3 is extremely proficient at pattern recognition, so you can feed it a few examples of how it should respond to a given query to build simple APIs.

Below is an example snippet of what this project would feed into GPT-3 for the user input "turn off the kitchen light".

Prompt: can you turn on the dining room light

Object: dining room light

Desired State: 1

Response: What, you're too busy getting fat watching TV on the couch to do it yourself?

Prompt: turn off the kitchen light

The first prompt is given as an example, which shows GPT how to structure its response. The second prompt is the user input for GPT-3 to complete. GPT-3 might respond with the following:

Object: kitchen light

Desired State: 0

Response: What, you like to eat in the dark? Creep.

In the real system you'd feed GPT-3 more examples than just one, but hopefully you get the point. Additionally, you'll oftentimes get a different response for the same query due to how the model is constructed. I've seen everything from benign to muderous to flirty for the input "turn on the bedroom light".

Step 2: Hardware and Software Requirements

All you need to try this out is a computer with a microphone and a speaker. Your own home automation setup with require some configuration, but you can test the voice capabilities and see some funny responses without getting that set up.

I didn't want to surrender my laptop to be a glorified home automation hub, so I created a Raspberry Pi system to listen for commands and handle the responses. Most of the heavy lifting here is done in the cloud, so even an RPi Zero would work. I also ordered some smart bulbs with a simple LAN-based API. You can likely use your own, but it will require some development work. Below is a full list of the parts I used:

Raspberry Pi 4 with power supply (practically any single-board computer will work, but you will need to modify the parts below accordingly)
RASPIAUDIO speaker and mic HAT
DFRobot LCD Display
LIFX Smart Bulb

You also need your own GPT-3 API key for the time being. There's currently a waitlist to get these, but you can sign up for one here. It would take some time for me to wrap this service in an API of its own such that you don't need a key. If you would like this, comment below.

Step 3: Code Setup

You can find the Git repository for this project at github.com/AlexFWulff/SnarkyHomeAutomation

Begin by cloning this repository to your remote machine
Next, create a virtual environment to isolate all the dependencies for this project (where you replace the example path with the place where you would like your virtual environment to go): python3 -m venv /path/to/virtual_env
Activate your virtual environment with source /path/to/virtual_env/bin/activate
Navigate to the top level of the git repository, and install all the project's requirements with pip install -r requirements.txt
Depending upon your platform, you may be missing some libraries. On Raspberry Pi, I had to run sudo apt install libportaudio2 python3-tk flac ffmpeg
Put your OpenAI key in a text file somewhere on your system, and change the key_path field in config.ini to point to this file

You *should* be able to just run things by running python Run.py. If you do not have your microphone and speaker configured correctly, you may see errors such as the following:

OSError: [Errno -9996] Invalid input device (no default output device)

If this plagues you, ensure your microphone and speaker function correctly by using them in a separate program.

The default wake word is "computer". To give the system a command, say "computer" and then speak a query. The system should then give you a response over the speaker and output the action it will take.

Step 4: Configuring Your Smart Devices

The file SmartDeviceInfo.xml contains all the information I needed to use my LIFX devices. If you are also using LIFX bulbs, you can just add a new entry to the XML file with a name for your device and its MAC address. You can find the MAC addresses of all the LIFX devices on your local network with this Python tool. Make sure you set the enable_lifx field in the config file to "True" to enable this functionality.

I wrote this software such that it should be easy to add other types of devices. Just set the tag in the xml document to be something else, and add a separate handler for it in AutomationManager.py.

You can have the program only change the state of devices when the name outputted by GPT-3 perfectly matches a device on your network, or you can live dangerously and have the program select the closest device name to what GPT-3 outputs. You can change this behavior in the config file.

Step 5: Prompt Customization

You can add more custom prompt examples to change the way in which GPT will respond. You can find the default prompt at prompts/prompt1.txt. If you want to create new behavior, add a new file to this directory and change the prompt_file_path value in config.ini to point to this new file.

GPT-3 can pick up on a lot of different speech patterns. GPT-3 is even proficient in a variety of different languages, so if you'd rather speak to your home automation AI in a language other than English you can create a prompt to do so.

I've also created some prompts that are exceedingly fun to use but not necessarily appropriate for public distribution. Shoot me an email (which you can find on my website) if you would like me to send you some.

Step 6: More Customization!

This project can also serve as a useful template for other types of voice-enabled assistants powered by GPT-3. I wrote the software in a modular manner, so it should be easy to swap things in and out as you see fit. For example, you could extend this same functionality to instead add items to a to-do list with funny voice responses for each item you add. In fact, you can keep adding modules to build a fully-featured voice assistant that can accomplish many tasks!

The display is also rather rudimentary. If you'd like, it should be pretty easy to modify my designs and replace them with some of your own. Everything is written in Tkinter, with a semi-responsive behavior to new screen resolutions.

That's all! I hope you enjoyed this project, and let me know in the comments below what you do with it.