Controlling a Mindstorms EV3 with Amazon Echo
This past Christmas my oldest son received a LEGO Mindstorms set as one of his gifts. Mindstorms is a fantastic platform for teaching kids about programming and robotics. With his set he can build color-sensing rovers, "seeing" robots that can navigate through a room, and all kinds of other devices that can sense, move, lift, and interact.
Admittedly, I was also a little excited that he received the Mindstorms. I'm a gadget lover and a tinkerer. I've got my Arduino, my Raspberry Pi, and a head full of project ideas. So, the Mindstorms will let me share some of my hobbies with my son.
Knowing that I'm such a gadget fan, my kids gifted me with an Amazon Echo. These fun little devices are like Apple's Siri - for your living room. It's a nice looking, nice sounding speaker that can act as a traditional bluetooth speaker. But the real power of the Echo (also called "Alexa"), is in the voice recognition and interaction. I can now walk into my kitchen, and ask: "Alexa, what's the weather going to be like today?" or "Alexa, is Gene Wilder still alive?" - and I'll get the answers I'm looking for.
Shortly after setting up my Echo and uttering my first few silly questions, the geek voice in the back of my head started chattering...
"I bet this thing's got an API..."
But what would I have it do? My impish geek voice didn't let me down.
"Hey, I bet that Mindstorms has an API too..."
Bingo!
After a couple of evenings of hacking around with the Alexa API and the C#/.NET EV3 API, I was able to flex my mighty programming muscles to my 7 year-old son.
The architecture for the Alexa-to-EV3 communications looks a little extensive at first, but it's really not so bad. Most of the capture part of the operation chains through a few Amazon AWS services, which are pretty easy to set up. Below is the high-level flow of data through the applications.
The general flow is:
User utters the EV3 skill's phrase, including a command and an optional value. For example, "Alexa, Tell EV3 Move Forward 10".
The Echo interprets your language according to how you set up your grammar, and sends a message to some endpoint. Right now, Amazon lets you send a message to either a web service or a Lambda function. In this case, I set it up to activate the Lambda function.
The Lambda function, a simple Node JS application, inspects the message sent by the Echo, and decides what to do. Unless the user is canceling out of their command session, this lambda function will simply package up a message into the SNS service.
SNS (Simple Notification Service) receives the message from the Lambda function, then marshals the messages off to configurable endpoints. I configured my SNS service to both add a message to an SQS message queue, as well as to send me an email (for debugging purposes).
The message is added to an SQS queue. This will allow the messages to queue up and live for 1 minute before they're automatically cleaned out. Having the queueing and temporary persistence allows the console application to pull down new commands whenever it's ready for one.
The .NET console application polls the SQS queue for new messages. When one is found, it's processed and removed from the queue.
Finally, the console app interprets the message and sends an appropriate bluetooth command to the rover via the EV3 C# API. The rover then acts on the command and causes motion.
In my next two blog postings I'll cover in more detail:
The capture/collection side of the solution. This will include all of the Amazon AWS configurations, as well as the Node JS code written for the Lambda function.
The processing/communication side of the solution. This will cover the .NET console application and the Bluetooth communication to the rover.