Driving analytics from IoT signal exhaust


This is going to be a short exploration in to the world of data collection and “signal pollution”. The goal of this article is to provide real world examples of using passive signal scanners to collect and derive information from the gadgets and devices people are already wearing.

What is [I]IoT

IoT stands for Internet of Things (IIoT being Industrial Internet of Things), and involves deploying internet connected devices and sensors to people, products, equipment, etc., with the intent of having these devices report data back to a central location. This data is then used for driving decisions, for example, a temperature sensor can send real time temperature information to a server, which then can trigger alerts, or even reduce heating elements without requiring user interaction.

What is signal exhaust?

Pollution, typically speaking, is easy to see. A truck drives by, and you see the smog coming out of the exhaust pipe.

Radio operators, and network engineers, will be familiar with term signal pollution, the invisible noise that comes from connected devices, microwaves, cordless phones, and other broadcasting devices. This signal pollution, or signal exhaust, comes from many types of devices, and the number is growing with more and more devices becoming ‘internet aware.’

Just like smog fills the atmosphere, signal exhaust fills the surrounding area with noise, which, with the right equipment, can be collected, and harnessed.

How to leverage

Today, consumers are loading themselves up with an ever increasing number of wireless devices. These devices are all broadcasting, and although the literal information may be encrypted, you can still pull some important data from all of the noise.

Using specially tailored listening devices (which can easily be an open source router, or a specially purposed Raspberry Pi), you can pick up these stray signals, and uncover certain device specific information such as hardware ID, software information, maybe even what the device is. From this information, we can start analyzing trends such as device density and location, how many times a device reappears after leaving, proportion of people using a specific technology, and much more.

Use case

I can think of several use cases for this type of information, but the one that really stands out is consumer identification in the retail industry.

Retail chains are making a big push to be able to somehow link online consumers to brick and mortar consumers. Lots of different methods are being deployed, including collecting email address, linking rewards cards, etc.

But this requires the shopper to provide additional information, and doesn’t help figure out who is looking in the store, and then shopping online.

Perhaps this information could be better inferred by passively tagging consumers by their device.

For example:

Let’s say a customer comes in and makes a purchase. You can pick up their unique cell phone identity, and link this information to the purchase. Chances are, you’ll also pick up a few people around your target customer, so the data is not completely accurate on the first go around.

Some time later, the customer comes in and makes another purchase at another store. Suddenly, you have two data points, again there will be some unwanted information from other customers, but, by cross referencing similar information between both events, and you’ve suddenly targeted a unique customer.

What can you do with this information?

Lots! Special offers at the counter, tracking repeat customers, even track the customer behaviors through the store!

I would call this ‘soft data’, because the information is being inferred through deduction, rather than coming from a rock solid event.

Black hat

Privacy, Privacy, Privacy, Privacy!

I hear the screams coming from every direction, and it’s definitely a major concern.

Back when I ran a chain of retail stores and deployed this type of passive data collection, we made sure to never collect personally identifiable information, unless that was voluntarily offered to us through captive portals (another topic all together). Incoming data was securely stored by our vendor who only provided us with a non-identifying, unique customer key, which information about repeat visits, durations, and sales conversions.

But the line is thin, and you need to be careful about what data you collect, and how you are using it.

Like this post? Follow me!
Twitter: @aj_bubb
Blog: AJBubb.com
LinkedIn: linkedin.com/in/AJBubb

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>