Creating a Machine Learning Model: Bus Ride Detection in a Mobile App

Training an App to Know When You Are Riding a Bus (post written on behalf of our client, PIPs Rewards)

PIPs Rewards is a mobile platform that utilizes data-driven solutions to prompt and reward a range of beneficial behaviors from walking and bike-sharing to recycling and shopping. The app’s north star is to prompt these behaviors using games and rewards and to verify them using smart technology, rather than self-reporting. When an action is verified, this triggers a reward in the form of a currency of good, called Positive Impact Points (or PIPs). PIPs can be redeemed for gift cards or converted into scholarships by college students.

Over the last year, we’ve been testing data science-based tools that are native to cell phones in order to verify when users are taking public transit, whether it be a bus, train, or subway. This is a highly valued practice among our university partners; the more people we can get out of cars and onto public transit, the healthier, safer, and more livable the community becomes.

Since fall 2017, we’ve been using beacons to detect bus riding. However, beacons are expensive, implementation was inconsistent across platforms, and their batteries needed to be checked regularly. For these reasons, beacons were an unrealistic choice for large public transit systems and as a consequence were always considered a temporary solution.

Our Artificial Intelligence (AI) solution, launching this summer, detects when a user is on a bus by using live data that is fed into the system. To set up such a solution, we needed to determine which data could reliably be used to recognize a bus riding situation and test it. Our data scientists proposed that we use 2 data sets:

Visual data (pictures taken of the inside of the bus)
Sensor data (data from the sensors on the user’s phone; able to monitor changes in altitude, location, and speed)

To start training our machine learning model, we needed a base set of data about public transportation. Public data sets allowed us to set the base rules for recognizing the inside of a bus and analyzing the sensor data to detect bus driving patterns.

Next, the PIPs Team collected live data from six locations around the world. We also collected other mobility data so that the model could learn to differentiate bus riding from other types of movements including walking, biking, running, driving in a car, and standing still.

Though the model continued to improve, the machine had not yet learned every possible case, so we gathered more data in order to reduce or eliminate false negatives (such as the model mistaking a bus ride for a car ride) and false positives (such as the model thinking that you are in a bus when you are actually looking at photos of bus interiors online, sitting at your desk).

We now have a machine learning model that is able to detect in approximately 90% of cases when our users are on a bus vs. when they are not. From here, we see it as an easy step to hone our model to learn to detect other types of public transportation such as when our users in NYC ride the subway.

As we’ve experienced the creation, fine-tuning, and release of our first machine-learning model made from scratch, we are fascinated by the capabilities of the AI algorithms underlying the model while also very aware of their weaknesses. We are also excited to explore other ways we can leverage this type of technology to track and reward responsible behavior while ensuring that the technology continues to improve itself and help the communities we serve.

See the original post on the PIPs Rewards Medium blog here.