Auditing Uber: Northeastern researchers test Uber’s ‘surge pricing’

A rough map of Uber "surge zones" in Boston.
A rough map of Uber "surge zones" in Boston.

Christo Wilson admits there was something a little sketchy about going from drug store to drug store in downtown Boston last December, buying hundreds of dollars in prepaid Visa cards.

But Wilson, an assistant computer science professor at Northeastern University, wasn’t laundering money: He was buying the cards to create fake Uber accounts that he and his colleagues would use to probe the ride-hailing app for supply and demand data. They’ll be presenting their findings this Friday at the 2015 Internet Measurement Conference in Tokyo.

“Uber is a black box: They do not provide data about supply or demand, and prices are set dynamically by an opaque ‘surge pricing’ algorithm,” Wilson and his co-authors Le Chen and Alan Mislove wrote. “The lack of transparency has led to concerns about whether Uber artificially manipulate prices, and whether dynamic prices are fair to customers and drivers.”

So the authors set out to see just how Uber works. By creating 43 virtual versions of the actual Uber app and arraying them around midtown Manhattan and downtown San Francisco, the researchers were able to monitor the number and locations of vehicles, track estimated wait times, and get information about surge pricing around the clock – amounting to more than 18 million records that captured four weeks of data in granular detail.

Not only did they discover that every city has “surge zones” with set borders that apply the same fare multiplier across a part of the city, but their research showed surprisingly little movement of drivers when prices surged in a particular area. Uber has contested those claims, and said its research shows that drivers respond to the higher prices.

Wilson, Mislove, and Chen aren’t the first people to try to reveal Uber’s secrets with data. In April, the journalism professor Nicholas Diakopoulos collected and analyzed similar information for Washington, D.C., and concluded that surge pricing — whose fare multipliers are highly volatile – doesn’t make more drivers hit the road so much as move drivers from one area of the city to another. And academics and data journalists have also used Uber and taxi data collected by New York City regulators to analyze Uber’s impacts on travel trends.

But Wilson and his colleagues’ work is new, and highly detailed. Day in and day out, the researchers’ virtual sentinels tracked cars and compiled data. They detected an average of 5,000 drivers for UberX – the company’s most popular and least regulated service – in one section of Manhattan every day and 9,000 in San Francisco. But demand for drivers in San Francisco is so high that surge prices were in effect 57 percent of the time. As far as they could tell, their virtual passenger apps didn’t increase surge prices by adding to the demand Uber measured; Wilson said he’s not sure anyone at Uber even noticed them.

“I’m sure if they looked through the logs, they could detect us. We’re pretty obvious. But they just weren’t looking,” he said. “Forty people in Manhattan or San Francisco is nothing.”

They made some fascinating discoveries. By tinkering with the coordinates emitted by their emulators in New York, San Francisco, and many other cities, the authors found that each city is divided into several surge pricing “zones” where the same multiplier is in effect. In Boston, those zones are broken up such that a rush of demand near Boston University could boost prices as far away as the North End, while Charlestown riders wouldn’t be affected by whatever multiplier is facing folks a short distance away in Somerville. Wilson and his colleagues also gleaned some information about the effects of surge pricing, and suggested the reduction of demand caused by surge prices could actually keep drivers away.

“On one hand, surge does seem to have a small effect on attracting new cars,” the paper said. “On the other hand, it also appears to have a larger, negative effect on demand.”

The study had many limitations. Even though the researchers could spoof passenger apps and collect data about things like wait times and surge multipliers that give a rough idea of rider demand, they couldn’t monitor the actual number of passengers looking for rides. And when a car disappeared from the screen of one of the virtual Uber apps, it could suggest that either the car picked up a passenger or that the driver shut off the app.

The authors also acknowledged that they couldn’t reliably predict when surges would go into effect because they couldn’t access some critical data Uber uses to apply its surges.

Still, their survey got attention at Uber. Wilson said they discovered a bug in how surge prices were displayed to users that led to random, short-lived price multiples. He added that his co-author Chen, a graduate student, would soon be applying for an internship at Lyft, Uber’s underdog competitor.

“We have not gotten any job offers,” Wilson joked.

An Uber spokeswoman cautioned against reading too much into the paper, saying it was “based on extremely limited, public data.” She added the researchers’ findings were sometimes unclear – not differentiating between Uber’s various offerings – and did not match up with the company’s analysis of its own supply, demand, and surge pricing data.

She also referred to a recent case study by Uber data scientists that showed the supply of Uber cars rose more than 50 percent in an area around Madison Square Garden in the hours after a concert, when surge prices ranged from 1 to 1.8 times the normal fare. That’s much larger than the effect Wilson and his colleagues observed. The researchers’ map of surge zones in Boston was roughly accurate, Uber said.

Wilson called his survey style “algorithmic auditing.” The Uber study is similar to one last year that used scripts and software to scrape and analyze price information on 16 popular shopping and travel websites and found that customers often received different prices for the same products. He’s also working on a study that looks at how Google Maps shows different national boundaries based on which countries users are in.