We’re back with another round of interesting #UberData asking a few questions today:

“How is San Francisco’s Financial District like New York?” and “What neighborhood tells us the most about DC’s lifestyle?”

This post provides a few data-derived answers. Obviously the Uber nerd-collective are into maps. Mapping is important to what we’re building at Uber, but as a brain guy I usually don’t think in terms of space as much as I do in terms of time: temporal correlations, autoregressive models, causal relationships, time-frequency analyses, and so on. What’s happening in the brain and when.

Today, we’ll look at what Uber’s temporal demand patterns tell us about city neighborhoods we service. This is what San Francisco’s demand looks like, broken down by hour of week:

Uberdata: San Francisco demand curve

Now compare that to New York:

Uberdata: San Francisco v. New York demand curve

Right away you can see there’s something different:

Our ridership in New York is more heavily skewed toward weekdays, whereas San Francisco demand jumps up on weekends.

Since we’ve been in business for 2 years and are still growing like crazy, we can get much more granular. Instead of looking at differences between cities, we can start to look at differences between neighborhoods. 271 of them across 9 major US cities, to be exact:

  • Boston
  • Chicago
  • DC
  • LA
  • New York
  • Philadelphia
  • San Diego
  • San Francisco
  • Seattle

Now that we can get more fine-grained we can begin to observe some pretty clear neighborhood-by-neighborhood differences. Have a look at 2 neighborhoods in San Francisco — the Mission and the Financial District:

Uberdata: San Francisco Mission v Financial District demand curves

Check out how daily demand in the Mission peaks later in the day — after work hours — whereas demand in the Financial District peaks toward the end of the work day. The big difference, of course, is that the Mission has a lot more demand on Saturdays.

Now look at how San Francisco’s Financial District compares to New York’s Financial District:

Uberdata: San Francisco v New York Financial Districts demand curves

San Francisco’s Financial District is more Manhattan-like than it is San Francisco-like!*

(* By “more like” we mean Uber demand, which is an index for activity within that neighborhood).

In fact, we can quantify how <city>-like or not <city>-like any given neighborhood is. For example, “how San Francisco-like is the Mission, really?” and “how much more like New York is the Financial District than it is San Francisco?” What do we find?

Cities have “stereotypical” neighborhoods that very strongly match the flow of their home cities really well, and some neighborhoods just don’t really seem to belong to their home city. They’re outliers.

“But wait a minute!” you might say, “you’re correlating a variable with another variable that includes the first! If one neighborhood contributes more overall power to the signal of the city average, of course it will correlate with it better!” Keep calm, and carry on — we corrected for that. The concern here is that some neighborhoods have more demand and thus contribute more to the overall city demand. One way to address this is to correlate a city’s neighborhood demand with the city’s demand curves removing the effect of that neighborhood. Which is what we’ve done. Thus…

The most stereotypically “like” neighborhood for each city is:
• San Francisco: North Beach
• New York: Chelsea
• Seattle: Capitol Hill
• Chicago: Near North Side
• Boston: Back Bay – Beacon Hill
• DC: Dupont Circle
• LA: Mid-City West

And on the other end of the spectrum…

The most stereotypically “unlike” neighborhood for each city is:
• San Francisco: Crocker Amazon
• New York: Washington Heights
• Seattle: South Park
• Chicago: Montclare
• Boston: West Roxbury
• DC: Deanwood
• LA: Southeast LA

We can also detect different types of demand curves. Are there neighborhoods that are more active on weekends and others that are clearly work-week hotspots? One simple mathematical technique to identify stereotyped patterns in data is via principal component analysis. Let’s jump to the results: there are 2 types of demand curves accounting for 93% of the overall demand variance. Here’s what they look like:

Uberdata: PCA demand curves weekend/weekday

Essentially you’ve got one rising demand curve that peaks on evenings and Friday and Saturday nights (red), and one workday/workweek curve that diminishes on weekends (blue). We can then ask, for each city, which neighborhood is the most “weekend-like” and which is the most “weekday-like” (that is, how strongly does each neighborhood correlate with each of these two curves)?

If we could build the perfect party city consisting only of the neighborhoods from each city that correlate most with the weekend curve, it would be:

• San Francisco: North Beach
• New York: SoHo
• Seattle: First Hill
• Chicago: Near North Side
• Boston: South Boston
• DC: Dupont Circle
• LA: Santa Monica

In contrast…

If we could build the perfect “let’s get serious about business” city the neighborhoods it contains would be:

• San Francisco: Financial District
• New York: Garment District
• Seattle: Overlake
• Chicago: O’Hare
• Boston: East Boston
• DC: Deanwood
• LA: Westchester

But this is looking at how neighborhoods relate to cities. What about how they relate to one-another?

Well, given that we’re working with 271 neighborhoods, we’re talking about running 36585 correlations, which is messy to display. So we’ve pared the data down to just the strongest relationships. Play around with it by clicking the image below:

Uberdata - Neighborhood Correlations

interactive plot

built with d3.js

Of course this is all academic, as the thing that makes cities like San Francisco great are their diversity. I’m sure living in a perpetual party city would eventually have to get old…right?