I want to find somewhere super livable; aside from “livable” being subjective, it is often hard to find data even if you have specific criteria. The data at Walkscore is really interesting but I see two big problems with it.
- The data for an entire city favtors in part of the city I likely do not care about.
- The data for one neighborhood is likely too small for the “area” I’d visit most days.
What I really want is a dataset which looks at “networks” of neighborhoods. Specifically, I want the aggregate data for a neighborhood and all other neighborhoods which are within a specific distance. To me, a network is an area which I’d “call home” and like visit frequently. In my case, I consider anything within 3 miles of my neighborhood within my “network” since I can get there on foot, bike, or transit pretty easily.
Instead of just complaining into the wind, I decided to explore the idea of creating neighborhood networks with Walkscore. To do this, I wrote a Python script which uses the Walkscore data to create aggregate datasets for neighborhood networks. You can find the script on my GitHib repository. The input for this script is a simple CSV file with city and states you want to lookup; the script has comment documentation to explain. To calculate distance I use a simple mathematical formula; you could also use Google if you have API keys.
For anyone curious, I looked up a few thousand US cities and calculated all networks within a 3 mile distance. The dataset is available here in CSV format.
There are some gotchas with the script, without question. Here are some of the biggest ones:
- If no data is found on Walkscore a zero is used (bad-ish.)
- You’re beholden to how Walkscore defined neighborhoods
- My method of calculating distance is very approximate
Next up – visualizing and playing with this data!