(This article was first published on AriLamstein.com » R, and kindly contributed to R-bloggers)
In my course Learn to Map Census Data in R I provide people with a handful of interesting demographics to analyze. This is convenient for teaching, but people often want to search for other demographic statistics. To address that, today I will work through an example of starting with a simple demographic question and using R to answer it.
Here is my question: I used to live in Japan, and to this day I still enjoy practicing Japanese with native speakers. If I wanted to move from San Francisco to a part of the country that has more Japanese people, where should I move?
Step 1: Find the Table for the DataData in the census bureau is stored in tables. One way to find the table for a particular metric is to use the function ?acs.lookup from the acs package. (Note that to run this code you will need to get and install a census API key; I explain how to do that here).
The Census Bureau has two “Japanese” tables: the first relates to race and the second to language. For simplicity, let’s focus on race (B02006). The “_009” at the end indicates the column of the table; each column tabulates a different Asian nationality.
Step 2: Get the DataThere are a few ways to get the data from that table into R. One way is to use the function ?acs.fetch in the acs package. If your end result is to map the data with the choroplethr package, however, you might find it easier to use the function ?get_acs_data in the choroplethr package:
What’s returned is a list with 2 elements. The first element is a data frame with the (region, value) pairs. The second element is the title of the column:
Step 3: Analyze the DataThe first way to analyze the data is to simply look at the data frame:
People who have taken my course will recognize the regions asFIPS County Codes. We can use a boxplot to look at the distribution of values:
boxplot(df$value)
[color=rgb(255, 255, 255) !important]
I draw two conclusions from this chart: 1) the median is very low and 2) there are two very large outliers.
To find out the names of the outliers we need to convert the FIPS Codes to English. We can do that by merging df with the data frame ?county.regions.
> data(county.regions)> head(county.regions) region county.fips.character county.name state.name state.fips.character state.abb1 1001 01001 autauga alabama 01 AL36 1003 01003 baldwin alabama 01 AL55 1005 01005 barbour alabama 01 AL15 1007 01007 bibb alabama 01 AL2 1009 01009 blount alabama 01 AL16 1011 01011 bullock alabama 01 AL> df2 = merge(df, county.regions)> df2 = df2[order(-df2$value), ]> head(df2) region value county.fips.character county.name state.name state.fips.character state.abb548 15003 150984 15003 honolulu hawaii 15 HI205 6037 103180 06037 los angeles california 06 CA216 6059 33211 06059 orange california 06 CA229 6085 28144 06085 santa clara california 06 CA2971 53033 21493 53033 king washington 53 WA223 6073 18592 06073 san diego california 06 CA
So the outliers are Honolulu county and Los Angeles county. San Francisco isn’t even in the top 6. So if I ever decide to give up my career in technology for a career focused on Japanese, I should move to Honolulu!
It’s also easy to create a choropleth map of the values. This allows us to see the geographic distribution of the values.
library(choroplethrMaps)county_choropleth(df, title = "2012 County Estimates:nNumber of Japanese per County")
[color=rgb(255, 255, 255) !important]
According to this map, by living on the west coast I am already in a part of the country with a high concentration of Japanese people.
ConclusionIf you wind up using this blog post to do an analysis of your own, or have difficulty adapting this code to your own purposes, please leave a comment below. I’m always interested in hearing what my readers are working on.
A final note to my Japanese friends: どう思いますか?アメリカで一番興味がある場所はホノルルとロサンゼルスですか?口コミしてください!
LEARN TO MAP CENSUS DATA
Subscribe and get my free email course: Mapping Census Data in R!
100% Privacy. We don’t spam.

The post How to Search for Census Data from R appeared first on AriLamstein.com.