Help finding and analyzing data
Looking for data in the social sciences? For help at any point in the process, Ask Us. For support analyzing data, doing GIS, learning software, or advanced data visualization, contact the Data Lab.
You can search for data by topic using the Social Science Data Research Guide.
Finding data
- The first question to ask is: do you need microdata or aggregate data? In other words, do you need data about individuals (microdata) or combined/aggregated data, usually by geographic region (for example, income ranges in a particular town).
- To narrow down the data that would answer your research question, consider the following steps:
- Where do you want to find data from? A particular geography? Don't consider only location though: what unit size do you need: in other words, do you want data for an entire country as a country, or broken down by state, county, etc? This is often the biggest limiter of data.
- When do you want data from? Do you want a snapshot (ex. population in 1999) or do you want longitudinal data that you can compare over time? Make sure the same question is being asked of the same population.
- Who do you want data about? The entire population? A particular age, gender, socioeconomic class, race, or other demographic group? Are any of the groups you want to research protected or vulnerable? If so, it will be harder to find data, and you will need to be particularly cognizant of your responsibility as a researcher to do no harm.
- What variables could answer the research question? This is intentionally after the three previous questions, because often there are multiple variables that could address your question. If you want to know income, for example, you could look for self-reported income, tax filings, or a proxy like car model owned (which correlates closely with wealth). To find out what the variable actually means, check the codebook, which will describe how that question was gathered and what it actually represents. Also ask what type of data: do you need microdata or aggregate data?
- Why would someone gather this data? What biases does that introduce? Is Coca Cola funding this research on sugar? If you're having trouble finding data, this can also be used to identify stakeholders who may gather data, which you can then intentionally search for.
- How is this data gathered, and how does that affect your analysis? If people self-report height and wealth, for example, research demonstrates they are consistently over-estimating.
Top data sources
These sources often hold the data that social science students and researchers at Tufts are looking for.
- Social Explorer Interactive data portal that include census & American Community Survey data, as well as other federal data, 1750-present with a variety of geographies. Downloads available.
- Data.Gov: Census Data Official US Government portal for Census and American Community Survey Data.
- Inter-University Consortium for Political and Social Research (ICPSR) The Inter-University Consortium for Political and Social Research. Gold standard of curation, cleaning, and description of datasets. Includes government data as well as researcher-created data, including election data, opinion polls, more.
- IPUMS (Integrated Public Used Microdata Series) U.S. Census and American Community Survey microdata from 1850 to the present. Also has microdata from the Current Population Survey (labor force),& nearly 100 international censuses, Time Use surveys, and more.