12 Quantitative research
This is not really supposed to be a statistics course, but you need to know and be able to use and write about data analysis and to understand what you read in order to be a good researcher. This is true even if you are 100% committed to qualitative research.
This project will be a secondary data analysis. That is you will use a data set that has already been collected and cleaned. We’ve done some looking at the General Social Survey which is a source on almost any topic about adults in the United States at some point in its 50 year history (see info below). However, there are many more data sets available. For example ICPSR claims to host “19,043 studies” that collectively include over 6 million variables. There are many other archived data sets. The Harvard Dataverse has a wide collection of data sets. Data.gov is the central location for many federal data sets that can be hard to find.
Some data sets are part of a kind of shared language among researchers on a specific topic, such as the Add Health for people who studey adolescent health while the National Educational Longitudinal Survey (aka NELS88) is used in so many sociology articles that it’s hard to keep track of them.
For this project you will need to identify a research topic and a data set to address it with. We will use R to do any data management that is necessary, and I will help you with this. For the project you will need to do a mini literature review. This would normally include reading at least one publication using the data set. These are usually listed where you found the data, but, if not, you can search using the author’s names and the name of the data set.
Your work of this project should follow the basic outline of a research poster or paper. I’m going to share poster examples but you don’t need to make an actual poster, just follow the general structure. Of course, they do not all follow the exact same outline. I’m happy to show you how to make a poster if you are interested. But the real point of me putting it this way is that this is not a full blown paper.
General Outline
- Introduction including background literature.
- Hypotheses or questions
- Description of data source
- Analyses including visualizations
- Conclusions
- Take aways/Implications/further research
- References
Suggestions
There are a number of directions you can go with this, depending on your interests. But here are a few:
- You could do a crosstabulations with a single dependent (outcome) variable and two or more independent variables.
- You could do a regression analysis.
- You could examine measurement reliability or validity for a measure in one study.
Visualizations could include histograms, bar charts, scatterplots or any other option that is appropriate for your data.
GSS Access
To get to the GSS you will need to create an account on GSS data explorer. Then you search for the variables you want, saving them to MyGSS.
See https://gssdataexplorer.norc.org/search_variables
Next you will create an export from the GSS Explorer containing just your variables, and the variables that are always include (id and year) for the years of data you need. This can and should include subsetting to get just those records you want.
To do this we are going to need to save your data in a way that you can retrieve it. The easiest way to do this is to upload it to a Google spreadsheet. If you do not have a Google account and don’t want to create one, you can email me your spreadsheet. You could also potentially use Microsoft 365 but please let me know if you want to do that because you will need to load a different package to work with that.
Follow the instructions for creating an abstract, selecting Excel Workbook with Meta data as the format.
This explains how to create an extract.Links to an external site.
In my experience it works best do use the download option rather than to export to Google Docs directly. So once you extract is completed you will check the box next to its name, download it and then upload it.
Some examples
https://www.bgsu.edu/arts-and-sciences/sociology/undergraduate-program/research/undergradresearch/capstone-soc-4800/capstone-2018.html These all use secondary data analysis so they are particularly appropriate to look at.
https://guides.library.illinois.edu/poster/ex
https://urca.msu.edu/poster-samples
https://soc.washington.edu/sites/soc/files/documents/chen_2020_honors_poster_session.pdf
https://libguides.humboldt.edu/c.php?g=880315&p=6323841 (links to many more examples on right)
Advice
https://colinpurrington.com/tips/poster-design/ (scroll down for outline)
How this will be graded
- Does it include an introduction, hypothesis or question, data analysis, conclusion, implications and references (but could be labelled in different ways)?
- Does it accurately refer to at least two pieces of background literature?
- Is the data analysis appropriate, correctly implemented and accurately explained?
- Do the conclusions and implications follow from the rest of the analysis?
- Do the parts of the document complement each other and tell a single story?
- Does the document show sophisticated thinking about the topic?