Section 7: Final Meeting

Sam Frederick

Columbia University

4/25/23

Biases and Errors in Research

G Question Question Theory Theory Question->Theory Hypotheses Hypotheses Theory->Hypotheses dat Data Collection Hypotheses->dat ht Hypothesis Testing dat->ht msr Measurement Error dat->msr sampling Sampling Error dat->sampling nrb Nonresponse Bias dat->nrb ovb Confounding Variables dat->ovb ht->ovb

Biases and Errors in Research

G Question Question Theory Theory Question->Theory Hypotheses Hypotheses Theory->Hypotheses dat Data Collection Hypotheses->dat ht Hypothesis Testing dat->ht msr Measurement Error dat->msr sampling Sampling Error dat->sampling nrb Nonresponse Bias dat->nrb rb Response Bias dat->rb ovb Confounding Variables dat->ovb ht->ovb

Response Bias: Social Desirability Bias

  • Did you vote in the last election?
    • Depending on who you’re with, you might be more or less likely to answer this question honestly

Voter Turnout

  • Let’s look at some survey results:
ces <- read.csv("ces.csv")
  • What do these results tell us about turnout in elections?

Voter Turnout

  • How would we make a scatterplot of turnout from the survey?
ces 

Voter Turnout

ces |> 
  ggplot(aes(year, voted_turnout_self_nona))

Voter Turnout

ces |> 
  ggplot(aes(year, voted_turnout_self_nona))+ 
  geom_point()

Voter Turnout

ces |> 
  ggplot(aes(year, voted_turnout_self_nona))+ 
  geom_point()+
  ylim(c(0, 100))

Voter Turnout

ces |> 
  ggplot(aes(year, voted_turnout_self_nona))+ 
  geom_point()+
  ylim(c(0, 100))+
  scale_x_continuous(seq(2006, 2022, 2))

Voter Turnout

ces |> 
  ggplot(aes(year, voted_turnout_self_nona))+ 
  geom_point()+
  ylim(c(0, 100))+
  scale_x_continuous(seq(2006, 2022, 2))+
  labs(x = "Year", y = "Turnout")

Voter Turnout

Voter Turnout

  • What about the true turnout?
turnout <- read.csv("turnout.csv")
ces <- ces |> 
  left_join(turnout, by = "year")

Voter Turnout

ces |>
  ggplot(aes(year, voted_turnout_self_nona)) +
  geom_point(aes(color = "Self-Reported Turnout")) +  
  scale_x_continuous(breaks = seq(2006, 2020, 2)) +
  ylim(c(0, 100))+
  labs(x = "Year", y = "Turnout")

Voter Turnout

ces |>
  ggplot(aes(year, voted_turnout_self_nona)) +
  geom_point(aes(color = "Self-Reported Turnout")) +  
  scale_x_continuous(breaks = seq(2006, 2020, 2)) +
  ylim(c(0, 100))+
  labs(x = "Year", y = "Turnout")

Voter Turnout

ces |>
  ggplot(aes(year, voted_turnout_self_nona)) +
  geom_point(aes(color = "Self-Reported Turnout")) +  
  scale_x_continuous(breaks = seq(2006, 2020, 2)) +
  ylim(c(0, 100))+
  labs(x = "Year", y = "Turnout")+
  geom_point(aes(year, turnout, color = "True Turnout"))

Voter Turnout

Voter Turnout

  • What’s going on here?
    • Sampling Error
    • Social Desirability Bias
    • False/Incorrect Memories

Reducing Social Desirability Bias

  • Survey Medium (Face-to-Face, Phone, Online)
  • Question Wording
  • Anonymized Question Designs

Anonymized/Secret Question Designs

  • List Experiments:
    • Generate a list of items, including the sensitive item you are really interested in
    • Randomly assign participants to “treatment” and “control”
      • Treatment receives the sensitive item + other items
      • Control receives only other items
    • Ask participants how many of the items apply to them

Anonymized/Secret Question Designs

  • List Experiments:
    • Difference in Means between treatment and control
      • Proportion of people to whom the sensitive item applies

Anonymized/Secret Question Designs

  • Things to look out for with List Experiments
    • Ceiling Effects and Floor Effects:
      • Control items should be a mix of items that should be expected to both apply and not apply to people
      • Violations of this principle removes anonymity

Anonymized/Secret Question Designs

  • List Experiments
  • Coin flipping, rolling die
    • Tell respondents to flip a coin
    • If outcome heads, give a certain answer
    • If outcome tails, give your true answer
    • Know true proportion who flipped heads (0.5), so difference is true proportion on sensitive answer

Qualitative or Case Study Designs

  • Choosing Cases:
    • Think carefully about which cases will help you best test your theory
    • Select across the range of your independent variable(s)
    • Be aware of the potential limitations of generalization and specific context of cases
    • Consider cases where your theory doesn’t hold
      • Does theory need to be refined?
      • Do these cases suggest limitations/conditions of theory?

Qualitative or Case Study Designs

  • Combining Qualitative and Quantitative
    • Can use mixed methods
      • supplement case study with quantitative analysis

Qualitative or Case Study Designs

  • Some Potential Benefits:
    • In-Depth knowledge about certain cases
    • Better understanding of motivations of political actors
    • Better understanding of context
    • Can bring area-relevant background knowledge of language, culture, etc.

Qualitative or Case Study Designs

  • Some Potential Drawbacks:
    • Representativeness/Generalizability
    • Causal inference can be challenging
      • Need more observations than causes
      • Hard to establish counterfactuals

Biases and Errors in Research

G Question Question Theory Theory Question->Theory Hypotheses Hypotheses Theory->Hypotheses dat Data Collection Hypotheses->dat ht Hypothesis Testing dat->ht msr Measurement Error dat->msr sampling Sampling Error dat->sampling nrb Nonresponse Bias dat->nrb rb Response Bias dat->rb ovb Confounding Variables dat->ovb ht->ovb