Workshop: Hypothesis Tests

Author

Adam Roberts (adapted from Curt Signorino)

Published

February 11, 2026

Introduction

This is NOT a HW. You will not be graded on your answers. You are very much encouraged to discuss the problems with your classmates. Answers will be provided on Blackboard once the last workshop has had the opportunity to work on these questions.

Setup

For this practice HW, we return to the world of Dogtopia. You’re a curious young reporter working for the Daily Growler. You’ve just fielded your first survey using random sampling. It’s a public opinion poll concerning how dogs feel about cats, among other things. For the survey, you randomly sampled N=500 eligible voters in Dogtopia. Your survey data is in dog-Oct.rdata. The codebook has the same name but is a pdf.

# Load required packages
library(tidyverse)

# Load the data
load("C:/Users/adamd/Downloads/dog-Oct.rdata")
dog_env <- list2env(as.list(dog))
attach(dog_env, name = "dog_env")

Problem 1

Consider the variable cats. This is a “feeling thermometer” rating on a scale of 0 (hate cats) to 100 (love cats), with 50 being neutral. Your friend Spot claims that the true Dogtopian average feeling towards cats is 50. Conduct a classical hypothesis test of Spot’s claim using an \(\alpha\) = .05 level of statistical significance.

(a) Formally state the null and alternative hypotheses.

\(H_0: \mu = 50\)
\(H_1: \mu \neq 50\)

(b) Calculate the sample mean of cats, the standard deviation, and the sample size.

# Calculate sample statistics
sample_mean <- mean(cats, na.rm = TRUE)
sample_sd <- sd(cats, na.rm = TRUE)
sample_size <- sum(!is.na(cats))

# Display results
cat("Sample mean:", sample_mean, "\n")
cat("Sample standard deviation:", sample_sd, "\n")
cat("Sample size:", sample_size, "\n")

(c) Given the sample size, what distribution and test statistic is appropriate for this hypothesis test?

Given the large sample size (n = 500), we use the normal distribution and the Z test statistic.

(d) Calculate the test statistic under the null.

# Calculate test statistic
null_mean <- 50
se <- sample_sd / sqrt(sample_size)
z_stat <- (sample_mean - null_mean) / se

cat("Standard error:", se, "\n")
cat("Z test statistic:", z_stat, "\n")

(e) What is the critical value associated with an \(\alpha\) = .05 significance level?

# Two-tailed test
alpha <- 0.05
critical_value <- qnorm(1 - alpha/2)

cat("Critical value (two-tailed):", critical_value, "\n")
cat("Rejection region: |Z| >", critical_value, "\n")

(f) Interpret the test and state your conclusion. Can we reject the null hypothesis at the \(\alpha\) = .05 level?

# Decision
reject_null <- abs(z_stat) > critical_value

cat("Test statistic:", abs(z_stat), "\n")
cat("Critical value:", critical_value, "\n")
cat("Reject null hypothesis:", reject_null, "\n")

The test statistic is much larger than the critical value, so we can reject the null hypothesis of neutrality towards cats at \(\alpha\) = 0.05.

Problem 2

Your other friend Digger claims that the true Dogtopian average feeling towards cats is 23.45. Conduct a classical hypothesis test of Digger’s claim using an \(\alpha\) = .01 level of statistical significance.

(a) Formally state the null and alternative hypotheses.

\(H_0: \mu = 23.45\)
\(H_1: \mu \neq 23.45\)

(b) Calculate the test statistic under the null.

# Calculate test statistic
null_mean_2 <- 23.45
z_stat_2 <- (sample_mean - null_mean_2) / se

cat("Z test statistic:", z_stat_2, "\n")

(c) What is the critical value associated with an \(\alpha = 0.01\) significance level?

# Two-tailed test
alpha_2 <- 0.01
critical_value_2 <- qnorm(1 - alpha_2/2)

cat("Critical value (two-tailed):", critical_value_2, "\n")

(d) Interpret the test and state your conclusion. Can we reject the null hypothesis at the \(\alpha\) = .01 level?

# Decision
reject_null_2 <- abs(z_stat_2) > critical_value_2

cat("Test statistic:", abs(z_stat_2), "\n")
cat("Critical value:", critical_value_2, "\n")
cat("Reject null hypothesis:", reject_null_2, "\n")

Since the test statistic is smaller than the critical value, we fail to reject the null hypothesis at the \(\alpha = .01\) level.

(e) Calculate the p-value for this test. Based on the p-value, would you reject the null hypothesis at the .05 level? Explain.

# Calculate p-value (two-tailed)
p_value_2 <- 2 * (1 - pnorm(abs(z_stat_2))) # or 2*pnorm(z_stat_2)

cat("P-value:", p_value_2, "\n")
cat("Reject at alpha = 0.05:", p_value_2 < 0.05, "\n")
cat("Reject at alpha = 0.01:", p_value_2 < 0.01, "\n")

The p-value = .217 is the smallest level of significance \(\alpha\) for which we could reject the null hypothesis. We could reject the null hypothesis for values of \(\alpha\) > .217. However, we would fail to reject the null for values of \(\alpha\) < .217. \(\alpha\) = .05 is smaller than our p-value, so we fail to reject the null hypothesis at the .05 level

Problem 3

Consider the variable coat.short. Your other other friend Shaggy claims that only half of Dogtopians have short-haired coats. Test the hypothesis that half of Dogtopians have short-haired coats. Conduct the test by evaluating a p-value.

(a) Formally state the null and alternative hypotheses.

\(H_0: \pi = 0.5\)
\(H_1: \pi \neq 0.5\)

(b) Calculate the sample proportion of Dogtopians with short hair. Is this more than 50%? Less than 50%?

# Calculate sample proportion
sample_prop <- mean(coat.short, na.rm = TRUE)
n_prop <- sum(!is.na(coat.short))

cat("Sample proportion:", sample_prop, "\n")
cat("Sample size:", n_prop, "\n")

This proportion (0.512) is smaller than 50%.

(c) Calculate the standard error of the sample proportion. Remember to calculate this assuming the null is true.

# Standard error under the null
null_prop <- 0.5
se_prop <- sqrt(null_prop * (1 - null_prop) / n_prop)

cat("Standard error (under null):", se_prop, "\n")

(d) Calculate the test statistic under the null.

# Z test statistic for proportions
z_stat_3 <- (sample_prop - null_prop) / se_prop

cat("Z test statistic:", z_stat_3, "\n")

(e) Calculate the p-value for this test. Based on the p-value, would you reject the null hypothesis at the .05 level? Explain.

# Calculate p-value (two-tailed)
p_value_3 <- 2 * (1 - pnorm(abs(z_stat_3)))

cat("P-value:", p_value_3, "\n")
cat("Reject at alpha = 0.05:", p_value_3 < 0.05, "\n")

No, we fail to reject the null hypothesis because \(0.05 < 0.59\).

Problem 4

Create a new variable cats11 that consists of only the first eleven observations of cats. Suppose cats11 is all the data you have on Dogtopian voters’ feeling towards cats. Your friend ChiChi claims that the population average for feelings towards cats is 25.

(a) Formally state the null and alternative hypotheses.

\(H_0\): \(\mu = 25\)
\(H_1\): \(\mu \neq 25\)

(b) Use cats11 to calculate the sample mean for feeling towards cats. Calculate the sample standard deviation.

# Create cats11
cats11 <- cats[1:11]

# Calculate statistics
mean_11 <- mean(cats11, na.rm = TRUE)
sd_11 <- sd(cats11, na.rm = TRUE)
n_11 <- sum(!is.na(cats11))

cat("Sample mean:", mean_11, "\n")
cat("Sample standard deviation:", sd_11, "\n")
cat("Sample size:", n_11, "\n")

(c) Given the sample size, what distribution and test statistic is appropriate for this hypothesis test?

Given the small sample size (n = 11), we use the t-distribution and the t test statistic with n-1 degrees of freedom.

(d) Calculate the test statistic under the null.

# Calculate t test statistic
null_mean_4 <- 25
se_11 <- sd_11 / sqrt(n_11)
t_stat_4 <- (mean_11 - null_mean_4) / se_11

cat("Standard error:", se_11, "\n")
cat("t test statistic:", t_stat_4, "\n")
cat("Degrees of freedom:", n_11 - 1, "\n")

(e) Calculate the p-value for this test. Based on the p-value, would you reject the null hypothesis at the .05 level? Explain.

# Calculate p-value (two-tailed)
df_4 <- n_11 - 1
p_value_4 <- 2 * (1 - pt(abs(t_stat_4), df = df_4))

cat("P-value:", p_value_4, "\n")
cat("Reject at alpha = 0.05:", p_value_4 < 0.05, "\n")

The p-value=.8281 is larger than .05, so we fail to reject the null hypothesis at the .05 level of significance.

Problem 5

If you have time, try to analyze the following.

We saw in lecture that, although we typically conduct an hypothesis test for a single null hypothesized value, we could choose any (valid) value for the mean or proportion. Moreover, there are many values of the null where we would reject the null. And there are many values of the null where we would fail to reject the null. This exercise is an attempt to demonstrate that.

Let’s return to the hypothesis test in Q3 on the proportion of short-haired Dogtopians. Given the sample proportion that you calculated in Q3, for what values of the null hypothesis would you reject the null hypothesis? We’ll assume an \(\alpha\) = .05 level of significance.

(a) Write down the Z test statistic, as in Q3(d). Plug in for the sample proportion. However, don’t plug in for the null hypothesized \(\pi_0\) or the standard error under the null. Note that the standard error is a function of \(\pi_0\). If the proportion p and sample size n are taken as given, we can think of the Z test statistic as an equation in terms of null hypothesized values \(\pi_0\).

Answer: Sample proportion of Dogtopians with short-haired coats and sample size:

p <- mean(coat.short, na.rm = TRUE)
p
n <- sum(!is.na(coat.short))
n

Given the sample proportion and sample size, the hypothesis test

\(H_0\): \(\pi = \pi_0\)
\(H_1\): \(\pi \neq \pi_0\)

implies a standard error under the null of

\(se_0 = \sqrt{\frac{\pi_0(1 - \pi_0)}{n}} = \sqrt{\frac{\pi_0(1 - \pi_0)}{500}}\)

and a Z test statistic of

\(z = \frac{p - \pi_0}{se_0} = \frac{.512 - \pi_0}{\sqrt{\pi_0(1 - \pi_0)/500}}\)

(b) In R, use seq() to create a sequence of values from .4 to .6 in steps of .01. Assign that to pi0.seq. Each element of pi0.seq will represent a null hypothesized value \(\pi_0\).

pi0.seq <- seq(0.4, 0.6, 0.01)
pi0.seq

(c) In R, plug pi0.seq into your equation for the standard error under the null. Call that vector se0. Each element of se0 will now be the standard error under the null, corresponding to each null hypothesized value in pi0.seq.

se0 <- sqrt(pi0.seq * (1 - pi0.seq) / n)
se0

(d) In R, plug in the sample proportion, pi0.seq, and se0 into your equation for the Z test statistic. Assign that to z. Each element of z is the test statistic you would calculate if you were to use the null value \(\pi_0\) corresponding to the same element in pi0.seq.

z <- (p - pi0.seq) / se0
z

(e) In R, execute rbind(pi0.seq, z). This should print a 2 x 21 matrix (or table) of values, where the top row shows the values of pi0.seq and the bottom row shows the corresponding z statistics.

rbind(pi0.seq, z)

(f) Suppose we were to use an \(\alpha\) = .05 level of significance for an hypothesis test. For which values of pi0.seq would we reject the null hypothesis? For which values of pi0.seq would we fail to reject the null hypothesis?

# Critical value for two-tailed test
crit_val <- qnorm(0.975)

# Determine which nulls we would reject
reject <- abs(z) > crit_val

# Create a summary table
results <- data.frame(
  pi0 = pi0.seq,
  z_stat = round(z, 3),
  reject = reject
)

print(results)

# Summary
cat("\nCritical value:", crit_val, "\n")
cat("Reject null for pi_0 values:", pi0.seq[reject], "\n")
cat("Fail to reject null for pi_0 values:", pi0.seq[!reject], "\n")

Visualization:

ggplot(results, aes(x = pi0, y = z_stat, color = reject)) +
  geom_point(size = 3) +
  geom_hline(yintercept = c(-crit_val, crit_val), linetype = "dashed", color = "red") +
  geom_hline(yintercept = 0, linetype = "dotted") +
  #geom_line() +
  labs(
    title = "Z Statistics for Different Null Hypothesized Values",
    x = "Null Hypothesized Proportion (pi0)",
    y = "Z Test Statistic",
    color = "Reject H0?"
  ) +
  theme_minimal() +
  scale_color_manual(values = c("FALSE" = "blue", "TRUE" = "red"))