Todo: Add resources
The goal of this part of the lab session is to learn how to calculate correlations and how to perform Chi-square tests. In addition, it shows how to analyze questions where participants can select multiple answers. We will continue using the OPTI/QWERTY data set. As usual, try to solve the questions by yourself, with the help of your fellow students before you peek ahead. Now let’s start doing some inferential statistics!
Before beginning, ensure that you have already completed Lab 1 (Descriptive Statistics) and Lab 2 (Inferential Statistics Part 1) and loaded your dataset into R.
The goal of this lab session is to learn how to calculate correlations and perform Chi-square tests using R. In addition, you will learn how to analyze questions where participants can select multiple answers. We will continue using the OPTI/QWERTY data set.
Remember: Try to solve the questions on your own before checking the answers at the end of this document.
Correlation measures the strength of the relationship between two
variables. One interesting possible relationship is between the number
of messages a person sends per day and their average typing speed (WPM)
on the OPTI or QWERTY keyboard. Initially, we will use our questionnaire
data only. Prepare by loading
questionnaire_data_R_2024.xlsx.
A good starting point for examining the relationship between two variables is to create a scatterplot. For example, to plot Age (X-axis) vs. Hours_of_computer_use_per_day (Y-axis):
library(ggplot2)
# Replace 'data' with your dataset name
ggplot(questionnaire_data, aes(x = Age, y = Hours_of_computer_use_per_day)) +
geom_point() +
labs(title = "Scatterplot of Age vs. Hours_of_computer_use_per_day",
x = "Age",
y = "Hours of Computer Use")
Next, compute the correlation. For ratio/interval variables, Pearson’s r is commonly used. For example:
cor_test_result <- cor.test(questionnaire_data$Age, questionnaire_data$Hours_of_computer_use_per_day, method = "pearson")
cor_test_result
Questions:
Example APA-style report: “A Pearson correlation analysis revealed that there was no significant relationship between age and hours of computer use, r(64) = .08, p = .508.”
The Chi-square test allows us to test whether there is a significant relationship between two ordinal/nominal variables. More specifically, it tests whether the frequencies of observed events differ significantly from what we would expect to find if all events were equally likely.
Let’s dive into our data set and investigate whether female students send more messages per dag than male students. For us to investigate this we need to check whether there are differences in the observed and expected frequencies of messages sent for different genders.
If there is no difference between the likelihood of these different categories, we would expect that there are as many female students sending a lot of messages as male students, and as many female students sending few messages just like their male counterparts. The Chi-square test tells us whether this is the case.
Before running the Chi-square test, ensure your messaging variable is recoded into ordinal groups. Re-run your recoding code from Stats1 if needed:
library(dplyr)
questionnaire_data <- questionnaire_data %>%
mutate(MessagesCategory = case_when(
Messages_per_day <= 10 ~ "10 or less",
Messages_per_day >= 11 & Messages_per_day <= 50 ~ "11 to 50",
Messages_per_day > 50 ~ "More than 50"
))
Next, set the order of the categories:
questionnaire_data$MessagesCategory = factor(questionnaire_data$MessagesCategory, levels = c("10 or less", "11 to 50", "More than 50"), ordered = TRUE)
This turns MessagesCategory from a nominal into an ordinal scale.
Then, let us just remind ourselves of the data:
Now let’s see if there is a relationship between the amount of messages
our sample sends per day and gender or is messaging equally likely
regardless of gender (our null hypothesis)?
A contingency table displays the observed frequencies of respondents in each combination of categories. In our case, it shows how many males and females fall into each MessagesCategory (for example, “<=10”, “11-50”, etc.). This table is the foundation for performing the Chi-square test:
tab <- table(questionnaire_data$Sex, questionnaire_data$MessagesCategory)
addmargins(tab) # adds row/column sums
The Chi-square test evaluates whether the differences in frequencies across categories are statistically significant compared to what we would expect if there were no association between gender and messaging behavior. A significant result (p < 0.05) would indicate that the observed differences in messaging frequency between genders are unlikely due to chance:
# Perform the Chi-square test
chi_test <- chisq.test(tab)
chi_test
Question:
For example, A Chi-square test revealed no significant relationship between gender and messages sent per day, χ²(2, N = 64) = 1.10, p = .78.
Note: Ensure that there are at least 5 cases per cell in your contingency table for the test to be reliable.
Although the contingency table and Chi-square test offer numerical insights, a visual display helps illustrate the differences in distributions between men and women. A faceted chart is ideal for comparing messages-per-day distributions between groups.
You can find a tutorial here. Skip
to the first mention of facet_wrap. Be aware of a few
differences:
geom_bar(stat = "identity"), which
requires a predefined y-value. Instead, we will use
geom_bar(stat = "count"), which automatically counts
observations.y in aes().facet_wrap variable: Sex or
MessagesCategory?fill="forest green"). Instead, you can set the fill based
on MessagesCategory by adding
fill = MessagesCategory inside aes(). Looks
nicer :-) (And allows for color-matching between the two
distributions).It is time to address the question if the OPTI or QWERTY keyboards are in fact faster than one another if we control for a number of variables (age, number of messages sent per day, group 1 or 2).
RQ: If we control for the possible effect of age and the participants use of sending messages in their daily life, can our statistical model predict that the participants type significantly faster on the QWERTY keyboard vs the OPTI keyboard?
To address the question, we can do a hierarchical regression. There are many other types of regression models, but we do not have time to go through them all. Some models are more appropriate on some datasets than others (size and type), the purpose of your investigation and importantly the models differ in the amount of control you have over the statistical analysis. Doing a hierarchical regression, means we enter the predictor (independent) variables in steps, in an order we decide. In the first step, we use Keyboard, Group and Messages_per_day in the analysis. In the second step, we add Age. Now we can see whether our variables can explain the variance between the WPM on the keyboards.
We will answer the research question using our
keyboard_data dataframe, so be sure to load that one first.
In R, you can perform hierarchical regression using the
lm() function and compare models using
anova():
# Model 1: Predict OPTI_MEAN using GROUP and MESSDAY
model1 <- lm(WPM ~ Group + Keyboard + Messages_per_day, data = keyboard_data)
print(summary(model1))
# Model 2: Add AGE_GROUP to the predictors
model2 <- lm(WPM ~ Group + Keyboard + Messages_per_day + Age, data = keyboard_data)
print(summary(model2))
# Compare the models to see the added variance explained by AGE_GROUP
anova(model1, model2)
Check the R Squared values in the Model Summary tables. After the variables in model1 (Keyboard, Group and Messages_per_day) have been entered, the overall model explains 20.7% of the variance in average scores (R Square). After model2 variables have also been entered (Age) and included, the model as a whole explains 21.7% of the variance in average WPM.
Question:Let’s see how well each of the variables contribute to our regression. To compare the relative impact of different predictors, we can standardize the coefficients:
library(lm.beta)
# Run regression and get standardized coefficients
model2_standardized <- lm.beta(model2)
summary(model2_standardized)
This extended summary table provides key information about the relationship between each predictor and the dependent variable (WPM):
Estimate Coefficient (B): Represents
the expected change in WPM for a one-unit
increase in the predictor, keeping other variables constant.
The scale depends on the original units of measurement.Standardized Coefficient (β): Converts
predictors to z-scores, making them comparable in terms
of effect size. A larger β means a stronger influence
on WPM.Pr(>|t|)): If p < 0.05,
the predictor has a statistically significant impact on
WPM.model3 (and 4 and 5 and …) by modifying the set
of predictors. Consider adding new variables (e.g., Sex) and removing
insignificant ones (e.g., Group). How does this affect the model’s
explanatory power?When predictors are analyzed together in a regression model, their significance can differ from individual t-tests due to several key reasons:
We can report our results as so: Hierarchical multiple regression was used to assess if the average typing speed on the OPTI keyboard was effected by the participant’s age, the amount of messages they send per day in their daily life, and if they tested the OPTI keyboard before the QWERTY keyboard. Test group and messages per day were entered at step 1, explaining 9.4% of the total variance in average wpm. After entry of Age in step 2, the total variance explained by the model was 11.6%, F(3,60)=2.6, p>0.5. Age added an additional 2.2% of the variance in wpm, after controlling for group and messages sent per day. R square change=.22, F change (1,60) 1.491, p>0.05. In the final model Test Group variable was statistically significant predictor Beta=0.271, p<.05, where as Messages sent per day (Beta=-0.14, p>.05) and Age (Beta=-0.158,p >0.05) were not.
Alternatively, one can also report the coefficients table. For an example, see here.
Note: The effect of Group might not be
significant, but you would still want to report it with your
model1. It tells us about whether there occurred ‘transfer
learning’ from QWERTY to OPTI or vice versa, which was controlled for by
our within-subjects design.
We suspect that for our dataset it is more likely that experience with the QWERTY keyboard has a greater effect on WPM for the QWERTY keyboard than the OPTI keyboard. Our participants were used to working on the QWERTY keyboard and had most likely never worked with the OPTI keyboard before. Our experiment is biased in favour of the QWERTY keyboard.
This bias suggests that participants who frequently send messages should benefit even more when using the QWERTY keyboard. This is called an interaction effect: we expect Messages_per_day to interact with Keyboard in predicting WPM. Specifically, we hypothesize that higher message frequency improves WPM more for QWERTY than for OPTI.
In R’s formula notation, interactions are specified using the
* or : operators: - A * B adds
both main effects (A and B) and their
interaction term (A × B). - A : B includes
only the interaction term, without main effects.
model5 in which include
Keyboard * Messages_per_day as predictor in your formula to
test for an interaction effect. Can you confirm an interaction
effect?Work through the exercises, compare your results with the examples provided, and discuss any discrepancies with your peers and instructors.
Happy analyzing!