๐Ÿ“š

ย >ย 

๐Ÿ“Šย 

ย >ย 

โœณ๏ธ

8.2 Setting Up a Chi Square Goodness of Fit Test

4 min readโ€ขmay 12, 2021

Josh Argo

Josh Argo


AP Statisticsย ๐Ÿ“Š

265ย resources
See Units

8.2: Setting Up a Chi Square Goodness of Fit Test

https://firebasestorage.googleapis.com/v0/b/fiveable-92889.appspot.com/o/images%2F-oHqW0os90SdJ.png?alt=media&token=966efd92-c98e-4b06-9b81-c22c64d3d1e6


Image From Statkat

Goodness of Fit

The first variation of a chi-squared test we will run is a chi square goodness of fit test. A goodness of fit test (GOF in your calculator๐Ÿ–ฉ) is used when evaluating the fit of one categorical variable with multiple categories. In the past when observing one categorical variable, we were limited to two categories, so only binary examples.ย 
For instance, we could look at a group of people and whether they answered yes or no, but we could not determine if they answered on a scale of 1-5. Since a scale of 1-5 would have 5 categories that participants could fall into, we could not perform a 1 Prop Z Test, so we would have to use something a bit more complex, like a chi-squared goodness of fit test.

Parameters

It is important to specify what our parameters are when performing inference. In the case of chi-squared GOF tests, we will have multiple population proportions that we are trying to check against a claim.
For example, if we survey a group of people on their scale of happiness 1-5 with 5 being the happiest and we have a claim that says:
  • ย 10% said they were unhappy (1),ย 
  • 15% said they were somewhat unhappy (2),ย 
  • 28% said they were sometimes happy and sometimes sad (3),ย 
  • 30% said they were happy (4), and
  • 17% said they were always happy (5)
Then our parameter we would be testing would be the true proportion of 1s, 2s, 3s, 4s and 5s.

Hypotheses

Null Hypothesis
Just as with any inference test, we must have both a null hypothesis and an alternate hypothesis. Our null hypothesis is generally what we would expect to happen if everything goes according to plan. There is nothing different going on than what our original claim was.
In the example of our happiness scale of 1-5, our null hypothesis would be as follows:
Ho: p1=0.1
p2=0.15
p3=0.28
p4=0.3
p5=0.17
It is very important when writing our hypotheses to include context. In the example that we have just done, adding a subscript of 1,2,3,4 or 5 gives us context since the problem was dealing with a survey score of 1-5. It is also a good idea to identify p1=true proportion of people who rated 1 as their happiness score, etc. for other scores.
Alternate Hypothesis
Our null hypothesis is normally very simple. It is best to just state that at least one of the proportions in our null hypothesis is incorrect. Since all of our proportions add up to 100%, one of our null proportions being incorrect leads to others being incorrect as well.
For example, on the happiness scale problem as noted above, our alternate hypothesis would be:
Ha:ย  At least one of the proportions measuring peopleโ€™s happiness is incorrect.
As always, context is key and can cause your score to be docked!

Conditions

Chi-squared tests require two similar conditions as previous inference tests:
  • Our sample must be randomย 
  • 10% rule:ย  Our population must be at least 10x our sample
Instead of checking for a normal distribution, we have to make sure that our expected counts are at least 5.ย 
In our happiness scale example, we would take our sample size and multiply by 0.1, 0.15, 0.28, 0.3 and 0.17 to ensure that we would expect to have at least 5 fall into each category.
**If performing an experiment by random assignment of treatments, the independence condition is assumed (10% condition) and the random assignment suffices for the random condition.

Example

A recent survey established that when choosing their favorite between Harry Potter, Lord of the Rings and Star Wars, the answers were the same with โ…“ picking each of the series.
To test this claim, a random sample of 2500 US adults was surveyed about their favorite movie/book series. To check this test, write your hypotheses and check conditions for inference.
Hypotheses and parameter
Ho:ย  pHP=0.33, pSW=0.33, pLOTR=0.33
Ha:ย  At least one of the proportions of favorite movie/book series is incorrect.
pHP=true proportion of people who prefer Harry Potter,ย 
pSW=true proportion of people who prefer Star Wars,ย 
pLOTR=true proportion of people who prefer Lord of the Rings
Conditions
  • Random:ย  โ€œA random sample of 2500 US adultsโ€ (quote the problem)
  • Independence:ย  It is reasonable to believe that there are 25,000 adults in the US (10% condition)
  • Large Counts:ย  2500(0.33)=825>5 (same for all three proportions)
In the next section, we will finish the problem by going through and calculating our test statistic and p-value based on our actual counts from our sample.
๐ŸŽฅย  Watch: AP Stats Unit 8 - Chi Squared Tests
Browse Study Guides By Unit
๐Ÿ‘†Unit 1 โ€“ Exploring One-Variable Data
โœŒ๏ธUnit 2 โ€“ Exploring Two-Variable Data
๐Ÿ”ŽUnit 3 โ€“ Collecting Data
๐ŸŽฒUnit 4 โ€“ Probability, Random Variables, & Probability Distributions
๐Ÿ“ŠUnit 5 โ€“ Sampling Distributions
โš–๏ธUnit 6 โ€“ Proportions
๐Ÿ˜ผUnit 7 โ€“ Means
โœณ๏ธUnit 8 โ€“ Chi-Squares
๐Ÿ“ˆUnit 9 โ€“ Slopes
โœ๏ธFrequently Asked Questions
โœ๏ธFree Response Questions (FRQs)
๐Ÿ“†Big Reviews: Finals & Exam Prep

Fiveable
Fiveable
Home
Stay Connected

ยฉ 2023 Fiveable Inc. All rights reserved.


ยฉ 2023 Fiveable Inc. All rights reserved.