Chisquare/Fisher’s
Exact Test
Independence Testing
A test of
independence is a statistical test that determines whether two categorical
variables associate with each other. Chisquare test and Fisher’s exact test,
which apply to contingency tables, are common approaches of independence
testing.
Fisher’s exact test is
one of the exact tests, while chisquare test is based on approximation. When
there are more than 20% of cells with < 5 expected frequencies,
Fisher’s exact test is preferrable to chisquare test because applying
approximation is inadequate.
If the corresponding
pvalue of the test statistic is less than the chosen significance level, then
the association between the two variables is statistically significant.
Chisquare Test
Chisquare test is a
nonparametric statistical hypothesis test. The null hypothesis is that the
observed frequency is consistent with the expected frequency of certain events
in a sample. If the frequency distribution of a categorical variable does not
differ across groups from another categorical variable, the two variables can
be concluded as independent.
The test statistic is:
Following distribution
with degrees of freedom and where is the observed frequency, is the expected
count, is the number
of rows of table and is the number
of columns.
Fisher’s Exact Test
Fisher’s Exact Test
is based on a hypergeometric distribution of the counts in cells of the
contingency table. A 2 x 2 contingency table is shown below:

A 
Not A 
Total 
B 



Not B 



Total 



The probability of
obtaining such frequency distribution is:
Some statistical
analysis software and packages, for example, SAS, supports Fisher’s exact test
on general x tables.
Example Code in SAS
DATA PERSONS ; INPUT
GROUP $ SUCCESS $ @@;
DATALINES ;
DRUG NO DRUG NO DRUG NO DRUG YES
DRUG YES DRUG YES DRUG YES DRUG YES
DRUG YES DRUG YES
PLACEBO NO PLACEBO NO PLACEBO YES
PLACEBO YES
PLACEBO YES PLACEBO YES PLACEBO YES
PLACEBO YES
PLACEBO YES PLACEBO YES
RUN ;
PROC FREQ DATA = PERSONS ;
TABLES GROUP * SUCCESS/ NOPERCENT NOCOL
NOROW
CHISQ FISHER EXPECTED
;
RUN ;
Example Code in R
i# input data
success < c("No", "No", "No", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"No", "No", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes")
group < c("drug", "drug", "drug", "drug", "drug", "drug", "drug", "drug",
"drug", "drug", "placebo", "placebo", "placebo", "placebo",
"placebo", "placebo", "placebo", "placebo", "placebo", "placebo")
# create
a dataframe
df < data.frame(success, group)
# contingency
table
table(df)
# compute
expected frequencies on the contingency table
xsq < chisq.test(df$success, df$group)
xsq$expected
# 50% of cells have expected
counts less than 5. Use fisher's exact test
fisher.test(df$success, df$group)
References
1. Kim
H. Y. (2017). Statistical notes for clinical researchers: Chisquared test and
Fisher's exact test. Restorative dentistry & endodontics, 42(2), 152–155. https://doi.org/10.5395/rde.2017.42.2.152
2. Hoffman,
J. I. E. (2015). Biostatistics for Medical and Biomedical Practitioners.
Academia Press. https://doi.org/10.1016/B9780128023877.000135