```{r setup, include=FALSE}
  knitr::opts_chunk$set(echo = TRUE)
  knitr::opts_chunk$set(dev = 'pdf')
  def.chunk.hook  <- knitr::knit_hooks$get("chunk")
  knitr::knit_hooks$set(chunk = function(x, options) {
    x <- def.chunk.hook(x, options)
    ifelse(options$size != "normalsize", paste0("\n \\", options$size,"\n\n", x, "\n\n \\normalsize"), x)
  })
```




## Last time {.t}
Let $X$ be the number of heads when we flip a coin 6 times. We have:

\vspace{3mm}
$H_0$: the coin is fair; $p=0.5$

$H_A$: the coin is unfair; $p \neq 0.5$

where $p$ is the true probability of heads with this coin. 

\vspace{8mm}

With a rejection region of $\{X=0, X=1, X=5, X=6\}$, the probability of making a Type I Error was:
```{r, size="scriptsize"}
dbinom(x=0, size=6, prob=0.5) + dbinom(x=1, size=6, prob=0.5) + 
  dbinom(x=5, size=6, prob=0.5) + dbinom(x=6, size=6, prob=0.5)
```




## Last time {.t}
What if we used a smaller rejection region, say of $\{X=0, X=6\}$?

\vspace{10mm}
Then the probability of making a Type I error becomes:

```{r, size="scriptsize"}
dbinom(x=0, size=6, prob=0.5) + dbinom(x=6, size=6, prob=0.5)
```

Side question: if shrinking the rejection region leads to a lower probability of making a Type I Error, then why don't we always just use the smallest possible rejection region?


## Type I Error with a t-Test {.t}
Suppose we want to know if there is evidence that the average amount of sleep that a UCSD student gets is \underline{different from} 6 hours per night. We obtain the following data on a random sample of five students and ask them how many hours of sleep they got on the previous night, with the data below:

$$
3, 7, 1, 2, 2
$$

\pause

::: {.block}
### Our hypotheses are:
\vspace{-4mm}
$$
\begin{aligned}
H_0: \mu = 6 \\
H_A: \mu \neq 6
\end{aligned}
$$
:::

Now how do we make a decision?

## Type I Error with a t-Test {.t}

::: {.block}
### The t-Test:
\vspace{-2mm}
We calculate:

$$
t_s = \frac{\overline{x} - \mu_0}{s / \sqrt{n}} = \ldots
$$
:::


```{r}
sleep <- c(3, 7, 1, 2, 2)
t_s <- (mean(sleep) - 6) / (sd(sleep) / sqrt(5))
t_s
```

We then compare this to its null distribution\ldots

## Type I Error with a t-Test {.t}
\label{tdist}

```{r, echo=FALSE}
t <- seq(-4, 4, by=0.01)
f_t <- dt(t, df=4)
plot(f_t ~ t, type='l', main="Null distribution: t with df=4", ylab="f(t)", cex.lab=1.5)
polygon(c(t[t>=qt(0.025, df=4, lower.tail=FALSE)], max(t), qt(0.025, df=4, lower.tail=FALSE)), c(f_t[t>=qt(0.025, df=4, lower.tail=FALSE)], 0, 0), col="gray75")
polygon(c(t[t<=qt(0.025, df=4)], qt(0.025, df=4), min(t)), c(f_t[t<=qt(0.025, df=4)], 0, 0), col="gray75")
```


## Type I Error with a t-Test {.t}
::: {.block}
### At $\alpha=0.05$, the critical values that delineate the rejection region are:
```{r}
qt(0.025, df=4)
qt(0.025, df=4, lower.tail=FALSE)
```
:::

\pause 
So our test statistic of `r round(t_s, 3)` is in the rejection region, meaning that we would reject $H_0$. Specifically, the p-value is:
```{r}
pt(t_s, df=4) * 2
```


## Type I Error with a t-Test {.t}
Also note that there is a built-in R function that does the t-Test for us and will give the same exact answer as the manual calculations from the previous slides:
```{r}
t.test(sleep, mu=6)
```


## Type I Error with a t-Test {.t}
And if we just want to extract the p-value:

```{r}
model1 <- t.test(sleep, mu=6)
names(model1)
model1$p.value
```

So either way works, and again, we reject $H_0$ at $\alpha=0.05$. But\ldots

## Type I Error with a t-Test {.t}
\label{typei}

::: {.block}
### Might we have made a Type I Error?
\vspace{-2mm}
We can't know for sure in any given case, but we CAN know the probability of making a Type I Error.

:::

If the conditions of the test are met, the probability of making a \ Type I error should just be the $\alpha$ level of the test (here, 0.05). Why?

https://pollev.com/chi


## Type I Error with a t-Test {.t}
\label{conditions}
But wait, what conditions?

https://pollev.com/chi

## Type I Error with a t-Test {.t}
And what are the consequences of performing a test when its conditions are not met?


\vspace{4mm}

::: {.block}
### What happens if the conditions of your test were not met?
\vspace{-2mm}
Primarily, your probability of making a Type I error may be inflated.
:::

In other words, if you thought you were doing a 0.05-level test, but the conditions required for your test were not met and you still did the test anyways, then it might actually not be a 0.05-level test!

\vspace{4mm}

(P.S. this is really bad!!)


## Type I Error with a t-Test {.t}
\framesubtitle{Sidenote: why is it called a t-Test (or also the "Student's t-Test")?}
![](student.png){height=90%}



## Type I Error with a t-Test {.t}
\framesubtitle{Sidenote: why is it called a t-Test (or also the "Student's t-Test")?}

 - If $\sigma$ is known and the conditions just mentioned are met, 
 $\frac{\overline{x} - \mu_0}{\sigma / \sqrt{n}}$ follows a normal distribution. But usually, $\sigma$ is NOT known! Recall:
   - $\sigma$ is the population standard deviation
   - $s$ is the sample standard deviation

\vspace{2mm}

 - Prior to this work, people just treated $\frac{\overline{x} - \mu_0}{s / \sqrt{n}}$
  like it also follows a normal distribution, knowing that it didn't with small samples, but not
  knowing how to fix it.
   - Since $s$ is calculated from the data, $s$ itself has variability which is what messes things up.

\vspace{2mm}

 - What "Student" did was derive the probability distribution of $\frac{\overline{x} - \mu_0}{s / \sqrt{n}}$ for any sample size (e.g. the one on Slide \ref{tdist} when $n=5$).


\vspace{5mm}

So who was "Student" and why did they call themselves that?

## Type I Error with a t-Test {.t}
![](Gosset-plaque.png)

## Type I Error with a t-Test {.t}
![](Gosset-plaque2.png)


## Determining Type I error rates {.t}
Example: let's suppose that in the population, the distribution of hours of sleep follows a Gamma(1.2, 5) distribution (don't worry too much about the specifics; just know that this is a skewed distribution):

```{r, echo=FALSE, out.width="75%"}
x <- seq(0.00001, 14, by=0.01)
y <- dgamma(x, shape=1.2, scale=(6/1.2))
plot(y ~ x, type='l', xlab="hours", ylab="density", main="Gamma distributed hours of sleep among UCSD students", cex.lab=1.75, cex.main=1.75, cex.axis=1.75)
```

## Determining Type I Error rates {.t}
The Gamma distribution shown in the previous slide does have a mean of $\mu=6$. Based on this, can we figure out what the Type I Error rate of a t-Test would be?

\vspace{3mm}


::: {.block}
### Wait what's the Type I Error rate again?
\vspace{-2mm}
It's the probability of rejecting $H_0$ if $H_0$ is actually true...

\vspace{3mm}
So, here, it would be the probability that we get a $t_s$ value that is in the rejection region of the t-Distribution.
:::

\vspace{2mm}

\pause 
::: {.block}
### Recall the test statistic $t_s$:
\vspace{-4mm}
$$
t_s = \frac{\overline{x} - \mu_0}{s / \sqrt{n}}
$$
:::

When $H_0$ is true, this thing follows the t-Distribution on Slide \ref{tdist}, but ONLY IF the conditions are met (which here we know they are not). 


## The t-Distribution again {.t}

```{r, echo=FALSE, out.width="90%"}
t <- seq(-4, 4, by=0.01)
f_t <- dt(t, df=4)
plot(f_t ~ t, type='l', main="Null distribution: t with df=4", ylab="f(t)", cex.lab=1.5)
polygon(c(t[t>=qt(0.025, df=4, lower.tail=FALSE)], max(t), qt(0.025, df=4, lower.tail=FALSE)), c(f_t[t>=qt(0.025, df=4, lower.tail=FALSE)], 0, 0), col="gray75")
polygon(c(t[t<=qt(0.025, df=4)], qt(0.025, df=4), min(t)), c(f_t[t<=qt(0.025, df=4)], 0, 0), col="gray75")
```

\vspace{-2mm}

The problem: since $X$ is highly skewed, the distribution of $t_s$ under $H_0$ is going to look quite different from this!

## Determining Type I Error rates {.t}
Therefore, to calculate the Type I Error rate in this situation theoretically, we would have to know what the distribution of

$$
t_s = \frac{\overline{x} - \mu_0}{s / \sqrt{n}}
$$
is when $X$ follows a Gamma distribution, and then use that to determine the probability of $t_s$ falling in the rejection region of the original t-Distribution. 

\vspace{2mm}

::: {.t}
### Unfortunately...
\vspace{-2mm}
There is actually no closed-form solution to the distribution of $t_s$ if $X$ follows a Gamma distribution. :(
:::


\vspace{3mm}
But, we can go back to our DSC roots and simulate!

## Determining Type I Error rates {.t}
What do we simulate?

 - Simulate a sample of size 5 under the Gamma distribution

\vspace{2mm} 

- Run the t-Test on that sample, get the p-value

\vspace{2mm} 

 - Check if the p-value is less than 0.05 (in which case, that sample gave a Type I Error)

\vspace{2mm} 

 - Do this repeatedly, count each time that p-value was less than 0.05 and thus gives a Type I Error

\vspace{2mm} 

 - The proportion of times that a Type I Error occurs in the simulations is an estimate of the Type I Error rate.


## Determining Type I Error rates {.t}

```{r}
count <- 0

for(i in 1:10000){
  gam_data <- rgamma(n=5, shape=1.2, scale=(6/1.2))
  p.val <- t.test(gam_data, mu=6)$p.value
  if(p.val < 0.05){
    count <- count + 1
  }
}

TypeI <- count / 10000
TypeI
```

That's more than twice as big as 0.05!

## Your Turn {.t}
\label{your}
Let's do the same simulation, but under two other scenarios:

 - Let $X$ actually follow a normal distribution, with $\mu=6$ and $\sigma=1$.
   - The `rnorm` function will be useful.


\vspace{4mm}

 - Let $X$ follow a Uniform distribution from 2 to 10.
   - The `runif` function will be useful.

In an R Markdown file, find a simulated estimate of the Type I Error rate in each case, again with samples of size $n=5$, and briefly comment on what you observe. 

## Recap {.t}

 - Any \underline{parametric} statistical test has conditions that must be met for validity. 
   - By \underline{parametric}, we essentially mean one that has distributional conditions on the data in order to have a closed-form null distribution (such as the t-Test)
   - By validity, we primarily mean that a statistical test is valid if its Type I Error rate will in fact be the stated $\alpha$-level. 

\vspace{5mm}

 - We can evaluate Type I Error rates via simulation. We will extend this to more complicated statistical analyses as we proceed in this course. 



## Recap {.t}
### Things still to come:
\vspace{-2mm}
 - \underline{Non-parametric} tests (e.g., but not limited to, what you've learned in DSC 10 and 80 where you get a p-value via simulating the null distribution) are less dependent on conditions to be valid.
   - By \underline{non-parametric}, essentially we mean tests that do not rely on distributional conditions on the data in order to have a closed-form null distribution
   - Tests performed via simulation are one example of this, but there are others as well.

\vspace{4mm}

 - What about statistical power?

## Daily Check for today {.t}
Upload your .pdf output file from R Markdown to Gradescope, consisting of:

1. The answer to the question on Slide \ref{typei}.

\vspace{2mm}

2. The answer to the question on Slide \ref{conditions}.

\vspace{2mm}

3. Your two simulations from the "Your Turn" on Slide \ref{your}, with brief commentary on what you observed in each case. 

\vspace{3mm}

If you have not yet been able to get R Markdown to output to a pdf file, we will be lenient on that for this assignment; please just find some way to upload a single pdf with your answers, code and output.