Introduction

This document performs some basic exploratory analyses using ToothGrowth dataset.

1. Loading data

We use the ToothGrowth dataset from the UsingR package.

## Loading required package: MASS
## Loading required package: HistData
## Loading required package: Hmisc
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
## 
##     format.pval, units
## [1] "len"  "supp" "dose"
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

2. Basic Summary of Data

The data consists of 60 guinea pig observations and 3 variables. Guinea pigs were given two different supplements: supp=OJ is orange juice and supp=VC is ascorbic acid two different methods to deliver vitamic C to the animals. The idea is to test if supplement type and dose affects tooth growth measured by length.

head(ToothGrowth)
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5
summary(ToothGrowth)
##       len        supp         dose      
##  Min.   : 4.20   OJ:30   Min.   :0.500  
##  1st Qu.:13.07   VC:30   1st Qu.:0.500  
##  Median :19.25           Median :1.000  
##  Mean   :18.81           Mean   :1.167  
##  3rd Qu.:25.27           3rd Qu.:2.000  
##  Max.   :33.90           Max.   :2.000

Let’s see if mean tooth length differs by supp

mean(ToothGrowth[ToothGrowth$supp=="OJ",]$len)
## [1] 20.66333
mean(ToothGrowth[ToothGrowth$supp=="VC",]$len)
## [1] 16.96333

Let’s plot to see it on a plot.

library(ggplot2)
p <- ggplot(ToothGrowth, aes(x = dose, y = len)) +
  geom_line() +
  facet_wrap(~ supp) +
  labs(title = "Tooth Length vs Dose by Supplement Type",
       x = "Dose",
       y = "Tooth Length") +
  theme_minimal()

# Print the plot
print(p)

3. Comparison of tooth growth by supp and dose.

3.1 By supplement type

We calculate confidence interval for independent groups. We choose supp=VC as one group and supp=OJ as another group and create respective subsets and calculate means and variances for each group. We also calculate pooled variance estimate.

oj <- subset(ToothGrowth, supp == "OJ")
oj_mean = mean(oj$len)
oj_var = var(oj$len)
vc <- subset(ToothGrowth, supp == "VC")
vc_mean = mean(vc$len)
vc_var = var(vc$len)

# difference of means
diff_mean <- oj_mean-vc_mean

# pooled standard deviation
sp <- sqrt(((30-1)*oj_var^2+(30-1)*vc_var^2)/(30+30-2))
#sp <- 8

# confidence interval for difference of means
diff_mean + c(-1,1)*qt(.975, 58)*sp*(1/30+1/30)^.5
## [1] -25.92832  33.32832

3.2 By dose

Does dose levels make a difference in tooth growth? Let’s test.We use dose05, dose10 for 0.5 and 1.0 dose levels respectively. We want to see if going from dose 0.5 to 1.0 or 2.0 makes a difference. We calculate two confidence intervals using the mean difference between dose05 and dose10 and dose05 and dose20.

dose05 <- subset(ToothGrowth, dose == 0.5)
dose05_mean = mean(dose05$len)
dose05_var = var(dose05$len)

dose10 <- subset(ToothGrowth,dose==1.0)
dose10_mean = mean(dose10$len)
dose10_var = var(dose10$len)

dose20 <- subset(ToothGrowth,dose==2.0)
dose20_mean = mean(dose20$len)
dose20_var = var(dose20$len)

# difference of means
diff_dose <- dose10_mean-dose05_mean
diff_dose0520 <-dose20_mean-dose05_mean

# pooled standard deviation
sp <- sqrt(((20-1)*dose10_var^2+(30-1)*dose05_var^2)/(20+20-2))

# confidence interval for difference of means
diff_dose + c(-1,1)*qt(.975, 38)*sp*(1/20+1/20)^.5
## [1] -5.226439 23.486439
diff_dose0520 + c(-1,1)*qt(.975, 38)*sp*(1/20+1/20)^.5
## [1]  1.138561 29.851439

4. Conclusion

  1. The confidence interval for mean differences includes 0, which means that we cannot rule out that supplement type has no impact on tooth growth.

  2. The confidence interval for mean differences of dose-divided groups show that 0 difference between animals that receive 0.5 or 1.0 dose levels cannot be ruled out. But when we jump from 0.5 to 2.0 dose level it has at least some positive effect on tooth growth.

Assumptions

These conclusions are based on the assumption that the animals were assigned randomly to the treatment of different supplements and dose values. This also implies that the variance is constant across two groups. We also assume that all factors contributing to tooth growth are controlled in choosing the test animals. If there are interaction effects between two variables impacting tooth growth then we need other types of analyses.