Assignment 4: Hypothesis Testing


Purpose:

The purpose of this assignment is to practice hypothesis testing using Z and T statistical tests. The goals of the assignment may be broken down as follows

Goals

  • To distinguish between a Z and T test and understand which to use for a given situation
  • To calculate a Z or T score 
  • To understand the steps of Hypothesis Testing
  • Use results to make a decisions regarding the null hypothesis
  • Practice with using real world data. 
_________________________________________________________________________________
Terms:

The following are terms related to the statistical methods used in this assignment

Steps of Hypothesis Testing-
  1. State the null hypothesis
  2. State the alternative hypothesis
  3. Choose a statistical test
  4. Choose the level of significance (α)
  5. Calculate the statistics
  6. Make a decision about the null and alternative hypothesis

Null Hypothesis- The null hypothesis is the hypothesis that suggest that there is No Change between the hypothesized and sample means. The null hypothesis can either be rejected (if there is a change) or fail to be rejected ( if there is no change), but never accepted

Alternative Hypothesis- The alternative hypothesis is the hypothesis which states that there is a difference between the Sample and Hypothesized mean. The alternative hypothesis can not be rejected or fail to be rejected.

Hypothesized Mean-  The hypothesized mean represents the mean for an entire population that is being studied. It is compared to the sample mean during hypothesis testing

Sample Mean- The sample mean is the mean for a smaller sample of the larger population.

Confidence Interval- the confidence interval sets the range that values fall between on a normal distribution curve. A confidence interval of 95% means that 5% of values on a one tail test or 2.5% of values on a two tail test  will fall outside of the normal distribution. 

Critical Value- The critical value is the deciding value on the normal distribution curve to determine whether or not to reject or fail to reject the null hypothesis
_________________________________________________________________________________
Methods:

T
Part 1.

The first part of the assignment involved filling out a chart to become familiar with the different parts of hypothesis testing the chart can be seen below in figure 1.
Figure 1.
Z/T test chart
The information needing to be calculated is the significance level, whether to use a Z or T test, and the Z or T value. The significance level can be calculated based on the confidence interval. For a one tail test, subtracting the confidence interval from 1 will give you the significance level. For a two tail test , the results need to be divided by 2.

Whether to use a Z test or T test depends on the sample size. A z test is used for numbers greater than 30, and a T test is used for numbers under 30.

The Z and T score can be calculated by using a Z or T score chart. For a Z score, you would look for the closest value to the confidence interval and find the corresponding Z score. For T test, the significance values and degree of freedom (number of observations - 1)
___________________________________________________________________________________

Part 2.

The second part of the assignment involved hypothesis testing to sarcastically test if there is a difference in the sample and hypothesis mean for 3 types vegetables grown in Kenya. These are ground nuts, cassava, and beans. The sample population of 23 farmers was surveyed. Because the sample size is under 30, a two tailed T-test is used. Below is the calculation for calculating a Z or T score. All tests were two tailed T- tests with a confidence interval of 95%

Formula to Calculate a Z or T score


Ground Nuts:

.  Null Hypothesis: There is no difference in the average amount of ground nuts (in metric tons) grown in the sample population and the overall population

Alternative Hypothesis: There is a difference in the average amount of ground nuts grown between the sample and overall population.

Sample mean = .51
Hypothesis mean = .55
Standard Deviation = .3

A two tailed test with a confidence interval of .95  gives a critical value of +/- 2.074
The calculated T value is .639

.639 falls between -2.074 and + 2.074. Therefore we fail to reject the null hypothesis

Cassava

Null Hypothesis:  There is no difference in the average amount of cassava (in metric tons) grown in the sample population and the overall population

Alternative Hypothesis:  There is a difference in the average amount of  cassava (in metric tons) grown in the sample population and the overall population

Sample mean = 3.4
Hypothesis mean = 3.8
Standard Deviation = .74

A two tailed test with a confidence interval of .95  gives a critical value of +/- 2.074
The calculated T-value is -2.59

-2.59 falls below the critical value of -2.074. Therefore we reject the null hypothesis

Beans

Null Hypothesis- There is no difference in the average amount of beans (in metric tons) grown in the sample population and the overall population

Alternative Hypothesis-There is a difference in the average amount of cassava (in metric tons) grown in the sample population and the overall population

Sample mean = .33
Hypothesis mean = .28
Standard Deviation = .13

A two tailed test with a confidence interval of .95  gives a critical value of +/- 2.074
The calculated T-Value is 1.84

1.84 falls between the critical values of -2.074 and 2.074. Therefore we fail to reject the null hypothesis

Similarities: Both ground nuts and cassava had lower sample means than the hypothesis mean. However this does not necessarily indicate a difference. The tests for ground nuts and beans failed to reject the null hypothesis, meaning that statistically there is no difference between the amount grown in the sample and rest of the country.

Differences- The sample mean for beans was larger than the hypothesis mean,, but again this does not necessarily indicate a difference between the two. Unlike ground nuts and beans, the test for cassava rejected the null hypothesis. This means that statistically there is a difference between the two means.
___________________________________________________________________________________

Part 3

The third part of the assignment involved using hypothesis testing in environmental studies. In the scenario, a researcher is testing whether a particular stream has a higher amount of pollutants than the allowable amount of 4.4 mg/l. A sample size of 17 streams gave a mean pollutant value of 6.8 mg/l.

Null Hypothesis: There is no difference in the mean pollutant levels between the sample and hypothesized means

Alternative Hypothesis:  There is a difference in the mean pollutant levels between the sample and hypothesized means

A Confidence Interval of .95 using a 1 tail test with 16 (17 samples - 1) degrees of freedom gives a Critical Value of 1.746

Sample mean= 6.8
Hypothesized mean = 4.4
Standard Deviation = 4.4
Sample size of 17 = One tail T-test
The calculated T- Value was 2.35

2.35 is greater than the Critical Value of 1.746. Therefore, we reject the null hypothesis.

Conclusions: The T value exceeded the critical value and the null hypothesis was rejected Therefore it can concluded that there is a statistical difference between the average levels of pollutants.
___________________________________________________________________________________

Part 4

The final part of the assignment focused on using hypothesis testing to determine if there is a statistical difference between the housing values for the City of Eau Claire and Eau Claire county in western Wisconsin at the census tract level. Since more than 30 census tracts were surveyed, a Z-test was used. The results were than mapped using standard deviation. The data used is current as of 2016

Null Hypothesis- There is no difference in average home value at the census tract level between the City of Eau Claire and Eau Claire County.

Alternative Hypothesis- There is a difference in average home value at the census tract level between the City of Eau Claire and Eau Claire County.

Sample Mean-151,876.5
Hypothesized Mean- 169,438
Standard Deviation- 49,706

A two tailed test with a confidence interval of .95
The critical value of this interval is +/- -1.65
The calculated Z-value is -2.54

The calculated Z value of -2.54 falls below the critical value of 1.65, therefore we reject the null hypothesis. The results can be visualized on the map below
Conclusions:  The results for both the test and the map indicate that property values for the City of Eau Claire and Eau Claire County are statistically different. This can be visualized on the map. The census tracts for the City of  Eau Claire fall below the mean in terms of standard deviation. One possible explanation for this would be that land parcels are smaller in the city, whereas homes outside of the city have more land associated with them which would raise the homes value.

Comments

Popular posts from this blog

Assignment 6 - Regression Analysis

Assignment 3: Z-Scores and Probability