Thursday, April 19, 2012

Hypothesis Testing - Investment Analysis Context

A.   Basic Statistical Concepts

1.     Definitions

Population – all the items in a group of data, such as all the stocks on the NYSE.

Parameters – descriptive characteristics of a population, such as average profit margin and variance or standard deviation around the population average.

Sample – a selection of items from a population that is (hopefully) representative of the population.

Statistics – descriptive characteristics of a sample.


2.     Key Terms and Symbols


Population           Sample

                                                                   (Parameter)         (Statistic)

                   Number of items in the group        N                                n

                Arithmetic Average          µ                          X

                                Variance                                                                             σ²                                Sx2

                   Standard Deviation                         σ                                Sx

Standard Error or the Estimate                                       Sx


3.     Confidence Intervals


Example: An analyst selects a random sample of 5 companies traded on the NYSE. The average profit margin of the 5 companies is 16.6% with a standard deviation of 8.63%. Estimate with  95% confidence, a range that will include the true average profit margin of NYSE companies. The t-value (reliability factor) for 95% confidence is 2.776.



(1- α) Confidence interval = Point Estimate + (Reliability Factor *

         Standard Error)


(1- α) is the probability the true value of the population average falls within the confidence interval. The probability that the true value falls outside the confidence interval is equal to alpha(α ). This alpha(α ) is known as the level of significance.


If the true population standard deviation is not known, the standard error is calculated using the sample’s standard deviation.


If the true population standard deviation is known, then the following formula is used for the standard error:




In this example, the standard error is:



95% confidence interval   =   16.6% + 2.776 * 3.86%  = 5.9% to 27.3%



B.   Performing a Simple Hypothesis Test

A hypothesis is a statement about one or more populations that it can be tested.

A hypothesis test is a procedure based on statistics to infer whether the hypothesis (statement) is likely to be true or false, given a specified degree of confidence.


Example: An analyst asserts that the true average profit margin of all NYSE-listed companies is 30%. Given the previous confidence interval calculations, is this plausible?


Answer: Based on the sample data, there is a 95% chance the true average profit margin is between 5.9% and 27.3%. Therefore, the claim of 30% is almost certainly false. However, there is a 5% chance it could be true.


Seven steps of Hypothesis test.


1.     State the Hypothesis

The hypotheses are always state in pairs, called the null and alternative hypotheses:


The null hypothesis (H0) is a positive affirmation of the hypothesis that is being tested.

The alternative hypothesis (H1) is the proposition that must be true if the null hypothesis is false.


In all cases, for any given probability assumed for the test, one hypothesis will rejected and the other accepted.


In the NYSE profit margin example:


Null Hypothesis, H0: The true average profit margin is 30%


Alternative Hypothesis, H1: The true average profit margin is not 30%.


A hypothesis test can be formulated as either a one-tailed or two-tailed test.


Two-tailed test

          H0:   µx = µ0

          H1:   µx  is not equal to µ0



In a one-tailed test, the null hypothesis is formulated so that the hypothesized true parameter is greater (or less) than or equal to some value.


          H0:  Mu x is > Mu 0

          H1: Mu x < Mu 0



          H0: Mux < Mu 0

          H1: Mu x > Mu 0

2.     Identify the Test Statistic

The test statistic is a value calculated from the sample data. This value is used to determine whether to accept or reject the null hypothesis. The general formula:


Test Statistic =

(Sample Statistic – Hypothesized value)/(Standard Error of the Sample Statistic)


Sample statistic is the value calculated from the sample.


3.     Specify the Level of Significance for the Hypothesis Test

A hypothesis test can only show that a hypothesis is probable or improbable with some stated level of significance.


Significance (or Level of Significance) is the probability of rejecting a true null hypothesis. It is also called Type I error.

A Type II error is the risk of accepting a false (invalid) null hypothesis.


In a test, both errors will decrease only when the sample size is increased.


If it is important to avoid a Type I error, the significance level will be set low, usually 1%.

If it is very important to avoid a type II error, the significance level will be set high usually 10%.


4.     Define and Interpret the Decision Rule


Decision rules regarding the test: Three potential ways.


Using an acceptance range around the hypothesized value

Using an acceptance range for the test statistic

p-value approach


a.     Using an acceptance range around the hypothesized value

Acceptance range = Value Hypothesized in Null Hypothesis + (Reliability Factor * Standard Error)


If the value of the statistic from the sample data falls within the accepted range, the null hypothesis cannot be rejected. – it is accepted.


When calculating a confidence interval, the range is calculated around the sample average. However, when performing a hypothesis test, the acceptance range is calculated around the null hypothesized value.


b.     Using an acceptance range for the test statistic

Acceptance range is stated in terms of a minimum and maximum for the test statistic.


In a two tailed test the level of significance is halved the t or z value corresponding to the half value is found out for the critical value.


In a one tailed test, the value that corresponds to the level of significance is the critical value.


If the null hypothesis is rejected, the test result is said to be statistically significant. If the null hypothesis is not rejected, the test result is said to be not statistically significant.


c.     p-value approach

The p-value is the lowest level of significance at which the null hypothesis is rejected.


If the p-value determined is greater than the level of significance predetermined, the null hypothesis is accepted.

Otherwise, the null hypothesis is rejected.


5.     Collect the Data and Make the Calculations   

6.      Make the Statistical and Economic Decisions

In general, a statistical decision should not be acted upon unless a more basic analysis of the underlying economics for the situation supports the statistical decision.

Original knol - 495

No comments:

Post a Comment