Final Exam Crib Sheet

Comment

Report 3 Downloads 220 Views

MGCR 271 Crib Sheet By Kareem Halabi Z-score: 𝑧 =

𝑥−𝑥 𝑠𝑥

(if |𝑧| > 3 then the number

is an outlier). All the z scores of a data set will have 𝑥 = 0, 𝑠𝑥 = 1 Simple Regression: Residual for (𝒙𝒊 , 𝒚𝒊 ): 𝑒𝑖 = 𝑦𝑖 − 𝑎 − 𝑏𝑥𝑖 Ordinary Least Squares regression: Goal is to minimize ∑ 𝑒𝑖 2 𝑦 = 𝑎 + 𝑏𝑥 𝑏=

𝑛(∑ 𝑥𝑦)−(∑ 𝑥)(∑ 𝑦) 𝑛(∑ 𝑥 2 )−(∑ 𝑥)

2

𝑎=

(∑ 𝑦)−𝑏(∑ 𝑥)

The confidence interval is found by 𝑧 z is the z-score of the area of

𝑠𝑥 √𝑛

1−𝐶𝐿 2

where Similar to normal distribution but has fatter tails. Different sample size lead to slightly different shapes for t-distribution

Central Limit Theorem: A random sample of df: Degrees of Freedom = n-1 size n ≥ 30 is selected from an infinite (or large) population distributed with mean μ and 𝑡 = 𝑥−𝜇 NEVER use σ 𝑠𝑥 /√𝑛 standard deviation σ. Let 𝑥 be the mean of this random sample; even if the parent population Confidence formula using t: 𝑥̅ ± 𝑡 𝑠𝑥 √𝑛 is not normally distributed 𝑥 is approximately 𝑥−𝜇 𝑥−𝜇 Sample Size Formulas (Always use z): normally distributed. (rare), 𝜎/√𝑛

𝑠𝑥 /√𝑛

(frequent) have the same distribution as z Binary Data: Suppose a very large dataset comprises only 1s and 0s

1.

𝐸= 𝑧

𝑛

Normal Distributions: Probability of one specific outcome of a continuous random variable z is 0%

𝝁 = 𝒑 where p is the proportion of 1s

For point estimate: 𝑥̅ ± 𝑧

2.

𝑠𝑥 √𝑛

→𝒏=(

𝑠𝑥 √𝑛

𝒛𝒔𝒙 𝟐 𝑬

) always round up

For binary data: 𝑝̅ ± 𝑧√

𝑝̅ (1−𝑝̅ ) 𝑛

𝝈 = √𝒑(𝟏 − 𝒑) 𝑝̅ (1−𝑝̅ )

𝑛𝑝̅ (1−𝑝̅ )

For a sample of size n, 𝑠𝑥 = √

𝐸 = 𝑧√

𝑛

→𝒏=

̅ (𝟏−𝒑 ̅) 𝒛𝟐 𝒑 𝑬𝟐

(round up)

𝑛−1

Use Cumulative Distribution Function (CDF) and Conservative sample size: If we don’t know 𝑝̅ 𝑝̅ −𝑝 𝑝̅ −𝑝 density function for a range of values Ex: 𝑧 ≈ ̅ (1−𝑝̅) 𝑂𝑅 where 𝑝̅ is the average 𝑝 𝑝(1−𝑝) use the worst case scenario of 0.5 and (1-0.5) √ √ CDF 𝐹(𝑥) = 𝑃(𝑧 < 1) 𝑛−1 𝑛 of a sample of size n ̃(𝟏−𝒑 ̃) 𝒛𝟐 𝒑 Density function = If there is 𝑝̃, a prior estimate of p: 𝒏 = 𝑬𝟐 derivative of CDF Hypothesis-Testing Procedure (keywords p-Value Hypothesis Testing: “significant”, “statistically significant”): Shape of CDF is Same procedure as regular hypothesis Sigmoid Curve: 1. Form appropriate null and alternative hypotheses. H0 by convention has ≥, ≤ or = testing except test statistic is the smaller Shape of density area under distribution. Find z/t/F/χ2 score, whereas Ha has or ≠ function is a bell 2. Determine the appropriate distribution (z, then look up area (this will be p-value) 𝑥2 1 − curve: 𝑒 2 t, χ2 or F) 1. For a right/left tailed test reject 𝐻0 if 𝑝 < 𝛼 √2𝜋 3. Calculate the appropriate test statistic 2. For a two tailed test, reject 𝐻0 if 2𝑝 < 𝛼 Probabilities are found as areas under the 4. Using the level of significance α (by default density curve. First convert data to z-scores, 2-Sample Tests (used when comparing the 0.05) and tables of distribution to form then look up areas in table. means of two samples if n ≥30): rejection region. a. 𝐻0 : 𝜃 ≤ 𝜃0 𝐻𝑎 : 𝜃 > 𝜃0 Right tailed 𝑝̅ −𝑝̅ For percentiles, use the table backwards, treat 𝑧 = 𝑝̅ (1−𝑝̅1 ) 𝑝̅2 (1−𝑝̅ ) (unless using a pooled test. Tail area = α. Reject H0 if test 1 1 + 2 2 √ the percentile as an area then lookup the z 𝑛1 𝑛2 statistic is > critical value from table score and convert to a data value proportion b. 𝐻0 : 𝜃 ≥ 𝜃0 𝐻𝑎 : 𝜃 < 𝜃0 Left tailed test. Tail area = α. Reject H0 if test statistic is 𝑧 = 𝑥̅1 −𝑥̅2 (except when using the pooled If a random sample of size n is drawn from a 𝑠 2 𝑠 2 𝑥−𝜇 √ 1 + 2 < critical value from table normal population, then has the same 𝑛1 𝑛2 𝜎/√𝑛 c. 𝐻0 : 𝜃 ≠ 𝜃0 𝐻𝑎 : 𝜃 = 𝜃0 Two-tailed variance aka ANOVA) distribution as z. (𝑥 is the mean of the sample) test. Tail areas = α/2. Reject H0 if test 𝑠 𝑠 2 𝑠 2 statistic is > than right hand critical Standard error of sample mean: 𝑥 Standard error for 𝑥̅1 − 𝑥̅2 is √ 1 + 2 √𝑛 𝑛1 𝑛2 value or < than left hand critical value 5. State the conclusion in ordinary language For a t-distribution (n < 30): Estimation: The use of a single number calculated from a sample is called point estimation (statistic). Unfortunately, it is rarely equal to the parameter it is estimating An interval estimate has a margin of error. Confidence Level: estimate of the probability that the confidence interval includes the true parameter value

Student’s t distribution: Theoretical Requirements: 1. Population should be approximately normally distributed 2. Sample should be randomly selected Useful for sample sizes n < 30

2 𝑠 2 𝑠 2

Satterthwaite DF: truncate result

T-statistic: 𝑡 =

𝑥̅1 −𝑥̅2 𝑠 2 𝑠 2 √ 1 + 1 𝑛1

𝑛2

( 𝑛1 + 𝑛2 ) 1 2

2 2 𝑠 2 𝑠 2 ( 𝑛1 ) ( 𝑛2 ) 1 2 + 𝑛1 −1 𝑛2 −1

Multiple Regression 𝒚̂ = 𝒃𝟎 + 𝒃𝟏 𝒙𝟏 + ⋯ + 𝒃𝒌 𝒙𝒌 : χ2 (Chi-square) distribution:

ANOVA: used for testing whether several means are statistically different

T statistic = Theoretical Requirements: 1. 2. 3.

𝑠𝑏𝑗 is std error of 𝑏𝑗

Analysis of Variance for Regression:

The samples are taken from populations that are all normally distributed The samples are randomly and independently selected The variances of all the populations are roughly the same (homoscedasticity) 1

𝑏𝑗 𝑠𝑏𝑗

Source

DF

SS

MS

Regression

k

SSR

MSR

Error Total

n-k-1 n-1

SSE SSTOT

MSE

F 𝑀𝑆𝑅 𝑀𝑆𝐸

R2 (coefficient of determination) =

𝑆𝑆𝑅 𝑆𝑆𝑇𝑂𝑇

aka

proportion of the variation in y that is explained by the regression relationship. Most common measure of the accuracy a model

MSTR MSE

Compare to F distribution:

SSR (explained variation/regression sum of Numerator DF = number of treatments (k) – 1: squares) = R2 *SSTOT 𝑁𝐷𝐹 = 𝑘 − 1 SSE (unexplained variation in y) = SSTOT-SSR Denominator DF = number of data points (n) – (∑ 𝑦)2 k: 𝐷𝐷𝐹 = 𝑛 − 𝑘 SSTOT (total variation in y) = (∑ 𝑦 2 ) − 𝑛

ANOVA Table Method (preferred method): Source

DF

SS

MS

𝑀𝑆𝑅 =

F 𝑀𝑆𝑇𝑅 𝑀𝑆𝐸

𝑆𝑆𝑅

𝑀𝑆𝐸 =

𝑘

𝜒2 = ∑

(𝑂−𝐸)2 𝐸

where O are observed A

Example contingency table:

Error Mean Square (MSE): ∑ 𝑠𝑖 (the average k = number of x variables 𝑛 of the square std deviations of the treatments) n = number of lines of data

ANOVA test statistic =

Chi-square independence test (keywords “independent”, “dependent”, “depends on”, “related to”

frequencies and E are expected frequencies

2

Treatment Mean Square (MSTR): 𝑛𝑗 𝜎(𝑥̅𝑗 )2

𝜒 2 ≈ 𝑧1 2 + 𝑧2 2 + ⋯ + 𝑧3 2 (never negative)

𝑆𝑆𝐸 𝑛−𝑘−1

A1 𝐴1 𝐵1 𝑛 𝐴1 𝐵2 𝑛

A2 𝐴2 𝐵1 𝑛 𝐴2 𝐵2 𝑛

B1 ALWAYS: B H0: A and B are B2 independent Ha: A and B are significantly dependent Example of Expected Frequencies: DF = (rows – 1)(columns – 1)

A B

A1 O11 O21

B1 B2

A2 O12 O22

If the test statistic > chi-square value in table, reject null hypothesis Chi-squared goodness of fit test:

If two sets of data are given, one of which is a set of expected values and another set which F and p-values are used for predicting whether a are observed values, use the χ2 statistic. Never Error n-k SSE MSE model is significant for predicting y Total n-1 SSTOT use proportions for the frequencies. If H0: the model is not significant for predicting y proportions are given, multiply by the total Ha: the model is significant for predicting y SS: Sum of squares: number of data points to get the frequencies 2 (Treatment sum of squares) A change in R between two models can be H0: The expected values are not significantly (sum of treatment)2 (sum of all)2 determined by examining the p-value of the 𝑆𝑆𝑇𝑅 = ∑ − # of data in treatment total # of data different from the observed values removed variable. If p < α, change is significant Ha: The expected values are significantly (Error sum of squares) 𝑆𝑆𝐸 = 𝑆𝑆𝑇𝑂𝑇 − 𝑆𝑆𝑇𝑅 Marginal Contribution of a variable is its different from the observed values coefficient Power Test (probability that the null (Total sum squares) (sum of all)2 hypothesis will be correctly rejected): CI for marginal contribution: 𝑏𝑗 ± 𝑡 𝑠𝑏𝑗 𝑆𝑆𝑇𝑂𝑇 = (sum of all squares) − total # of data DF = Error DF Say we have σ & n: MS: Mean square Prediction-Interval Formula (for simple H0: μ ≤ a Ha: μ >a but μ actually = b 𝑆𝑆𝑇𝑅 𝑆𝑆𝐸 regression): 𝑀𝑆𝑇𝑅 = 𝑀𝑆𝐸 = Treatments

𝑘−1

k-1

SSTR

MSTR

Standard error of estimate = √𝑀𝑆𝐸

𝑛−𝑘

H0: μ1 = μ2 = μ3 … (not significantly different) Ha: at least one is significantly different

1

(𝑎 + 𝑏𝑥0 ) ± 𝑡√𝑀𝑆𝐸 √1 + + 𝑛

(𝑥0 −𝑥̅ )2 (∑ 𝑥)2 (∑ 𝑥 2 )− 𝑛

ANOVA Confidence Intervals (must be used if NOTE: 𝒕𝑫𝑫𝑭 = √𝑭𝟏,𝑫𝑫𝑭 (if F NDF = 1) context of question is ANOVA):

Power of this test = 𝑃(𝑧 >

𝑎−𝑏 𝜎 ⁄√𝑛

+ 1.645)

Type I error: If H0 is true, but an unlucky choice of sample makes the statistician reject H0

Type II error: If H0 is false, but an unlucky choice of sample makes the statistician not Standard error of simple regression coefficient Turn α into t value with DF being Error DF (n-k) reject H0 𝟏. 𝑥̅𝑗 ± 𝑡√

𝑀𝑆𝐸 𝑛𝑗

𝑀𝑆𝐸

𝟐. 𝑥̅1 − 𝑥̅2 ± 𝑡√

𝑛1

+

𝑀𝑆𝐸 𝑛2

𝑠0 = √

𝑀𝑆𝐸 (∑ 𝑥 2 )−

(∑ 𝑥)2 𝑛

Significance for Variance: If one variance is 4 times another, the variances are significantly different

Recommend Documents

Crib Sheet - Final Exam

COMP 251 Final Crib Sheet

Sabrina Dhuman FIN401-Exam Crib Sheet

Physics Crib Sheet

MGCR 271 Crib Sheet

Physics Crib Sheet