MGCR 271 Crib Sheet

Report 12 Downloads 99 Views
MGCR 271 Crib Sheet By Kareem Halabi Stem plot: Stem is first few digits, 2nd column represents the last digit of each data point (can be multiple of the same number) 1

Mean (population mean is ΞΌ): π‘₯ = βˆ‘ π‘₯𝑖 𝑛

Sample standard deviation (quantifiable spread): 𝑠π‘₯ , 𝑠π‘₯, 𝜎 β€² or π‘₯πœŽπ‘›βˆ’1 = √

βˆ‘(π‘₯βˆ’π‘₯)2

βˆ‘(π‘₯βˆ’πœ‡)2

π‘₯βˆ’π‘₯ 𝑠π‘₯

(if |𝑧| > 3 then the number

is an outlier). All the z scores of a data set will have π‘₯ = 0, 𝑠π‘₯ = 1 Coefficient of Variation: 𝐢𝑉 =

𝑠π‘₯ π‘₯Μ…

Intuitive definition of Percentiles: If n data are arranged in numerical order, a number x is called a pth percentile if At least p% of the data ≀ x At least (100-p)% of the data β‰₯ x

Linear interpolation for percentiles: 1. 2. 3.

𝑛(βˆ‘ π‘₯𝑦)βˆ’(βˆ‘ π‘₯)(βˆ‘ 𝑦) 𝑛(βˆ‘ π‘₯ 2 )βˆ’(βˆ‘ π‘₯)

Put the data into numerical order Calculate 𝑝% (𝑛 + 1) = π‘˜ (an integer) + 𝑑 (decimal) The pth percentile is: π’™π’Œ + 𝒅(π’™π’Œ+𝟏 βˆ’ π’™π’Œ )

First Quartile: Q1 = 25th percentile Median: M = 50th percentile Third Quartile: Q3 = 75th percentile Interquartile Range: IQR = Q3 - Q1 Outliers by boxplot criterion: High outlier: π‘₯ > 𝑄3 + 1.5 Γ— 𝐼𝑄𝑅 Low outlier: π‘₯ < 𝑄1 βˆ’ 1.5 Γ— 𝐼𝑄𝑅

Q1

M

Q3

Highest non-outlier

𝑛

Negative (left) skew: π‘₯𝑀

𝑃(𝐸 | 𝐹) =

1.

π‘Ž=

Conditional Probability: Probability of E happening if F also happens

Example Turkey’s boxplot:

Lowest non-outlier

2

𝑛

Z-score: 𝑧 =

1. 2.

Random variable: assigns a numerical value to every possible outcome of an experiment Discrete random variable: When there is a gap between successive possible values (Ex: can have 72 or 73 people but not 72.3) Residual (vertical distance) for (π’™π’Š , π’šπ’Š ): 𝑒𝑖 = Continuous random variable: can assume all 𝑦𝑖 βˆ’ π‘Ž βˆ’ 𝑏π‘₯𝑖 vales in some interval Ordinary Least Squares regression: Goal is Probability Distribution Function (PDF): the to minimize βˆ‘ 𝑒𝑖 2 set of all possible values of a discrete random 𝑦 = π‘Ž + 𝑏π‘₯ variable together with their probabilities 𝑏=

Median Third Quartile Highest non-outlier (by boxplot criterion)

π‘›βˆ’1

Population standard deviation (always use this when dealing with decimal percentages): 𝜎π‘₯ = √

3. 4. 5.

Lowest non-outlier (by boxplot criterion) First Quartile

Empirical Rule: If a data set is unimodal and not very skewed, then 1. ~68% of data are within 1𝑠π‘₯ of π‘₯Μ… 2. ~95% of data are within 2𝑠π‘₯ of π‘₯Μ… 3. ~99.7% of data are within 3𝑠π‘₯ of π‘₯Μ…

Expected Value of x: 𝐸(π‘₯) = πœ‡(π‘₯) = πœ‡π‘₯ = π‘₯Μ… Statistical Independence: The probability of A happening is independent of whether or not B happens (the second flip of a coin is independent from the first flip) 1. 2. 3.

𝑃(𝐴 | 𝐡) = 𝑃(𝐴 | 𝐡̅ ) 𝑃(𝐴 | 𝐡) = 𝑃(𝐴) 𝑷(𝑨 ∩ 𝑩) = 𝑷(𝑨)𝑷(𝑩) (Can be used to determine whether A & B are independent)

Binomial Setup: 1. 2.

There are only two outcomes to an experiment: Success and failure Let p be the probability of success, and it is the same every time

If the experiment is performed n times, the probability of x successes and n-x failures is: 𝒏 ( ) 𝒑𝒙 (𝟏 βˆ’ 𝒑)π’βˆ’π’™ on calc: (nCx)(p^x)(1-p)^(n-x) 𝒙 For Binomial problems: 𝐸(π‘₯) = 𝑛𝑝

𝜎(π‘₯) = βˆšπ‘›π‘(1 βˆ’ 𝑝) End of Midterm Material