C955 - Applied Statistics and Probability
Nice applicable content provided in this class. I used the cohorts and the key concepts and formulas page.
Last updated: May 4th, 2023
Fractions and Percentages
Multiplying and Dividing Fractions
1/3 × 2/8 = (1 × 2) / (3 x 8) = 2/24 = 1/12
2/3 = x/6 = (2 × 6) = (3 × x) = 12 = 3x = x = 4
3/5 ÷ 2/3 = 3/5 × 3/2 = 9/10
Unit Conversions
Common Imperial
- 1 tablespoon (tbsp) = 3 teaspoons (tsp)
- 1 cup (cup) = 16 tablespoons (tbsp)
- 1 pint (pt) = 2 cups (cup)
- 1 quart (qt) = 2 pints (pt)
- 1 gallon (gal) = 4 quarts (qt)
- 1 pound (lb) = 16 ounces (oz)
Metric Chart
- kilo- (k): 1,000
- hecto- (h): 100
- deka- (da): 10
- deci- (d): 0.1
- centi- (c): 0.01
- milli- (m): 0.001
Basic Algebra
Linear Equalities
- 𝑥 − 2 = 7 = x = 9
Linear Inequalities
- 3𝑦 + 5 < 14 = 3𝑦 < 9 = 𝑦 < 3
- -2𝑧 - 4 > 12 = -2𝑧 > 16 = 𝑧 < -8
- Notice the flipped sign
Slope-Intercept
- y=mx+b
- Imagine m as a fraction, rise/run
- m = 3 is the same as m = 3/1
- go up 3, right 1
Terminology for Single Variable Stats
Data Types
- Quantitative (measurable)
- Categorical
Graphs
- Pie - Parts of a whole
- Bar Chart - Frequencies of categories
- Histogram - Shape and spread of data
- Normal - Mean, Median, Mode are equal
- Positive - Mean > Median > Mode
- Negative - Mean < Median < Mode
- Box Plot - Center, spread and outliers. Each section covers 25% of data.
- Dot Plot - Shows all data points
- Stem Plot - Shows shape according to place values. Shows all data points.
- Categorical
- Pie Chart or Bar Chart
- Quantitative
- Histogram, Stem Plot, Boxplot, Dot Plot
Standard Deviation Rule (for normal distribution)
- SD = 68% of data
- SD = 95% of data
- SD = 97.5% of data
Terminology for Two Variable Stats
Graphs
- C → C
- Two way Table
- Conditional Percentages
- C → Q
- Side-by-Side Boxplots
- Five Number Summary
- Q → Q
- Scatterplot
- r value
Correlation and Regression
Study-Design
- Experimental
- Researchers assign groups
- Observational
- Groups are pre-determined by participants
Simpson's Paradox
- Simpson's Paradox is a statistical phenomenon where a trend appears in seperate groups of data, but disappears or reverses when the groups are combined.
Probability
Finding Probability
Sample Spaces
- Flipping an increasing amount of coins:
- H, T = 1 coin, 2 possible outcomes
- HH, HT, TH, HH = 2 coins, 4 possible outcomes
- HHH, HHT, HTH, , HTT, THH, THT, TTH, TTT = 3 coins, 8 possible outcomes
Formulas
Notation:
- P(A or B) represents the probability that either event A or event B will occur, or both A and B will occur simultaneously.
- P(A and B) represents the probability that both events A and B will occur at the same time.
- P(A|B) represents the conditional probability of event A occurring, given that event B has already occurred.
Vocabulary:
- Complementary Rule is P(not A) = 1 - P(A)
- Disjoint Events are events that cannot occur at the same time, meaning if one event happens, the other cannot happen simultaneously. In this case, P(A and B) = 0, which means events A and B are mutually exclusive and cannot occur together.
- Independent Events are events that do not affect the probability of each other occurring. If events A and B are independent, then the probability of event A occurring, given that event B has occurred (P(A|B)), is equal to the probability of event A occurring (P(A)), and the probability of event B occurring, given that event A has occurred (P(B|A)), is equal to the probability of event B occurring (P(B)).
Formulas:
- OR Rule (General Addition): The probability of either event A or event B occurring (or both) is given by P(A or B) = P(A) + P(B) - P(A and B). If events A and B are disjoint (mutually exclusive), then the formula simplifies to P(A or B) = P(A) + P(B).
- AND Rule (General Multiplication): The probability of both event A and event B occurring together is given by P(A and B) = P(A) x P(B|A), where P(B|A) is the conditional probability of event B occurring given that event A has occurred. If events A and B are independent, then the formula simplifies to P(A and B) = P(A) x P(B).
- Conditional Probability: The probability of event B occurring given that event A has occurred is given by P(B|A) = P(A and B) / P(A), where P(A and B) is the probability of both event A and event B occurring together, and P(A) is the probability of event A occurring.
Cheatsheet
- Step 1: Independent if True
- P(A|B) = P(A)
- P(B|A) = P(B)
- P(A and B) = P(A) * P(B)
- P(B|A) = P(B| not A)
- Step 2: Apply Formulas
- Independent
- AND - Multiplication
- P(A)*P(B)
- OR/BOTH - Addition
- P(A) + P(B) - P(A) * P(B)
- Dependent
- AND (Not-Disjoint) - Multiplication
- P(A) * P(B|A)
- OR (Disjoint) - Addition
- P(A) + P(B)
- NOT - Complementary
- 1 - P(A)
- IF/ALSO - Conditional
- P(A and B) / P(A)