Neko

C955 - Applied Statistics and Probability

Nice applicable content provided in this class. I used the cohorts and the key concepts and formulas page.


Last updated: May 4th, 2023

Fractions and Percentages

Multiplying and Dividing Fractions

  • 1/3 × 2/8 = (1 × 2) / (3 x 8) = 2/24 = 1/12
  • 2/3 = x/6 = (2 × 6) = (3 × x) = 12 = 3x = x = 4
  • 3/5 ÷ 2/3 = 3/5 × 3/2 = 9/10

Unit Conversions

Common Imperial

  • 1 tablespoon (tbsp) = 3 teaspoons (tsp)
  • 1 cup (cup) = 16 tablespoons (tbsp)
  • 1 pint (pt) = 2 cups (cup)
  • 1 quart (qt) = 2 pints (pt)
  • 1 gallon (gal) = 4 quarts (qt)
  • 1 pound (lb) = 16 ounces (oz)

Metric Chart

  • kilo- (k): 1,000
  • hecto- (h): 100
  • deka- (da): 10
  • deci- (d): 0.1
  • centi- (c): 0.01
  • milli- (m): 0.001

Basic Algebra

Linear Equalities

  • 𝑥 − 2 = 7 = x = 9

Linear Inequalities

  • 3𝑦 + 5 < 14 = 3𝑦 < 9 = 𝑦 < 3
  • -2𝑧 - 4 > 12 = -2𝑧 > 16 = 𝑧 < -8
    • Notice the flipped sign

Slope-Intercept

  • y=mx+b
    • Imagine m as a fraction, rise/run
    • m = 3 is the same as m = 3/1
      • go up 3, right 1

Terminology for Single Variable Stats

Data Types

  • Quantitative (measurable)
  • Categorical

Graphs

  • Pie - Parts of a whole
  • Bar Chart - Frequencies of categories
  • Histogram - Shape and spread of data
    • Normal - Mean, Median, Mode are equal
    • Positive - Mean > Median > Mode
    • Negative - Mean < Median < Mode
  • Box Plot - Center, spread and outliers. Each section covers 25% of data.
  • Dot Plot - Shows all data points
  • Stem Plot - Shows shape according to place values. Shows all data points.
  • Categorical
    • Pie Chart or Bar Chart
  • Quantitative
    • Histogram, Stem Plot, Boxplot, Dot Plot

Standard Deviation Rule (for normal distribution)

  1. SD = 68% of data
  2. SD = 95% of data
  3. SD = 97.5% of data

Terminology for Two Variable Stats

Graphs

  • C → C
    • Two way Table
    • Conditional Percentages
  • C → Q
    • Side-by-Side Boxplots
    • Five Number Summary
  • Q → Q
    • Scatterplot
    • r value

Correlation and Regression

Study-Design

  • Experimental
    • Researchers assign groups
  • Observational
    • Groups are pre-determined by participants

Simpson's Paradox

  • Simpson's Paradox is a statistical phenomenon where a trend appears in seperate groups of data, but disappears or reverses when the groups are combined.

Probability

Finding Probability

Sample Spaces

  • Flipping an increasing amount of coins:
    • H, T = 1 coin, 2 possible outcomes
    • HH, HT, TH, HH = 2 coins, 4 possible outcomes
    • HHH, HHT, HTH, , HTT, THH, THT, TTH, TTT = 3 coins, 8 possible outcomes

Formulas

Notation:

  • P(A or B) represents the probability that either event A or event B will occur, or both A and B will occur simultaneously.
  • P(A and B) represents the probability that both events A and B will occur at the same time.
  • P(A|B) represents the conditional probability of event A occurring, given that event B has already occurred.

Vocabulary:

  • Complementary Rule is P(not A) = 1 - P(A)
  • Disjoint Events are events that cannot occur at the same time, meaning if one event happens, the other cannot happen simultaneously. In this case, P(A and B) = 0, which means events A and B are mutually exclusive and cannot occur together.
  • Independent Events are events that do not affect the probability of each other occurring. If events A and B are independent, then the probability of event A occurring, given that event B has occurred (P(A|B)), is equal to the probability of event A occurring (P(A)), and the probability of event B occurring, given that event A has occurred (P(B|A)), is equal to the probability of event B occurring (P(B)).

Formulas:

  • OR Rule (General Addition): The probability of either event A or event B occurring (or both) is given by P(A or B) = P(A) + P(B) - P(A and B). If events A and B are disjoint (mutually exclusive), then the formula simplifies to P(A or B) = P(A) + P(B).
  • AND Rule (General Multiplication): The probability of both event A and event B occurring together is given by P(A and B) = P(A) x P(B|A), where P(B|A) is the conditional probability of event B occurring given that event A has occurred. If events A and B are independent, then the formula simplifies to P(A and B) = P(A) x P(B).
  • Conditional Probability: The probability of event B occurring given that event A has occurred is given by P(B|A) = P(A and B) / P(A), where P(A and B) is the probability of both event A and event B occurring together, and P(A) is the probability of event A occurring.

Cheatsheet

  1. Step 1: Independent if True
    • P(A|B) = P(A)
    • P(B|A) = P(B)
    • P(A and B) = P(A) * P(B)
    • P(B|A) = P(B| not A)
  2. Step 2: Apply Formulas
    • Independent
      • AND - Multiplication
        • P(A)*P(B)
      • OR/BOTH - Addition
        • P(A) + P(B) - P(A) * P(B)
    • Dependent
      • AND (Not-Disjoint) - Multiplication
        • P(A) * P(B|A)
      • OR (Disjoint) - Addition
        • P(A) + P(B)
      • NOT - Complementary
        • 1 - P(A)
      • IF/ALSO - Conditional
        • P(A and B) / P(A)