Week 4 - BALT 4396 - Probability and Statistics for Data Science
Descriptive Statistics
Core mathematical concepts, probability and statistics allow us to better understand data and evaluate it on different levels. This is not only important in interpretation but also in forecasting. Descriptive statistics allows us to summarize and describe the majority of data and provide quantitative data summaries for visualization and with Python, it can be vital in getting statistics for further use cases.
Examples:
Mean: The average value of a dataset.
Median: The middle value when data is sorted.
Mode: The most frequently occurring value in a dataset.
Range: The difference between the highest and lowest values in a dataset.
Variance: The average squared difference between each value and the mean.
Standard deviation: The square root of the variance, measuring the dispersion of values.
Probability Distributions
These examples are imperative in getting us to the next step as probability distributions wouldn't be possible without descriptive statistics. Different types of distributions include normal (Gaussian), binomial, and Poisson just to name a few. Python will help in breaking down the statistics from earlier to help visualize the data in a new way. A normal distribution is a probability that most people are likely familiar with as an example can pertain to the height of people which is a natural phenomenon.

Comments
Post a Comment