Descriptive Statistics

Descriptive statistics help us summarize and understand the main features of a dataset. This section covers key concepts and includes R code examples.

📊 Measures of Central Tendency

Mean: The average of the data.
Median: The middle value when the data is ordered.
Mode: The most frequently occurring value.

# Sample data
data <- c(5, 8, 9, 6, 8, 7, 10, 6, 8)

# Mean
mean(data)

[1] 7.444444

# Median
median(data)

[1] 8

# Mode (custom function)
get_mode <- function(x) {
  uniqx <- unique(x)
  uniqx[which.max(tabulate(match(x, uniqx)))]
}
get_mode(data)

[1] 8

📏 Measures of Dispersion

Range: Difference between max and min values.
Variance: The average squared deviation from the mean.
Standard Deviation: Square root of the variance.

# Range
range(data)

[1]  5 10

diff(range(data))  # Range as a single number

[1] 5

# Variance
var(data)

[1] 2.527778

# Standard Deviation
sd(data)

[1] 1.589899

📈 Data Visualization

Visual representations are essential for understanding the distribution of data.

# Histogram
hist(data, main = "Histogram of Data", col = "skyblue", border = "white")

# Boxplot
boxplot(data, main = "Boxplot of Data", col = "lightgreen")

# Bar plot for frequency
barplot(table(data), main = "Bar Plot of Frequencies", col = "lightcoral")

✅ Summary

Descriptive statistics give you a first understanding of your data. You can use these techniques and R code snippets to quickly explore any dataset before moving on to more complex analysis.

More examples coming soon!