Descriptive Statistics

Descriptive statistics help us summarize and understand the main features of a dataset. This section covers key concepts and includes R code examples.

📊 Measures of Central Tendency

  • Mean: The average of the data.
  • Median: The middle value when the data is ordered.
  • Mode: The most frequently occurring value.
# Sample data
data <- c(5, 8, 9, 6, 8, 7, 10, 6, 8)

# Mean
mean(data)
[1] 7.444444
# Median
median(data)
[1] 8
# Mode (custom function)
get_mode <- function(x) {
  uniqx <- unique(x)
  uniqx[which.max(tabulate(match(x, uniqx)))]
}
get_mode(data)
[1] 8

📏 Measures of Dispersion

  • Range: Difference between max and min values.
  • Variance: The average squared deviation from the mean.
  • Standard Deviation: Square root of the variance.
# Range
range(data)
[1]  5 10
diff(range(data))  # Range as a single number
[1] 5
# Variance
var(data)
[1] 2.527778
# Standard Deviation
sd(data)
[1] 1.589899

📈 Data Visualization

Visual representations are essential for understanding the distribution of data.

# Histogram
hist(data, main = "Histogram of Data", col = "skyblue", border = "white")

# Boxplot
boxplot(data, main = "Boxplot of Data", col = "lightgreen")

# Bar plot for frequency
barplot(table(data), main = "Bar Plot of Frequencies", col = "lightcoral")


✅ Summary

Descriptive statistics give you a first understanding of your data. You can use these techniques and R code snippets to quickly explore any dataset before moving on to more complex analysis.

More examples coming soon!