GRADE 9 Mathematics DATA HANDLING AND PROBABILITY – DATA INTERPRETATION(GROUPED DATA) Notes
Mathematics — Data Handling & Probability
Subtopic: DATA INTERPRETATION (GROUPED DATA)
Objectives: By the end you should be able to:
- Understand class intervals, class boundaries and class marks.
- Calculate mean, median and mode from grouped data.
- Draw a simple histogram and ogive for grouped data.
What is grouped data?
Grouped data is data organized into classes (intervals). For example, test marks of many students are grouped into ranges like 0–9, 10–19, etc. We use frequencies (how many values fall into each class) to describe the data.
Key terms (simple)
- Class interval: the range, e.g. 30–39.
- Class boundary: exact edges between classes (e.g. 29.5–39.5 if classes are 30–39).
- Class mark (midpoint): middle value of a class, (lower + upper) / 2.
- Frequency (f): number of items in that class.
- Cumulative frequency (cf): running total of frequencies up to that class.
- Class width (h): upper limit − lower limit (e.g. 10 for 30–39).
Example (grouped frequency table)
Marks of 40 students are grouped as follows:
| Class (marks) | Frequency f | Class mark m | f × m | Cumulative cf |
|---|---|---|---|---|
| 0–9 | 1 | 4.5 | 4.5 | 1 |
| 10–19 | 3 | 14.5 | 43.5 | 4 |
| 20–29 | 6 | 24.5 | 147.0 | 10 |
| 30–39 | 9 | 34.5 | 310.5 | 19 |
| 40–49 | 10 | 44.5 | 445.0 | 29 |
| 50–59 | 6 | 54.5 | 327.0 | 35 |
| 60–69 | 4 | 64.5 | 258.0 | 39 |
| 70–79 | 1 | 74.5 | 74.5 | 40 |
| Total | 40 | 1610.0 |
Notes: Class marks are midpoints, f × m is used to find the mean. Total frequency N = 40. Sum of f × m = 1610.
Mean (for grouped data)
Mean ≈ (Σ f × m) / N = 1610 ÷ 40 = 40.25
Tip: For many classes use the assumed-mean method to make calculations easier (especially with large numbers).
Median (using interpolation)
1. Find N/2: 40/2 = 20 → we want the 20th value.
2. Find the class containing the 20th value: cumulative cf shows 19 for 30–39 and 29 for 40–49, so median class is 40–49.
Use formula (with class boundaries): median = L + ((N/2 − cf_prev) / f_m) × h
Here: L = 39.5 (lower boundary of 40–49), cf_prev = 19, f_m = 10, h = 10
median = 39.5 + ((20 − 19) / 10) × 10 = 39.5 + 1 = 40.5
Mode (most frequent class, interpolation)
Modal class is 40–49 (frequency 10). Use formula:
mode ≈ L + ((fm − f1) / (2fm − f1 − f2)) × h
Here: L = 39.5, fm = 10, f1 (previous) = 9, f2 (next) = 6, h = 10
mode = 39.5 + ((10 − 9) / (20 − 9 − 6)) × 10 = 39.5 + (1/5) × 10 = 39.5 + 2 = 41.5
Visuals: Histogram and Ogive (simple)
Below is a simple drawn histogram (bars show class frequencies) and an ogive (cumulative frequency curve) overlaid.
Interpretation: The tallest bar (40–49) has most students (modal class). The ogive helps find percentiles visually (e.g., the 50th percentile ~ median ~ 40.5).
Quick steps
- Make a frequency table with class marks and cumulative frequencies.
- Mean: Σ(f × m) ÷ N.
- Median: locate median class using cf, then interpolate using boundaries.
- Mode: locate modal class (largest f), use mode formula to interpolate.
- Draw histogram: bars of equal width, heights = frequencies. Draw ogive: plot cumulative frequency at class upper boundaries and join points.
Common mistakes & tips
- Don't forget class boundaries (e.g. 39.5) when interpolating for median and mode.
- For histogram, use class width correctly — if classes have different widths, draw bars with widths proportional to class width and use frequency density (frequency ÷ width) for height.
- Always check total frequency N before calculating mean, median and percentiles.
Practice (try these)
- From the table above, find the percentage of students who scored 50 or more.
- Using an assumed mean of 44.5, recalculate the mean (show the steps).
- If one more student scored in 70–79, what happens to the mean and median? Explain.