DATA INTERPRETATION(GROUPED DATA) Notes, Quizzes & Revision
๐ Revision Notes โข ๐ Quizzes โข ๐ Past Papers available in app
Mathematics โ Data Handling & Probability
Subtopic: DATA INTERPRETATION (GROUPED DATA)
Objectives: By the end you should be able to:
- Understand class intervals, class boundaries and class marks.
- Calculate mean, median and mode from grouped data.
- Draw a simple histogram and ogive for grouped data.
What is grouped data?
Grouped data is data organized into classes (intervals). For example, test marks of many students are grouped into ranges like 0โ9, 10โ19, etc. We use frequencies (how many values fall into each class) to describe the data.
Key terms (simple)
- Class interval: the range, e.g. 30โ39.
- Class boundary: exact edges between classes (e.g. 29.5โ39.5 if classes are 30โ39).
- Class mark (midpoint): middle value of a class, (lower + upper) / 2.
- Frequency (f): number of items in that class.
- Cumulative frequency (cf): running total of frequencies up to that class.
- Class width (h): upper limit โ lower limit (e.g. 10 for 30โ39).
Example (grouped frequency table)
Marks of 40 students are grouped as follows:
| Class (marks) | Frequency f | Class mark m | f ร m | Cumulative cf |
|---|---|---|---|---|
| 0โ9 | 1 | 4.5 | 4.5 | 1 |
| 10โ19 | 3 | 14.5 | 43.5 | 4 |
| 20โ29 | 6 | 24.5 | 147.0 | 10 |
| 30โ39 | 9 | 34.5 | 310.5 | 19 |
| 40โ49 | 10 | 44.5 | 445.0 | 29 |
| 50โ59 | 6 | 54.5 | 327.0 | 35 |
| 60โ69 | 4 | 64.5 | 258.0 | 39 |
| 70โ79 | 1 | 74.5 | 74.5 | 40 |
| Total | 40 | 1610.0 |
Notes: Class marks are midpoints, f ร m is used to find the mean. Total frequency N = 40. Sum of f ร m = 1610.
Mean (for grouped data)
Mean โ (ฮฃ f ร m) / N = 1610 รท 40 = 40.25
Tip: For many classes use the assumed-mean method to make calculations easier (especially with large numbers).
Median (using interpolation)
1. Find N/2: 40/2 = 20 โ we want the 20th value.
2. Find the class containing the 20th value: cumulative cf shows 19 for 30โ39 and 29 for 40โ49, so median class is 40โ49.
Use formula (with class boundaries): median = L + ((N/2 โ cf_prev) / f_m) ร h
Here: L = 39.5 (lower boundary of 40โ49), cf_prev = 19, f_m = 10, h = 10
median = 39.5 + ((20 โ 19) / 10) ร 10 = 39.5 + 1 = 40.5
Mode (most frequent class, interpolation)
Modal class is 40โ49 (frequency 10). Use formula:
mode โ L + ((fm โ f1) / (2fm โ f1 โ f2)) ร h
Here: L = 39.5, fm = 10, f1 (previous) = 9, f2 (next) = 6, h = 10
mode = 39.5 + ((10 โ 9) / (20 โ 9 โ 6)) ร 10 = 39.5 + (1/5) ร 10 = 39.5 + 2 = 41.5
Visuals: Histogram and Ogive (simple)
Below is a simple drawn histogram (bars show class frequencies) and an ogive (cumulative frequency curve) overlaid.
Interpretation: The tallest bar (40โ49) has most students (modal class). The ogive helps find percentiles visually (e.g., the 50th percentile ~ median ~ 40.5).
Quick steps
- Make a frequency table with class marks and cumulative frequencies.
- Mean: ฮฃ(f ร m) รท N.
- Median: locate median class using cf, then interpolate using boundaries.
- Mode: locate modal class (largest f), use mode formula to interpolate.
- Draw histogram: bars of equal width, heights = frequencies. Draw ogive: plot cumulative frequency at class upper boundaries and join points.
Common mistakes & tips
- Don't forget class boundaries (e.g. 39.5) when interpolating for median and mode.
- For histogram, use class width correctly โ if classes have different widths, draw bars with widths proportional to class width and use frequency density (frequency รท width) for height.
- Always check total frequency N before calculating mean, median and percentiles.
Practice (try these)
- From the table above, find the percentage of students who scored 50 or more.
- Using an assumed mean of 44.5, recalculate the mean (show the steps).
- If one more student scored in 70โ79, what happens to the mean and median? Explain.