Grouping of Data

In this lesson we will learn about the grouping of data.

Grouped data are data that has been organized in groups known as classes.

Grouped data has been classified and a data class is group of data.

Consider the marks obtain by 15 students in a history test as given below,

23, 45, 32, 34, 42, 78, 65, 55, 58, 75, 36, 56, 85, 73, 60.

The data in this form is called the raw data.

When data looking in this form, we find the highest and the lowest marks.

We take some time to search for the maximum and minimum marks.

If we arrange the data in ascending or descending order then less time is consumed.

So let us arrange the marks in ascending order as

23, 32, 34, 36, 42, 45, 55, 56, 58, 60, 65, 73, 75, 78, 85.

Now we can easily and clearly see that lowest marks are 23 and highest marks are 85.

Range

The difference between the highest and the lowest values of the given data is called the range of the data.

So, the range of the above data is 85 – 23 = 62.

When in any observation the number of data is large then presentation of data in ascending or descending order can be quite time consuming.

In the example: 

The following marks (out of 50 marks) are obtain by 30 students in class VI a English test as given below,

25, 8, 28, 30, 25, 13, 8, 30, 13, 19, 42, 39, 42, 45, 32, 28, 13, 39, 26, 32, 27, 13, 13, 26, 39, 28, 42, 42, 8, 39.

8, 8, 8, 13, 13, 13, 13, 13, 19, 25, 25, 26, 26, 27, 28, 28, 28, 30, 30, 32, 32, 39, 39, 39, 39, 42, 42, 42, 42, 45.

In above observation we see that the 3 students got 8 marks, so the frequency of 8 marks is 3, 5 students got 13 marks, so the frequency of 13 marks is 5.

To understand the data easily we write data in a table given below.

Example: The weight of 50 students measured in (kg) are given below, construct a distribution table.

              50        24      19      35       34
              32        65      31      26       46
              41        23      44      37       34
              55        69      66      42       58
              64        69      65      48       52
              43        50      67      33       30
              40        69      55      36       67
              62        28      28      54       61
              52        49      67      58       62
              55        46      68      68       61

 In above observation the number of data is large, so arrange the data in ascending order. 

              19       23       24       26      28
              28       30       31       32      33
              34       34       35       36      37
              40       41       42       43      44
              46       46       48       49      50
              50       52       52       54      55
              55       55       58       58      61
              61       62       62       64      65
              65       66       67       67      67
              68       68       69       69      69           

To present such a large number of data we condense it into groups like 10-20, 20-30,…60-70 (since our data is from 19 to 69).

The groups 10-20, 20-30, 30-40, 40-50, 50-60 are called classes or class intervals.

In class 10-20, the number 10 (least number) is called the lower class limit and 20 the (greatest number) is called the upper class limit.

In any class interval the difference between the upper limit and the lower limit is called the class size or class width.

Thus, the class size of class 10-20 is 10.

e.g., in 10-20, 10 is the lower class limit and 20 is the upper class limit.

The above data can be represented in tabular form as follows: 

The mid value of a class is called its class mark.

When adding the upper and lower class limits and dividing the sum by 2, we get class mark of the class.

Thus, the class mark of 10-20 is (10 + 20)/2 = 15.5

the class mark of 20-30 is (20 + 30)/2 = 25.

Data presenting in this form are simplifies and condense and enables us to observe some important features at a glance.

This is called a grouped frequency distribution table.

Here we can easily observe that weight of 12  students is 31 to 35 kg.

We also observe that in above table the classes are non-overlapping.

There is no any hard and fast rule about this except that the classes should not overlap.

We have made more classes of shorter size, or fewer classes of larger size. 

For example the intervals could have been 
21- 23, 24 – 26, and so on.   

Now, if two new students admitted in the class of weights 25.5 kg and 30.5 kg, so here is not clear in which class interval we will include them.

We can not add them either 21 – 25 or 26 – 30, because there are gaps between the upper and lower limits of two consecutive classes.

Therefore, we need to divide the intervals so that the upper and lower limits of consecutive intervals are the same.

For this, first we find the difference between the upper limit of a class and the lower limit of its succeeding class, then we add half of this difference to each of the upper limits and subtract the same from each of the lower limits.

Example- consider the classes 21-25 and 26-30.

The lower limit of 26-30 is 26
The upper limit of 21-25 is 25.

First we find the difference between the upper limit of a class and the lower limit of its succeeding class,

The difference = 26 – 25 = 1

So, half of the difference = 1/2 = 0.5 

now, add half of this difference to each of the upper limits and subtract the same from each of the lower limits.So the new class formed from 21-25 will be 

(21-0.5) – (25+0.5), i.e.,19.5 – 25.5.

Similarly, the new class formed from the class 26-30 will be (26-0.5) – (30+0.5), i.e., 25.5 – 30.5. 
in the same manner, continuing the classes are,

20.5 – 25.5, 25.5 – 30.5, 30.5 – 35.5, 35.5 – 40.5, 40.5 – 45.5, 45.5 – 50.5, 50.5 – 55.5.

Now we include the weights of the new students in these classes. But 25.5 appears in both classes 

20.5 -25.5 and 25.5 – 30.5, 

so in which class this weight should be considered.

If it considered in both classes, it will be counted twice.

By convention, we consider the weight 25.5 in class 25.5 – 30.5, not in 20.5 – 25.5.

Similarly, 30.5 is considered in class 30.5 – 35.5, not in 25.5 – 30.5.

So, 35.5 is include in 35.5 to 40.5, and 40.5 kg would be included in 40.5 to 45.5.

So, the new weights 45.5 kg and 50.5 kg would be included in 45.5 – 50.5 and 50.5 – 55.5, respectively. 

Now, with these assumptions, the new frequency distribution table will be as shown below:

20.5 – 25.5 
25.5 – 30.5
30.5 – 35.5 
35.5 – 40.5 
40.5 – 45.5 
45.5 – 50.5 
50.5 – 55.5

Leave a Reply

Your email address will not be published. Required fields are marked *