Objectives of Classification

  1. To condense the data for easy understanding

  2. To help comparison

  3. To eliminate unnecessary details

  4. To make decision making possible

  5. To enable further statistical treatments

  6. To identify main features of the data

i. Chronological Classification

It is the arrangement of data in ascending or descending order with reference to time.

Chronological Classification
Year Population
2009 567
2010 638
2011 736
2012 758

ii. Geographical Classification

It is the arrangement of data with reference to geographical location such as countries, states (Spatial).

Geographical Classification
States Production of Rice
Andhrapradesh 1200
Tamilnadu 950
Kerala 830

iii. Qualitative Classification

It is the arrangement of Data on the basis of some qualities or Values or attributes such as colour, sex, religion, literacy.

Qualitative Classification
States Literacy
Kerala 99.5%
Karnataka 95.6%
Bihar 68%

iv. Quantitative Classification

It is the classification of Data on the basis of some quantitative measurement such as weight, price, cost.

Quantitative Classification
Companies Sales
Hundai 800
Tata 638
Maruti 736

Quantitative Data:

Data can be measured numerically-eg; Income, Production, Price, Cost..

Qualitative:

Data cannot be measured numerically- eg; Health, Intelligence, Ability..Also termed as Attributes.

Variable

A variable is that characteristic whose value is capable of changing from unit to unit. Depending on the way they vary, they are broadly classified Into two types:

  1. CONTINUOUS
  2. DISCRETE
Indivisual Series (Simple Array)

Each value of the variable occurs usually once. It can be arranged either in ascending or descending order.

Individual Series
Number of workers Wage (Rs)
1 500
2 600
3 550

Discrete Series (Frequency Array)

Certain items occurs many time in the data. It can be arranged either in ascending or descending order.

Discrete Series
Number of Children per couple Number of Couples (Frequency)
0 21
1 19
2 10
Total 50

Continuous Series

Different values of the variable are stated in a continuous manner with respect to their frequencies.

Continuous Series
Marks (Class) Number of Students (Frequency)
0 - 10 5
10 - 20 10
20 - 30 17
30 - 40 13
40 - 50 5
Total 50

Frequency Distribution

An orderly arrangement of data classified according to the magnitude of observations in different classes along with their corresponding class frequencies is known as frequency distribution. Frequency means the number of times a value or item occurs.

Construction of Frequency Distribution

Selection of Class

  • There is no hard and fast rule to determine number of classes
  • A class should not be too big or too small
  • There should not be too much classes or too short
Example:- 0 - 10, 10 - 20, 20 - 30...etc

Class Limit

  • The class limits are the lowest and the highest values that can be included in the class.
  • It is the two ends of a class.
  • In class 20 – 30, 20 is called the lower class limit and 30 is called upper class limit.
Class Interval

  • It is the difference between the upper and lower class limits.
  • Class interval is also known as class width or class size.
  • The class interval of the class 50 – 100 is 50 (100 – 50 = 50)
Class Midpoint

  • It is the middle value of a class. It is also known as mid value or class mark.
  • It lies half way between the lower and upper class limits of a class.
Magnitude of Class Interval

  • The difference between lower and upper class boundaries is called the magnitude of a class interval
Class frequency

  • The number of observation corresponding to a particular class is known as the class frequency.

How to find Frequency of distribution ?

We had seen that frequency means the number of times a value or item occurs and we have to count the number of times each value of the variable is repeated in the data to get the frequency. If the data is large, the counting simply will invite errors. For this we use the method of tally marks. Tally marks are vertical bars (/) used for counting.

  • Let us create a frequency distribution for the following data.
  • 70, 54, 35, 45, 45, 73, 56, 46, 3, 42, 43, 43, 43, 36, 47, 23, 57, 45, 25, 43, 55, 21, 65, 78, 39, 28, 42, 21, 27, 70, 23, 85, 41, 71, 24, 43, 17, 26, 56, 39, 87, 43, 8, 38, 12, 71, 68, 28, 47, 23, 67, 60, 34, 59, 2, 77, 91, 56, 28, 43, 40, 21, 80, 56, 55, 51, 34, 58, 28, 28, 54, 34, 68, 30, 45, 24, 32, 34, 21, 54, 7, 16, 49, 32, 26, 21, 5, 26, 29, 37, 34, 21, 29, 71, 35, 8, 34, 20, 21, 80.

    Using tally marks, we can create a frequency distribution. For that first we will draw a table with three columns. In the first column we write the class, in the second we write tally marks, and in the third frequency. All the entries in the first column are filled with classes. Now look at the data given. The first entry is 70. That-will fall in the class 70 - 80. Now strike off the entry 70 in the data and and put a tally mark in the second column right to the class 70 - 80. The second entry is 54. That will fall in the class 50 - 60. Now strike off the entry 54 in the data and put a tally mark in the second column right to the class 50 - 60. This process will be repeated up to when all the entries in the data gone stroked off. One more thing to notice is that, after placing 4 tally marks vertically, for the fifth we put the tally mark horizontally to cut the first four tally marks, so that this gives us a block of 5. For the sixth we put another tally mark vertically leaving some space from the first block. Look at the given below table, it is completed by doing the above said process.

    Frequency Distribution with Tally Mark
    Class Tally Marks Marks
    0 - 10 //// / 6
    10 - 20 /// 3
    20 - 30 //// //// //// //// //// 25
    30 - 40 //// //// //// / 16
    40 - 50 //// //// //// //// 19
    50 - 60 //// //// /// 13
    60 - 70 //// 5
    70 - 80 //// /// 8
    80 - 90 //// 4
    90 - 100 / 1
    Total 100

    Exclusive Method

    Under this method the classes, are so fixed that the upper limit of one class is the lower limit of next class.

    Exclusive Classes
    Marks (Class)
    0 - 10
    10 - 20
    20 - 30

    Inclusive Method

    Under this method the classes, are so fixed that the upper limit of one class is included in the class itself.

    Inclusive Classes
    Marks (Class)
    0 - 9
    10 - 19
    20 - 29

    How to Convert Inclusive Classes into Exclusive Classes ?

    Find the difference between the upper limit of a class and the lower limit of the next class. Find half the difference. Subtract this number from all the lower limits and add this number to all the upper limits.

  • Let us convert the below given inclusive type classes into exclusive type classes.
  • Inclusive Classes
    Marks (Class)
    0 - 9
    10 - 19
    20 - 29

    Given classes, 0 – 9, 10 – 19 , 20 – 29

    Difference between the upper limit of a class and the lower limit of the next class = 10 – 9 = 1

    Half the difference : \( {{\frac{ 1}{2}} } \) or (0.5).

    Now we can get exclusive type class as given below.

    Exclusive Classes
    Marks (Class)
    -0.5 - 9.5
    9.5 - 19.5
    19.5 - 29.5

    Cumulative Series

    In a cumulative series the frequencies are progressively totalled and aggregates are shown.

    Cumulative Series
    Marks (Class) Number of Students (Frequency)
    Marks below 10 12
    " below 20 18
    " below 30 24
    " below 40 30
    " below 50 36

    The cumulation may be upward or downward.

    Open end Class

    If the lower limit of the first class or upper limit of the last class are not given, such series are called open end class series.

    Open end Class
    Marks (Class) Number of Students (Frequency)
    Marks below 10 4
    10 - 20 6
    20 - 30 6
    30 - 40 9
    40 and above 5

    Unequal Class

    If all classes in the distributions are not equal, it can be called unequal class distribution.

    Unequal Class
    Marks (Class) Number of Students (Frequency)
    0 - 10 4
    10 - 20 6
    20 - 25 6
    25 - 30 9
    30 - 40 5

    Univariate Distribution.

    Frequency distribution with single variable is called Univariate.

    Univariate Distribution
    Marks. Number of Students.
    40 - 50 5
    50 - 60 8
    60 - 70 15
    70 - 80 20
    80 - 90 7
    90 - 100 2

    Bivariate Distribution.

    Frequency distribution with two variables is called Bivariate.

    Bivariate distribution
    Sales. 100 - 200 200 - 300 300 - 400 400 - 500
    Cost.
    40 - 50 5 3 2 1
    50 - 60 8 4 3 1
    60 - 70 8 3 1 1
    70 - 80 6 1 2 1
    80 - 90 4 1 1 2