Examples of Primary Sources of Data
Letters Manuscripts
Diaries Journals
Maps* Video footages*
Oral histories Speeches
Interviews News Papers*
Research data Audio recordings
Photographs* Objects or artifacts

*usually, but not always will be a primary source.

Diference Between Primary and Secondary Data
PRIMARY DATA SECONDARY DATA
Original Data Second-hand Data
Expensive Less expensive
Time Consuming Less time consuming
Raw item Finished material
More care needed at the time of data collection More care needed while using

Personal Interview

In personal interview, the investigator himself goes to the place of the informants and collects the information by asking questions or through observation. The questions asked may be unstructured questions or from structured questionnaires or schedules.

Structured Questionnaires and Schedules:

After studying the problem of investigation, the investigator prepare a list of questions that he feels relevant and useful. This is called a structured questionnaire. Appropriate spaces are given in the questionnaire for recording the answers of the informants. This sheet is called a schedule. The printed form of the questionnaire is given to the respondent and is asked to record his answers by himself. This form of questionnaire is called a structured questionnaire.

Unstructured Questions:

Suppose an investigator is collecting data from HIV+ people. Since he is reluctant to unveil his situation or the facts lead him to the situation, he may not give proper answer to the questions that the investigator already framed. In such circumstance it would be better to establish a rapport with the informant by asking him questions appropriate for the situation. Answers to some of these questions may provide some relevant data to the investigator. This type of questions is called unstructured questions.

Observation Method:

Suppose an investigator is studying about the living conditions of certain tribal communities. Data for this study cannot be collected through asking them questions, structured on unstructured. Sometimes a long period of stay is essential to have an adequate and suitable rapport with them. The data is collected through observation. Observation method is useful in anthropological studies.

Merits of Personal Interview

  • Original data are collected
  • Likely to be more reliable and accurate
  • Response will be more encouraging
  • Uniformity and homogeneity
  • Possible to collect supplementary information
  • Misinterpretation can be avoided

Disadvantages of Personal Interview

  • Most expensive method
  • More Time consuming
  • Possibility of influencing the respondents
  • Chances of personal bias

Indirect Oral Investigation

Indirect Oral Investigation is the method of collecting data through indirect sources. Persons who are likely to have information about the problems, are interrogated and on thebasis of their answers, factual data have to be compiled.

Merits of Indirect Oral Investigation

  • Less time consuming
  • Less Expensive
  • Less effort
  • Confidential information can be collected.
  • Information is likely to be unbiased and reliable.
  • This method is relatively simple to understand.

Demerits of Indirect Oral Investigation

  • The degree of accuracy of information is less.
  • This method leads to doubtful, conclusion
  • Carelessness of the witness is likely to happen.

Telephonic Interview

In this method, the investigator collects data from the informants through telephonic conversation. In early times this method was not much applicable in India, since most of the people have no access to telephone. But now, thanks to the giant leap of information technology, the number of people with access to telephone is high. Hence the method of primary data collection through telephone interview finds a better position.

Merits of Telephonic Interview

  • Less time consuming
  • Less Expensive
  • Less effort
  • Can cover a wide geographic area
  • Easy to conduct
  • More personal in nature
  • Fast data collection

Limitations of Telephonic Interview

  • Questions cannot be of a complex nature.
  • May refuse to participate.
  • Cannot see body language.
  • May provide incorrect data.

Mail Questionnaire

In this method, the questionnaire prepared is send to the informants. A self addressed envelope and a covering letter requesting to furnish the necessary information are also sent along with the questionnaire. This method is useful in cases where the informants are spread over a wider field.

Merits of Mail Questionnaire

  • Useful where the field of investigation is vast.
  • Need only less effort.
  • Less Expensive.
  • It is free from the bias of the interviewer.
  • Respondents have adequate time to give answers.
  • Results can be made more dependable and reliable.
  • Respondents, who are not easily approachable, can also be reached conveniently.

Demerits of Mail Questionnaire

  • It cannot be used for illiterate or uneducated respondents.
  • Rate of non-response is high in comparatively with other method.
  • If there is any confusion in the questionnaire, they cannot be solved.
  • The control over questionnaire may be lost once it is sent.
  • It is difficult to verify the accuracy of the answers given.
  • There is no scope for asking supplementary questions.
  • Filled in questionnaire may be incomplete as well as inaccurate.

Construction of a Questionnaire

The success or failure of the investigation depends on it. So it should be scientific and designed carefully so that they are extremely reliable and highly accurate. The construction or drafting of a questionnaire is an art. Only an expert with sound intelligence and rich experience can design a meaningful questionnaire. The questionnaire should contain two parts: The aims and objectives of the investigation should be mentioned in the first part. A request seeking help and co-operation of informants should also be included there. If the information that should be furnished by the informants needs secrecy, an assurance should be given that the information furnished by them will be kept confidential. All these will be given in the first part of the questionnaire. The questionnaire should provide the necessary instructions such as the time within which and the place to which the furnished questionnaire to be returned. In the second part, the questions are included.

Major Problems:

  1. Selection of Type of Questions
  2. Order of Questions
  3. Question wording and form of response
A good questionnaire should have the following qualities:

  • The questions should be clear, simple and easy to understand:

  • The questionnaire should be brief:

  • The questions should not use double negatives:

  • The questions should not be leading:

  • The questions should be arranged in a logical order:

  • Personal questions should be avoided:

  • The questionnaire should look attractive:

  • The questionnaire may consist of closed ended questions:

  • The questionnaire should be pre-tested.

Pilot Survey

When questionnaire is ready, it is always advisable to tryout the questionnaire or schedule on a limited number of informants which is known as pilot survey or pre-testing of the questionnaire. This should be done before the actual survey is undertaken. This brings out problems which can be attended to before the large scale survey begins. This is called pilot survey.

Pilot survey also helps in assessing the suitability of questions, clarity of instructions, performance of enumerators and cost and time involved in actual survey.

Collection of Secondary Data

Secondary data are those data which can be obtained from published or unpublished sources. Secondary data are those which are available in published or unpublished records. Once a decision is taken to collect secondary data, the question of sources of data arises. There are two sources for the collection of secondary data, namely, published sources and unpublished sources.

Published Sources

  • Reports and publications of central and state governments
  • Official publications of international bodies
  • Financial and economic bodies
  • Publications of research scholars
  • Annual reports of firms and companies
  • Reports of commitees

Unpublished Sources

Data maintained by

  • Government departments
  • Private offices
  • Studies made by research institutions
  • Scholars
  • Individuals

Population, Census, Sample Survey

POPULATION

The aggregates from which data are to be collected in a statistical enquiry is called population. It is the items under consideration in any field of enquiry.

CENSUS

A complete enumeration of all the items in the population is known as Census survey. It is also called complete enumeration.

SAMPLE SURVEY

A survey conducted by taking sample to represent the characteristics of the population under study is called sample survey.

Advantages of Census Method

  1. Free from sampling errors.
  2. Results will be highly accurate.
  3. Useful for further studies.
  4. Can study each unit in detail.
  5. All the characters of the population are maintained in original form.
  6. Suitable for heterogeneous units.

Disdvantages of Census Method

  1. Large number of enumerators required.
  2. It is time consuming.
  3. Highly expensive.
  4. Inconvenient.
  5. Possible in limited circumstances.
  6. Not applicable for infinite population.
  7. Labour consuming.
  8. More statistical errors

Sample

A sample is that part of the population which represents the entire population in terms of the characteristics under study.

Sample Survey

In census method we collect information from each unit in the population. But sometimes it is not possible due to many reasons. Suppose that a mobile phone company wants to know about the popularity and using habits of mobile phone among college students. Is it advisable to collect information from each college student, which is the population of the particular statistical survey? Definitely not. Then, how will they conduct the study? They will collect information from some of them only. From this data they come to a conclusion. The selected students in this investigation is known as sample. That is, a sample is that part of the population which represents the entire population in terms of the characteristics under study. The result obtained through this sample will be reasonably matching to the result that could be obtained through complete enumeration method. Thus, a survey conducted by taking sample to represent the characteristics of the population under study is referred to as sample survey. The process of selecting a sample out of a given population is called sampling. The number of units in the sample is called sample size.

Sample method is an important method of statistical investigation. In most of the statistical investigations we rely on sample survey method. In our daily life also we take on sampling to obtain results. A physician makes inferences about a patient’s blood by testing a few drops of his blood. It is the only method that can be used in certain cases, as we had seen in the testing of blood Sample surveys are performed because of the following reasons:

Sample method is an important method of statistical investigation. In most of the statistical investigations we rely on sample survey method. In our daily life also we take on sampling to obtain results. A physician makes inferences about a patient’s blood by testing a few drops of his blood. It is the only method that can be used in certain cases, as we had seen in the testing of blood Sample surveys are performed because of the following reasons:

  1. Less expensive: The cost of conducting a sample survey is much lesser than the cost of conducting the investigation through census method.
  2. Less time consuming : Since we consider only a part of the population, it will save considerable time and labour in collecting data. Since data collected is less in number compared to census method, it enables quick classification and processing of data.
  3. More reliable: The results obtained through sample survey are more accurate and reliable because it will be free from errors arise from inaccuracy of information or incompleteness of returns.
  4. Detailed study of the selected unit is possible: Since only a part of the population is put under study, we can collect more detailed information from all the selected units.

The sample in a sample survey should be selected very carefully. The selection may be made either deliberately or randomly. But it should have the following essential characteristics:

  • A good sample should be a representative of the population.
  • lt should be homogeneous.
  • It should be adequate.

Merits of Sample Survey

  1. Less expensive
  2. Less time consuming
  3. More reliable
  4. More detailed information
  5. Organisational Convenience
  6. More Scientific
  7. Indispensable Method in certain cases

Demerits of Sample Survey

  1. Absence of being a representative
  2. Likely to arrive at wrong conclusions
  3. Small universe
  4. Specialised knowledge
  5. Inherent defects
  6. Sampling Error
  7. Personal bias

Methods of Sampling

There are various methods of selecting samples from a population. These are called sampling techniques. The two types of sampling techniques are random sampling and non-random sampling.

Random Sampling Method

Random sampling a is a technique of drawing a sample from the population in which each and every unit of the population has equal chance of being included in the sample. It is further divided into simple random sampling and restricted random sampling.

1. SIMPLE RANDOM SAMPLING

In this method, the sample is taken from the population without making any division or classification of the population. Hence every unit of the population has an equal chance of being selected in the sample. Simple random sampling may be done either by using lottery method or by Table of random numbers.

  1. Lottery method
  2. This is the most popular and simple method of sampling. Under this method, all the items of the population are numbered on separate slips of paper of same size, shape and colour. The paper slips are folded in a uniform manner and mixed up in a container. A blindfold selection is then made of the number of slips required to form the desired sample size. The selection of items depends purely on chance.

  3. Table of random numbers
  4. Several standard random number tables are available, and the most popular one is Tippett’s random number table. The random number table constructed by Tippett consists of 10400 four digit numbers giving a total of 41600 numbers (10400x4). The technique of selecting random sample with the help of this table is like this: if we have to select a sample of 100 from a population of 5000, we first have to number the population from 1 to 5000. Then we can open any page of Tippett’s table and select the first 100 numbers which are less than 5000.

2. RESTRICTED RANDOM SAMPLING

Restricted random sampling is of mainly three types. Stratified sampling, systematic sampling and cluster sampling.

  1. Stratified Sampling:-
  2. When the population is heterogeneous, stratified sampling method is used. Under this method the whole population is divided into various groups or strata of units, such that the units in each class possess similar characteristics. For example, suppose you are studying about the consumption pattern of students in your school. The population comprises the whole student, studying in various standards of your school. A student studying in standard-5 and a student studying in standard-9 may have different consumption patterns. That is, for this characteristic, the population is heterogeneous. Hence, different standards can be selected as different group, or strata. Then sample is drawn from each stratum at random.

  3. Systematic Sampling:-
  4. A systematic sampling is formed by selecting one at random and then selecting the rest at evenly spaced intervals until the sample size has been reached. Suppose that that in the nature club of your school, there are 100 members and you want to make a core group of 10. First you number the 100 students of the club from 1 to 100. By lottery method or by random table method you select one student from the first ten. Let it be the 7th student. Then take an appropriate interval and select the rest 9 students. If the interval you had taken is 10, then the second student in the sample is the 17th student, the third student in the sample is the 27th student, etc.

  5. Cluster Sampling:-
  6. This type of sampling is carried out in several stages. Suppose we are studying about the employment of households in Kerala. In the first stage, Kerala is divided into three or four zones. Then each zone is divided into districts. Then each district is divided into villages. From each district, sample of villages may be taken at random. From each selected village, households of required size are also taken at random. Since several stages involve in cluster sampling, it is also known as multi-stage sampling.

Non-Random Sampling Method

In this method of sampling the investigator himself makes the choice of sample from the population according to his own discretion which he thinks to be the best. Here all the units in the population do not have equal chance of being selected in to the sample. Since the investigator gets the freedom to include or avoid a particular unit, it enables him to collect data smoothly. But, it has many shortcomings as:

  • The bias and prejudices of the investigator influence the selection of the sample. Sometimes this sample cannot be considered as a true representative of the population.
  • The degree of accuracy may not be assured.

The below given chart shows different sampling method we using for collecting data.

Sampling and Non-sampling errors

Statistical investigations are carried out for drawing some inferences about the population. ‘Errors may occur at any time of the investigation. It may be at the time of framing the sample, or in the collection of data, or at any other process of investigation. Mainly we address two types of errors, namely, sampling errors and non-sampling errors.

1. SAMPLING ERRORS

In sampling method we collect data from only a small fraction of the population. After drawing the inference by studying the data, we generalize the result to the population as well. That is, we are drawing inferences about the population on the basis of a few observations (or sample). Errors may occur in this process. This sample value may differ from the population value that is the value we might have got if we conduct complete enumeration. The error arising due to drawing inferences about the population on the basis of sample is known as sampling error. Let us see an example:

Suppose that there are 30 students in your class. Let their marks in Economics is as below. Let the problem is to find the performance of the students in Economics examination. Let it be carried out by calculating the average mark.

Table-12
43 48 65 87 32 17
43 50 80 32 48 82
73 67 62 70 39 12
43 45 61 76 32 75
71 63 32 18 60 42

If we consider the whole population (i.e. all the 30 students). the average mark is 52. Now, let . the problem be done in sampling method. Let the sample size be 5. At random we are selecting marks of five students. Let they be 70, 32, 39, 82 and 17. The average of these five marks is 48. That is, the population value is in fact 52 but the sampling value is only 48. This difference comes from sampling error. That is, the sample we had chosen may not be the correct representatives of the population. Here the sampling error is 52 —48 = 4.

As the size of the sample increases, the error in sampling decreases. Hence, sampling errors may be minimized by taking large samples. For minimizing sampling error, the following precautions may be taken:

  1. Sample size be made large
  2. Sample should be taken with care
  3. Enumerators should be well trained
  4. Scientific methods of data collection should be employed

2. NON-SAMPLING ERRORS

Will the result of the investigation be free from errors if we conduct the census method? No. There may creep in errors because of many factors. The vast size of the population, inability of certain enumerators, tabulation of huge data, all these factors put in errors. Errors arising in this manner are called non-sampling errors. That is, non-sampling errors are those that creep in due to human factors. Some of the non-sampling errors are errors in data acquisition, non-response errors and sampling bias. More specifically, non-sampling errors may arise from one or more of the following factors:

  1. Data specification being inadequate with respect to the objectives of the survey Statistical units prescribed may be inappropriate
  2. Methods of interview may be unsuitable
  3. Investigators may be inexperienced or untrained
  4. No or incorrect response from respondents
  5. Errors in data processing
  6. Errors occurred during presentation and tabulation of data

Non-sampling error is more serious than sampling error because sampling error be minimized by taking larger samples. But, sampling errors cannot be minimized even by making samples large or small.