Description
This is a presentation explaining various methods of data collection and sampling.
Data Collection and Sampling
January 31, 2013
Dr.B.Sasidhar
1
Types of Data - Two Types
?
Qualitative Categorical or Nominal: Examples are?Color ?Gender ?Nationality
?
Quantitative Measurable or Countable: Examples are-
January 31, 2013
?Temperatures ?Salaries ?Number of points scored on a 100 Dr.B.Sasidhar point exam
2
Scales of Measurement
• Nominal Scale - groups or classes
?Gender, Nationality • Ordinal Scale - order matters
?Ranks (top ten videos)
• Interval Scale - difference or distance matters – has arbitrary zero value.
?Temperatures (0F, 0C), Likert Scale
• Ratio Scale - Ratio matters – has a natural zero value. ?Salaries, Age, Income
January 31, 2013 Dr.B.Sasidhar 3
Types of Data
For example:
•The Statistical abstracts of the United States, compiles data from primary sources • Compustat, sells variety of financial data tapes compiled from primary sources
PUBLISHED DATA – This is often a preferred source of data due to low cost and convenience.
– Published data is found as printed material, tapes, disks, and on the Internet. For example: – Data published by the Data published by the organization that has US Bureau of Census. collected it is called PRIMARY DATA.
– Data published by an organization different than the organization that has collected it is called January 31, 2013 Dr.B.Sasidhar SECONDARY DATA.
4
Secondary Data
• Secondary Data are data that have already been collected for purpose other than the problem at hand. • Can be located quickly • Inexpensive • Example: Stock exchange directory, Government data on population etc. • IMF, World Bank, ADB and other international Organisations • Web pages of different Organisations
January 31, 2013 Dr.B.Sasidhar 5
Primary versus Secondary data
Differences
Primary data Collection Purpose Collection process Collection cost Collection Time
January 31, 2013
Secondary data
For the problem at hand Very Involved
High
For other problems Rapid and easy
Relatively Low
Long
Dr.B.Sasidhar
Short
6
Classification of Secondary data
Secondary data
External Internal
Ready to use
Further process
Published Material
January 31, 2013
Computerised data bases
Dr.B.Sasidhar
Syndicated services 7
External Secondary Sources
Published Secondary data
Government sources
Internet Sources
Census data
General Business sources
Other Government Publications
Guides
January 31, 2013
Indexes
Annual Reports Dr.B.Sasidhar of companies
Directories
8
Observational and experimental studies
• When published data is unavailable, one needs to conduct a study to generate the data.
– Observational study is one in which measurements representing a variable of interest are observed and recorded, without controlling any factor that might influence their values. – Experimental study is one in which measurements representing a variable of interest are observed and recorded, while controlling factors that might influence their values.
January 31, 2013 Dr.B.Sasidhar 9
Primary Data
Primary data are collected by conducting a survey. If the survey covers the entire population, then it is known as the census Survey or complete enumeration. In contrast, if the
survey covers only a part of a population, or a
subset from a set of units, with the object of investigating the properties of the parent population
or set, it is known as the sample survey.
January 31, 2013 Dr.B.Sasidhar
10
Advantages of Sampling
1. A sample survey is cheaper than a census survey. 2. Since the magnitude of operations involved in a sample survey is small, both the execution of the field work and the analysis of the results can be carried out speedily. 3. Sampling results in a greater economy of effort as a relatively small staff is required to carry out the survey and to tabulate and process the survey data. 4. As compared to the census survey, more detailed information can be collected in a sample survey.
5. Since the scale of operations involved in a sample survey is
small, the quality of interviewing, supervision and other related activities can be better than that in a census survey.
January 31, 2013 Dr.B.Sasidhar
11
Limitations of Sampling
1. When the information is needed on every unit in the population such as individuals, dwelling units or business establishments, a sample survey cannot be of much help for it fails to provide information on individual count.
2.
Sampling gives rise to certain errors. If these errors are too large, the results of the sample survey will be of extremely limited use.
3.
While in a census survey it may be easy to check the omissions of certain units in view of complete coverage, this is not so in the case of a sample survey. January 31, 2013 Dr.B.Sasidhar 12
Surveys
• Surveys solicit information from people. • Surveys can be made by means of
– personal interview – telephone interview – through mailed questionnaire
January 31, 2013
Dr.B.Sasidhar
13
Survey Methods
Telephone Personal Mail
Traditional Telephone
Computer asst. TI
In-Home
Mall Intercept
CAPI
January 31, 2013
Dr.B.Sasidhar
Mail interview
Mail Panel 14
Questionnaire
• Questionnaire: A structured technique for data collection consisting of a series of questions, written or verbal, that a respondent answers. • Objectives of questionnaire:
– 1. Information needed, into a set of questions – 2. Motivate and encourage the respondent to give answers – 3. Should minimise the response error
January 31, 2013 Dr.B.Sasidhar 15
Questionnaire design process
1.Specify the information needed
2.Specify the type of interviewing method
3.Determine the content of individual questions
4.Design the questions to over come the Respondent’s unwillingness to answer
5.Decide the structure of the questions
6.Determine the question wording 7.Arrange the questions in proper order 8.Decide the form and layout 9.Pre-test the questionnaire
January 31, 2013 Dr.B.Sasidhar 10.Eliminate Bugs if any 16
Question Content
• Every question should elicit information which is needed to the study • If there is no use for the data resulting from a question, that question should be eliminated • However the introductory questions are to be there else the respondents may not co-operate • Some questions will be duplicated to test the reliability or validity • Questions unrelated to the immediate problem may be included and may be used later
January 31, 2013 Dr.B.Sasidhar 17
Questions - Structure
• Double Barreled Questions - A single question attempts to to cover two issues simultaneously • Example: Is Coca-Cola a tasty and refreshing soft drink? • Tasty and refreshing are two issues and should be asked in two questions. Else we will get ambiguous answers. • Multiple questions: Why do you shop in Big Bazaar? • Many types of answers, sometimes unwanted and coding will be difficult
January 31, 2013 Dr.B.Sasidhar 18
Structured Questions
• A structured question could be – 1. Multiple choice – 2. Dichotomous – 3. A Scale • Multiple choice questions could be coded easily but difficult to give all alternatives • Dichotomous questions have only two answers yes or no, agree or disagree etc • sometimes don’t know or both or none are given as alternatives • In scaling attributes like quality, beauty etc will be tested with say Lirket Scale etc January 31, 2013 Dr.B.Sasidhar 19
Question Wording
• 1. Define the issue in simple words • 2. Use unambiguous words • 3. Avoid leading and biasing questions (a question that leads the respondent to the desirable answer needed by the Researcher) • 4. Avoid implicit alternatives • 5 . Avoid implicit assumptions • 6. Avoid generalisations and Estimates • 7.Use positive and negative statements January 31, 2013 Dr.B.Sasidhar 20 for measuring attitudes and lifestyles
Inability to Answer
• The respondents may not remember, may not articulate or describe, may not be informed • Wife may not inform the husband about the groceries purchases and vice versa • Many things we may not remember like what you ate 4 days back • Not well educated people may not be able to describe the answer • Filter questions may be asked to drop these people. Before asking how many kids you have ask whether he or she is married.
January 31, 2013 Dr.B.Sasidhar 21
Unwillingness to Answer
• The possible reasons for unwilling to answer • 1.Too much effort is necessary • 2. The situation or context mat not be appropriate • 3. No legitimate purpose (if income is asked) • 4. If Sensitive questions are asked • 5. If privacy is touched
January 31, 2013 Dr.B.Sasidhar 22
Questionnaire Form and Layout
• 1. Divide the questionnaire into several parts • 2. Questions in each part should be numbered • 3. The questionnaire should be precoded • 4. The questionnaires themselves should be numbered serially • 5. This facilitates the control of questionnaires in the field as well as January 31, 2013 Dr.B.Sasidhar coding
23
Questionnaire Pretesting
• 1. All aspects of questionnaire should be tested (content, wording, sequence, from and layout, difficulty and instructions) • 2. The respondents should be similar to those who will be included in the actual survey • 3. The pretest sample size should be around 30 • 4. After seeing the results revise the questions • 5. Again pretest • 6. The responses obtained in the pretest should be coded and analysed to see approximate results
January 31, 2013 Dr.B.Sasidhar 24
Sampling
• Motivation for conducting a sampling procedure:
– Costs. – Population size. – The possible destructive nature of the sampling process.
• The sampled population and the target population should be similar to one another.
January 31, 2013 Dr.B.Sasidhar 25
Sampling and Non-sampling errors
• Two major types of errors can arise when a sampling procedure is performed. • Sampling Error
– Sampling error refers to differences between the sample and the population, because of the specific observations that happen to be selected, due to wrong process of selection (Bias) – Sampling error is expected to occur when making a statement about the population based on the sample taken. Expect sampling error to decrease with increase in sample size.
January 31, 2013 Dr.B.Sasidhar 26
Sampling Errors
Population income distribution
m ( population mean) The sample mean falls here only because Sampling error certain randomly selected observations were included in the sample.
January 31, 2013
x ( sample mean)
Dr.B.Sasidhar
27
Non-sampling Errors
• Non-sampling errors occur due to mistakes made along the process of data acquisition • Increasing sample size will not reduce this type of errors. • There are three types of Non-sampling errors;
– Errors in data acquisition, – Non-response errors, – 2013 January 31, Selection bias. Dr.B.Sasidhar
28
Data Acquisition Error
Population
If this observation…
Sample
Sampling error + Data acquisition error
…is wrongly recorded here…
January 31, 2013 Dr.B.Sasidhar 29
…then the sample mean is affected
Non-Response Error
Population
No response here... …may lead to biased results here.
Sample
January 31, 2013
Dr.B.Sasidhar
30
Selection Bias
Population
When parts of the population cannot be selected...
Sample
…the sample cannot represent the whole population.
January 31, 2013
Dr.B.Sasidhar
31
Causes of Non-sampling Errors
1. Using imprecise definition or wrong concept while launching the survey. 2. Entrusting the survey work to untrained and inexperienced investigators. 3. Despatching a defective mail questionnaire to respondents who may not clearly understand certain questions. 4. Errors that may arise on account of non-response from respondents. 5. Poor supervision of the field staff. 6. Faulty tabulation while transferring the questionnaire data to tabulation sheets.
7. Calculation mistakes in the processing and analysis of data.
8. Committing mistakes while oral or written presentation of the survey results.
32
Target Population
• Population: It is the collection of units or objects
that posses the information sought by the researcher and about which inferences are to be made
• Dividend paying companies • Sampling unit: The basic unit containing the
elements of the population to be sampled • Like SAIL, Infosys etc. • Element: An object that possesses the information sought by the researcher and about which inferences are to be made. • Current ratio, ROI, Debt Equity ratio
January 31, 2013 Dr.B.Sasidhar 33
Sampling Frame
• Frame: A representation of the units of the
target population that consists of a list or set of directions for identifying the target population
• Stock Exchange directories, Industry Publications etc • Often it is the researcher who compiles the frame • He or She should avoid inclusion of unnecessary units and should not omit the relevant units
January 31, 2013 Dr.B.Sasidhar 34
Sampling Techniques Classification
Non-probability Sampling
Probability Sampling
Convenience
Judgmental
Quota
Snowball
Random
January 31, 2013
Systematic
Dr.B.Sasidhar
Stratified
Cluster
35
Probability and Nonprobability Sampling
• Nonprobability Sampling: Sampling techniques that do not use chance selection procedures but rely on personal judgement of the researcher (Convenience procedure) • Probability Sampling: A sampling procedure in which each element of the population has a fixed probabilistic chance of being selected (Lottery method)
January 31, 2013 Dr.B.Sasidhar 36
Convenience Sampling
• The selection of sampling units is done by the researcher according to his convenience. • Examples: While studying consumer behaviour convenience sample may be used • This sampling is widely used in exploratory research • Useful in pre testing the questionnaire and pilot studies • It is the least time consuming and inexpensive method • This sample will not be a true representative of the population being studied
January 31, 2013 Dr.B.Sasidhar 37
Judgmental Sampling
• It is a form of convenience sampling in which the elements are selected by the researcher on some criteria • Example: Questions on E-commerce can’t be asked to all, but only to people looking like having internet computer knowledge • Useful in testing hypotheses of specialised nature • Limited use • It can’t be used for studies of general nature
January 31, 2013 Dr.B.Sasidhar 38
Quota Sampling
• It is a 2 stage restricted judgmental sampling. • First stage develops control categories on some criteria like age, sex etc. • Second stage the sample elements are selected based on convenience or judgment till the quota is fulfilled • Quota sampling will be effective in determining magazine readership
Control characteristic Sex Male Female Total Age 18-30 31-45 46-60 January 31, 2013 Total Sample % 48 52 100 27 39 34 Dr.B.Sasidhar 100 Sample items 480 520 1000 270 390 340 1000
39
Snowball Sampling
• It is a sample like snow ball • An initial sample is selected first on a random basis and subsequent respondents are selected based on referrals • An advertisement was given to get some data relating to work shop machines and the respondents were asked to give names and addresses of people who use same or similar type of machines. • This technique will be useful in collecting data about rare and antique pieces etc 40 January 31, 2013 Dr.B.Sasidhar
• Each element in the population has a known and equal probability of selection (lottery method) • To draw a random sample the researcher first compiles a sampling frame in which each element has a unique identification number • From a random table numbers will be drawn and the element will be added to sample traditionally • But now computer random numbers are chosen
January 31, 2013 Dr.B.Sasidhar 41
Simple Random Sampling
Random Sampling (contd.)
• Merits: 1. It is easy to understand • 2. It could be projected to the target population • 3. All Statistical inferences assume that the data have been collected by random sampling • Demerits: 1. Difficult to construct sampling frame • 2. Time consuming • 3. Cost of data collection is also high • 4. Often results in larger standard errors • 5. It may or may not result in a representative sample
January 31, 2013 Dr.B.Sasidhar 42
Systematic Sampling / Quasi-random Sampling
• In Systematic sampling procedure the sample is chosen by selecting a random starting point and then picking every ith element in succession from the sampling frame. • Gallup poll, data collection from say 4th house and then pick up every 5th house ie 4,9,14,19,24, 29,34…..and so on. • Sales of companies can be arranged in ascending order and systematic sampling can be used. This will give a fair sample of big and small firms for analysis • This is less costly and easy to collect data • Random number generation need not be understood January 31, 43 • Can be2013 applied even if Dr.B.Sasidhar Sampling frame is not available
Stratified Sampling
• This sampling uses 2 step process. First it partitions the population into sub populations or strata. Secondly the elements are selected from each stratum by a random procedure. • The elements with in each strata should be homogeneous as for as possible and it should be heterogeneous in different strata • People can be classified into strata on the basis of income, age etc and samples can be collected from each strata • This increases the precision but not the cost
January 31, 2013 Dr.B.Sasidhar 44
Cluster Sampling / Area Sampling
• This sampling uses 3 step process. First it partitions the population into sub populations or Clusters. Secondly few clusters are selected by random sampling procedure. Thirdly all elements in a cluster or a few elements from each cluster is chosen as sample • The elements with in each cluster should be homogeneous as for as possible and it should be heterogeneous in different clusters
Stratified Sapling 1 All strata are selected for sampling 2 Homogeneity within subgroups and heterogeneity between subgroups 3 Elements are randomly selected from within each subgroup Cluster Sampling Few clusters are selected Heterogeneity within subgroups and homogeneity between subgroups Only a few subgroups are randomly selected and all elements in those subgroups are covered
45
January 31, 2013
Dr.B.Sasidhar
Comparison of sampling techniques
Sampling 1.Convenience 2. Judgmental 3. Quota 4. Snowball 5. Random Merits Least expensive, lesser time consuming, most convenient Low cost, convenient, not time consuming Sample can be controlled for certain characteristics Can estimate rare characteristics Easy to understand, results could be projected Easy, no sampling frame, Can increase the representativeness, Includes all sub populations, precision Easy, Cost effective
Dr.B.Sasidhar
6. Systematic 7. Stratified 8. Cluster
January 31, 2013
Demerits Selection bias, not representative, not suitable for descriptive and causal Does not allow generalisation subjective Selection bias, no assurance of representativeness Time-consuming Difficult to construct sampling frame, expensive, lower precision, no assurance of representativeness No randomness Selection of criteria for strtification difficult, expensive Difficult to cluster and interpret, imprecise
46
doc_478661836.ppt
This is a presentation explaining various methods of data collection and sampling.
Data Collection and Sampling
January 31, 2013
Dr.B.Sasidhar
1
Types of Data - Two Types
?
Qualitative Categorical or Nominal: Examples are?Color ?Gender ?Nationality
?
Quantitative Measurable or Countable: Examples are-
January 31, 2013
?Temperatures ?Salaries ?Number of points scored on a 100 Dr.B.Sasidhar point exam
2
Scales of Measurement
• Nominal Scale - groups or classes
?Gender, Nationality • Ordinal Scale - order matters
?Ranks (top ten videos)
• Interval Scale - difference or distance matters – has arbitrary zero value.
?Temperatures (0F, 0C), Likert Scale
• Ratio Scale - Ratio matters – has a natural zero value. ?Salaries, Age, Income
January 31, 2013 Dr.B.Sasidhar 3
Types of Data
For example:
•The Statistical abstracts of the United States, compiles data from primary sources • Compustat, sells variety of financial data tapes compiled from primary sources
PUBLISHED DATA – This is often a preferred source of data due to low cost and convenience.
– Published data is found as printed material, tapes, disks, and on the Internet. For example: – Data published by the Data published by the organization that has US Bureau of Census. collected it is called PRIMARY DATA.
– Data published by an organization different than the organization that has collected it is called January 31, 2013 Dr.B.Sasidhar SECONDARY DATA.
4
Secondary Data
• Secondary Data are data that have already been collected for purpose other than the problem at hand. • Can be located quickly • Inexpensive • Example: Stock exchange directory, Government data on population etc. • IMF, World Bank, ADB and other international Organisations • Web pages of different Organisations
January 31, 2013 Dr.B.Sasidhar 5
Primary versus Secondary data
Differences
Primary data Collection Purpose Collection process Collection cost Collection Time
January 31, 2013
Secondary data
For the problem at hand Very Involved
High
For other problems Rapid and easy
Relatively Low
Long
Dr.B.Sasidhar
Short
6
Classification of Secondary data
Secondary data
External Internal
Ready to use
Further process
Published Material
January 31, 2013
Computerised data bases
Dr.B.Sasidhar
Syndicated services 7
External Secondary Sources
Published Secondary data
Government sources
Internet Sources
Census data
General Business sources
Other Government Publications
Guides
January 31, 2013
Indexes
Annual Reports Dr.B.Sasidhar of companies
Directories
8
Observational and experimental studies
• When published data is unavailable, one needs to conduct a study to generate the data.
– Observational study is one in which measurements representing a variable of interest are observed and recorded, without controlling any factor that might influence their values. – Experimental study is one in which measurements representing a variable of interest are observed and recorded, while controlling factors that might influence their values.
January 31, 2013 Dr.B.Sasidhar 9
Primary Data
Primary data are collected by conducting a survey. If the survey covers the entire population, then it is known as the census Survey or complete enumeration. In contrast, if the
survey covers only a part of a population, or a
subset from a set of units, with the object of investigating the properties of the parent population
or set, it is known as the sample survey.
January 31, 2013 Dr.B.Sasidhar
10
Advantages of Sampling
1. A sample survey is cheaper than a census survey. 2. Since the magnitude of operations involved in a sample survey is small, both the execution of the field work and the analysis of the results can be carried out speedily. 3. Sampling results in a greater economy of effort as a relatively small staff is required to carry out the survey and to tabulate and process the survey data. 4. As compared to the census survey, more detailed information can be collected in a sample survey.
5. Since the scale of operations involved in a sample survey is
small, the quality of interviewing, supervision and other related activities can be better than that in a census survey.
January 31, 2013 Dr.B.Sasidhar
11
Limitations of Sampling
1. When the information is needed on every unit in the population such as individuals, dwelling units or business establishments, a sample survey cannot be of much help for it fails to provide information on individual count.
2.
Sampling gives rise to certain errors. If these errors are too large, the results of the sample survey will be of extremely limited use.
3.
While in a census survey it may be easy to check the omissions of certain units in view of complete coverage, this is not so in the case of a sample survey. January 31, 2013 Dr.B.Sasidhar 12
Surveys
• Surveys solicit information from people. • Surveys can be made by means of
– personal interview – telephone interview – through mailed questionnaire
January 31, 2013
Dr.B.Sasidhar
13
Survey Methods
Telephone Personal Mail
Traditional Telephone
Computer asst. TI
In-Home
Mall Intercept
CAPI
January 31, 2013
Dr.B.Sasidhar
Mail interview
Mail Panel 14
Questionnaire
• Questionnaire: A structured technique for data collection consisting of a series of questions, written or verbal, that a respondent answers. • Objectives of questionnaire:
– 1. Information needed, into a set of questions – 2. Motivate and encourage the respondent to give answers – 3. Should minimise the response error
January 31, 2013 Dr.B.Sasidhar 15
Questionnaire design process
1.Specify the information needed
2.Specify the type of interviewing method
3.Determine the content of individual questions
4.Design the questions to over come the Respondent’s unwillingness to answer
5.Decide the structure of the questions
6.Determine the question wording 7.Arrange the questions in proper order 8.Decide the form and layout 9.Pre-test the questionnaire
January 31, 2013 Dr.B.Sasidhar 10.Eliminate Bugs if any 16
Question Content
• Every question should elicit information which is needed to the study • If there is no use for the data resulting from a question, that question should be eliminated • However the introductory questions are to be there else the respondents may not co-operate • Some questions will be duplicated to test the reliability or validity • Questions unrelated to the immediate problem may be included and may be used later
January 31, 2013 Dr.B.Sasidhar 17
Questions - Structure
• Double Barreled Questions - A single question attempts to to cover two issues simultaneously • Example: Is Coca-Cola a tasty and refreshing soft drink? • Tasty and refreshing are two issues and should be asked in two questions. Else we will get ambiguous answers. • Multiple questions: Why do you shop in Big Bazaar? • Many types of answers, sometimes unwanted and coding will be difficult
January 31, 2013 Dr.B.Sasidhar 18
Structured Questions
• A structured question could be – 1. Multiple choice – 2. Dichotomous – 3. A Scale • Multiple choice questions could be coded easily but difficult to give all alternatives • Dichotomous questions have only two answers yes or no, agree or disagree etc • sometimes don’t know or both or none are given as alternatives • In scaling attributes like quality, beauty etc will be tested with say Lirket Scale etc January 31, 2013 Dr.B.Sasidhar 19
Question Wording
• 1. Define the issue in simple words • 2. Use unambiguous words • 3. Avoid leading and biasing questions (a question that leads the respondent to the desirable answer needed by the Researcher) • 4. Avoid implicit alternatives • 5 . Avoid implicit assumptions • 6. Avoid generalisations and Estimates • 7.Use positive and negative statements January 31, 2013 Dr.B.Sasidhar 20 for measuring attitudes and lifestyles
Inability to Answer
• The respondents may not remember, may not articulate or describe, may not be informed • Wife may not inform the husband about the groceries purchases and vice versa • Many things we may not remember like what you ate 4 days back • Not well educated people may not be able to describe the answer • Filter questions may be asked to drop these people. Before asking how many kids you have ask whether he or she is married.
January 31, 2013 Dr.B.Sasidhar 21
Unwillingness to Answer
• The possible reasons for unwilling to answer • 1.Too much effort is necessary • 2. The situation or context mat not be appropriate • 3. No legitimate purpose (if income is asked) • 4. If Sensitive questions are asked • 5. If privacy is touched
January 31, 2013 Dr.B.Sasidhar 22
Questionnaire Form and Layout
• 1. Divide the questionnaire into several parts • 2. Questions in each part should be numbered • 3. The questionnaire should be precoded • 4. The questionnaires themselves should be numbered serially • 5. This facilitates the control of questionnaires in the field as well as January 31, 2013 Dr.B.Sasidhar coding
23
Questionnaire Pretesting
• 1. All aspects of questionnaire should be tested (content, wording, sequence, from and layout, difficulty and instructions) • 2. The respondents should be similar to those who will be included in the actual survey • 3. The pretest sample size should be around 30 • 4. After seeing the results revise the questions • 5. Again pretest • 6. The responses obtained in the pretest should be coded and analysed to see approximate results
January 31, 2013 Dr.B.Sasidhar 24
Sampling
• Motivation for conducting a sampling procedure:
– Costs. – Population size. – The possible destructive nature of the sampling process.
• The sampled population and the target population should be similar to one another.
January 31, 2013 Dr.B.Sasidhar 25
Sampling and Non-sampling errors
• Two major types of errors can arise when a sampling procedure is performed. • Sampling Error
– Sampling error refers to differences between the sample and the population, because of the specific observations that happen to be selected, due to wrong process of selection (Bias) – Sampling error is expected to occur when making a statement about the population based on the sample taken. Expect sampling error to decrease with increase in sample size.
January 31, 2013 Dr.B.Sasidhar 26
Sampling Errors
Population income distribution
m ( population mean) The sample mean falls here only because Sampling error certain randomly selected observations were included in the sample.
January 31, 2013
x ( sample mean)
Dr.B.Sasidhar
27
Non-sampling Errors
• Non-sampling errors occur due to mistakes made along the process of data acquisition • Increasing sample size will not reduce this type of errors. • There are three types of Non-sampling errors;
– Errors in data acquisition, – Non-response errors, – 2013 January 31, Selection bias. Dr.B.Sasidhar
28
Data Acquisition Error
Population
If this observation…
Sample
Sampling error + Data acquisition error
…is wrongly recorded here…
January 31, 2013 Dr.B.Sasidhar 29
…then the sample mean is affected
Non-Response Error
Population
No response here... …may lead to biased results here.
Sample
January 31, 2013
Dr.B.Sasidhar
30
Selection Bias
Population
When parts of the population cannot be selected...
Sample
…the sample cannot represent the whole population.
January 31, 2013
Dr.B.Sasidhar
31
Causes of Non-sampling Errors
1. Using imprecise definition or wrong concept while launching the survey. 2. Entrusting the survey work to untrained and inexperienced investigators. 3. Despatching a defective mail questionnaire to respondents who may not clearly understand certain questions. 4. Errors that may arise on account of non-response from respondents. 5. Poor supervision of the field staff. 6. Faulty tabulation while transferring the questionnaire data to tabulation sheets.
7. Calculation mistakes in the processing and analysis of data.
8. Committing mistakes while oral or written presentation of the survey results.
32
Target Population
• Population: It is the collection of units or objects
that posses the information sought by the researcher and about which inferences are to be made
• Dividend paying companies • Sampling unit: The basic unit containing the
elements of the population to be sampled • Like SAIL, Infosys etc. • Element: An object that possesses the information sought by the researcher and about which inferences are to be made. • Current ratio, ROI, Debt Equity ratio
January 31, 2013 Dr.B.Sasidhar 33
Sampling Frame
• Frame: A representation of the units of the
target population that consists of a list or set of directions for identifying the target population
• Stock Exchange directories, Industry Publications etc • Often it is the researcher who compiles the frame • He or She should avoid inclusion of unnecessary units and should not omit the relevant units
January 31, 2013 Dr.B.Sasidhar 34
Sampling Techniques Classification
Non-probability Sampling
Probability Sampling
Convenience
Judgmental
Quota
Snowball
Random
January 31, 2013
Systematic
Dr.B.Sasidhar
Stratified
Cluster
35
Probability and Nonprobability Sampling
• Nonprobability Sampling: Sampling techniques that do not use chance selection procedures but rely on personal judgement of the researcher (Convenience procedure) • Probability Sampling: A sampling procedure in which each element of the population has a fixed probabilistic chance of being selected (Lottery method)
January 31, 2013 Dr.B.Sasidhar 36
Convenience Sampling
• The selection of sampling units is done by the researcher according to his convenience. • Examples: While studying consumer behaviour convenience sample may be used • This sampling is widely used in exploratory research • Useful in pre testing the questionnaire and pilot studies • It is the least time consuming and inexpensive method • This sample will not be a true representative of the population being studied
January 31, 2013 Dr.B.Sasidhar 37
Judgmental Sampling
• It is a form of convenience sampling in which the elements are selected by the researcher on some criteria • Example: Questions on E-commerce can’t be asked to all, but only to people looking like having internet computer knowledge • Useful in testing hypotheses of specialised nature • Limited use • It can’t be used for studies of general nature
January 31, 2013 Dr.B.Sasidhar 38
Quota Sampling
• It is a 2 stage restricted judgmental sampling. • First stage develops control categories on some criteria like age, sex etc. • Second stage the sample elements are selected based on convenience or judgment till the quota is fulfilled • Quota sampling will be effective in determining magazine readership
Control characteristic Sex Male Female Total Age 18-30 31-45 46-60 January 31, 2013 Total Sample % 48 52 100 27 39 34 Dr.B.Sasidhar 100 Sample items 480 520 1000 270 390 340 1000
39
Snowball Sampling
• It is a sample like snow ball • An initial sample is selected first on a random basis and subsequent respondents are selected based on referrals • An advertisement was given to get some data relating to work shop machines and the respondents were asked to give names and addresses of people who use same or similar type of machines. • This technique will be useful in collecting data about rare and antique pieces etc 40 January 31, 2013 Dr.B.Sasidhar
• Each element in the population has a known and equal probability of selection (lottery method) • To draw a random sample the researcher first compiles a sampling frame in which each element has a unique identification number • From a random table numbers will be drawn and the element will be added to sample traditionally • But now computer random numbers are chosen
January 31, 2013 Dr.B.Sasidhar 41
Simple Random Sampling
Random Sampling (contd.)
• Merits: 1. It is easy to understand • 2. It could be projected to the target population • 3. All Statistical inferences assume that the data have been collected by random sampling • Demerits: 1. Difficult to construct sampling frame • 2. Time consuming • 3. Cost of data collection is also high • 4. Often results in larger standard errors • 5. It may or may not result in a representative sample
January 31, 2013 Dr.B.Sasidhar 42
Systematic Sampling / Quasi-random Sampling
• In Systematic sampling procedure the sample is chosen by selecting a random starting point and then picking every ith element in succession from the sampling frame. • Gallup poll, data collection from say 4th house and then pick up every 5th house ie 4,9,14,19,24, 29,34…..and so on. • Sales of companies can be arranged in ascending order and systematic sampling can be used. This will give a fair sample of big and small firms for analysis • This is less costly and easy to collect data • Random number generation need not be understood January 31, 43 • Can be2013 applied even if Dr.B.Sasidhar Sampling frame is not available
Stratified Sampling
• This sampling uses 2 step process. First it partitions the population into sub populations or strata. Secondly the elements are selected from each stratum by a random procedure. • The elements with in each strata should be homogeneous as for as possible and it should be heterogeneous in different strata • People can be classified into strata on the basis of income, age etc and samples can be collected from each strata • This increases the precision but not the cost
January 31, 2013 Dr.B.Sasidhar 44
Cluster Sampling / Area Sampling
• This sampling uses 3 step process. First it partitions the population into sub populations or Clusters. Secondly few clusters are selected by random sampling procedure. Thirdly all elements in a cluster or a few elements from each cluster is chosen as sample • The elements with in each cluster should be homogeneous as for as possible and it should be heterogeneous in different clusters
Stratified Sapling 1 All strata are selected for sampling 2 Homogeneity within subgroups and heterogeneity between subgroups 3 Elements are randomly selected from within each subgroup Cluster Sampling Few clusters are selected Heterogeneity within subgroups and homogeneity between subgroups Only a few subgroups are randomly selected and all elements in those subgroups are covered
45
January 31, 2013
Dr.B.Sasidhar
Comparison of sampling techniques
Sampling 1.Convenience 2. Judgmental 3. Quota 4. Snowball 5. Random Merits Least expensive, lesser time consuming, most convenient Low cost, convenient, not time consuming Sample can be controlled for certain characteristics Can estimate rare characteristics Easy to understand, results could be projected Easy, no sampling frame, Can increase the representativeness, Includes all sub populations, precision Easy, Cost effective
Dr.B.Sasidhar
6. Systematic 7. Stratified 8. Cluster
January 31, 2013
Demerits Selection bias, not representative, not suitable for descriptive and causal Does not allow generalisation subjective Selection bias, no assurance of representativeness Time-consuming Difficult to construct sampling frame, expensive, lower precision, no assurance of representativeness No randomness Selection of criteria for strtification difficult, expensive Difficult to cluster and interpret, imprecise
46
doc_478661836.ppt