Description
Research is a process (or series of iterative steps), and followed often when management is faced with a “problem” and/or “opportunity”, management needs further information in order to make a decision – the need for market(ing) research is an issue that is likely to need addressing...
Marketing Research
ROLE OF MARKETING RESEARCH
Customer Groups
Consumers Employees Shareholders Suppliers
Uncontrollable Environment factors
Controllable Environment factors
Marketing Research
Assessing information needs Providing Information Marketing Decision Making
Product Price Promotion Distribution
MarketIngManagers Market Segmentation Target market selection Marketing programme Performance and control
Economy Technology Competition Regulations Political factors Social & Cultural factors
•
Research is a process
(or series of iterative steps), and followed often when management is faced with a “problem” and/or “opportunity”, management needs further information in order to make a decision – the need for market(ing) research is an issue that is likely to need addressing...
The question is “when to conduct market(ing) research?”
When to Conduct Market(ing) Research
Yes Time Constraints Is sufficient time available? Availability of Data Is the information on hand inadequate? No Yes Yes Nature of Decision Is the decision of considerable importance? No Benefits vs. Costs Does the value of the research exceed the cost? Yes Conduct Market Research
No
No
Do not conduct market research!
Example issues: (1) What is our market share? (2) Will people drink tomato soup from a plastic jar? (3) Whose machine tools do our potential customers buy? (4) Which medicine is more preferred for a decease?
When Research Should be Done
•If it clarifies problems or investigates changes in the marketplace that can directly impact your product responsibility •If it resolves your selection of alternative courses of marketing action to achieve key marketing objectives •If it helps you gain a meaningful competitive advantage •If it allows you to stay abreast of your market(s)
Questions addressing the various stages of the
Research Process
Stage in the Process 1. Formulate problem Typical Questions
What is purpose of study - solve a problem? Identify opportunity? Is additional background info necessary? What info is needed to make decision? How will info be utilized? Should research be conducted? How much is already known? Can hypothesis
2. Determine research design – Exploratory / conclusive
Descriptive and causal
be formulated ? What types of questions need to be answered ? What type of study best address research questions ?
3. Determine data collection Can existing data be used to advantage? Methods & forms What is to be measured? How? What is source
of data? Are there any cultural factors? Are there any restrictions on data collection methods ? Can objective answers be obtained by asking people?
Questions addressing the various stages of the Research Process
Stage in the Process Typical Questions
4. Design data collection forms
Should structure or unstructured items used in
collecting data? Should purpose of study be made known to respondents? Should rating scale be used? What type of rating scale would be most appropriate? Who is target population? Is list of population elements available? Is sample necessary? Is Probability sample desirable? How large should sample be? What operational procedures will be followed? What methods will be used to ensure quality of data collected? Who will read the report? What is their technical level of sophistication? Are managerial recommendations called for? What will be format of written report? Is oral report necessary? How should the oral report be structured?
5. Design sample & collect
6. Analyze & interpret data
The research process
Presenting the results Management information
Data analysis
Decisions requiring information
Sampling
Problem definition
Research design
The research process Is a set of iterative steps and relationships....
The Concept of Total Error
All research has error and this impacts on the research outcome – its usability and accuracy
Poorly Written Research Report Poor Logic Poor problem definition formulation
Improper use of Statistical Procedures
Total Error
Poor data collection methods
Inadequate sample size
Inadequate sample design
Problem definition steps
Management problem definition process
Research problem definition process
Please note that sometimes this is called Research question or research problem.....
“research problem”... and that
research questions are objectives that fit underneath the research problem.....
Problem Definition
• Management problem:
– Focuses on the decision that management has to make and is action oriented (i.e. once the information is obtained a course of action will be required)…. The management problem may include:
– – Symptoms of failure to achieve an objective. Must select course of action to regain it. Symptoms of likelihood of achieving objective. Must decide how to seize opportunity (opportunity identification)
Formulate
Formulate
Management Problem
Research Problem
Problem Definition
•
•
The research problem:
How to provide relevant, accurate, and unbiased information that manages can use to solve their marketing management problems.
The research problem is information oriented and researchers need to do some investigation (e.g., ask questions, read information) before defining the research problem – Researchers ask yourself: is the issue that management is seeking answers to merely a symptom of X?
– Remember the iceberg principle
• • The symptoms are what we can see (e.g. falling sales) The issues (causes) are generally what we cant see and generally the issue (below the surface) is what needs investigating and therefore forms the research problem …………..
Examples of
Management Problem
Develop package for new product. Increase store traffic.
Research Problem
Evaluate effectiveness of alternative package designs.
Measure current image of the store.
Increase market penetration through the opening of new stores.
Evaluate prospective locations.
Ok, so we have a problem, how do we write the problem definition????
So you think you have a problem – how do you write it????
Management Problem
Decision / action oriented Should a new product be introduced?
Research Problem
Information oriented To determine consumer preferences and purchase intentions for the proposed new product
Should the advertising campaign be changed?
To determine the effectiveness of the current advertising campaign
Should the price of the brand be increased? To determine the price elasticity of demand and the impact of sales and profits of various levels of price changes
To help you develop and write the research problem and research objectives you should consult other sources of information: ask questions, rely on experience, search industry info, academic journals (theory)...... This is an iterative and difficult process
The problem definition process
How much is this information worth?????? Estimate the value of information
• What decision needs to be made? Management • What sort of actions may occur once the research has been completed??????
problem
Research problem
• Can you delineate the symptoms from the causes/issues? • What information is required to solve the management problem?
Research objectives
• Specific measurable information requirements that will allow a researcher to answer the research problem.. i.e...RO1: RO2: RO3:................. • Research objectives (also sometimes called research questions) provide the “Blueprint” for the research as the objectives set the scene for “what needs to be done”
Marketing Research
Problem identification research
Problem solving research
Market Potential Research Market Share Research Image Research Market Characteristics Research Sales Analysis Research For casting Research Business Trends Research
Segmenting Research Product Research Pricing Research Promotion Research Distribution Research
Problem solving research
Segmenting Research:
Basis of segmentation, find out response of segments, selection of target segment test , design , packaging, modification, positioning and repositioning
Product Research :
Pricing Research :
price policy, line policy, price elasticity, customer response
Promotion Research:
Promotion budget, relationship with other tools, media decision , testing, effectiveness
Distribution Research:
Type of distribution, channel members, intensity of coverage, margins, location of channel members
nd 2
Session
Marketing Research Defined (AMA)
“Marketing research is the function which links consumers and the consumer to the organization through information- Information used to identify and define marketing problems; generate, refine, and evaluate marketing actions ; monitor marketing performance; and improve our understanding of marketing as a process.”
The role of marketing research within the marketing system
THE ROLE OF MARKETING RESEARCH
MARKETING RESEARCH
A FORMAL COMMUNICATION LINK WITH ENVIRONMENT
PROVIDE ACCURATE AND USEFUL
a) specifying b) collecting c) analyzing d) interpreting
a) planning b) problem-solving c) control
BETTER DECISION MAKING
FOR
NATURE OF MARKETING RESEARCH
Applied/Problem solving research Often based on cost-benefit analysis Vital for implementation of marketing concept Value of information declines with time Dynamic (ongoing)
DRIVERS OF MARKETING RESEARCH
Shift from production to customer-orientation Declining cost of unit information (digital age) Increase intensity of competition Globalization Technology and commercialization
Factors shaping the Marketing Research Industry
Competitor Intelligence Low cost survey providers Surveys to generate sales & PR
Customer Analytics
The nature and future of Marketing Research
Internet, e.g. online panels
‘Value for money’ marketing
‘Strategic’ consultants
‘Respondent’ rewards
Reasons for Doing Marketing Research: The Five Cs
1. Customers: To determine how well customer needs are being met, investigate new target markets, and assess and test new services and facilities. To identify primary competitors and pinpoint their strengths and weaknesses. To reduce the perceived risk in making marketing decisions. To increase the believability of promotional messages among customers. To keep updated with changes in travelers’ needs and expectations.
2. 3. 4. 5.
Competition: Confidence: Credibility: Change:
Reasons for Not Doing Marketing Research
1. Timing: 2. Cost: 3. Reliability: It will take to much time. The cost of the research is too high. There is no reliable research method available for doing the research. 4. Competitive intelligence: There is a fear that competitors will learn about the organization’s intentions. 5. Management decision: Management prefers to use own judgment.
Five Key Requirements of Marketing Research Information
1. Utility: 2. Timeliness: 3. Cost-effectiveness: 4. Accuracy: 5. Reliability: Can we use it? Does it apply to us? Will it be available in time? Do the benefits outweigh the costs? Is it accurate? Is it reliable?
Classification of marketing research
Examples of problem-solving research
Problem Definition Process
Environmental Context of the problem
Tasks involved in problem definition
Discussion with decision makers
Interviews with experts
Secondary data analysis
Qualitative research
Management decision problem
Marketing research problem
Factors to Consider - Environmental Context
•Past information and forecasts •Resources and constraints •Objectives (organizational & decision maker) •Buyer behavior •Legal environment •Economic environment •Marketing and technological skills
Defining the Research Problem
Allow the researcher to obtain all the information needed to address the management decision problem Guide the researcher in formulating the research design A broad definition does not provide clear guidelines for the subsequent steps involved in the project e.g.
Developing a marketing strategy for the brand Improving the competitive position of the firm Improving the company’s image
So you think you have a problem – how do you write it????
Management Problem
Decision / action oriented Should a new product be introduced?
Research Problem
Information oriented To determine consumer preferences and purchase intentions for the proposed new product
Should the advertising campaign be changed?
To determine the effectiveness of the current advertising campaign
Should the price of the brand be increased? To determine the price elasticity of demand and the impact of sales and profits of various levels of price changes
Define Research Design
A framework or blueprint for conducting the marketing research project.
Details the procedures necessary for obtaining the information needed to structure or solve marketing research problems
A Classification of Marketing Research Designs
Research Design
Exploratory Research Design
Conclusive Research Design
Descriptive Research
Causal Research
Cross-Sectional Design
Longitudinal Design
Differences Between Exploratory and Conclusive Research
Exploratory
Objective: Characteristics: To provide insights, understandings. Information needed defined loosely. Research process flexible/unstructured. Sample is small and nonrepresentative.
Conclusive
Test hypothesis/examine relationships.
Information needed is clearly defined.
Research process is formal and structured. Sample is large and representative. Data Analysis is quantitative. Conclusive. Findings input into decision making.
Analysis of primary data is qualitative.
Findings: Outcome: Tentative. Followed by conclusive research.
Exploratory Research: Overview
Characteristics : flexible, versatile, but not conclusive Useful for : discovery of ideas and insights, Formulating problems more precisely, Identifying alternative courses of action, Establishing priorities for further research Methods Used : case studies secondary data focus groups qualitative research When done? Generally initial research conducted to clarify and define the nature of a problem Does not provide conclusive evidence : Subsequent research expected
Descriptive Research: Overview
Characteristics : Describes characteristics of a population or phenomenon Some understanding of the nature of the problem preplanned, structured, conclusive Useful for : describing market characteristics or functions Methods Used : Surveys (primary data) panels scanner data (secondary data) When Used: Often a follow-up to exploratory research Examples include: Market segmentation studies, i.e., describe characteristics of various groups Determining perceptions of product characteristics Price and promotion elasticity studies Sale potential studies for particular geographic region or population segment
Examples of Descriptive Studies
•Market studies that describe the size of the market, buying power of the consumers, availability of distributors, and consumer profiles •Market share studies that determine the proportion of total sales perceived by a company and its competitors
•Sales analysis studies that describe sales by geographic region, product line, type of account size of account
•Image studies that determine consumer perceptions of the firm and its products •Product usage studies that describe consumption patterns •Distribution studies that determine traffic flow patterns and the number and location of distributors •Pricing studies that describe the range and frequency of price changes and probable response to proposed price changes •Advertising studies that describe media consumption habits and audience profiles for specific television programs and magazines
A Comparison of Basic Research Designs
Exploratory
Objective: Discovery of ideas
Descriptive
Describes market characteristics
Causal
Determine cause and effect
Characteristics:
Flexible, versatile. Front end research.
Prior formulation of hypothesis. Planned, structured design
Manipulate independent variables. Control of other variables.
Experiments Methods: Secondary data Surveys
Classification of Marketing Research Data
Marketing Research Data
Secondary Data
Primary
Data
Qualitative
Data
Quantitative Data
Descriptive
Causal
Survey Data
Observational & Other Data
Experimental Data
Relationship among Exploratory, Descriptive and causal Research
rd 3
Session
Sampling Design
Management information systems
Recom mendations
Problem definition
Exploratory
Data collection analysis
&
Research design
Descriptive
Causal
Sampling Non-probability Probability
Sample or Census
A population is the aggregate of all the elements that share some common set of characteristics, and that comprise the universe for the purpose of the marketing research problem. The population parameters are typically numbers, such as the proportion of consumers who are loyal to a particular brand of toothpaste. Information about a population parameters may be obtained by taking a census or a sample.
Sample or Census
A census involves a complete enumeration of the elements of a population. The population parameters can be calculated directly in a straightforward way after the census is enumerated (specify individually).
A sample is a subgroup of the population selected for participation in the study. Sample characteristics, called statistics, are then used to make inferences about the population parameters. The inferences that link sample characteristics and population parameters are estimation procedures and tests of hypotheses.
Sample Versus Census
Condition favoring the use of Budget Time Available Population Variance in Characteristics Cost of Sampling Error Cost of Non Sampling Error Attention of individual Cases Sample Small Short Small Small Low High Yes Census Large Long Large Large High Low No
Sampling
is the process of selecting a sufficient number of elements from the population so that by studying the sample, and understanding the properties or characteristics of the sample subjects, it would be possible to generalise the properties or characteristics to the population elements. more representative the sample is of the population, the more generalisable are the findings of the research
Sampling design – key terms
Population – entire group of people, events or things of interest that the researcher wishes to investigate - N Population element – single member of the population Sampling frame – list of all elements or the population from which the sample is drawn Sample (ing) – subset of the population selected for the specific research study - n Sample unit (subject) – single element selected in the sample; could be a group ( could be a two stage process) Census – an investigation of all individual elements that make up the population
Why sample?
time cost accuracy population may be difficult to access greater depth of information
Managerial objectives of sampling
Representative Reliable efficient as time permits
Errors associated with sampling
Sampling frame error - an error that occurs when certain sample elements are not listed or are not accurately represented in a sampling frame (occurs between the population and sampling frame) Random sampling error – occurs between the sampling frame and the planned sample for study Non - response error – the statistical difference between a survey that includes only those who responded and a perfect survey that would also include those who failed to respond (occurs between the planned sample and the respondents (actual sample)
Sampling design process
Step 1: Define Population Entire group under study as defined by research objectives Step 2: Establish Sampling Frame list of sampling units from which a sample will be drawn; the list could consist of geographic areas, institutions, individuals or other units Step 3: Choose sampling technique/method method of selecting the sampling units Probability (random) vs. non probability (non-random) Step 4: Determine sample size if non-probability sampling method –involves some judgement based on time, cost, analysis required if probability sampling – based on statistical determination of sample size Step 5: Identify and select sample unit (subject) follow procedures based on sampling technique selected
Classification of Sampling Techniques
Sampling Techniques
Nonprobability Sampling Techniques
Probability Sampling Techniques
Convenience Sampling
Judgmental Sampling
Quota Sampling
Snowball Sampling
Simple Random Sampling
Systematic Sampling
Stratified Sampling
Cluster Sampling
Other Sampling Techniques
Non Probability Sampling
each sampling unit of the population being studied does not have an equal chance of being included in the study (due to the way the sample is selected) non-random (selection process is subjective)
researchers rely heavily on personal judgement
projecting the findings beyond the sample is statistically inappropriate is less concerned about generalisability; other factors are more important - time ; preliminary information - then use nonprobability
Non Probability Sampling
Common sampling approaches
convenience judgement quota snowball
Convenience Sample
Also known as haphazard or accidental sampling based on convenient availability of sampling units sample units happen to be in a certain place at certain time – high traffic locations – shopping malls; pedestrian areas
Acceptable only in pre - test/exploration phase when further research will use probability sampling
Representativeness highly uncertain
Quota sampling can reduce some of the sample selection error
Judgement Sampling
An experienced individual (could be the researchers) selects the sample based on personal judgement about some appropriate characteristics suited to the study Focus group studies use this method
Quota Samples
Various subgroups in a population are represented based on pertinent characteristics
Haphazard selection of respondents may introduce bias Similar to stratified random sampling
Snowball Sampling
Judgement sample that relies on researchers ability to locate an initial set of respondents with the desired characteristics; these individuals are then used as informants to identify others with desired characteristic Acceptable when sample units are difficult to locate Advantages reduced sample size and costs
Probability Sampling
In a probability sample each element in the population has some known chance or probability of being included in the sample
Used when the representativeness of the sample is important for generalisability of results Random selection of sample thus eliminating bias
Probability Sampling cont.
statistical efficiency
same sample size and smaller standard error of the mean is obtained
economic efficiency
precision refers to the level of uncertainty about the characteristics being measured precision is inversely related to sampling error precision is positively related to cost
Types of probability sampling
Simple random sample Systematic sampling Stratified sampling
proportionate disproportionate
Cluster sampling Area sampling
Simple Random Sampling
Assures each element in the population of an equal chance of being included in the sample Blind draw - putting all name in a hat and drawing out a sample of 100 (size has been statistically calculated) Random numbers Need to begin with a complete list of the population – sometimes difficult to obtain
Systematic Sampling
A starting point is selected by a random process and then every nth number on the list is selected Calculate skip interval = population list size/ sample size (size has been statistically calculated) Danger of periodicity – if list has a systematic pattern Can be more representative than a simple random sample
Stratified Sampling
Simple random sub samples are drawn from within each stratum in the population that are more or less equal on some characteristic Greater degree of representativeness Two types
proportionate - sample size of each stratum is relative to the size of each stratum in the population disproportionate –sample size of each stratum does not reflect their relative proportions in the population
Cluster Sampling
divides the population into groups (clusters), any one of which can be considered a representative sample an economically efficient technique in which the primary sampling unit is not the individual element but a large cluster of elements clusters are selected randomly random sample from within each cluster
Technique
Nonprobability Sampling Convenience sampling Judgmental sampling Quota sampling Snowball sampling
Strengths
Least expensive, least time-consuming, most convenient Low cost, convenient, not time-consuming Sample can be controlled for certain characteristics Can estimate rare characteristics
Weaknesses
Selection bias, sample not representative, not recommended for descriptive or causal research Does not allow generalization, subjective Selection bias, no assurance of representativeness Time-consuming
Probability sampling Simple random sampling (SRS) Systematic sampling
Easily understood, results projectable
Can increase representativeness, easier to implement than SRS, sampling frame not necessary Include all important subpopulations, precision Easy to implement, cost effective
Difficult to construct sampling frame, expensive, lower precision, no assurance of representativeness. Can decrease representativeness
Stratified sampling
Cluster sampling
Difficult to select relevant stratification variables, not feasible to stratify on many variables, expensive Imprecise, difficult to compute and interpret results
Choosing probability vs. non-probability sampling
Probability sampling
Conclusive Larger sampling errors
Evaluation Criteria Nature of research Relative magnitude of sampling and non-sampling error Population variability Statistical Considerations Sophistication Needed Time Budget Needed
Non-probability sampling
Exploratory Larger non-sampling error
High [Heterogeneous] Favorable High Relatively Longer High
Low [Homogeneous] Unfavorable Low Relatively shorter Low
Selecting an Appropriate Design
degree of accuracy resources time advance knowledge of the population national versus local projects need for statistical analysis
Session - 4
Measurement and Scaling
Measurement means assigning numbers or other symbols to characteristics of objects according to certain pre-specified rules. One-to-one correspondence between the numbers and the characteristics being measured. The rules for assigning numbers should be standardized and applied uniformly. Rules must not change over objects or time.
Measurement and Scaling
Scaling involves creating a continuum upon which measured objects are located. Consider an attitude scale from 1 to 100. Each respondent is assigned a number from 1 to 100, with 1 = Extremely Unfavorable, and 100 = Extremely Favorable. Measurement is the actual assignment of a number from 1 to 100 to each respondent. Scaling is the process of placing the respondents on a continuum with respect to their attitude toward department stores
Primary Scales of Measurement
Scale Nominal
Numbers Assigned to Runners Rank Order of Winners
Third place Second place First place Finish
7 8 3
Ordinal
Finish
Interval
Performance Rating on a 0 to 10 Scale
Time to Finish, in Seconds
8.2
9.1
9.6
Ratio
15.2
14.1
13.4
Primary Scales of Measurement Nominal Scale
The numbers serve only as labels or tags for identifying and classifying objects.
When used for identification, there is a strict one-to-one correspondence between the numbers and the objects.
The numbers do not reflect the amount of the characteristic possessed by the objects.
The only permissible operation on the numbers in a nominal scale is counting. Only a limited number of statistics, all of which are based on frequency counts, are permissible, e.g., percentages, and mode.
Illustration of Primary Scales of Measurement
Nominal Scale
No. Store
Ordinal Scale
Preference Rankings
Interval Scale
Preference Ratings
Ratio Scale
$ spent last 3 months
1. Lord & Taylor 2. Macy’s 3. Kmart 4. Rich’s 5. J.C. Penney 6. Neiman Marcus 7. Target 8. Saks Fifth Avenue 9. Sears 10.Wal-Mart
7 2 8 3 1 5 9 6 4 10
79 25 82 30 10 53 95 61 45 115
1-7 5 7 4 6 7 5 4 5 6 2
11-17 15 17 14 16 17 15 14 15 16 12
0 200 0 100 250 35 0 100 0 10
Primary Scales of Measurement Ordinal Scale
• A ranking scale in which numbers are assigned to objects to indicate the relative extent to which the objects possess some characteristic. Can determine whether an object has more or less of a characteristic than some other object, but not how much more or less. Any series of numbers can be assigned that preserves the ordered relationships between the objects. In addition to the counting operation allowable for nominal scale data, ordinal scales permit the use of statistics based on centiles, e.g., percentile, quartile, median.
•
• •
Primary Scales of Measurement Interval Scale
• • • Numerically equal distances on the scale represent equal values in the characteristic being measured. It permits comparison of the differences between objects. The location of the zero point is not fixed. Both the zero point and the units of measurement are arbitrary.
•
•
Any positive linear transformation of the form y = a + bx will preserve the properties of the scale.
It is not meaningful to take ratios of scale values.
•
Statistical techniques that may be used include all of those that can be applied to nominal and ordinal data, and in addition the arithmetic mean, standard deviation, and other statistics commonly used in marketing research.
Primary Scales of Measurement Ratio Scale
•
•
Possesses all the properties of the nominal, ordinal, and interval scales.
It has an absolute zero point.
•
• •
It is meaningful to compute ratios of scale values.
Only proportionate transformations of the form y = bx, where b is a positive constant, are allowed. All statistical techniques can be applied to ratio
data.
Primary Scales of Measurement
Scale Nominal Basic Characteristics Numbers identify & classify objects Common Examples Social Security nos., numbering of football players Nos. indicate the Quality rankings, relative positions rankings of teams of objects but not in a tournament the magnitude of differences between them Differences Temperature between objects (Fahrenheit) Zero point is fixed, Length, weight ratios of scale values can be compared Marketing Permissible Examples Descriptive Brand nos., store Percentages, types mode Preference Percentile, rankings, market median position, social class Statistics Inferential Chi-square, binomial test Rank-order correlation, Friedman ANOVA
Ordinal
Interval Ratio
Attitudes, opinions, index Age, sales, income, costs
Range, mean, standard Geometric mean, harmonic mean
Productmoment Coefficient of variation
A Classification of Scaling Techniques
Scaling Techniques
Comparative Scales
Noncomparative Scales
Paired Comparison
Rank Order
Constant Sum
Q-Sort and Other Procedures
Continuous Itemized Rating Scales Rating Scales
Likert
Semantic Differential
Stapel
A Comparison of Scaling Techniques
• Comparative scales involve the direct comparison of stimulus objects. Comparative scale data must be interpreted in relative terms and have only ordinal or rank order properties. In non-comparative scales, each object is scaled independently of the others in the stimulus set. The resulting data are generally assumed to be interval or ratio scaled.
•
Relative Advantages of Comparative Scales
• • Small differences between stimulus objects can be detected. Same known reference points for all respondents.
•
• •
Easily understood and can be applied.
Involve fewer theoretical assumptions. Tend to reduce halo or carryover effects from one judgment to another.
Relative Disadvantages of Comparative Scales
Ordinal nature of the data Inability to generalize beyond the stimulus objects scaled.
Comparative Scaling Techniques Paired Comparison Scaling
• • • • A respondent is presented with two objects and asked to select one according to some criterion. The data obtained are ordinal in nature. Paired comparison scaling is the most widely-used comparative scaling technique. Under the assumption of transitivity, it is possible to convert paired comparison data to a rank order.
Obtaining Shampoo Preferences Using Paired Comparisons
Instructions: We are going to present you with ten pairs of shampoo
brands. For each pair, please indicate which one of the two brands of shampoo you would prefer for personal use.
Recording Form:
Jhirmack Finesse Vidal Sassoon Head & Shoulders Pert
Jhirmack
Finesse 0
Vidal Sassoon 0 0
Head & Shoulders 1 1 1
Pert 0 0 1 0
1a 1 0 1 1 0 1
0 0 1
Number of Times 3 2 0 4 1 b Preferred aA 1 in a particular box means that the brand in that column was preferred over the brand in the corresponding row. A 0 means that the row brand was preferred over the column brand. bThe number of times a brand was preferred is obtained by summing the 1s in each column.
Paired Comparison Selling
The most common method of taste testing is paired comparison. The consumer is asked to sample two different products and select the one with the most appealing taste. The test is done in private and a minimum of 1,000 responses is considered an adequate sample. A blind taste test for a soft drink, where imagery, self-perception and brand reputation are very important factors in the consumer’s purchasing decision, may not be a good indicator of performance in the marketplace. The introduction of New Coke illustrates this point. New Coke was heavily favored in blind paired comparison taste tests, but its introduction was less than successful, because image plays a major role in the purchase of Coke.
Comparative Scaling Techniques Rank Order Scaling
Respondents are presented with several objects simultaneously and asked to order or rank them according to some criterion.
It is possible that the respondent may dislike the brand ranked 1 in an absolute sense. Furthermore, rank order scaling also results in ordinal data. Only (n - 1) scaling decisions need be made in rank order scaling.
Preference for Toothpaste Brands Using Rank Order Scaling
Instructions: Rank the various brands of toothpaste in order of preference. Begin by picking out the one brand that you like most and assign it a number 1. Then find the second most preferred brand and assign it a number 2. Continue this procedure until you have ranked all the brands of toothpaste in order of preference. The least preferred brand should be assigned a rank of 10. No two brands should receive the same rank number.
The criterion of preference is entirely up to you. There is no right or wrong answer. Just try to be consistent.
Preference for Toothpaste Brands Using Rank Order Scaling
Form
Brand Rank Order
1. Crest
2. Colgate 3. Aim 4. Gleem 5. Sensodyne 6. Ultra Brite
_________
_________ _________ _________ _________ _________
7. Close Up
8. Pepsodent 9. Plus White 10. Stripe
_________
_________ _________ _________
Comparative Scaling Techniques Constant Sum Scaling
Respondents allocate a constant sum of units, such as 100 points to attributes of a product to reflect their importance. If an attribute is unimportant, the respondent assigns it zero points. If an attribute is twice as important as some other attribute, it receives twice as many points. The sum of all the points is 100. Hence, the name of the scale.
Importance of Bathing Soap Attributes Using a Constant Sum Scale
Instructions
On the next slide, there are eight attributes of bathing soaps. Please allocate 100 points among the attributes so that your allocation reflects the relative importance you attach to each attribute. The more points an attribute receives, the more important the attribute is. If an attribute is not at all important, assign it zero points. If an attribute is twice as important as some other attribute, it should receive twice as many points.
Importance of Bathing Soap Attributes Using a Constant Sum Scale
Form Average Responses of Three Segments
Attribute Segment I 1. Mildness 8 2. Lather 2 3 3. Shrinkage 53 4. Price 9 5. Fragrance 7 6. Packaging 5 7. Moisturizing 13 8. Cleaning Power 100 Sum Segment II
2 4 9 17 0 5 3 60 100
Segment III
4 17 7 9 19 9 20 15 100
Q – Sort Scaling
A comparative scaling technique that uses a rank order procedure to sort objects based on similarity with respect to some criterion.
Session - 5
Non - comparative Scaling Techniques
Respondents evaluate only one object at a time, and for this reason noncomparative scales are often referred to as monadic scales. Non-comparative techniques consist continuous and itemized rating scales. of
Continuous Rating Scale
Respondents rate the objects by placing a mark at the appropriate position on a line that runs from one extreme of the criterion variable to the other. The form of the continuous scale may vary considerably. How would you rate Sears as a department store? Version 1 Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Probably the best Version 2 Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - Probably the best 0 10 20 30 40 50 60 70 80 100
90
Version 3
Neither good Very good nor bad Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - -Probably the best 0 10 20 30 40 50 60 70 80 90
100
Very bad
RATE: Rapid Analysis and Testing Environment
A relatively new research tool, the perception analyzer, provides continuous measurement of “gut reaction.” A group of up to 400 respondents is presented with TV or radio spots or advertising copy. The measuring device consists of a dial that contains a 100-point range. Each participant is given a dial and instructed to continuously record his or her reaction to the material being tested.
As the respondents turn the dials, the information is fed to a computer, which tabulates second-bysecond response profiles. As the results are recorded by the computer, they are superimposed on a video screen, enabling the researcher to view the respondents' scores immediately. The responses are also stored in a permanent data file for use in further analysis. The response scores can be broken down by categories, such as age, income, sex, or product usage.
Itemized Rating Scales
The respondents are provided with a scale that has a number or brief description associated with each category.
The categories are ordered in terms of scale position, and the respondents are required to select the specified category that best describes the object being rated.
The commonly used itemized rating scales are the Likert, semantic differential, and Stapel scales.
Likert Scale
The Likert scale requires the respondents to indicate a degree of agreement or disagreement with each of a series of statements about the stimulus objects. SD D Neither A SA A or D
1. Sears sells high quality merchandise. 1 2X 3 4 5
2. Sears has poor in-store service.
3. I like to shop at Sears.
1
1
2X
2
3
3X
4
4
5
5
The analysis can be conducted on an item-by-item basis (profile analysis), or a total (summated) score can be calculated. When arriving at a total score, the categories assigned to the negative statements by the respondents should be scored by reversing the scale.
Semantic Differential Scale
The semantic differential is a seven-point rating scale with end points associated with bipolar labels that have semantic meaning. SEARS IS: Powerful Unreliable Modern
--:--:--:--:-X-:--:--: Weak --:--:--:--:--:-X-:--: Reliable --:--:--:--:--:--:-X-: Old-fashioned
The negative adjective or phrase sometimes appears at the left side of the scale and sometimes at the right. This controls the tendency of some respondents, particularly those with very positive or very negative attitudes, to mark the right- or left-hand sides without reading the labels. Individual items on a semantic differential scale may be scored on either a -3 to +3 or a 1 to 7 scale.
A Semantic Differential Scale for Measuring SelfConcepts, Person Concepts, and Product Concepts
1) Rugged 2) Excitable :---:---:---:---:---:---:---: Delicate :---:---:---:---:---:---:---: Calm
3) Uncomfortable 4) Dominating 5) Thrifty 6) Pleasant 7) Contemporary 8) Organized 9) Rational 10) Youthful 11) Formal 12) Orthodox 13) Complex 14) Colorless 15) Modest
:---:---:---:---:---:---:---: Comfortable :---:---:---:---:---:---:---: Submissive :---:---:---:---:---:---:---: Indulgent :---:---:---:---:---:---:---: Unpleasant :---:---:---:---:---:---:---: Obsolete :---:---:---:---:---:---:---: Unorganized :---:---:---:---:---:---:---: Emotional :---:---:---:---:---:---:---: Mature :---:---:---:---:---:---:---: Informal :---:---:---:---:---:---:---: Liberal :---:---:---:---:---:---:---: Simple :---:---:---:---:---:---:---: Colorful :---:---:---:---:---:---:---: Vain
Stapel Scale
The Stapel scale is a unipolar rating scale with ten categories numbered from -5 to +5, without a neutral point (zero). This scale is usually presented vertically. SEARS +5 +5 +4 +4 +3 +3 +2 +2X +1 +1 HIGH QUALITY POOR SERVICE -1 -1 -2 -2 -3 -3 -4X -4 -5 -5 The data obtained by using a Stapel scale can be analyzed in the same way as semantic differential data.
Basic Non - comparative Scales
Scale Continuous Rating Scale Itemized Rating Scales Likert Scale
Degrees of agreement on a 1 (strongly disagree) to 5 (strongly agree) scale Seven - point scale with bipolar labels Measurement of attitudes Easy to construct, administer, and understand More time - consuming
Basic Characteristics
Place a mark on a continuous line
Examples
Reaction to TV commercials
Advantages
Easy to construct
Disadvantages
Scoring can be cumbersome unless computerized
Semantic Differential
Brand, product, and company images Measurement of attitudes and images
Versatile to whether the
Stapel Scale
Unipolar ten - point scale, - 5 to +5, witho ut a neutral point (zero)
Easy to construct, administer over telephone
Confusing and
Itemized Scale Decisions
1) Number of categories Although there is no single, optimal number, traditional guidelines suggest that there should be between five and nine categories
2) Balanced vs. unbalanced
3) Odd/even no. of categories
In general, the scale should be balanced to obtain objective data (Next Slide).
If a neutral or indifferent scale response is possible from at least some of the respondents, an odd number of categories should be used In situations where the respondents are expected to have no opinion, the accuracy of the data may be improved by a non-forced scale An argument can be made for labeling all or many scale categories. The category descriptions should be located as close to the response categories as possible A number of options should be tried and the best selected ( Horizontally or vertically)
4) Forced vs. non-forced
5) Verbal description
6) Physical form
Balanced and Unbalanced Scales
Balanced Scale Jovan Musk for Men is
Extremely good Very good Good Bad Very bad Extremely bad
Unbalance Scale Jovan Musk for Men is
Extremely good Very good Good Somewhat good Bad Very bad
Rating Scale Configurations
A variety of scale configurations may be employed to measure the gentleness of Cheer detergent. Some examples include:
Cheer detergent is:
1) Very harsh 2) Very harsh
--1
--2
--3
--4
--5
--6
--7
Very gentle Very gentle
3) . Very harsh . . . Neither harsh nor gentle . . . Very gentle 4) ____ ____ ____ ____ Very Harsh Somewhat Neither harsh harsh Harsh nor gentle 5) Very harsh Neither harsh nor gentle
____ Somewhat gentle
____ Gentle
____ Very gentle
Very gentle
Measurement Error – Difference between observed score and true score
Measurement Accuracy
The true score model provides a framework for understanding the accuracy of measurement. XO = XT + X S + X R
where
XO = XT = XS = XR = the observed score or measurement the true score of the characteristic systematic error ( they affect the observed in the same way each time)score. random error ( Situational factors)
Potential Sources of Error on Measurement
1) Other relatively stable characteristics of the individual that influence the test score, such as intelligence, social desirability, and education.
2)
3) 4) 5) 6) 7) 8)
Short-term or transient personal factors, such as health, emotions, and fatigue.
Situational factors, such as the presence of other people, noise, and distractions. Sampling of items included in the scale: addition, deletion, or changes in the scale items. Lack of clarity of the scale, including the instructions or the items themselves. Mechanical factors, such as poor printing, overcrowding items in the questionnaire, and poor design. Administration of the scale, such as differences among interviewers. Analysis factors, such as differences in scoring and statistical analysis.
Reliability
Reliability can be defined as the extent to which measures are free from random error, XR. If XR = 0, the measure is perfectly reliable. Random error produces inconsistency leading to lower reliability
Validity
The validity of a scale may be defined as the extent to which differences in observed scale scores reflect true differences among objects on the characteristic being measured, rather than systematic or random error. Perfect validity requires that there be no measurement error (XO = XT, XR = 0, XS = 0).
Relationship Between Reliability and Validity
If a measure is perfectly valid, it is also perfectly reliable. In this case XO = XT, XR = 0, and XS = 0. If a measure is unreliable, it cannot be perfectly valid, since at a minimum XO = XT + XR. Furthermore, systematic error may also be present, i.e., XS?0. Thus, unreliability implies invalidity. If a measure is perfectly reliable, it may or may not be perfectly valid, because systematic error may still be present (XO = XT + XS). Reliability is a necessary, but not sufficient, condition for validity.
Session - 6
Data Collection and Questionnaire
Collection of Data
Data can be obtained :
Secondary Source Internal Records Primary source
Collection of Data
Primary Data :
Questionnaire : Observation : Schedule, Interview form (telephone and personal interview)
Questionnaire Definition
A questionnaire is a formalized set of questions for obtaining information from respondents.
Questionnaire Objectives
It must translate the information needed into a set of specific questions that the respondents can and will answer. A questionnaire must uplift, motivate, and encourage the respondent to become involved in the interview, to cooperate, and to complete the interview. A questionnaire should minimize response error.
Questionnaire Design Process
Specify the Information Needed
Specify the Type of Interviewing Method
Determine the Content of Individual Questions Design the Question to Overcome the Respondent’s Inability and Unwillingness to Answer Decide the Question Structure
Determine the Question Wording
Arrange the Questions in Proper Order
Identify the Form and Layout Reproduce the Questionnaire Eliminate Bugs by Pre-testing
Individual Question Content 1.Is the Question Necessary?
If there is no satisfactory use for the data resulting from a question, that question should be eliminated.
Individual Question Content ? 2. Are Several Questions Needed Instead of One?
Sometimes, several questions are needed to obtain the required information in an unambiguous manner. Consider the question: “Do you think Coca-Cola is a tasty and refreshing soft drink?” (Incorrect) Such a question is called a double-barreled question, because two or more questions are combined into one. To obtain the required information, two distinct questions should be asked: “Do you think Coca-Cola is a tasty soft drink?” and “Do you think Coca-Cola is a refreshing soft drink?” (Correct)
Overcoming Inability To Answer – 1. Is the Respondent Informed?
In situations where not all respondents are likely to be informed about the topic of interest, filter questions that measure familiarity and past experience should be asked before questions about the topics themselves. A “don't know” option appears to reduce uninformed responses without reducing the response rate.
Overcoming Inability To Answer – 2. Can the Respondent Remember?
How many gallons of soft drinks did you consume during the last four weeks? (Incorrect)
How often do you consume soft drinks in a typical week? (Correct) 1. ___ Less than once a week 2. ___ 1 to 3 times per week 3. ___ 4 to 6 times per week 4. ___ 7 or more times per week
Overcoming Inability To Answer – 3. Can the Respondent Articulate?
Respondents may be unable to articulate certain types of responses, e.g., describe the atmosphere of a department store. Respondents should be given aids, such as pictures, maps, and descriptions to help them articulate their responses.
Overcoming Unwillingness To Answer – Effort Required of the Respondents
Most respondents are unwilling to devote a lot of effort to provide information.
Overcoming Unwillingness To Answer
Context Respondents are unwilling to respond to questions which they consider to be inappropriate for the given context. The researcher should manipulate the context so that the request for information seems appropriate. Legitimate Purpose Explaining why the data are needed can make the request for the information seem legitimate and increase the respondents' willingness to answer. Sensitive Information Respondents are unwilling to disclose, at least accurately, sensitive information because this may cause embarrassment or threaten the respondent's prestige or self-image.
Overcoming Unwillingness To Answer – Increasing the Willingness of Respondents
Place sensitive topics at the end of the questionnaire.
Preface the question with a statement that the behavior of interest is common. Ask the question using the third-person technique : phrase the question as if it referred to other people. Hide the question in a group of other questions which respondents are willing to answer. The entire list of questions can then be asked quickly. Provide response categories rather than asking for specific figures. Use randomized techniques.
Choosing Question Structure – Unstructured Questions
Unstructured questions are open-ended questions that respondents answer in their own words. What is your occupation? Who is your favorite actor? What do you think about people who shop at high-end department stores?
Choosing Question Structure – Structured Questions
Structured questions specify the set of response alternatives and the response format. A structured question may be multiple-choice, dichotomous, or a scale.
Choosing Question Structure – Multiple-Choice Questions
In multiple-choice questions, the researcher provides a choice of answers and respondents are asked to select one or more of the alternatives given.
Do you intend to buy a new car within the next six months? ____ Definitely will not buy ____ Probably will not buy ____ Undecided ____ Probably will buy ____ Definitely will buy ____ Other (please specify)
Choosing Question Structure – Dichotomous Questions
A dichotomous question has only two response alternatives: yes or no, agree or disagree, and so on. Often, the two alternatives of interest are supplemented by a neutral alternative, such as “no opinion,” “don't know,” “both,” or “none.” Do you intend to buy a new car within the next six months? _____ Yes _____ No _____ Don't know
Choosing Question Structure – Scales
Do you intend to buy a new car within the next six months? Definitely will not buy 1 Probably will not buy 2 Undecided 3 Probably will buy 4 Definitely will buy 5
Choosing Question Wording – Define the Issue
Define the issue in terms of who, what, when, where, why, and way (the six Ws). Who, what, when, and where are particularly important. Which brand of shampoo do you use? (Incorrect) Which brand or brands of shampoo have you personally used at home during the last month? In case of more than one brand, please list all the brands that apply. (Correct)
Choosing Question Wording – Use Unambiguous Words
In a typical month, how often do you shop in department stores? _____ Never _____ Occasionally _____ Sometimes _____ Often _____ Regularly (Incorrect) In a typical month, how often do you shop in department stores? _____ Less than once _____ 1 or 2 times _____ 3 or 4 times _____ More than 4 times (Correct)
Choosing Question Wording – Avoid Leading or Biasing Questions
A leading question is one that clues the respondent to what the answer should be, as in the following: Do you think that patriotic Americans should buy imported automobiles when that would put American labor out of work? _____ Yes _____ No _____ Don't know (Incorrect) Do you think that Americans should buy imported automobiles? _____ Yes _____ No _____ Don't know (Correct)
Choosing Question Wording – Avoid Implicit Alternatives
An alternative that is not explicitly expressed in the options is an implicit alternative. 1. Do you like to fly when traveling short distances? (Incorrect) Do you like to fly when traveling short distances, or would you rather drive? (Correct)
2.
Choosing Question Wording – Avoid Implicit Assumptions
Questions should not be worded so that the answer is dependent upon implicit assumptions about what will happen as a consequence. 1. Are you in favor of a balanced budget? (Incorrect)
Are you in favor of a balanced budget if it would result in an increase in the personal income tax? (Correct)
2.
Determining the Order of Questions
Opening Questions The opening questions should be interesting, simple, and non-threatening. Type of Information As a general guideline, basic information should be obtained first, followed by classification, and, finally, identification information. Difficult Questions Difficult questions or questions which are sensitive, embarrassing, complex, or dull, should be placed late in the sequence.
Determining the Order of Questions
Effect on Subsequent Questions General questions should precede the specific questions (funnel approach).
Q1: “What considerations are important to you in selecting a department store?”
Q2: “In selecting a department store, how important is convenience of location?” (Correct)
Form and Layout
Divide a questionnaire into several parts.
The questions in each part should be numbered, particularly when branching questions are used.
The questionnaires should preferably be precoded.
The questionnaires themselves should be numbered serially.
Example of a Precoded Questionnaire
The American Lawyer
A Confidential Survey of Our Subscribers
(Please ignore the numbers alongside the answers. They are only to help us in data processing.) 1. Considering all the times you pick it up, about how much time, in total, do
you spend reading or looking through a typical issue of THE AMERICAN LAWYER?
Less than 30 minutes.....................-1
30 to 59 minutes............................-2 1 hour to 1 hour 29 minutes..........-3
11/2 hours to 1 hour 59 minutes.........-4
2 hours to 2 hours 59 minutes...........-5 3 hours or more.................................-6
Reproduction of the Questionnaire
The questionnaire should be reproduced on good-quality paper and have a professional appearance. Questionnaires should take the form of a booklet rather than a number of sheets of paper clipped or stapled together. Each question should be reproduced on a single page (or double-page spread).
Vertical response columns should be used for individual questions.
Grids are useful when there are a number of related questions they use the same set of response categories. The tendency to crowd questions together to make the questionnaire look shorter should be avoided. Directions or instructions for individual questions should be placed as close to the questions as possible.
Pretesting
Pretesting refers to the testing of the questionnaire on a small sample of respondents to identify and eliminate potential problems. A questionnaire should not be used in the field survey without adequate pretesting.
All aspects of the questionnaire should be tested, including question content, wording, sequence, form and layout, question difficulty, and instructions.
The respondents for the pretest and for the actual survey should be drawn from the same population. Pretests are best done by personal interviews, even if the actual survey is to be conducted by mail, telephone, or electronic means, because interviewers can observe respondents' reactions and attitudes.
Pretesting
After the necessary changes have been made, another pretest could be conducted by mail, telephone, or electronic means if those methods are to be used in the actual survey. A variety of interviewers should be used for pretests. The pretest sample size varies from 15 to 30 respondents for each wave. Protocol analysis and debriefing are two commonly used procedures in pretesting.
Finally, the responses obtained from the pretest should be coded and analyzed.
Measurement of Central Tendency
Session - 7
Classification of Data
Geographic i.e. Area wise classification – cities , districts Chronological i.e. on the basis of time – year wise Qualitative i.e. according to some attribute – Male and Female Quantitative i.e . In terms of magnitude – some characteristics- income
Formation of Frequency Distribution
e.g. Refrigerator sold each day in Oct. 2008 Classification according to class intervals
Class Limits Class intervals Class frequency Class Mid point
Tabulation
Simple Tables or one way table Two way Tables
Frequency Distribution
In a frequency distribution, one variable is considered at a time.
A frequency distribution for a variable produces a table of frequency counts, percentages, and cumulative percentages for all the values associated with that variable.
Measures of central tendency Mean, median, mode, etc. Quartile Measure of variation Range, interquartile range, variance and standard deviation, coefficient of variation Shape Symmetric, skewed, using box-and-whisker plots Coefficient of correlation
Summary Measures
Central Tendency
Quartile Mode
Variation
Mean Median
Range
Coefficient of Variation
Variance
Geometric Mean
Standard Deviation
Mean
Data:100, 78, 65, 43, 94, 58
Mean: The sum of a collection of data divided by the number of data 43+58+65+78+94+100=438 438÷6=73 Mean is 73
Mean
Sample Mean
Sample Size
n Population Mean
X?
?X
i ?1
n
i
X1 ? X 2 ? ? n
Population Size
? Xn
??
?X
i ?1
N
i
N
X1 ? X 2 ? ? N
? XN
Mean
Direct Method : X
Mean
• • • The most common measure of central tendency Acts as ‘Balance Point’ Affected by extreme values (outliers)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Mean = 5
Mean = 6
Median
Robust measure of central tendency Not affected by extreme values
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Median = 5
Median = 5
In an ordered array, the median is the “middle” number
If n or N is odd, the median is the middle number If n or N is even, the median is the average of the two middle numbers
Mode
A measure of central tendency Value that occurs most often Not affected by extreme values Used for either numerical or categorical data There may be no mode or several modes
Mode = 9 No Mode 1 2 34 5 6 7 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Quartiles
Q1, the first quartile, is the value such that 25% of the observations are smaller, corresponding to (n+1)/4 ordered observation Q2, the second quartile, is the median, 50% of the observations are smaller, corresponding to 2(n+1)/4 = (n+1)/2 ordered observation Q3, the third quartile, is the value such that 75% of the observations are smaller, corresponding to 3(n+1)/4 ordered observation
Quartiles
Split Ordered Data into 4 Quarters
25% 25% 25% 25%
Position of ith Quartile
?Q1 ?
?Q2 ?
?Q3 ? i ? n ? 1? ? Qi ? ?
4
Q1
Data in Ordered Array: 11 12 13 16 16 17 17 18 21
1? 9 ? 1? Position of Q1 ? ? 2.5 4
?12 ? 13? ? 12.5 ?
2
= Median = 16, Q3 = 17.5
Measures of Variation
Variation
Variance
Standard Deviation
Coefficient of Variation
Range
Population Variance
Population Standard Deviation
Sample Variance
Interquartile Range
Sample Standard Deviation
Range
Measure of variation Difference between the largest and the smallest observations:
Range ? X Largest ? X Smallest
Ignore the way in which data are distributed
Range = 12 - 7 = 5 Range = 12 - 7 = 5
7 12
8
9
10
11
7 12
8
9
10
11
Interquartile Range
Measure of variation Also known as midspread Spread in the middle 50% Difference between the first and third quartiles
Data in Ordered Array: 11 12 13 16 16 17 17 18 21
Interquartile Range ? Q3 ? Q1 ? 17.5 ? 12.5 ? 5
Not affected by extreme values
Variance
•Important measure of variation •Shows variation about the mean Sample variance:
S ?
2
?? X
i ?1
n
i
?X?
2
n ?1
Population variance
?2 ?
?? X
i ?1
N
i
???
2
N
Standard Deviation
Most important measure of variation Shows variation about the mean Has the same units as the original data n Sample standard deviation:
S?
i ?1
?? X
N
i
?X?
2
n ?1
Population standard deviation:
??
?? X
i ?1
i
???
2
N
Comparing Standard Deviations
Data A Mean = 15.5
s = 3.338
11 21 12 13 14 15 16 17 18 19 20
Mean = 15.5
Data B
11 12 20 21
13
14
15
16
17
18
19
s = .9258
Data C Mean = 15.5
11 21
12
13
14
15
16
17
18
19
20
s = 4.57
Coefficient of Variation
Measure of Relative Dispersion Always in % Shows Variation Relative to Mean Used to Compare 2 or More Groups
Formula (Sample Coefficient of Variation)
S CV ? ? 100% X
Session - 8
Skewness and Kurtosis
Review of Previous Lecture
Range
The difference between the largest and smallest values
Interquartile range
The difference between the 25th and 75th percentiles
Variance
The sum of squares divided by the population size or the sample size
Standard deviation
The square root of the variance
•Another Measure of Dispersion
•Coefficient of Variation (CV) •Skewness •Kurtosis
Measures of Dispersion – Coefficient of Variation
Coefficient of variation (CV) measures the spread of a set of data as a proportion of its mean. It is the ratio of the sample standard deviation to the sample mean
s CV ? ?100% x
It is sometimes expressed as a percentage
Measures of Skewness and Kurtosis
A fundamental task in many statistical analyses is to characterize the location and variability of a data set (Measures of central tendency vs. measures of dispersion) Both measures tell us nothing about the shape of the distribution A further characterization of the data includes skewness and kurtosis
Skewness
Skewness measures the degree of asymmetry exhibited by the data
Skewness
Positive skewness
There are more observations below the mean than above it
When the mean is greater than the median
Negative skewness
There are a small number of low observations and a large number of high ones When the median is greater than the mean
Shape of a Distribution
Describes how data is distributed Measures of shape Mean > median: right-skewness Mean ? median: left-skewness Mean = median: symmetric
Left-Skewed Symmetric
Mean < Median < Mode Mean = Median =Mode
Right-Skewed
Mode < Median < Mean
Kurtosis
Kurtosis measures how peaked the histogram is
kurtosis ?
? (x ? x)
i i
n
4
ns
4
?3
The kurtosis of a normal distribution is 0
Kurtosis characterizes the relative peakedness or flatness of a distribution compared to the normal distribution
Kurtosis
Platykurtic– When the kurtosis < 0, the frequencies throughout the curve are closer to be equal (i.e., the curve is more flat and wide) Thus, negative kurtosis indicates a relatively flat distribution
Leptokurtic– When the kurtosis > 0, there are high frequencies in only a small part of the curve (i.e, the curve is more peaked) Thus, positive kurtosis indicates a relatively peaked distribution
Kurtosis
k>3 Frequency
k=3
k<3
Value
•
• •
Kurtosis is based on the size of a distribution's tails. Negative kurtosis (platykurtic) – distributions with short tails Positive kurtosis (leptokurtic) – distributions with relatively long tails
TIME SERIES ANALYSIS
Statistical data which are collected, observed or recorded at successive intervals of time – such data are referred as TIME SERIES : -It helps in understanding the past behavior. -It helps in planning future operations -It helps in evaluating current accomplishments -It facilitates comparison.
TIME SERIES ANALYSIS
Components of Time Series: -Secular trends – General movement persisting over long term -Seasonal variations - pattern year after year -Cyclical variations – Fluctuations moving up and down every few years -Irregular variations- Variations in business activity which do not repeat in definite period
Methods of Measurement
-Moving Avg. Method -Method of least square
Correlation Analysis
If two quantities vary in such a way that movement in one are accompanied by movement in another, these quantities are said to be correlated. The statistical tool for calculating such relationship is known as correlation and is denoted by = r. Types of correlation ship - Positive and Negative; - Simple, partial and multiple; - Linear and Non - linear
Scatter Plots and Correlation
A scatter plot (or scatter diagram) is used to show the relationship between two variables Correlation analysis is used to measure strength of the association (linear relationship) between two variables Only concerned with strength of the relationship No causal effect is implied
Scatter Plot Examples
Linear relationships y y Curvilinear relationships
x y y
x
x
x
Scatter Plot Examples
Strong relationships y y Weak relationships
x y y
x
x
x
Scatter Plot Examples
No relationship y
x y
Correlation Coefficient
The population correlation coefficient ? (rho) measures the strength of the association between the variables
The sample correlation coefficient r is an estimate of ? and is used to measure the strength of the linear relationship in the sample observations
Features r
Range between -1 and 1 The closer to -1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship The closer to 0, the weaker the linear relationship
Calculating the Correlation Coefficient
r ?
? ( x ? x )( y ? y ) [ ? ( x ? x ) ][? ( y ? y )
2
2
]
or the algebraic equivalent:
r?
[n( ? x 2 ) ? ( ? x )2 ][n( ? y 2 ) ? ( ? y )2 ]
n? xy ? ? x ? y
where: r = Sample correlation coefficient n = Sample size x = Value of the independent variable y = Value of the dependent variable
For Example
Tree Height y 35 Trunk Diameter x 8 xy 280 y2 1225 x2 64
49
27 33 60 21
9
7 6 13 7
441
189 198 780 147
2401
729 1089 3600 441
81
49 36 169 49
45
51 ?=321
11
12 ?=73
495
612 ?=3142
2025
2601 ?=14111
121
144 ?=713
Tree Height, y
70 60
r?
[n(? x 2 ) ? ( ? x) 2 ][n(? y 2 ) ? ( ? y)2 ] 8(3142)? (73)(321) [8(713)? (73)2 ][8(14111)? (321)2 ]
n? xy ? ? x ? y
50
?
40
30
? 0.886
r = 0.886 ? relatively strong positive linear association between x and y
20
10
0 0 2 4 6 8 10 12 14
Trunk Diameter, x
Calculations of Correlation when deviations are taken from Assumed Mean
Rank Correlation coefficient
doc_874799865.ppt
Research is a process (or series of iterative steps), and followed often when management is faced with a “problem” and/or “opportunity”, management needs further information in order to make a decision – the need for market(ing) research is an issue that is likely to need addressing...
Marketing Research
ROLE OF MARKETING RESEARCH
Customer Groups
Consumers Employees Shareholders Suppliers
Uncontrollable Environment factors
Controllable Environment factors
Marketing Research
Assessing information needs Providing Information Marketing Decision Making
Product Price Promotion Distribution
MarketIngManagers Market Segmentation Target market selection Marketing programme Performance and control
Economy Technology Competition Regulations Political factors Social & Cultural factors
•
Research is a process
(or series of iterative steps), and followed often when management is faced with a “problem” and/or “opportunity”, management needs further information in order to make a decision – the need for market(ing) research is an issue that is likely to need addressing...
The question is “when to conduct market(ing) research?”
When to Conduct Market(ing) Research
Yes Time Constraints Is sufficient time available? Availability of Data Is the information on hand inadequate? No Yes Yes Nature of Decision Is the decision of considerable importance? No Benefits vs. Costs Does the value of the research exceed the cost? Yes Conduct Market Research
No
No
Do not conduct market research!
Example issues: (1) What is our market share? (2) Will people drink tomato soup from a plastic jar? (3) Whose machine tools do our potential customers buy? (4) Which medicine is more preferred for a decease?
When Research Should be Done
•If it clarifies problems or investigates changes in the marketplace that can directly impact your product responsibility •If it resolves your selection of alternative courses of marketing action to achieve key marketing objectives •If it helps you gain a meaningful competitive advantage •If it allows you to stay abreast of your market(s)
Questions addressing the various stages of the
Research Process
Stage in the Process 1. Formulate problem Typical Questions
What is purpose of study - solve a problem? Identify opportunity? Is additional background info necessary? What info is needed to make decision? How will info be utilized? Should research be conducted? How much is already known? Can hypothesis
2. Determine research design – Exploratory / conclusive
Descriptive and causal
be formulated ? What types of questions need to be answered ? What type of study best address research questions ?
3. Determine data collection Can existing data be used to advantage? Methods & forms What is to be measured? How? What is source
of data? Are there any cultural factors? Are there any restrictions on data collection methods ? Can objective answers be obtained by asking people?
Questions addressing the various stages of the Research Process
Stage in the Process Typical Questions
4. Design data collection forms
Should structure or unstructured items used in
collecting data? Should purpose of study be made known to respondents? Should rating scale be used? What type of rating scale would be most appropriate? Who is target population? Is list of population elements available? Is sample necessary? Is Probability sample desirable? How large should sample be? What operational procedures will be followed? What methods will be used to ensure quality of data collected? Who will read the report? What is their technical level of sophistication? Are managerial recommendations called for? What will be format of written report? Is oral report necessary? How should the oral report be structured?
5. Design sample & collect
6. Analyze & interpret data
The research process
Presenting the results Management information
Data analysis
Decisions requiring information
Sampling
Problem definition
Research design
The research process Is a set of iterative steps and relationships....
The Concept of Total Error
All research has error and this impacts on the research outcome – its usability and accuracy
Poorly Written Research Report Poor Logic Poor problem definition formulation
Improper use of Statistical Procedures
Total Error
Poor data collection methods
Inadequate sample size
Inadequate sample design
Problem definition steps
Management problem definition process
Research problem definition process
Please note that sometimes this is called Research question or research problem.....
“research problem”... and that
research questions are objectives that fit underneath the research problem.....
Problem Definition
• Management problem:
– Focuses on the decision that management has to make and is action oriented (i.e. once the information is obtained a course of action will be required)…. The management problem may include:
– – Symptoms of failure to achieve an objective. Must select course of action to regain it. Symptoms of likelihood of achieving objective. Must decide how to seize opportunity (opportunity identification)
Formulate
Formulate
Management Problem
Research Problem
Problem Definition
•
•
The research problem:
How to provide relevant, accurate, and unbiased information that manages can use to solve their marketing management problems.
The research problem is information oriented and researchers need to do some investigation (e.g., ask questions, read information) before defining the research problem – Researchers ask yourself: is the issue that management is seeking answers to merely a symptom of X?
– Remember the iceberg principle
• • The symptoms are what we can see (e.g. falling sales) The issues (causes) are generally what we cant see and generally the issue (below the surface) is what needs investigating and therefore forms the research problem …………..
Examples of
Management Problem
Develop package for new product. Increase store traffic.
Research Problem
Evaluate effectiveness of alternative package designs.
Measure current image of the store.
Increase market penetration through the opening of new stores.
Evaluate prospective locations.
Ok, so we have a problem, how do we write the problem definition????
So you think you have a problem – how do you write it????
Management Problem
Decision / action oriented Should a new product be introduced?
Research Problem
Information oriented To determine consumer preferences and purchase intentions for the proposed new product
Should the advertising campaign be changed?
To determine the effectiveness of the current advertising campaign
Should the price of the brand be increased? To determine the price elasticity of demand and the impact of sales and profits of various levels of price changes
To help you develop and write the research problem and research objectives you should consult other sources of information: ask questions, rely on experience, search industry info, academic journals (theory)...... This is an iterative and difficult process
The problem definition process
How much is this information worth?????? Estimate the value of information
• What decision needs to be made? Management • What sort of actions may occur once the research has been completed??????
problem
Research problem
• Can you delineate the symptoms from the causes/issues? • What information is required to solve the management problem?
Research objectives
• Specific measurable information requirements that will allow a researcher to answer the research problem.. i.e...RO1: RO2: RO3:................. • Research objectives (also sometimes called research questions) provide the “Blueprint” for the research as the objectives set the scene for “what needs to be done”
Marketing Research
Problem identification research
Problem solving research
Market Potential Research Market Share Research Image Research Market Characteristics Research Sales Analysis Research For casting Research Business Trends Research
Segmenting Research Product Research Pricing Research Promotion Research Distribution Research
Problem solving research
Segmenting Research:
Basis of segmentation, find out response of segments, selection of target segment test , design , packaging, modification, positioning and repositioning
Product Research :
Pricing Research :
price policy, line policy, price elasticity, customer response
Promotion Research:
Promotion budget, relationship with other tools, media decision , testing, effectiveness
Distribution Research:
Type of distribution, channel members, intensity of coverage, margins, location of channel members
nd 2
Session
Marketing Research Defined (AMA)
“Marketing research is the function which links consumers and the consumer to the organization through information- Information used to identify and define marketing problems; generate, refine, and evaluate marketing actions ; monitor marketing performance; and improve our understanding of marketing as a process.”
The role of marketing research within the marketing system
THE ROLE OF MARKETING RESEARCH
MARKETING RESEARCH
A FORMAL COMMUNICATION LINK WITH ENVIRONMENT
PROVIDE ACCURATE AND USEFUL
a) specifying b) collecting c) analyzing d) interpreting
a) planning b) problem-solving c) control
BETTER DECISION MAKING
FOR
NATURE OF MARKETING RESEARCH
Applied/Problem solving research Often based on cost-benefit analysis Vital for implementation of marketing concept Value of information declines with time Dynamic (ongoing)
DRIVERS OF MARKETING RESEARCH
Shift from production to customer-orientation Declining cost of unit information (digital age) Increase intensity of competition Globalization Technology and commercialization
Factors shaping the Marketing Research Industry
Competitor Intelligence Low cost survey providers Surveys to generate sales & PR
Customer Analytics
The nature and future of Marketing Research
Internet, e.g. online panels
‘Value for money’ marketing
‘Strategic’ consultants
‘Respondent’ rewards
Reasons for Doing Marketing Research: The Five Cs
1. Customers: To determine how well customer needs are being met, investigate new target markets, and assess and test new services and facilities. To identify primary competitors and pinpoint their strengths and weaknesses. To reduce the perceived risk in making marketing decisions. To increase the believability of promotional messages among customers. To keep updated with changes in travelers’ needs and expectations.
2. 3. 4. 5.
Competition: Confidence: Credibility: Change:
Reasons for Not Doing Marketing Research
1. Timing: 2. Cost: 3. Reliability: It will take to much time. The cost of the research is too high. There is no reliable research method available for doing the research. 4. Competitive intelligence: There is a fear that competitors will learn about the organization’s intentions. 5. Management decision: Management prefers to use own judgment.
Five Key Requirements of Marketing Research Information
1. Utility: 2. Timeliness: 3. Cost-effectiveness: 4. Accuracy: 5. Reliability: Can we use it? Does it apply to us? Will it be available in time? Do the benefits outweigh the costs? Is it accurate? Is it reliable?
Classification of marketing research
Examples of problem-solving research
Problem Definition Process
Environmental Context of the problem
Tasks involved in problem definition
Discussion with decision makers
Interviews with experts
Secondary data analysis
Qualitative research
Management decision problem
Marketing research problem
Factors to Consider - Environmental Context
•Past information and forecasts •Resources and constraints •Objectives (organizational & decision maker) •Buyer behavior •Legal environment •Economic environment •Marketing and technological skills
Defining the Research Problem
Allow the researcher to obtain all the information needed to address the management decision problem Guide the researcher in formulating the research design A broad definition does not provide clear guidelines for the subsequent steps involved in the project e.g.
Developing a marketing strategy for the brand Improving the competitive position of the firm Improving the company’s image
So you think you have a problem – how do you write it????
Management Problem
Decision / action oriented Should a new product be introduced?
Research Problem
Information oriented To determine consumer preferences and purchase intentions for the proposed new product
Should the advertising campaign be changed?
To determine the effectiveness of the current advertising campaign
Should the price of the brand be increased? To determine the price elasticity of demand and the impact of sales and profits of various levels of price changes
Define Research Design
A framework or blueprint for conducting the marketing research project.
Details the procedures necessary for obtaining the information needed to structure or solve marketing research problems
A Classification of Marketing Research Designs
Research Design
Exploratory Research Design
Conclusive Research Design
Descriptive Research
Causal Research
Cross-Sectional Design
Longitudinal Design
Differences Between Exploratory and Conclusive Research
Exploratory
Objective: Characteristics: To provide insights, understandings. Information needed defined loosely. Research process flexible/unstructured. Sample is small and nonrepresentative.
Conclusive
Test hypothesis/examine relationships.
Information needed is clearly defined.
Research process is formal and structured. Sample is large and representative. Data Analysis is quantitative. Conclusive. Findings input into decision making.
Analysis of primary data is qualitative.
Findings: Outcome: Tentative. Followed by conclusive research.
Exploratory Research: Overview
Characteristics : flexible, versatile, but not conclusive Useful for : discovery of ideas and insights, Formulating problems more precisely, Identifying alternative courses of action, Establishing priorities for further research Methods Used : case studies secondary data focus groups qualitative research When done? Generally initial research conducted to clarify and define the nature of a problem Does not provide conclusive evidence : Subsequent research expected
Descriptive Research: Overview
Characteristics : Describes characteristics of a population or phenomenon Some understanding of the nature of the problem preplanned, structured, conclusive Useful for : describing market characteristics or functions Methods Used : Surveys (primary data) panels scanner data (secondary data) When Used: Often a follow-up to exploratory research Examples include: Market segmentation studies, i.e., describe characteristics of various groups Determining perceptions of product characteristics Price and promotion elasticity studies Sale potential studies for particular geographic region or population segment
Examples of Descriptive Studies
•Market studies that describe the size of the market, buying power of the consumers, availability of distributors, and consumer profiles •Market share studies that determine the proportion of total sales perceived by a company and its competitors
•Sales analysis studies that describe sales by geographic region, product line, type of account size of account
•Image studies that determine consumer perceptions of the firm and its products •Product usage studies that describe consumption patterns •Distribution studies that determine traffic flow patterns and the number and location of distributors •Pricing studies that describe the range and frequency of price changes and probable response to proposed price changes •Advertising studies that describe media consumption habits and audience profiles for specific television programs and magazines
A Comparison of Basic Research Designs
Exploratory
Objective: Discovery of ideas
Descriptive
Describes market characteristics
Causal
Determine cause and effect
Characteristics:
Flexible, versatile. Front end research.
Prior formulation of hypothesis. Planned, structured design
Manipulate independent variables. Control of other variables.
Experiments Methods: Secondary data Surveys
Classification of Marketing Research Data
Marketing Research Data
Secondary Data
Primary
Data
Qualitative
Data
Quantitative Data
Descriptive
Causal
Survey Data
Observational & Other Data
Experimental Data
Relationship among Exploratory, Descriptive and causal Research
rd 3
Session
Sampling Design
Management information systems
Recom mendations
Problem definition
Exploratory
Data collection analysis
&
Research design
Descriptive
Causal
Sampling Non-probability Probability
Sample or Census
A population is the aggregate of all the elements that share some common set of characteristics, and that comprise the universe for the purpose of the marketing research problem. The population parameters are typically numbers, such as the proportion of consumers who are loyal to a particular brand of toothpaste. Information about a population parameters may be obtained by taking a census or a sample.
Sample or Census
A census involves a complete enumeration of the elements of a population. The population parameters can be calculated directly in a straightforward way after the census is enumerated (specify individually).
A sample is a subgroup of the population selected for participation in the study. Sample characteristics, called statistics, are then used to make inferences about the population parameters. The inferences that link sample characteristics and population parameters are estimation procedures and tests of hypotheses.
Sample Versus Census
Condition favoring the use of Budget Time Available Population Variance in Characteristics Cost of Sampling Error Cost of Non Sampling Error Attention of individual Cases Sample Small Short Small Small Low High Yes Census Large Long Large Large High Low No
Sampling
is the process of selecting a sufficient number of elements from the population so that by studying the sample, and understanding the properties or characteristics of the sample subjects, it would be possible to generalise the properties or characteristics to the population elements. more representative the sample is of the population, the more generalisable are the findings of the research
Sampling design – key terms
Population – entire group of people, events or things of interest that the researcher wishes to investigate - N Population element – single member of the population Sampling frame – list of all elements or the population from which the sample is drawn Sample (ing) – subset of the population selected for the specific research study - n Sample unit (subject) – single element selected in the sample; could be a group ( could be a two stage process) Census – an investigation of all individual elements that make up the population
Why sample?
time cost accuracy population may be difficult to access greater depth of information
Managerial objectives of sampling
Representative Reliable efficient as time permits
Errors associated with sampling
Sampling frame error - an error that occurs when certain sample elements are not listed or are not accurately represented in a sampling frame (occurs between the population and sampling frame) Random sampling error – occurs between the sampling frame and the planned sample for study Non - response error – the statistical difference between a survey that includes only those who responded and a perfect survey that would also include those who failed to respond (occurs between the planned sample and the respondents (actual sample)
Sampling design process
Step 1: Define Population Entire group under study as defined by research objectives Step 2: Establish Sampling Frame list of sampling units from which a sample will be drawn; the list could consist of geographic areas, institutions, individuals or other units Step 3: Choose sampling technique/method method of selecting the sampling units Probability (random) vs. non probability (non-random) Step 4: Determine sample size if non-probability sampling method –involves some judgement based on time, cost, analysis required if probability sampling – based on statistical determination of sample size Step 5: Identify and select sample unit (subject) follow procedures based on sampling technique selected
Classification of Sampling Techniques
Sampling Techniques
Nonprobability Sampling Techniques
Probability Sampling Techniques
Convenience Sampling
Judgmental Sampling
Quota Sampling
Snowball Sampling
Simple Random Sampling
Systematic Sampling
Stratified Sampling
Cluster Sampling
Other Sampling Techniques
Non Probability Sampling
each sampling unit of the population being studied does not have an equal chance of being included in the study (due to the way the sample is selected) non-random (selection process is subjective)
researchers rely heavily on personal judgement
projecting the findings beyond the sample is statistically inappropriate is less concerned about generalisability; other factors are more important - time ; preliminary information - then use nonprobability
Non Probability Sampling
Common sampling approaches
convenience judgement quota snowball
Convenience Sample
Also known as haphazard or accidental sampling based on convenient availability of sampling units sample units happen to be in a certain place at certain time – high traffic locations – shopping malls; pedestrian areas
Acceptable only in pre - test/exploration phase when further research will use probability sampling
Representativeness highly uncertain
Quota sampling can reduce some of the sample selection error
Judgement Sampling
An experienced individual (could be the researchers) selects the sample based on personal judgement about some appropriate characteristics suited to the study Focus group studies use this method
Quota Samples
Various subgroups in a population are represented based on pertinent characteristics
Haphazard selection of respondents may introduce bias Similar to stratified random sampling
Snowball Sampling
Judgement sample that relies on researchers ability to locate an initial set of respondents with the desired characteristics; these individuals are then used as informants to identify others with desired characteristic Acceptable when sample units are difficult to locate Advantages reduced sample size and costs
Probability Sampling
In a probability sample each element in the population has some known chance or probability of being included in the sample
Used when the representativeness of the sample is important for generalisability of results Random selection of sample thus eliminating bias
Probability Sampling cont.
statistical efficiency
same sample size and smaller standard error of the mean is obtained
economic efficiency
precision refers to the level of uncertainty about the characteristics being measured precision is inversely related to sampling error precision is positively related to cost
Types of probability sampling
Simple random sample Systematic sampling Stratified sampling
proportionate disproportionate
Cluster sampling Area sampling
Simple Random Sampling
Assures each element in the population of an equal chance of being included in the sample Blind draw - putting all name in a hat and drawing out a sample of 100 (size has been statistically calculated) Random numbers Need to begin with a complete list of the population – sometimes difficult to obtain
Systematic Sampling
A starting point is selected by a random process and then every nth number on the list is selected Calculate skip interval = population list size/ sample size (size has been statistically calculated) Danger of periodicity – if list has a systematic pattern Can be more representative than a simple random sample
Stratified Sampling
Simple random sub samples are drawn from within each stratum in the population that are more or less equal on some characteristic Greater degree of representativeness Two types
proportionate - sample size of each stratum is relative to the size of each stratum in the population disproportionate –sample size of each stratum does not reflect their relative proportions in the population
Cluster Sampling
divides the population into groups (clusters), any one of which can be considered a representative sample an economically efficient technique in which the primary sampling unit is not the individual element but a large cluster of elements clusters are selected randomly random sample from within each cluster
Technique
Nonprobability Sampling Convenience sampling Judgmental sampling Quota sampling Snowball sampling
Strengths
Least expensive, least time-consuming, most convenient Low cost, convenient, not time-consuming Sample can be controlled for certain characteristics Can estimate rare characteristics
Weaknesses
Selection bias, sample not representative, not recommended for descriptive or causal research Does not allow generalization, subjective Selection bias, no assurance of representativeness Time-consuming
Probability sampling Simple random sampling (SRS) Systematic sampling
Easily understood, results projectable
Can increase representativeness, easier to implement than SRS, sampling frame not necessary Include all important subpopulations, precision Easy to implement, cost effective
Difficult to construct sampling frame, expensive, lower precision, no assurance of representativeness. Can decrease representativeness
Stratified sampling
Cluster sampling
Difficult to select relevant stratification variables, not feasible to stratify on many variables, expensive Imprecise, difficult to compute and interpret results
Choosing probability vs. non-probability sampling
Probability sampling
Conclusive Larger sampling errors
Evaluation Criteria Nature of research Relative magnitude of sampling and non-sampling error Population variability Statistical Considerations Sophistication Needed Time Budget Needed
Non-probability sampling
Exploratory Larger non-sampling error
High [Heterogeneous] Favorable High Relatively Longer High
Low [Homogeneous] Unfavorable Low Relatively shorter Low
Selecting an Appropriate Design
degree of accuracy resources time advance knowledge of the population national versus local projects need for statistical analysis
Session - 4
Measurement and Scaling
Measurement means assigning numbers or other symbols to characteristics of objects according to certain pre-specified rules. One-to-one correspondence between the numbers and the characteristics being measured. The rules for assigning numbers should be standardized and applied uniformly. Rules must not change over objects or time.
Measurement and Scaling
Scaling involves creating a continuum upon which measured objects are located. Consider an attitude scale from 1 to 100. Each respondent is assigned a number from 1 to 100, with 1 = Extremely Unfavorable, and 100 = Extremely Favorable. Measurement is the actual assignment of a number from 1 to 100 to each respondent. Scaling is the process of placing the respondents on a continuum with respect to their attitude toward department stores
Primary Scales of Measurement
Scale Nominal
Numbers Assigned to Runners Rank Order of Winners
Third place Second place First place Finish
7 8 3
Ordinal
Finish
Interval
Performance Rating on a 0 to 10 Scale
Time to Finish, in Seconds
8.2
9.1
9.6
Ratio
15.2
14.1
13.4
Primary Scales of Measurement Nominal Scale
The numbers serve only as labels or tags for identifying and classifying objects.
When used for identification, there is a strict one-to-one correspondence between the numbers and the objects.
The numbers do not reflect the amount of the characteristic possessed by the objects.
The only permissible operation on the numbers in a nominal scale is counting. Only a limited number of statistics, all of which are based on frequency counts, are permissible, e.g., percentages, and mode.
Illustration of Primary Scales of Measurement
Nominal Scale
No. Store
Ordinal Scale
Preference Rankings
Interval Scale
Preference Ratings
Ratio Scale
$ spent last 3 months
1. Lord & Taylor 2. Macy’s 3. Kmart 4. Rich’s 5. J.C. Penney 6. Neiman Marcus 7. Target 8. Saks Fifth Avenue 9. Sears 10.Wal-Mart
7 2 8 3 1 5 9 6 4 10
79 25 82 30 10 53 95 61 45 115
1-7 5 7 4 6 7 5 4 5 6 2
11-17 15 17 14 16 17 15 14 15 16 12
0 200 0 100 250 35 0 100 0 10
Primary Scales of Measurement Ordinal Scale
• A ranking scale in which numbers are assigned to objects to indicate the relative extent to which the objects possess some characteristic. Can determine whether an object has more or less of a characteristic than some other object, but not how much more or less. Any series of numbers can be assigned that preserves the ordered relationships between the objects. In addition to the counting operation allowable for nominal scale data, ordinal scales permit the use of statistics based on centiles, e.g., percentile, quartile, median.
•
• •
Primary Scales of Measurement Interval Scale
• • • Numerically equal distances on the scale represent equal values in the characteristic being measured. It permits comparison of the differences between objects. The location of the zero point is not fixed. Both the zero point and the units of measurement are arbitrary.
•
•
Any positive linear transformation of the form y = a + bx will preserve the properties of the scale.
It is not meaningful to take ratios of scale values.
•
Statistical techniques that may be used include all of those that can be applied to nominal and ordinal data, and in addition the arithmetic mean, standard deviation, and other statistics commonly used in marketing research.
Primary Scales of Measurement Ratio Scale
•
•
Possesses all the properties of the nominal, ordinal, and interval scales.
It has an absolute zero point.
•
• •
It is meaningful to compute ratios of scale values.
Only proportionate transformations of the form y = bx, where b is a positive constant, are allowed. All statistical techniques can be applied to ratio
data.
Primary Scales of Measurement
Scale Nominal Basic Characteristics Numbers identify & classify objects Common Examples Social Security nos., numbering of football players Nos. indicate the Quality rankings, relative positions rankings of teams of objects but not in a tournament the magnitude of differences between them Differences Temperature between objects (Fahrenheit) Zero point is fixed, Length, weight ratios of scale values can be compared Marketing Permissible Examples Descriptive Brand nos., store Percentages, types mode Preference Percentile, rankings, market median position, social class Statistics Inferential Chi-square, binomial test Rank-order correlation, Friedman ANOVA
Ordinal
Interval Ratio
Attitudes, opinions, index Age, sales, income, costs
Range, mean, standard Geometric mean, harmonic mean
Productmoment Coefficient of variation
A Classification of Scaling Techniques
Scaling Techniques
Comparative Scales
Noncomparative Scales
Paired Comparison
Rank Order
Constant Sum
Q-Sort and Other Procedures
Continuous Itemized Rating Scales Rating Scales
Likert
Semantic Differential
Stapel
A Comparison of Scaling Techniques
• Comparative scales involve the direct comparison of stimulus objects. Comparative scale data must be interpreted in relative terms and have only ordinal or rank order properties. In non-comparative scales, each object is scaled independently of the others in the stimulus set. The resulting data are generally assumed to be interval or ratio scaled.
•
Relative Advantages of Comparative Scales
• • Small differences between stimulus objects can be detected. Same known reference points for all respondents.
•
• •
Easily understood and can be applied.
Involve fewer theoretical assumptions. Tend to reduce halo or carryover effects from one judgment to another.
Relative Disadvantages of Comparative Scales
Ordinal nature of the data Inability to generalize beyond the stimulus objects scaled.
Comparative Scaling Techniques Paired Comparison Scaling
• • • • A respondent is presented with two objects and asked to select one according to some criterion. The data obtained are ordinal in nature. Paired comparison scaling is the most widely-used comparative scaling technique. Under the assumption of transitivity, it is possible to convert paired comparison data to a rank order.
Obtaining Shampoo Preferences Using Paired Comparisons
Instructions: We are going to present you with ten pairs of shampoo
brands. For each pair, please indicate which one of the two brands of shampoo you would prefer for personal use.
Recording Form:
Jhirmack Finesse Vidal Sassoon Head & Shoulders Pert
Jhirmack
Finesse 0
Vidal Sassoon 0 0
Head & Shoulders 1 1 1
Pert 0 0 1 0
1a 1 0 1 1 0 1
0 0 1
Number of Times 3 2 0 4 1 b Preferred aA 1 in a particular box means that the brand in that column was preferred over the brand in the corresponding row. A 0 means that the row brand was preferred over the column brand. bThe number of times a brand was preferred is obtained by summing the 1s in each column.
Paired Comparison Selling
The most common method of taste testing is paired comparison. The consumer is asked to sample two different products and select the one with the most appealing taste. The test is done in private and a minimum of 1,000 responses is considered an adequate sample. A blind taste test for a soft drink, where imagery, self-perception and brand reputation are very important factors in the consumer’s purchasing decision, may not be a good indicator of performance in the marketplace. The introduction of New Coke illustrates this point. New Coke was heavily favored in blind paired comparison taste tests, but its introduction was less than successful, because image plays a major role in the purchase of Coke.
Comparative Scaling Techniques Rank Order Scaling
Respondents are presented with several objects simultaneously and asked to order or rank them according to some criterion.
It is possible that the respondent may dislike the brand ranked 1 in an absolute sense. Furthermore, rank order scaling also results in ordinal data. Only (n - 1) scaling decisions need be made in rank order scaling.
Preference for Toothpaste Brands Using Rank Order Scaling
Instructions: Rank the various brands of toothpaste in order of preference. Begin by picking out the one brand that you like most and assign it a number 1. Then find the second most preferred brand and assign it a number 2. Continue this procedure until you have ranked all the brands of toothpaste in order of preference. The least preferred brand should be assigned a rank of 10. No two brands should receive the same rank number.
The criterion of preference is entirely up to you. There is no right or wrong answer. Just try to be consistent.
Preference for Toothpaste Brands Using Rank Order Scaling
Form
Brand Rank Order
1. Crest
2. Colgate 3. Aim 4. Gleem 5. Sensodyne 6. Ultra Brite
_________
_________ _________ _________ _________ _________
7. Close Up
8. Pepsodent 9. Plus White 10. Stripe
_________
_________ _________ _________
Comparative Scaling Techniques Constant Sum Scaling
Respondents allocate a constant sum of units, such as 100 points to attributes of a product to reflect their importance. If an attribute is unimportant, the respondent assigns it zero points. If an attribute is twice as important as some other attribute, it receives twice as many points. The sum of all the points is 100. Hence, the name of the scale.
Importance of Bathing Soap Attributes Using a Constant Sum Scale
Instructions
On the next slide, there are eight attributes of bathing soaps. Please allocate 100 points among the attributes so that your allocation reflects the relative importance you attach to each attribute. The more points an attribute receives, the more important the attribute is. If an attribute is not at all important, assign it zero points. If an attribute is twice as important as some other attribute, it should receive twice as many points.
Importance of Bathing Soap Attributes Using a Constant Sum Scale
Form Average Responses of Three Segments
Attribute Segment I 1. Mildness 8 2. Lather 2 3 3. Shrinkage 53 4. Price 9 5. Fragrance 7 6. Packaging 5 7. Moisturizing 13 8. Cleaning Power 100 Sum Segment II
2 4 9 17 0 5 3 60 100
Segment III
4 17 7 9 19 9 20 15 100
Q – Sort Scaling
A comparative scaling technique that uses a rank order procedure to sort objects based on similarity with respect to some criterion.
Session - 5
Non - comparative Scaling Techniques
Respondents evaluate only one object at a time, and for this reason noncomparative scales are often referred to as monadic scales. Non-comparative techniques consist continuous and itemized rating scales. of
Continuous Rating Scale
Respondents rate the objects by placing a mark at the appropriate position on a line that runs from one extreme of the criterion variable to the other. The form of the continuous scale may vary considerably. How would you rate Sears as a department store? Version 1 Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Probably the best Version 2 Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - Probably the best 0 10 20 30 40 50 60 70 80 100
90
Version 3
Neither good Very good nor bad Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - -Probably the best 0 10 20 30 40 50 60 70 80 90
100
Very bad
RATE: Rapid Analysis and Testing Environment
A relatively new research tool, the perception analyzer, provides continuous measurement of “gut reaction.” A group of up to 400 respondents is presented with TV or radio spots or advertising copy. The measuring device consists of a dial that contains a 100-point range. Each participant is given a dial and instructed to continuously record his or her reaction to the material being tested.
As the respondents turn the dials, the information is fed to a computer, which tabulates second-bysecond response profiles. As the results are recorded by the computer, they are superimposed on a video screen, enabling the researcher to view the respondents' scores immediately. The responses are also stored in a permanent data file for use in further analysis. The response scores can be broken down by categories, such as age, income, sex, or product usage.
Itemized Rating Scales
The respondents are provided with a scale that has a number or brief description associated with each category.
The categories are ordered in terms of scale position, and the respondents are required to select the specified category that best describes the object being rated.
The commonly used itemized rating scales are the Likert, semantic differential, and Stapel scales.
Likert Scale
The Likert scale requires the respondents to indicate a degree of agreement or disagreement with each of a series of statements about the stimulus objects. SD D Neither A SA A or D
1. Sears sells high quality merchandise. 1 2X 3 4 5
2. Sears has poor in-store service.
3. I like to shop at Sears.
1
1
2X
2
3
3X
4
4
5
5
The analysis can be conducted on an item-by-item basis (profile analysis), or a total (summated) score can be calculated. When arriving at a total score, the categories assigned to the negative statements by the respondents should be scored by reversing the scale.
Semantic Differential Scale
The semantic differential is a seven-point rating scale with end points associated with bipolar labels that have semantic meaning. SEARS IS: Powerful Unreliable Modern
--:--:--:--:-X-:--:--: Weak --:--:--:--:--:-X-:--: Reliable --:--:--:--:--:--:-X-: Old-fashioned
The negative adjective or phrase sometimes appears at the left side of the scale and sometimes at the right. This controls the tendency of some respondents, particularly those with very positive or very negative attitudes, to mark the right- or left-hand sides without reading the labels. Individual items on a semantic differential scale may be scored on either a -3 to +3 or a 1 to 7 scale.
A Semantic Differential Scale for Measuring SelfConcepts, Person Concepts, and Product Concepts
1) Rugged 2) Excitable :---:---:---:---:---:---:---: Delicate :---:---:---:---:---:---:---: Calm
3) Uncomfortable 4) Dominating 5) Thrifty 6) Pleasant 7) Contemporary 8) Organized 9) Rational 10) Youthful 11) Formal 12) Orthodox 13) Complex 14) Colorless 15) Modest
:---:---:---:---:---:---:---: Comfortable :---:---:---:---:---:---:---: Submissive :---:---:---:---:---:---:---: Indulgent :---:---:---:---:---:---:---: Unpleasant :---:---:---:---:---:---:---: Obsolete :---:---:---:---:---:---:---: Unorganized :---:---:---:---:---:---:---: Emotional :---:---:---:---:---:---:---: Mature :---:---:---:---:---:---:---: Informal :---:---:---:---:---:---:---: Liberal :---:---:---:---:---:---:---: Simple :---:---:---:---:---:---:---: Colorful :---:---:---:---:---:---:---: Vain
Stapel Scale
The Stapel scale is a unipolar rating scale with ten categories numbered from -5 to +5, without a neutral point (zero). This scale is usually presented vertically. SEARS +5 +5 +4 +4 +3 +3 +2 +2X +1 +1 HIGH QUALITY POOR SERVICE -1 -1 -2 -2 -3 -3 -4X -4 -5 -5 The data obtained by using a Stapel scale can be analyzed in the same way as semantic differential data.
Basic Non - comparative Scales
Scale Continuous Rating Scale Itemized Rating Scales Likert Scale
Degrees of agreement on a 1 (strongly disagree) to 5 (strongly agree) scale Seven - point scale with bipolar labels Measurement of attitudes Easy to construct, administer, and understand More time - consuming
Basic Characteristics
Place a mark on a continuous line
Examples
Reaction to TV commercials
Advantages
Easy to construct
Disadvantages
Scoring can be cumbersome unless computerized
Semantic Differential
Brand, product, and company images Measurement of attitudes and images
Versatile to whether the
Stapel Scale
Unipolar ten - point scale, - 5 to +5, witho ut a neutral point (zero)
Easy to construct, administer over telephone
Confusing and
Itemized Scale Decisions
1) Number of categories Although there is no single, optimal number, traditional guidelines suggest that there should be between five and nine categories
2) Balanced vs. unbalanced
3) Odd/even no. of categories
In general, the scale should be balanced to obtain objective data (Next Slide).
If a neutral or indifferent scale response is possible from at least some of the respondents, an odd number of categories should be used In situations where the respondents are expected to have no opinion, the accuracy of the data may be improved by a non-forced scale An argument can be made for labeling all or many scale categories. The category descriptions should be located as close to the response categories as possible A number of options should be tried and the best selected ( Horizontally or vertically)
4) Forced vs. non-forced
5) Verbal description
6) Physical form
Balanced and Unbalanced Scales
Balanced Scale Jovan Musk for Men is
Extremely good Very good Good Bad Very bad Extremely bad
Unbalance Scale Jovan Musk for Men is
Extremely good Very good Good Somewhat good Bad Very bad
Rating Scale Configurations
A variety of scale configurations may be employed to measure the gentleness of Cheer detergent. Some examples include:
Cheer detergent is:
1) Very harsh 2) Very harsh
--1
--2
--3
--4
--5
--6
--7
Very gentle Very gentle
3) . Very harsh . . . Neither harsh nor gentle . . . Very gentle 4) ____ ____ ____ ____ Very Harsh Somewhat Neither harsh harsh Harsh nor gentle 5) Very harsh Neither harsh nor gentle
____ Somewhat gentle
____ Gentle
____ Very gentle
Very gentle
Measurement Error – Difference between observed score and true score
Measurement Accuracy
The true score model provides a framework for understanding the accuracy of measurement. XO = XT + X S + X R
where
XO = XT = XS = XR = the observed score or measurement the true score of the characteristic systematic error ( they affect the observed in the same way each time)score. random error ( Situational factors)
Potential Sources of Error on Measurement
1) Other relatively stable characteristics of the individual that influence the test score, such as intelligence, social desirability, and education.
2)
3) 4) 5) 6) 7) 8)
Short-term or transient personal factors, such as health, emotions, and fatigue.
Situational factors, such as the presence of other people, noise, and distractions. Sampling of items included in the scale: addition, deletion, or changes in the scale items. Lack of clarity of the scale, including the instructions or the items themselves. Mechanical factors, such as poor printing, overcrowding items in the questionnaire, and poor design. Administration of the scale, such as differences among interviewers. Analysis factors, such as differences in scoring and statistical analysis.
Reliability
Reliability can be defined as the extent to which measures are free from random error, XR. If XR = 0, the measure is perfectly reliable. Random error produces inconsistency leading to lower reliability
Validity
The validity of a scale may be defined as the extent to which differences in observed scale scores reflect true differences among objects on the characteristic being measured, rather than systematic or random error. Perfect validity requires that there be no measurement error (XO = XT, XR = 0, XS = 0).
Relationship Between Reliability and Validity
If a measure is perfectly valid, it is also perfectly reliable. In this case XO = XT, XR = 0, and XS = 0. If a measure is unreliable, it cannot be perfectly valid, since at a minimum XO = XT + XR. Furthermore, systematic error may also be present, i.e., XS?0. Thus, unreliability implies invalidity. If a measure is perfectly reliable, it may or may not be perfectly valid, because systematic error may still be present (XO = XT + XS). Reliability is a necessary, but not sufficient, condition for validity.
Session - 6
Data Collection and Questionnaire
Collection of Data
Data can be obtained :
Secondary Source Internal Records Primary source
Collection of Data
Primary Data :
Questionnaire : Observation : Schedule, Interview form (telephone and personal interview)
Questionnaire Definition
A questionnaire is a formalized set of questions for obtaining information from respondents.
Questionnaire Objectives
It must translate the information needed into a set of specific questions that the respondents can and will answer. A questionnaire must uplift, motivate, and encourage the respondent to become involved in the interview, to cooperate, and to complete the interview. A questionnaire should minimize response error.
Questionnaire Design Process
Specify the Information Needed
Specify the Type of Interviewing Method
Determine the Content of Individual Questions Design the Question to Overcome the Respondent’s Inability and Unwillingness to Answer Decide the Question Structure
Determine the Question Wording
Arrange the Questions in Proper Order
Identify the Form and Layout Reproduce the Questionnaire Eliminate Bugs by Pre-testing
Individual Question Content 1.Is the Question Necessary?
If there is no satisfactory use for the data resulting from a question, that question should be eliminated.
Individual Question Content ? 2. Are Several Questions Needed Instead of One?
Sometimes, several questions are needed to obtain the required information in an unambiguous manner. Consider the question: “Do you think Coca-Cola is a tasty and refreshing soft drink?” (Incorrect) Such a question is called a double-barreled question, because two or more questions are combined into one. To obtain the required information, two distinct questions should be asked: “Do you think Coca-Cola is a tasty soft drink?” and “Do you think Coca-Cola is a refreshing soft drink?” (Correct)
Overcoming Inability To Answer – 1. Is the Respondent Informed?
In situations where not all respondents are likely to be informed about the topic of interest, filter questions that measure familiarity and past experience should be asked before questions about the topics themselves. A “don't know” option appears to reduce uninformed responses without reducing the response rate.
Overcoming Inability To Answer – 2. Can the Respondent Remember?
How many gallons of soft drinks did you consume during the last four weeks? (Incorrect)
How often do you consume soft drinks in a typical week? (Correct) 1. ___ Less than once a week 2. ___ 1 to 3 times per week 3. ___ 4 to 6 times per week 4. ___ 7 or more times per week
Overcoming Inability To Answer – 3. Can the Respondent Articulate?
Respondents may be unable to articulate certain types of responses, e.g., describe the atmosphere of a department store. Respondents should be given aids, such as pictures, maps, and descriptions to help them articulate their responses.
Overcoming Unwillingness To Answer – Effort Required of the Respondents
Most respondents are unwilling to devote a lot of effort to provide information.
Overcoming Unwillingness To Answer
Context Respondents are unwilling to respond to questions which they consider to be inappropriate for the given context. The researcher should manipulate the context so that the request for information seems appropriate. Legitimate Purpose Explaining why the data are needed can make the request for the information seem legitimate and increase the respondents' willingness to answer. Sensitive Information Respondents are unwilling to disclose, at least accurately, sensitive information because this may cause embarrassment or threaten the respondent's prestige or self-image.
Overcoming Unwillingness To Answer – Increasing the Willingness of Respondents
Place sensitive topics at the end of the questionnaire.
Preface the question with a statement that the behavior of interest is common. Ask the question using the third-person technique : phrase the question as if it referred to other people. Hide the question in a group of other questions which respondents are willing to answer. The entire list of questions can then be asked quickly. Provide response categories rather than asking for specific figures. Use randomized techniques.
Choosing Question Structure – Unstructured Questions
Unstructured questions are open-ended questions that respondents answer in their own words. What is your occupation? Who is your favorite actor? What do you think about people who shop at high-end department stores?
Choosing Question Structure – Structured Questions
Structured questions specify the set of response alternatives and the response format. A structured question may be multiple-choice, dichotomous, or a scale.
Choosing Question Structure – Multiple-Choice Questions
In multiple-choice questions, the researcher provides a choice of answers and respondents are asked to select one or more of the alternatives given.
Do you intend to buy a new car within the next six months? ____ Definitely will not buy ____ Probably will not buy ____ Undecided ____ Probably will buy ____ Definitely will buy ____ Other (please specify)
Choosing Question Structure – Dichotomous Questions
A dichotomous question has only two response alternatives: yes or no, agree or disagree, and so on. Often, the two alternatives of interest are supplemented by a neutral alternative, such as “no opinion,” “don't know,” “both,” or “none.” Do you intend to buy a new car within the next six months? _____ Yes _____ No _____ Don't know
Choosing Question Structure – Scales
Do you intend to buy a new car within the next six months? Definitely will not buy 1 Probably will not buy 2 Undecided 3 Probably will buy 4 Definitely will buy 5
Choosing Question Wording – Define the Issue
Define the issue in terms of who, what, when, where, why, and way (the six Ws). Who, what, when, and where are particularly important. Which brand of shampoo do you use? (Incorrect) Which brand or brands of shampoo have you personally used at home during the last month? In case of more than one brand, please list all the brands that apply. (Correct)
Choosing Question Wording – Use Unambiguous Words
In a typical month, how often do you shop in department stores? _____ Never _____ Occasionally _____ Sometimes _____ Often _____ Regularly (Incorrect) In a typical month, how often do you shop in department stores? _____ Less than once _____ 1 or 2 times _____ 3 or 4 times _____ More than 4 times (Correct)
Choosing Question Wording – Avoid Leading or Biasing Questions
A leading question is one that clues the respondent to what the answer should be, as in the following: Do you think that patriotic Americans should buy imported automobiles when that would put American labor out of work? _____ Yes _____ No _____ Don't know (Incorrect) Do you think that Americans should buy imported automobiles? _____ Yes _____ No _____ Don't know (Correct)
Choosing Question Wording – Avoid Implicit Alternatives
An alternative that is not explicitly expressed in the options is an implicit alternative. 1. Do you like to fly when traveling short distances? (Incorrect) Do you like to fly when traveling short distances, or would you rather drive? (Correct)
2.
Choosing Question Wording – Avoid Implicit Assumptions
Questions should not be worded so that the answer is dependent upon implicit assumptions about what will happen as a consequence. 1. Are you in favor of a balanced budget? (Incorrect)
Are you in favor of a balanced budget if it would result in an increase in the personal income tax? (Correct)
2.
Determining the Order of Questions
Opening Questions The opening questions should be interesting, simple, and non-threatening. Type of Information As a general guideline, basic information should be obtained first, followed by classification, and, finally, identification information. Difficult Questions Difficult questions or questions which are sensitive, embarrassing, complex, or dull, should be placed late in the sequence.
Determining the Order of Questions
Effect on Subsequent Questions General questions should precede the specific questions (funnel approach).
Q1: “What considerations are important to you in selecting a department store?”
Q2: “In selecting a department store, how important is convenience of location?” (Correct)
Form and Layout
Divide a questionnaire into several parts.
The questions in each part should be numbered, particularly when branching questions are used.
The questionnaires should preferably be precoded.
The questionnaires themselves should be numbered serially.
Example of a Precoded Questionnaire
The American Lawyer
A Confidential Survey of Our Subscribers
(Please ignore the numbers alongside the answers. They are only to help us in data processing.) 1. Considering all the times you pick it up, about how much time, in total, do
you spend reading or looking through a typical issue of THE AMERICAN LAWYER?
Less than 30 minutes.....................-1
30 to 59 minutes............................-2 1 hour to 1 hour 29 minutes..........-3
11/2 hours to 1 hour 59 minutes.........-4
2 hours to 2 hours 59 minutes...........-5 3 hours or more.................................-6
Reproduction of the Questionnaire
The questionnaire should be reproduced on good-quality paper and have a professional appearance. Questionnaires should take the form of a booklet rather than a number of sheets of paper clipped or stapled together. Each question should be reproduced on a single page (or double-page spread).
Vertical response columns should be used for individual questions.
Grids are useful when there are a number of related questions they use the same set of response categories. The tendency to crowd questions together to make the questionnaire look shorter should be avoided. Directions or instructions for individual questions should be placed as close to the questions as possible.
Pretesting
Pretesting refers to the testing of the questionnaire on a small sample of respondents to identify and eliminate potential problems. A questionnaire should not be used in the field survey without adequate pretesting.
All aspects of the questionnaire should be tested, including question content, wording, sequence, form and layout, question difficulty, and instructions.
The respondents for the pretest and for the actual survey should be drawn from the same population. Pretests are best done by personal interviews, even if the actual survey is to be conducted by mail, telephone, or electronic means, because interviewers can observe respondents' reactions and attitudes.
Pretesting
After the necessary changes have been made, another pretest could be conducted by mail, telephone, or electronic means if those methods are to be used in the actual survey. A variety of interviewers should be used for pretests. The pretest sample size varies from 15 to 30 respondents for each wave. Protocol analysis and debriefing are two commonly used procedures in pretesting.
Finally, the responses obtained from the pretest should be coded and analyzed.
Measurement of Central Tendency
Session - 7
Classification of Data
Geographic i.e. Area wise classification – cities , districts Chronological i.e. on the basis of time – year wise Qualitative i.e. according to some attribute – Male and Female Quantitative i.e . In terms of magnitude – some characteristics- income
Formation of Frequency Distribution
e.g. Refrigerator sold each day in Oct. 2008 Classification according to class intervals
Class Limits Class intervals Class frequency Class Mid point
Tabulation
Simple Tables or one way table Two way Tables
Frequency Distribution
In a frequency distribution, one variable is considered at a time.
A frequency distribution for a variable produces a table of frequency counts, percentages, and cumulative percentages for all the values associated with that variable.
Measures of central tendency Mean, median, mode, etc. Quartile Measure of variation Range, interquartile range, variance and standard deviation, coefficient of variation Shape Symmetric, skewed, using box-and-whisker plots Coefficient of correlation
Summary Measures
Central Tendency
Quartile Mode
Variation
Mean Median
Range
Coefficient of Variation
Variance
Geometric Mean
Standard Deviation
Mean
Data:100, 78, 65, 43, 94, 58
Mean: The sum of a collection of data divided by the number of data 43+58+65+78+94+100=438 438÷6=73 Mean is 73
Mean
Sample Mean
Sample Size
n Population Mean
X?
?X
i ?1
n
i
X1 ? X 2 ? ? n
Population Size
? Xn
??
?X
i ?1
N
i
N
X1 ? X 2 ? ? N
? XN
Mean
Direct Method : X
Mean
• • • The most common measure of central tendency Acts as ‘Balance Point’ Affected by extreme values (outliers)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Mean = 5
Mean = 6
Median
Robust measure of central tendency Not affected by extreme values
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Median = 5
Median = 5
In an ordered array, the median is the “middle” number
If n or N is odd, the median is the middle number If n or N is even, the median is the average of the two middle numbers
Mode
A measure of central tendency Value that occurs most often Not affected by extreme values Used for either numerical or categorical data There may be no mode or several modes
Mode = 9 No Mode 1 2 34 5 6 7 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Quartiles
Q1, the first quartile, is the value such that 25% of the observations are smaller, corresponding to (n+1)/4 ordered observation Q2, the second quartile, is the median, 50% of the observations are smaller, corresponding to 2(n+1)/4 = (n+1)/2 ordered observation Q3, the third quartile, is the value such that 75% of the observations are smaller, corresponding to 3(n+1)/4 ordered observation
Quartiles
Split Ordered Data into 4 Quarters
25% 25% 25% 25%
Position of ith Quartile
?Q1 ?
?Q2 ?
?Q3 ? i ? n ? 1? ? Qi ? ?
4
Q1
Data in Ordered Array: 11 12 13 16 16 17 17 18 21
1? 9 ? 1? Position of Q1 ? ? 2.5 4
?12 ? 13? ? 12.5 ?
2
= Median = 16, Q3 = 17.5
Measures of Variation
Variation
Variance
Standard Deviation
Coefficient of Variation
Range
Population Variance
Population Standard Deviation
Sample Variance
Interquartile Range
Sample Standard Deviation
Range
Measure of variation Difference between the largest and the smallest observations:
Range ? X Largest ? X Smallest
Ignore the way in which data are distributed
Range = 12 - 7 = 5 Range = 12 - 7 = 5
7 12
8
9
10
11
7 12
8
9
10
11
Interquartile Range
Measure of variation Also known as midspread Spread in the middle 50% Difference between the first and third quartiles
Data in Ordered Array: 11 12 13 16 16 17 17 18 21
Interquartile Range ? Q3 ? Q1 ? 17.5 ? 12.5 ? 5
Not affected by extreme values
Variance
•Important measure of variation •Shows variation about the mean Sample variance:
S ?
2
?? X
i ?1
n
i
?X?
2
n ?1
Population variance
?2 ?
?? X
i ?1
N
i
???
2
N
Standard Deviation
Most important measure of variation Shows variation about the mean Has the same units as the original data n Sample standard deviation:
S?
i ?1
?? X
N
i
?X?
2
n ?1
Population standard deviation:
??
?? X
i ?1
i
???
2
N
Comparing Standard Deviations
Data A Mean = 15.5
s = 3.338
11 21 12 13 14 15 16 17 18 19 20
Mean = 15.5
Data B
11 12 20 21
13
14
15
16
17
18
19
s = .9258
Data C Mean = 15.5
11 21
12
13
14
15
16
17
18
19
20
s = 4.57
Coefficient of Variation
Measure of Relative Dispersion Always in % Shows Variation Relative to Mean Used to Compare 2 or More Groups
Formula (Sample Coefficient of Variation)
S CV ? ? 100% X
Session - 8
Skewness and Kurtosis
Review of Previous Lecture
Range
The difference between the largest and smallest values
Interquartile range
The difference between the 25th and 75th percentiles
Variance
The sum of squares divided by the population size or the sample size
Standard deviation
The square root of the variance
•Another Measure of Dispersion
•Coefficient of Variation (CV) •Skewness •Kurtosis
Measures of Dispersion – Coefficient of Variation
Coefficient of variation (CV) measures the spread of a set of data as a proportion of its mean. It is the ratio of the sample standard deviation to the sample mean
s CV ? ?100% x
It is sometimes expressed as a percentage
Measures of Skewness and Kurtosis
A fundamental task in many statistical analyses is to characterize the location and variability of a data set (Measures of central tendency vs. measures of dispersion) Both measures tell us nothing about the shape of the distribution A further characterization of the data includes skewness and kurtosis
Skewness
Skewness measures the degree of asymmetry exhibited by the data
Skewness
Positive skewness
There are more observations below the mean than above it
When the mean is greater than the median
Negative skewness
There are a small number of low observations and a large number of high ones When the median is greater than the mean
Shape of a Distribution
Describes how data is distributed Measures of shape Mean > median: right-skewness Mean ? median: left-skewness Mean = median: symmetric
Left-Skewed Symmetric
Mean < Median < Mode Mean = Median =Mode
Right-Skewed
Mode < Median < Mean
Kurtosis
Kurtosis measures how peaked the histogram is
kurtosis ?
? (x ? x)
i i
n
4
ns
4
?3
The kurtosis of a normal distribution is 0
Kurtosis characterizes the relative peakedness or flatness of a distribution compared to the normal distribution
Kurtosis
Platykurtic– When the kurtosis < 0, the frequencies throughout the curve are closer to be equal (i.e., the curve is more flat and wide) Thus, negative kurtosis indicates a relatively flat distribution
Leptokurtic– When the kurtosis > 0, there are high frequencies in only a small part of the curve (i.e, the curve is more peaked) Thus, positive kurtosis indicates a relatively peaked distribution
Kurtosis
k>3 Frequency
k=3
k<3
Value
•
• •
Kurtosis is based on the size of a distribution's tails. Negative kurtosis (platykurtic) – distributions with short tails Positive kurtosis (leptokurtic) – distributions with relatively long tails
TIME SERIES ANALYSIS
Statistical data which are collected, observed or recorded at successive intervals of time – such data are referred as TIME SERIES : -It helps in understanding the past behavior. -It helps in planning future operations -It helps in evaluating current accomplishments -It facilitates comparison.
TIME SERIES ANALYSIS
Components of Time Series: -Secular trends – General movement persisting over long term -Seasonal variations - pattern year after year -Cyclical variations – Fluctuations moving up and down every few years -Irregular variations- Variations in business activity which do not repeat in definite period
Methods of Measurement
-Moving Avg. Method -Method of least square
Correlation Analysis
If two quantities vary in such a way that movement in one are accompanied by movement in another, these quantities are said to be correlated. The statistical tool for calculating such relationship is known as correlation and is denoted by = r. Types of correlation ship - Positive and Negative; - Simple, partial and multiple; - Linear and Non - linear
Scatter Plots and Correlation
A scatter plot (or scatter diagram) is used to show the relationship between two variables Correlation analysis is used to measure strength of the association (linear relationship) between two variables Only concerned with strength of the relationship No causal effect is implied
Scatter Plot Examples
Linear relationships y y Curvilinear relationships
x y y
x
x
x
Scatter Plot Examples
Strong relationships y y Weak relationships
x y y
x
x
x
Scatter Plot Examples
No relationship y
x y
Correlation Coefficient
The population correlation coefficient ? (rho) measures the strength of the association between the variables
The sample correlation coefficient r is an estimate of ? and is used to measure the strength of the linear relationship in the sample observations
Features r
Range between -1 and 1 The closer to -1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship The closer to 0, the weaker the linear relationship
Calculating the Correlation Coefficient
r ?
? ( x ? x )( y ? y ) [ ? ( x ? x ) ][? ( y ? y )
2
2
]
or the algebraic equivalent:
r?
[n( ? x 2 ) ? ( ? x )2 ][n( ? y 2 ) ? ( ? y )2 ]
n? xy ? ? x ? y
where: r = Sample correlation coefficient n = Sample size x = Value of the independent variable y = Value of the dependent variable
For Example
Tree Height y 35 Trunk Diameter x 8 xy 280 y2 1225 x2 64
49
27 33 60 21
9
7 6 13 7
441
189 198 780 147
2401
729 1089 3600 441
81
49 36 169 49
45
51 ?=321
11
12 ?=73
495
612 ?=3142
2025
2601 ?=14111
121
144 ?=713
Tree Height, y
70 60
r?
[n(? x 2 ) ? ( ? x) 2 ][n(? y 2 ) ? ( ? y)2 ] 8(3142)? (73)(321) [8(713)? (73)2 ][8(14111)? (321)2 ]
n? xy ? ? x ? y
50
?
40
30
? 0.886
r = 0.886 ? relatively strong positive linear association between x and y
20
10
0 0 2 4 6 8 10 12 14
Trunk Diameter, x
Calculations of Correlation when deviations are taken from Assumed Mean
Rank Correlation coefficient
doc_874799865.ppt