BASIC STATISTICS
Applications in Business and Economics
? Accounting
Public accounting firms use statistical sampling procedures when conducting audits for their clients.
? Finance
From Anderson, Sweeney and Williams
Financial analysts use a variety of statistical information, including price-earnings ratios and dividend yields, to guide their investment recommendations.
? Marketing
Electronic point-of-sale scanners at retail checkout counters are being used to collect data for a variety of marketing research applications.
Dr. Constance Lightner- Fayetteville State University 2
? Production
A variety of statistical quality control charts are used to monitor the output of a production process.
? Economics
From Anderson, Sweeney and Williams
Economists use statistical information in making forecasts about the future of the economy or some aspect of it.
Dr. Constance Lightner- Fayetteville State University
3
?
Statistics is the science of collecting, organizing and interpreting numerical and nonnumeric facts, which we call data. The collection and study of data are important in the work of many professions, so that training in the science of statistics is valuable preparation for variety of careers. , for example economists and financial advisors, businessmen, engineers, farmers. Knowledge of probability and statistical methods also are useful for informatics specialists of various fields such as data mining, knowledge discovery, so on. Whatever else it may be, statistics is, first and foremost, a collection of tools used for converting raw data into information to help decision makers in their works.
?
?
?
4
Definition of Statistics:
Statistics is the study of the collection, organization, graphical representation, statistical analysis and interpretation of data. A branch of applied mathematics concerned with the collection and interpretation of quantitative data and the use of probability theory to estimate population parameters.
History
? Some scholars pinpoint the origin of statistics to 1663, with the publication of Natural and Political Observations upon the Bills of Mortality by John Graunt. Early applications of statistical thinking revolved around the needs of states to base policy on demographic and economic data, hence its stat- etymology. The scope of the discipline of statistics broadened in the early 19th century to include the collection and analysis of data in general. Today, statistics is widely employed in government, business, and the natural and social sciences. ? Its mathematical foundations were laid in the 17th century with the development of probability theory by Blaise Pascal and Pierre de Fermat. Probability theory arose from the study of games of chance. The method of least squares was first described by Carl Friedrich Gauss around 1794. The use of modern computers has expedited large-scale statistical computation, and has also made possible new methods that are impractical to perform manually.
? Population: The entire aggregation of items from which samples can be drawn. ? Sample: A set of elements drawn from a population and analyzed to estimate the characteristics of a population. ? Parameter: A quantity (such as the mean or variance) that characterizes a statistical population and that can be estimated by calculations from sample data. ? Statistic: Any function of sample values that may be used to estimate a population parameter .
Descriptive and inferential statistics
± Descriptive statistics is devoted to the summarization and description of data (population or sample) . ± Inferential statistics uses sample data to make an inference about a population . Computer softwares for statistical analysis: ? ? ? ? ? ? ? ? ? ? Octave programming language (very similar to Mat lab) with statistical features . Simfit Simulation, curve fitting, statistics, and plotting MATLAB programming language with statistical features Minitab general statistics package SPSS comprehensive statistics package Stata comprehensive statistics package Statgraphics general statistics package STATISTICA comprehensive statistics package StatXact package for exact nonparametric and parametric statistics Systat general statistics package
8
VARIABLES
? Variables that vary between subjects ? Height, weight, blood pressure Can be grouped as ? Qualitative variable ? Quantitative variable
QUANTITATIVE VARIABLES
? Discrete variable Countable, but only whole numbers ? Continuous variable Countable as a continuum
QUALITATIVE VARIABLES
? ? ? ? ? ? Do not possess numerical values Color of hair,gender,blood group Three types Ordinal Dichotomous Nominal
? ?
Nominal data: data values are non-numeric group labels. For example, Gender variable can be defined as male = 0 and female =1. Ordinal data (we sometimes call 'Discrete Data'): data values are categorical and may be ranked in some numerically meaningful way. For example, strongly disagree to strong agree may be defined as 1 to 5.
?
Interval data : data values are ranged in a real interval, which can be as large as from negative infinity to positive infinity. The difference between two values are meaningful, however, the ratio of two interval data is not meaningful. For example temperature, IQ. Today is 1.2 times hotter than yesterday is not much useful nor meaningful. Ratio data: Both difference and ratio of two values are meaningful. For example, salary, weight. Dichotomous: Divided or dividing into two sharply distinguished parts or classifications divided - separated into parts or pieces; "opinions are divided"
? ?
Sources of data
? Primary
1. 2. 3. 4. 5. Direct personal investigation Indirect oral investigation Local correspondents Postal telephonic
? Secondary
1. 2. 3. 4. Journals News papers Television Internet etc.,
Data presentation
? Univariate data: 1. Classification using tally marks. 2. Classification using stem and leaf diagram.
? Bi-variate data: Classification using tally marks.
Case study 1
? Present the following data as a frequency distribution. Calculate relative frequencies, percentage frequencies and cumulative frequencies of both types.
Stock 1 2 3 4 5 6 7 8
Value 105 97 121.5 110.9 108.8 96.3 85.1 88
Stock 9 10 11 12 13 14 15 16
Value 110.4 100.7 103.6 109.6 131 116.2 118.2 109.5
stock 17 18 19 20 21 22 23 24
Value 112.7 99 128.6 87.6 107.4 86.7 112.3 91.4
Stock 25 26 27 28 29 30 31 32
Value 148.4 107.9 114.3 103.7 113.9 91.3 81.4 111.3
Case Study 2
The daily sales of a company in the last 26 working days is given below(in thousands of dollars): 25,27,18,34,19,20,41,37,18,22,22,32,29,37,42, 32,37,43,19,28,36,31,30,29 What is the population? What is the sample? Prepare a frequency distribution taking suitable class intervals if needed. Also prepare a stem and leaf diagram for this data. Also draw a frequency polygon and interpret the above data.
Case Study 3
? ? Bi- variate classification: The following data shows income(x) and percentage expenditure on food
in 25 families. Construct a bivariate frequency table and write down the marginal distributions of x and y. Also find. i) The conditional distribution of x when y lies between 15 and 20 Ii) The conditional distribution of y when x is greater than or equal to 300. X 546 628 310 420 600 689 400 Y 13 15 18 16 16 21 19 X 228 310 610 513 690 524 Y 27 26 20 18 10 12 X 600 300 425 554 328 317 Y 13 11 16 15 23 17 X 204 255 492 587 643 386 Y 30 29 18 21 19 18
? ?
Tabulation
? ? ? One way Two way Three way
Case Study 4
? Table below provides data concerning consumer installment credit outstanding at the end of June 1981 by type of credit and by holder, in millions of dollars: ? Organize the data in a two-way classification table. ? Also present it using a suitable diagram.
Type of credit
Type of holder
Amount outstanding(millions) 59,192 21,847 38,646 29,722 23,384 5,364 10,179 3,990 486 3,069 44,217 40,087 23,353 10,895
Automobile
Commercial banks Credit unions Finance companies
Revolving credit
Commercial banks Credit unions Finance companies
Mobile home
Commercial banks Credit unions Finance companies Savings And loans
Other
Commercial banks Credit unions Finance companies Savings And loans
Case study 5
? In 2008, out of total of 1750 workers of a company,1200 were members of trade union. The number of women employed was 200 of which 175 did not belong to the trade union. In 2009 the number of union workers increased to 1580 of which 1290 were men. On the other hand, the numbers of nonunion workers fell down to 208, of which 180 were men. In 2010, there were 1800 employees who belonged to trade union and 50 who did not belong to trade union. Of all the employees in 2010, 300 were women of whom only 8 did not belong to trade union. Present the above information in a tabular form and draw a suitable diagram.
Graphical presentation/distribution of data: Bar diagram Histogram Line diagram Pie-chart Scatter-plot
Tabulating and Graphing Bivariate Categorical Data
? Contingency tables: investment in thousands of dollars
Investment Stocks Bonds CD Savings Total Investor A 46.5 32 15.5 16 110 Investor B 55 44 20 28 147 Investor C Total Category 27.5 19 13.5 7 67 129 95 49 51 324
© 2002 Prentice-Hall, Inc.
Chap 2-23
Tabulating and Graphing Bivariate Categorical Data
Comparing Investors
Savings CD Bonds Stocks 0 10 20 30 40 50 60
Investor C
Investor B
Investor A
doc_907784221.ppt
Applications in Business and Economics
? Accounting
Public accounting firms use statistical sampling procedures when conducting audits for their clients.
? Finance
From Anderson, Sweeney and Williams
Financial analysts use a variety of statistical information, including price-earnings ratios and dividend yields, to guide their investment recommendations.
? Marketing
Electronic point-of-sale scanners at retail checkout counters are being used to collect data for a variety of marketing research applications.
Dr. Constance Lightner- Fayetteville State University 2
? Production
A variety of statistical quality control charts are used to monitor the output of a production process.
? Economics
From Anderson, Sweeney and Williams
Economists use statistical information in making forecasts about the future of the economy or some aspect of it.
Dr. Constance Lightner- Fayetteville State University
3
?
Statistics is the science of collecting, organizing and interpreting numerical and nonnumeric facts, which we call data. The collection and study of data are important in the work of many professions, so that training in the science of statistics is valuable preparation for variety of careers. , for example economists and financial advisors, businessmen, engineers, farmers. Knowledge of probability and statistical methods also are useful for informatics specialists of various fields such as data mining, knowledge discovery, so on. Whatever else it may be, statistics is, first and foremost, a collection of tools used for converting raw data into information to help decision makers in their works.
?
?
?
4
Definition of Statistics:
Statistics is the study of the collection, organization, graphical representation, statistical analysis and interpretation of data. A branch of applied mathematics concerned with the collection and interpretation of quantitative data and the use of probability theory to estimate population parameters.
History
? Some scholars pinpoint the origin of statistics to 1663, with the publication of Natural and Political Observations upon the Bills of Mortality by John Graunt. Early applications of statistical thinking revolved around the needs of states to base policy on demographic and economic data, hence its stat- etymology. The scope of the discipline of statistics broadened in the early 19th century to include the collection and analysis of data in general. Today, statistics is widely employed in government, business, and the natural and social sciences. ? Its mathematical foundations were laid in the 17th century with the development of probability theory by Blaise Pascal and Pierre de Fermat. Probability theory arose from the study of games of chance. The method of least squares was first described by Carl Friedrich Gauss around 1794. The use of modern computers has expedited large-scale statistical computation, and has also made possible new methods that are impractical to perform manually.
? Population: The entire aggregation of items from which samples can be drawn. ? Sample: A set of elements drawn from a population and analyzed to estimate the characteristics of a population. ? Parameter: A quantity (such as the mean or variance) that characterizes a statistical population and that can be estimated by calculations from sample data. ? Statistic: Any function of sample values that may be used to estimate a population parameter .
Descriptive and inferential statistics
± Descriptive statistics is devoted to the summarization and description of data (population or sample) . ± Inferential statistics uses sample data to make an inference about a population . Computer softwares for statistical analysis: ? ? ? ? ? ? ? ? ? ? Octave programming language (very similar to Mat lab) with statistical features . Simfit Simulation, curve fitting, statistics, and plotting MATLAB programming language with statistical features Minitab general statistics package SPSS comprehensive statistics package Stata comprehensive statistics package Statgraphics general statistics package STATISTICA comprehensive statistics package StatXact package for exact nonparametric and parametric statistics Systat general statistics package
8
VARIABLES
? Variables that vary between subjects ? Height, weight, blood pressure Can be grouped as ? Qualitative variable ? Quantitative variable
QUANTITATIVE VARIABLES
? Discrete variable Countable, but only whole numbers ? Continuous variable Countable as a continuum
QUALITATIVE VARIABLES
? ? ? ? ? ? Do not possess numerical values Color of hair,gender,blood group Three types Ordinal Dichotomous Nominal
? ?
Nominal data: data values are non-numeric group labels. For example, Gender variable can be defined as male = 0 and female =1. Ordinal data (we sometimes call 'Discrete Data'): data values are categorical and may be ranked in some numerically meaningful way. For example, strongly disagree to strong agree may be defined as 1 to 5.
?
Interval data : data values are ranged in a real interval, which can be as large as from negative infinity to positive infinity. The difference between two values are meaningful, however, the ratio of two interval data is not meaningful. For example temperature, IQ. Today is 1.2 times hotter than yesterday is not much useful nor meaningful. Ratio data: Both difference and ratio of two values are meaningful. For example, salary, weight. Dichotomous: Divided or dividing into two sharply distinguished parts or classifications divided - separated into parts or pieces; "opinions are divided"
? ?
Sources of data
? Primary
1. 2. 3. 4. 5. Direct personal investigation Indirect oral investigation Local correspondents Postal telephonic
? Secondary
1. 2. 3. 4. Journals News papers Television Internet etc.,
Data presentation
? Univariate data: 1. Classification using tally marks. 2. Classification using stem and leaf diagram.
? Bi-variate data: Classification using tally marks.
Case study 1
? Present the following data as a frequency distribution. Calculate relative frequencies, percentage frequencies and cumulative frequencies of both types.
Stock 1 2 3 4 5 6 7 8
Value 105 97 121.5 110.9 108.8 96.3 85.1 88
Stock 9 10 11 12 13 14 15 16
Value 110.4 100.7 103.6 109.6 131 116.2 118.2 109.5
stock 17 18 19 20 21 22 23 24
Value 112.7 99 128.6 87.6 107.4 86.7 112.3 91.4
Stock 25 26 27 28 29 30 31 32
Value 148.4 107.9 114.3 103.7 113.9 91.3 81.4 111.3
Case Study 2
The daily sales of a company in the last 26 working days is given below(in thousands of dollars): 25,27,18,34,19,20,41,37,18,22,22,32,29,37,42, 32,37,43,19,28,36,31,30,29 What is the population? What is the sample? Prepare a frequency distribution taking suitable class intervals if needed. Also prepare a stem and leaf diagram for this data. Also draw a frequency polygon and interpret the above data.
Case Study 3
? ? Bi- variate classification: The following data shows income(x) and percentage expenditure on food

? ?
Tabulation
? ? ? One way Two way Three way
Case Study 4
? Table below provides data concerning consumer installment credit outstanding at the end of June 1981 by type of credit and by holder, in millions of dollars: ? Organize the data in a two-way classification table. ? Also present it using a suitable diagram.
Type of credit
Type of holder
Amount outstanding(millions) 59,192 21,847 38,646 29,722 23,384 5,364 10,179 3,990 486 3,069 44,217 40,087 23,353 10,895
Automobile
Commercial banks Credit unions Finance companies
Revolving credit
Commercial banks Credit unions Finance companies
Mobile home
Commercial banks Credit unions Finance companies Savings And loans
Other
Commercial banks Credit unions Finance companies Savings And loans
Case study 5
? In 2008, out of total of 1750 workers of a company,1200 were members of trade union. The number of women employed was 200 of which 175 did not belong to the trade union. In 2009 the number of union workers increased to 1580 of which 1290 were men. On the other hand, the numbers of nonunion workers fell down to 208, of which 180 were men. In 2010, there were 1800 employees who belonged to trade union and 50 who did not belong to trade union. Of all the employees in 2010, 300 were women of whom only 8 did not belong to trade union. Present the above information in a tabular form and draw a suitable diagram.
Graphical presentation/distribution of data: Bar diagram Histogram Line diagram Pie-chart Scatter-plot
Tabulating and Graphing Bivariate Categorical Data
? Contingency tables: investment in thousands of dollars
Investment Stocks Bonds CD Savings Total Investor A 46.5 32 15.5 16 110 Investor B 55 44 20 28 147 Investor C Total Category 27.5 19 13.5 7 67 129 95 49 51 324
© 2002 Prentice-Hall, Inc.
Chap 2-23
Tabulating and Graphing Bivariate Categorical Data
Comparing Investors
Savings CD Bonds Stocks 0 10 20 30 40 50 60
Investor C
Investor B
Investor A
doc_907784221.ppt