|
What is statistics? |
- Statistics is the science of observing, recording, testing, analysing & synthesizing virtual or real events for hidden orders, patterns or trends for the intended purpose(s)
|
|
What inspired statistics? |
- Like all science branches, statistics did not start out as statistics
- In daily activities, people are faced with events, often recurring, uncertain, risky & with decisions to be made
- Farmer: How is the weather today or during this period? How best to arrange my crop portfolio within the expected weather conditions, ground conditions & farming practices?
- Commuter: what are the transportation conditions like? The arrival, travelling & destination times of various modes of transport? Which routes offer shortest travel times with least congestion? Should I get public or private transport & how do I make my decisions?
- Businessman: what are the market conditions in certain circumstances, periods, regulations & seasons? How’s the demand & supply conditions – the fluctuations, expectations, confidence? How sure am I of the business future, trends and development?
- Engineer: what are my goals, needs & available resources? How to plan & arrange them to meet my goals? What relationships are present between various parameters – theoretical, empirical or stochastic? How to assess / test the validity, confidence & limitations of proposed relations? Whether & when to use the parameters for prediction?
|
|
Then what are the uses of statistics? |
- Uncertainty: range of confidence, limits uncertainties & risks
- Decisions: serve as evidences in relation to defined objectives; influence decision-making over strategy, structure, implementation & others
- Understanding: promote further information & knowledge from raw data & substantiate research efforts
|
|
How do we identify statistics? |
- Statistics is really part of everyday phenomenon – its scope is wide & thus easily identifiable
- Recording: collection & storage of data on the target phenomenon, like wave actions, weather conditions, traffic, etc.
- Processing: through machines or thought, the observed data are reduced into findings, implications & conclusions
- Presentation: statistical results in the form of relations, quotes, percentages, charts, distributions, figures, tables, projections, etc.
- Probability: statistics is linked to probability theory as both constitute the building blocks of stochastic studies
|
|
Fundamental general concepts? |
- Population: all possible values of all parameters of the studies – infinite or finite, continuous or discrete
- Sample: a collection of data (sample size, n) from the population, thus sample < population; purpose of statistical investigations is gather & infer information from the sample about the population at large; sample – random (according to certain principles) or independent (little or no relationships) or dependent, non-biased or biased
- Common parameters: mean m
, standard deviation (sample s, population s
), variance (sample s2, population s
2), size (sample n, population N), errors (a
-reject when true, &b
-reject when false)
- Point estimation: determination of numeric value for each parameter; using method of moments or method of maximum likelihood (stationary: 1st derivative = 0)
- Distribution: different samples produce distributions for the parameters – Normal Z, t T, c
2, f F, etc.
- Interval estimation: interval where parameter might be located with a degree of stochastic confidence; thus the smaller the interval, the more confident & accurate the parameter estimation; the confidence interval reduces when n increases, smaller s
or alpha increases
- Hypothesis testing: decide whether to reject or not reject (contrary to accept) a statement on a parameter
- Goodness of fit test: compare observed distribution with theoretical distribution – chi-square c
2 test & Kolmogorov-Smirnov (KS) test
- Regression: a probabilistic relationship in that the mean & variance of one parameter as a function of values of other parameter(s); using method of least squares, it tells the value, the interval & correlation between the parameters; regression – linear or non-linear, single or multiple
|
|
Tools required? |
- Known stochastic distributions, derived from theoretical sources
- Calculator statistics skills or spreadsheet skills
- Overall solution procedure: parameters, statement, testing, results, implications
- Mathematics: representation of seemingly random data into a statistical problem with defined symbols, parameters & relationships
|
|
What is hypothesis testing? |
- To reject or not reject a hypothesis at the level of significance a
:
- Null Ho & alternate H1
- Level of significance a
- Using sample distribution or assumed known distribution, derive critical region(s) & critical value(s)
- Value of the statistic to be tested
- Decide whether or not to reject the null Ho
- Testing 1 mean with known s
2: Normal
- Testing 2 means with known s
2: Normal
- Testing 1 variance: c
2
- Testing 2 variances: F
|