Unit 1.1
Knowledge, information and data
Definitions taken from FOLDOC
Data, Information and Knowledge
Data on its own has no meaning.
Only when interpreted by some kind of data processing system does data take on meaning and become information.
Information is data with meaning.
Knowledge is information that has been processed.
Knowledge differs from data or information in that new knowledge may be created from existing knowledge using logical inference. If information is data plus meaning then knowledge is information plus processing.
Example:
data:            12,16,15,13
Information: The price of 100 widgets is as follows:
                    Company A: �12:00
                    Company B: �16:00
                    Company C: �15:00
                    Company D: �13:00
Knowledge: Deciding to order widgets from Company A because they are cheapest
Sources of Data
Direct data
is data which is collected for a specific purpose. 
e.g. A scanner reading a bar code in a supermarket
Indirect data
is data which is used for a purpose for which it was not originally intended.
e.g. A company using customer information collected for product registrations to predict future sales.
Quality of Data
Data is of poor quality if it is
out-of-date,
inaccurate,
irrelevant or
incomplete
The source of data has an effect of these 3 factors.
If a direct data source is used:
* the information is more likely to be recent (up-to-date),
* the information is more likely to be correct (accurate),
* there is less chance of important information being missing, (complete),
* irrelevant data will not have been collected.
If an indirect data source is used:
* the chances are that the data will have
been collected some time ago, and so will be out-of-date.
* some of the information may have changed
(become inaccurate), for example, someone may have moved house.
* the data was originally collected for some other purpose. This means it is likely that not all the information required for the new task was collected. For example, if the information was collected for an invoice,
then it will not include the person's age, which may be useful for marketing purposes.
* a large amount of information which is of no use for the new application will have been collected. This information is irrelevant.
Read Chapters 8 and 9 in Heathcote
Answer the questions in the Case study: Collecting Information
Answer all the questions in the exercise at the end of chapter 8
Presentation of Data
The presentation of information is as important as the quality.
If the information is presented in an unclear way, then
the listeners may misunderstand. If this happens,
then all the effort that went into making the information up-to-date,
complete, correct and relevant will have been wasted.
If the method of presentation is too slow, (or if it takes too long to prepare the presentation)
then the information may be useless by the time it reaches it's intended audience.
Humans can only take in about 4 or 5 points in a 45 minute presentation,
so clearly it is important to choose those points carefully
and use an appropriate method to present them.
Methods of presentation
Methods which can be used to present data include:
* Hard copy (Printouts)
* VDU
* Orally - through the grapevine
* Videoconferencing
* Intranet
* Web Site
The method of presentation will depend on the audience and the information itself.
Read Chapter 42 in Heathcote
Prepare a presentation on the topic given to you by your teacher.
Data Capture
Direct data capture
is entering data into a computer using some form of scanning.
The user does not enter the data themselves.
This should be a more reliable method of entering data than manual entry, but
read errors are still possible. Validation or verification must still be used.
Indirect data capture
is when a data entry clerk must enter the data manually.
This will lead to a greater percentage of errors than direct data capture, so
reducing the quality of the data.
Read Chapter 15 and 16 in Heathcote
Answer Question 1 (Case study) at the end of chapter 16
Read Chapter 43 in Heathcote
Answer all the questions in the exercise on at the end of chapter 43
The Cost of Information
Information costs money.
There are several areas where this money has to be spent so that complete, accurate,
up-to-date, relevant and well-presented data can be produced:
* Software
* Hardware
* Personnel
Software
Apart from the package which will be used to store the collected information,
there are several types of software which must be used.
> Software for producing and reading the data collection forms: especially if direct data capture is used.
OMR forms require special software which may even be required to print bar codes on each page so that they can be identified.
> Software for security: the information may need to be encrypted or password protected to prevent unauthorised access.
> Presentation software: for producing the information ina way which can be easily understood.
Hardware
> Scanners for reading the data collection forms
> Projectors for presentation
Personnel
A wide range of personnel will be required to produce good quality information:
> Data Collectors: for collecting the data!!!
> Data Entry Clerks: The information may need to be encrypted or password protected to prevent unauthorised access.
> Analysts: for interpreting and presenting the information
> System managers and technicians: to maintain and monitor the hardware
Go to notes on Unit 1.2
Go back to the module 1 main page