Unit 2.4
Data
Data Representation
Data can be stored and
presented in many different ways.
You see a partial
list of data storage formats every time you set the field type
for a field in Access.
For this course you need to know
that data can be represented in numeric, text and boolean formats
as follows:
Numeric
Data Formats
currency: usually a number with 2 decimal places and
a currency symbol at the beginning
eg �2,999.99
real:
a number with one or more decimal
places.
The range and
accuracy of these numbers will be determined by the number of
bytes used to store them.
In MS Access the single
and double
number formats allow real numbers.
eg 3.142
integer:
a whole number. ie there are no decimal places
eg 29
date:
this format can be represented in
several different ways. It can also include the time.
eg MS Access formats are:
General date eg 19/06/1994 17:22:15
Short date eg 19/6/94
Medium date eg 19-Jun-94
Long date eg 19 June 1994
Short time eg 17:22
Medium date eg 05:22 PM
Long time eg 17:22:15
Text Data Formats
character:
allows storage of a single letter, number or symbol
eg 'q'
string:
allows storage of 'words'.
The words can include letters, numbers or symbols.
eg 'tree43?'
Some
applications limit the length of strings that can be stored.
Others allow any number of characters or
allow the user to set the length.
Boolean Data Format
Boolean format is sometimes called
logical format
Fields (or variables) which are defined as Boolean
can only store the values 'true' or 'false'
Data
Transcription Errors
Data entry clerks can make mistakes when entering data.
These mistakes are called transcription errors because they happen when data is being
transcribed (copied) from paper to the computer.
There are several types of
transcription error:
omission: characters are
completely left out
eg Tee instead of Tree
transposition: characters are
swapped around
eg Tere instead of Tree
substitution: characters are
replaced by other characters which look similar
eg Fa11 instead
of Fall.
(NB this is useful when creating safer passwords -
M1rr0r is a safer password than Mirror)
Data
Validation and Verification
Validation is where data is
checked to see if it is sensible in the context for which it is
intended.
Verification is where data is
checked to see if any erors such as words being mistyped has
occured.
Validation / Verification Techniques
Range check: Defines a high or low limit on the
values in a field / variable.
eg the date must be equal to or
greater than today's date in an airline booking system.
Presence check: Some fields can be defined so
that they must have data entered.
If they are left blank then an
error has occured.
eg an address field on a delivery form.
Format check: Data must be entered
in a particular format (like an input mask in MS Access).
eg a
national insurance number must be 2 letters, then 6 numbers, then 1 letter.
Check digit: This is an extra digit added to the
end of a number.
The check digit is calculated from the
other digits in the number using a formula.
The same formula can
then be used to recalculate the check digit.
If the original and
re-calculated numbers are different then an error has occured.
The best known method for calculating check digits is Modulus-11.
This method is used on ISBN numbers.
Modulus-11 is calculated as follows:
1. Each digit in the number is given a weight.
The least significant digit has a weight of 2, the next digit 3,
etc.
2. Each digit is multiplied by it's weight and
the results are added together.
3. The total is divided by 11 and the remainder
is obtained.
4. The remainder is subtracted from 11 to give
the check digit.
There are 2 exceptions:
If the remainder is 0, the check digit is 0, not
11.
If the remainder is 1, the check digit is x, not
10.
eg The ISBN for A Level Computing by P. Heathcote
is 1-903112-21-4
The check digit is 4, leaving the number as
190311221
Introducing the weighting gives:
1 x 10 = 10
9 x 9 = 81
0 x 8 = 0
3 x 7 = 21
1 x 6 = 6
1 x 5 = 5
2 x 4 = 8
2 x 3 = 6
1 x 2 = 2
Adding the results gives: 10 + 81 + 0 + 21 + 6 +
5 + 8 + 6 + 2 = 139
Dividing 139 by 11 gives 12 remainder 7
subtracting 7 from 11 gives 4 - the original
check digit!
Batch Totals: These are similar to check digits.
The difference is as follows:
Check digits
are applied to individual pieces of data;
Batch totals are
applied to groups of data that are being transmitted together in
a batch.
There are 2 ways of creating batch totals:
Control total: A number representing the total
number of files or records in the batch
or:
A number calculated from adding together the values of the controlled
field
eg Monday Sales �345
Tuesday Sales �402
Wednesday Sales �367
Thursday Sales �348
Friday Sales � 413
Control Total = 345 + 402 + 367 + 348 + 413 = 1775
Hash total: Similar to Control totals except that a hash total is meaningless.
eg The hash total is controlling a field called customer number
Using customer id 12367 and 13452 gives a hash total of 25819
This can be used to check for errors, but it doesn't mean anything.