Unit 2.4
Data



Data Representation

Data can be stored and presented in many different ways.
You see a partial list of data storage formats every time you set the field type for a field in Access.

For this course you need to know that data can be represented in numeric, text and boolean formats as follows:



Numeric Data Formats

currency:
   usually a number with 2 decimal places and a currency symbol at the beginning
                    eg �2,999.99

real:           a number with one or more decimal places.
                    The range and accuracy of these numbers will be determined by the number of bytes used to store them.
                    In MS Access the single and double number formats allow real numbers.
                    eg 3.142

integer:      a whole number. ie there are no decimal places
                    eg 29

date:          this format can be represented in several different ways. It can also include the time.
                    eg MS Access formats are:
                    General date eg 19/06/1994 17:22:15
                    Short date eg 19/6/94
                    Medium date eg 19-Jun-94
                    Long date eg 19 June 1994
                    Short time eg 17:22
                    Medium date eg 05:22 PM
                    Long time eg 17:22:15




Text Data Formats

character:
  allows storage of a single letter, number or symbol
                    eg 'q'

string:        allows storage of 'words'. The words can include letters, numbers or symbols.
                    eg 'tree43?'
                    Some applications limit the length of strings that can be stored.
                    Others allow any number of characters or allow the user to set the length.




Boolean Data Format
Boolean format is sometimes called logical format
Fields (or variables) which are defined as Boolean can only store the values 'true' or 'false'










Data Transcription Errors
Data entry clerks can make mistakes when entering data. These mistakes are called transcription errors because they happen when data is being transcribed (copied) from paper to the computer.

There are several types of transcription error:

omission:        characters are completely left out
                        eg Tee instead of Tree

transposition: characters are swapped around
                        eg Tere instead of Tree

substitution:   characters are replaced by other characters which look similar
                        eg Fa11 instead of Fall.
                        (NB this is useful when creating safer passwords - M1rr0r is a safer password than Mirror)

Data Validation and Verification

Validation is where data is checked to see if it is sensible in the context for which it is intended.
Verification is where data is checked to see if any erors such as words being mistyped has occured.


Validation / Verification Techniques

Range check:
     Defines a high or low limit on the values in a field / variable.
                            eg the date must be equal to or greater than today's date in an airline booking system.

Presence check: Some fields can be defined so that they must have data entered.
                            If they are left blank then an error has occured.
                            eg an address field on a delivery form.

Format check:    Data must be entered in a particular format (like an input mask in MS Access).
                            eg a national insurance number must be 2 letters, then 6 numbers, then 1 letter.

Check digit:       This is an extra digit added to the end of a number.
                            The check digit is calculated from the other digits in the number using a formula.
                            The same formula can then be used to recalculate the check digit.
                            If the original and re-calculated numbers are different then an error has occured.
                            The best known method for calculating check digits is Modulus-11.
                            This method is used on ISBN numbers.

                            Modulus-11 is calculated as follows:

                            1. Each digit in the number is given a weight.
                                The least significant digit has a weight of 2, the next digit 3, etc.
                            2. Each digit is multiplied by it's weight and the results are added together.
                            3. The total is divided by 11 and the remainder is obtained.
                            4. The remainder is subtracted from 11 to give the check digit.
                                There are 2 exceptions:
                                If the remainder is 0, the check digit is 0, not 11.
                                If the remainder is 1, the check digit is x, not 10.


                            eg The ISBN for A Level Computing by P. Heathcote is 1-903112-21-4

                            The check digit is 4, leaving the number as 190311221

                            Introducing the weighting gives:

                            1 x 10 = 10
                            9 x 9 = 81
                            0 x 8 = 0
                            3 x 7 = 21
                            1 x 6 = 6
                            1 x 5 = 5
                            2 x 4 = 8
                            2 x 3 = 6
                            1 x 2 = 2

                            Adding the results gives: 10 + 81 + 0 + 21 + 6 + 5 + 8 + 6 + 2 = 139

                            Dividing 139 by 11 gives 12 remainder 7

                            subtracting 7 from 11 gives 4 - the original check digit!

Batch Totals:    These are similar to check digits.
                          The difference is as follows:
                          Check digits are applied to individual pieces of data;
                          Batch totals are applied to groups of data that are being transmitted together in a batch.

                          There are 2 ways of creating batch totals:

                          Control total: A number representing the total number of files or records in the batch
                                           or: A number calculated from adding together the values of the controlled field

                                                eg Monday Sales �345
                                                Tuesday Sales �402
                                                Wednesday Sales �367
                                                Thursday Sales �348
                                                Friday Sales � 413
                                                Control Total = 345 + 402 + 367 + 348 + 413 = 1775


                          Hash total:   Similar to Control totals except that a hash total is meaningless.

                                                eg The hash total is controlling a field called customer number
                                                Using customer id 12367 and 13452 gives a hash total of 25819
                                                This can be used to check for errors, but it doesn't mean anything.
Hosted by www.Geocities.ws

1