Encoding Data in Numeric Compaction Mode
Now that we have an idea of the physical layout of a
PDF417 bar code, we need to know how to encode data.
All data is represented by codewords with values between 0 and 928. High-level encoding
converts the data to be printed into corresponding codeword
values. How we encode each codeword, described below, depends on
what the codeword represents.
Low-level encoding, by contrast, involves physically
converting the codeword values into their respective bar/space
patterns.
PDF417 encodes values according to clusters, or
mutually exclusive encodation sets. The bar-space pattern of each
codeword depends on both the value to be encoded and the cluster used by that
row.
The entire set of 929 codewords in PDF417 is
represented in three mutually exclusive symbol character sets, or
clusters. Each cluster encodes the 929 available ‘PDF417' codewords
into different bar-space patterns so that one cluster is distinct
from another. The cluster numbers are 0, 3, 6. The cluster
definition applies to all ‘PDF417' symbol characters, except
for start and stop characters.
Each row uses only one of the three clusters (0, 3,
or 6) to encode data, with the same cluster repeating
sequentially every third row. Row 0 codewords use cluster 0, row
1 uses cluster 3, and row 2 uses cluster 6, etc. In general,
cluster number = ((row number) mod 3) *3. To encode any codeword
value, it is necessary first to know what cluster it must belong
to.
Each cluster presents the 929 available values with
distinct bar-space patterns so that one cluster cannot be
confused with another. Because any two adjacent rows use
different clusters, scan stitching can be used without confusing
rows.
High level encoding converts the data characters
into their corresponding codewords. Data compaction schemes are
used to achieve high level encoding. Depending on data type,
there are several ways to encode data in PDF417, which involve
selecting a mode and encoding codeword values.
A mode is simply a method of compacting data. PDF417
provides three modes to encode data. In one PDF417 symbol it is
possible to switch back and forth between modes as often as
required.
The optimal mode for an application may be a
combination of the three modes, each used in different portions
of the symbol. By using mode latchesshifts you can move at will
from mode to mode within the same symbol.
The three modes are defined below. Each mode
defines a particular efficient mapping between user-defined data
and codeword sequences:
900 codewords are available in each mode for data
encodation and other functions within the mode. The remaining
29 codewords are assigned to specific functions independent of
the current compaction mode.
Codewords 900 to 928 are assigned as function
codewords as follows:
Switching between modes
Enhanced applications using
Extended Channel Interpretations (ECIs)
Other enhanced applications
At present codewords 903 to 912 and 914 to 920 are
reserved. Below is a table that defines the list of assigned
and reserved function codewords.
Codeword |
Function |
900 |
mode latch to Text Compaction mode |
901 |
mode latch to Byte Compaction mode |
902 |
mode latch to Numeric Compaction mode |
903 to 912 |
reserved for future functions |
913 |
mode shift to Byte Compaction mode |
914 to 920 |
reserved for future functions |
921 |
reader initialization |
922 |
terminator codeword for Macro PDF control
block |
923 |
sequence tag to identify the beginning of
optional fields in the Macro PDF control block |
924 |
mode latch to Byte Compaction mode (used
differently from 901) |
925 |
identifier for a user defined Extended
Channel Interpretation (ECI) |
926 |
identifier for a general purpose ECI format |
927 |
identifier for an ECI of a character set or
code page |
928 |
Macro marker codeword to indicate the
beginning of a Macro PDF Control Block |
A mode latch codeword may be used to switch from the
current mode to the indicated destination and remains in effect
until another mode switch is explicitly brought into use. Codewords
900 to 902 and 924 are assigned to this function. Below is a
table that defines the Mode Definition and Mode Switching
Codewords function.
The mode shift codeword 913 causes a temporary
switch from Text Compaction mode to Byte Compaction mode. This
switch shall be in effect for only the next codeword, after which
the mode shall revert to the prevailing sub-mode of the Text
Compaction mode. Codeword 913 is only available in Text
Compaction mode; its use is described in.
Destination Mode |
Mode Latch |
Mode Shift |
Text Compaction |
900 |
|
Byte Compaction |
901/924 |
913 |
Numeric Compaction |
902 |
|
‘PDF417' also supports the Extended Channel
Interpretation system, which allows
different interpretations of data to be accurately encoded in the
symbol
To select the optimal mode, first select the mode
that supports the characters that must be encoded. Next, when
there is a choice among modes, consider the “efficiency” of
the mode. This involves answering two questions:
1. Per mode, how many characters per
codeword may be encoded?
2. If mode switching seems promising,
what additional overhead (e.g., shift, latch characters) is
needed, or is it more efficient to use one mode only?
Each available mode in PDF417 is described in the
following sections. The issue of “efficiency” will become
clear as we discuss each mode’s advantages and limitations.
Text Compaction mode permits all printable ASCII
characters to be encoded, i.e. values 32 - 126 inclusive in
accordance with ISO/IEC 646, as well as selected control
characters.
The Text Compaction Mode has four sub-modes:
• Alpha
(uppercase alphabetic)
• Lower
(lowercase alphabetic)
• Mixed
(numeric and some punctuation)
• Punctuation
Each sub-mode contains 30 characters, including sub-mode
latch and shift characters.
The default compaction mode for ‘PDF417' in effect
at the start of each symbol is always Text Compaction Mode Alpha
sub-mode (uppercase alphabetic). A latch codeword from another
mode to the Text Compaction mode will always switch to the Text
Compaction Alpha sub-mode.
Byte Compaction mode permits all 256 possible 8-bit
byte values to be encoded. This includes all ASCII characters
value 0 to 127 inclusive and provides for international character
set support.
The Bite Compaction Mode lets you encode 256
international characters, including the full ASCII set, as well
as any 8-bit value from 0 to 255. This mode encodes about 1.2
bytes per codeword. In terms of the breadth of encodable data, it
is a powerful mode; in terms of the resulting size of the printed
symbol, it is the least efficient mode.
Encoding Data in Byte Compaction Mode
Switching to Byte Compaction Mode
Compaction Rules for Encoding Longer Byte Compaction
Character Strings (Using Mode Latch 924 or 901)
The Byte Compaction mode enables a sequence of 8-bit
bytes to be encoded into a sequence of codewords. It is
accomplished by a Base 256 to Base 900 conversion, which achieves
a compaction ratio of six Byte Compaction characters to 5
codewords (1,2:1).
All the characters and their values (0 to 255) are
defined in ______?. This is the default graphical and
control character interpretation. When ECIs are invoked this
interpretation may be defined as either ECI 000000 or ECI 000002.
Because the default mode for ‘PDF417' is Text
Compaction mode, to switch to Byte Compaction mode, it is
necessary to use one of the following codewords:
• mode latch 924 shall be used
when the total number of Byte Compaction characters to be encoded
is an integer multiple of 6
• mode latch 901 shall be used
when the total number of Byte Compaction characters to be encoded
is not a multiple of 6
• mode shift 913 can be used
instead of codeword 901 when a single
Byte Compaction character has to be encoded
If there is a need to encode only one Binary/ASCII
Plus digit (8-bit), the Binary/ASCII Plus mode shift 913 may be
used. When mode shift 913 is used, the next codeword is treated
as a single 8 bit value.
The following procedure shall be used to encode Byte
Compaction character data:
1. Establish
the total number of Byte Compaction characters.
2. If a
perfect multiple of 6, mode latch 924 shall be used; or else mode
latch 901 shall be used.
3. Sub-divide
the number of Byte Compaction characters into a sequence of 6
characters, from left to right (the most to least significant
characters). If less than 6 characters go to Step 7.
4. Assign the
decimal values of the 6 data bytes to be encoded in Byte
Compaction mode as b5 to b0 (where b5
is the first data byte).
5. Carry out
a base 256 to base 900 conversion to produce a sequence of 5
codewords. Annex
Encoding binary data using latches 901/924 requires
converting the data from base 256 to base 900. The encoding
process, specified by latch codeword 924, takes 6 base 256
numbers or values at a time, converting them to 5 base 900
codewords. The process continues until no more numbers or values
need to be encoded.
If the number of digits to encode is not a multiple
of 6, use mode 901. This just converts each group of 6 digits
using the above algorithm and appends any remaining digits using
1 codeword per digit.
The following algorithm performs a base 256 to base
900 conversion:
To calculate the codewords,
we must first define the following:
n =
number of codewords (In this case, it is always 5.)
t =
temporary variable
We then calculate t as
follows:
t = d5*2565 + d4*2564 + d3*2563
+ d2*2562 + d1*2561 + d0*2560
Next, we calculate each codeword
as follows:
For each codeword ci = c0 ...
cn-1
BEGIN
ci
= t mod 900
t
= t div 900
END
Where:
ci
= codeword n
Remember:
mod is the integer remainder after division. If the remainder is
negative, take the complement to get the correct result. For
example, the remainder of -29160 divided by 929 is -361. The
complement is 929 - 361, or 568.
div is the integer division
operator
For example:
Encode the numbers {1,2,3,4,5,6}
Calculate sum of data bytes
to encode:
t = 1*2565
+ 2*2564 + 3*2563 + 4*2562 + 5*2561 + 6*2560
=
1108152157446
Calculate codeword 0
c0
= 1108152157446 mod 900 = 846
t
= 1108152157446 div 900 = 1231280174
Calculate codeword 1
c1
= 1231280174 mod 900 = 74
t
= 1231280174 div 900 = 1368089
Calculate codeword 2
c2
= 1368089 mod 900 = 89
t
= 1368089 div 900 = 1520
Calculate codeword 3
c3
= 1520 mod 900 = 620
t
= 1520 div 900 = 1
Calculate codeword 4 (last codeword)
c4
= 1 mod 900
= 1
t
= 1 div 900
= 0
The codeword sequence is (including
Binary/ASCII Plus mode latch 924): 924,1,620,89,74,846
Example:
Encode the numbers 1,2,3,4,5,6,7,8,4
The mode shift character 901
is used since the number of digits to encode is not a multiple of
6. The above results apply for the first 6 digits. The remaining
digits are appended as codewords, in order of their occurrence.
The codeword sequence is: 901,1,620,89,74,846,7,8,4
The Numeric Compaction mode is a method for base 10
to base 900 data compaction and should be used to encode long
strings of consecutive numeric digits. The Numeric Compaction
mode encodes up to 2.93 numeric digits per codeword.
Numeric Compaction mode may be invoked when in Text
Compaction or Byte Compaction modes using mode latch 902.
The following procedure is used to compact numeric
data:
1. Divide the string of digits into groups of 44
digits. The last group may contain fewer digits.
2. For each group add the digit 1 to the most
significant position to prevent the loss of leading zeros.
EXAMPLE:
original data 00246812345678
after step 2 100246812345678
NOTE: The leading digit 1 is removed in the decode algorithm.
Perform a base 10 to base 900 conversion as follows:
Starting from least
significant codeword 0, calculate the following:
For each codeword ci = c0 ...
cn-1
BEGIN
ci
= x mod 900
x
= x div 900
If x
= 0, then
stop encoding
END
where:
x = numeric string to encode, preceded by
number 1
Remember: mod is the
integer remainder after division. If the remainder is negative,
take the complement to get the correct result. For example, the
remainder of -29160 divided by 929 is -361. The complement is 929
- 361, or 568.
div is the integer division
operator.
Repeat from Step 2 as necessary.
For example:
Encode the numeric string
00021329000.
First precede the string by
the number one:
100021329000
Next use the conversion
algorithm to produce the desired codewords:
Start by setting x =
100021329000
Calculate codeword 0
c0
= 100021329000 mod 900 = 0
x
= 100021329000 div 900 = 111134810
Calculate codeword 1
c1
= 111134810 mod 900 = 110
x
= 111134810 div 900 = 123483
Calculate codeword 2
c2
= 123483 mod 900 = 183
x
= 123483 div 900 = 137
Calculate codeword 3
c3
= 137 mod 900 = 137
x
= 137 div 900 = 0
The codeword sequence then is (including numeric
mode latch 902): 902,137,183,110, 0
The following rules can be used to determine the
precise number of codewords in Numeric Compaction mode:
• Groups
of 44 numeric digits compact to 15 codewords.
• For
groups of shorter sequences of digits, the number of codewords
can be calculated as follows:
Codewords = INT (number of digits / 3) +1
EXAMPLE:
For a 28 digit sequence
INT (28 / 3) + 1
= 9 + 1
= 10 codewords
Numeric Compaction mode may be terminated by the end
of the symbol, or by any of the following codewords:
• 900 (Text
Compaction mode latch)
• 901 (Byte
Compaction mode latch)
• 902 (Numeric
Compaction mode latch)
• 924 (Byte
Compaction mode latch)
• 928 (Beginning
of Macro ‘PDF417' Control Block)
• 923 (Beginning
of Macro ‘PDF417' Optional Field)
• 922 (Macro
‘PDF417' Terminator)
The latter three codewords only occur within the
Macro ‘PDF417' Control Block of a Macro ‘PDF417' symbol. Numeric
Compaction mode is also affected by the presence of a reserved codeword.
Re-invoking Numeric Compaction mode (by using codeword
902 while in Numeric Compaction mode) serves to terminate the
current Numeric Compaction mode grouping as described in??, and
then to start a new grouping. This procedure may be necessary
when an ECI assignment number needs to be encoded.
During the decode process for Numeric Compaction
mode, the result of the base 900 to base 10 conversion results in
a number whose most significant digit is a ‘1'. Otherwise, the
symbol is invalid. The leading ‘1' is removed to produce the
original number.
The row indicators in a PDF417 symbol contain
several key components: row number, number of rows, security
level, and number of data columns.
Not every row indicator contains every component.
The information is spread over several rows, and the pattern
repeats itself every three rows. This pattern, the spreading and
repeating below, makes the symbol as robust as possible.
Row 0: Left R.I. (Row #, # of
Rows) Right R.I. (Row #, # of Columns)
Row 1: Left R.I. (Row #, Security
Level) Right R.I. (Row #, # of Rows)
Row 2: Left R.I. (Row #, # of
Columns) Right R.I. (Row #, Security Level)
For example, row 3's (or 6’s, 9’s, 12’s, etc.)
right row indicator is calculated the same way as row 0's right
row indicator; row 3's (or 6’s, 9’s, 12’s, etc.) left row
indicator is calculated the same way as row 0's left row
indicator, etc.
To make this clear, formulas and specific examples
follow.
Left row indicators are calculated as follows:
Row 0 30 * (row number div 3) + ((number of rows -
1) div 3)
Row 1 30 * (row number div 3) + security level * 3
+ (number of rows - 1) mod 3
Row 2 30 * (row number div 3) + (number of columns
- 1)
Notes: div is the integer division operator
mod is the integer remainder after division. If the
remainder is negative, take the complement to get the correct
result. For example, the remainder of -29160 divided by 929 is -361.
The complement is 929 - 361, or 568.