The ByteString is the basis of Cardinal, Real, and String Types. A ByteString is supposed to have unlimited length, but instead �nearly infinite� is modeled. (There is an upper limit on ByteString size.)
A ByteString is a convenient way to store data. The ByteString has a StringForm stored as a compiled stream and the ByteString has an ArrayForm when used as an active Object. A ByteString is read off of disk as a byte stream conforming to the specifications below. (See ReadByteArray[stream]). When this ByteString is in electronic memory, it is accessed through a handle to the ArrayForm.
The QuasiArbitraryByteString is a compact means to represent Strings or Cardinals of arbitrary size. It is a compromise between byte-sized computer data efficiency, algorithmic technique, real-world numerics, and mathematical elegance.
The QuasiArbitraryByteString is not actually arbitrary. QuasiArbitraryByteString is composed of an almost arbitrary number of bytes. For the purposes of String or Cardinal representation, it presumes that it is not worthwhile to count bits. In addition, the maximum size of a ByteString is 8^255 bytes. This size limit will probably never be noticed (this is what is meant by real-world numerics). For all practical purposes, both Strings and Cardinals are arbitrarily large. Each ByteString specifies its size as a prefix to the rest of the data. This size is given in bytes according to the following scheme:
(1) The 1st byte is interpreted as a Cardinal number unless its value is {1,1,1,1,1,1,1,1}, (a.k.a. 256). Thus, the 1st byte represents the Cardinal values 0 through 255.
(2) If the 1st byte is {1,1,1,1,1,1,1,1}, then the 2nd byte is interpreted as the number of bytes used to specify the QuasiArbitraryByteString unless the 2nd byte is {1,1,1,1,1,1,1,1}, (a.k.a. 256). If 2nd byte is less then 256, then it specifies Cardinals whose ByteString has less then 256 bytes. In other words, all Cardinals less then 8^256.
While 8^256 is a huge number, it is still small compared to the size of the possible Cardinal values that could be represented by the binary string on a Compact Disk. On the other hand, the number of bytes on a CD, (or many-many times the number of bytes on a CD), is easily represented by a Cardinal value less then 8^256.
(3) The 2nd byte is {1,1,1,1,1,1,1,1}, (a.k.a. 256). This codes for overflow; in other words, the QuasiArbitraryByteString codes for a Cardinal value greater then 8^256.
[This possibility is not yet developed.]
The 3rd byte must be all zeros or Noop[Cardinal][1] is returned. See below. Then the following happens:
(a) The subsequent byte is a Cardinal number less then 256 that is the number of bytes subsequently read in (b).
(b) The subsequent byte sequence of 255 or less elements is interpreted as the Cardinal cSize. This is presumed to be number of bytes in the ouput value, which follows immediately after an all zeros byte. (If the all-zeros-byte is not present, Noop[Cardinal][2] is returned.)
(c) The output value is the ByteString composed of the subsequent cSize bytes.
If the 3rd byte is {1,1,1,1,1,1,1,1}, (a.k.a. 256), then this generates a Noop[Cardinal] NoopNamedError.
How big is the largest ByteString coded by this sheme?
(a)�s max value is 255. This means that the next 255 bytes might represent the size of the final BinaryString.
Thus, the maximum size is a BinaryString with 8^255 bytes. This value is considered to be sufficiently large to be the upper limit on real-world BinaryStrings for some time.
To put this in perspective, consider that Avagadro�s Number is just 6.061e+23!
The Noop[Cardinal] error means that the BinaryString is too big to represent. In practice, any program that threatens to produce, or which does produce this error should be using a Real Number representation. It is not physically practical to represent such a large BinaryString.
The form taken by a reasonably sized QuasiArbitraryByteString might look like any of the following:
{{11111110}} �or�
{{11111111}, {�134�}, byte1, �, byte134} �or�
{{11111111}, {11111111}, {00000000}, {�134�}, byte1, �, byte134}, {00000000},
(* The byte sequence whose length is the cardinal value is represented by �byte1, �, byte134�. *)}
(*****************************************************************)
Name[StreamObjectQ[Stream[Type[strmCard, Pattern[Type[Cardinal]]], Pattern[Sequence]],
Sequence[
lastStreamCardinal=strmCard,
True
]
]
Name[StreamObjectQ[Type[strm, Pattern[]]],
False
]
Name[ByteStreamQ[Type[strm, StreamObjectQ[strm]]],
Pattern[Stream][Stream[Type][strm]]
]
Name[QuasiArbitraryByteString[ Type[strm, ByteStreamQ[strm]],
While[AllTrueByte[byte],
byteCount
]
]
(*************************************************************************)
The form taken by the QuasiArbitraryByteString will be the following:
{allOne1, �, allOneN, card1 (*byte*), card2 (* less then 256 bytes *),
�., cardN (* byteSize specified by cardN-1.*)
(*
All of the following should be written in C or machine code:
maxByte[byte_]
� It takes a byte and returns True if all 8 bits are 1.
zeroCardQ[card_]
�should be written in C or machine code. It takes a Cardinal, card, and returns True if it is zero.
card8[byte]
�returns a Cardinal whose value is less then 255.
*)
getRecursionNumber[strm�_] = Cardinal[Loop][
If[Not[maxByte[byte=Reckon[strm]]],
Return[Loop]
]
]
(*
constructBinaryString[byteNum_, bytes_, strm_]
A BinaryString is constructed by placing bytes in an array.
At the end of this process, there is an array of bytes, and byteNum, which is the number of bytes.
If this procedure was written in C, a pointer to this object would be handed back as the result�
*)
constructBinaryString[byteNum_, bytes_, strm_] =
(* Old name: arbitraryCardinal[strm_]
ReadByteArray[strm_]
�reads elements from the StreamObject, strm, and returns a handle to an arbitrary length ByteArray.
This is an important procedure because it embodies the algorithm to transform a coded File stream for
a quasi-arbitrary length file.
This is also a strange procedure because it is written in Grok32`, yet it returns a handle to a ByteArray.
*)
ReadByteArray[strm_] =
Cast[{byte, card},
Switch[getRecursionNumber[strm],
0, card=card8[byte],
Sequence[arrayHndl=arrayBytes[card, strm],
1],
Tally[recursionN][constructBinaryString[1, bytes, strm]],
],
(* recursionN > 2. The BinaryString is known to have more then 2^256 bytes. *)
]
]
{allOne1, �, allOneN, card1 (*byte*), card2 (* less then 256 bytes *),
�., cardN (* byteSize specified by cardN-1.*)
� 2004, 2005
by John Van Wie Bergamini
This document may be distributed under the terms of the Lesser General Public License.