Specification of the EBS File Format for Biosignals

This is a local cache, in Australia, of the canonical resource.

1 Purpose

In the analysis of multi-channel bio-signal recordings (e.g., electro cardiogram, electro encephalogram, magneto cardiogram, magneto encephalogram, audio data), scientists often spend a significant time in the coding of simple functions and programs that write and read the data into and out of files. Programs for trivial tasks like extracting a single channel or a short time sequence out of a huge file, applying different filters and standard signal processing algorithms to the recordings and visualization of the data are rewritten and reinvented again and again in many institutions all over the world each day. A lot of existing formats offered by recording equipment vendors are often designed only for very special applications, are unflexible and unextensible. Some of these vendor formats are also optimized only for a special hardware and are not or only badly documented. Scientific applications require file formats that are not too complicated, easy to understand and implement, highly flexible, fully documented and that allow researchers to cooperate by making easy exchange of data and tools among work groups possible.

The design goals of the EBS file format have been:

Implementation of software which supports the EBS format must not be very difficult and it must be possible for advanced programs to exchange data with very simple implementations that use only a few features of the format.
It must be possible to handle the data efficiently, because often very large data sets have to be processed. I.e. different machine architectures have to be considered. Modifications to header data must be possible without having to copy the entire file. Access to growing files while the recording is still in progress should be possible on multitasking systems.
The format must be as universal as possible. Only very few parameters (length of the file, number of channels, data format) should be mandatory. It must be possible to attach arbitrary further information, i.e. the format must be highly extensible in a way that won't prevent the use of existing tools for extended versions of this format. The data part of the format must be capable of containing different encodings of data (e.g. various precisions, fixed point or floating point types, compressed variable length encodings, etc.).
A number of common attributes that are required by many different applications stored together with data in the file (e.g. patient ID, description texts, common recording parameters) have to be predefined. In this way, extensions to the format are not necessary very often.

The EBS file format has been designed for storing single-channel or multi-channel signals that have been recorded simultaneously at constant intervals of time with the same sample rate in each channel. Not all channels must store signals from the same source, e.g., EEG, ECG and trigger signals may very well be mixed in one file, but the same encoding (e.g. 16-bit signed integers, floating point reals, compressed or uncompressed) and the same sample frequency must be used for all channels in a single file.

It is our hope that the EBS file format will motivate scientists working on the analysis of bio-signals to exchange their tools and data sets as public domain software, because similar positive influences of standard file formats have been observed in other scientific communities (e.g. computer graphics, astronomy and operating systems) where well-known scientists have developed a lot of freely available high quality software.

2 File Format

An EBS file is a linear sequence of 8-bit bytes of defined length. If a file system allows a file name extension, '.ebs' is recommended and if a file type has to be specified, a transparent unstructured binary type should be used. Each EBS file consists of 3 or 4 different parts: (1) the fixed header containing information that is needed by every program reading EBS files, (2+4) the variable headers which might contain additional data that is only needed by some programs and may be simply ignored by others and (3) the encoded bio-signal data. The normal position of the variable header information is between the fixed header and the encoded data (2), but it is also possible to put some or all parts of the variable header information behind the encoded data (4).

[Note: Having two possible positions of the variable header information allows to change, insert or delete information in the variable header without having to move the encoded signal data as well as reading files while other programs are still adding data to the end of part (3) (on-line processing).]


                   -----------------------------------
                   |     Fixed Header (32 bytes)     |       (1)
                   +---------------------------------+
                   |         Variable Header         |       (2)
                   +---------------------------------+
                   | Encoded Signal Data (4*d bytes) |       (3)
                   +---------------------------------+
                   | Optional Second Variable Header |       (4)
                   -----------------------------------

Most integer values in the fixed and variable headers are coded as 32-bit words stored in 4 bytes beginning with the most significant byte (Bigendian format). If the value is a signed integer type, then the usual 2-complement representation of negative values will be used. E.g., the value -3 is stored as 0xff,0xff,0xff,0xfd and 1024 is stored as 0x00,0x00,0x04,0x00 (in this text, the prefix '0x' indicates a hexadecimal number as in the C programming language and two hex digits form an 8-bit byte value). All 32-bit integer values in the fixed and variable headers are aligned to 32-bit boundaries, i.e. their start byte position relative to the first byte of the file is always a multiple of 4.

2.1 The Fixed Header

Each EBS file starts with a 32 bytes long data structure with the following format:

                   -----------------------------------
                   |  Identification Code (8 bytes)  |
                   +---------------------------------+
                   |   Data Encoding ID (4 bytes)    |
                   +---------------------------------+
                   |  Number n of channels (4 bytes) |
                   +---------------------------------+
                   |  Number m of samples (8 bytes)  |
                   +---------------------------------+
                   | Length d of Data Part (8 bytes) |
                   -----------------------------------


	Byte |  Value        |  Meaning
      -------+---------------+---------------------------------
          0  |  0x45         | ASCII character 'E'
          1  |  0x42         | ASCII character 'B'
          2  |  0x53         | ASCII character 'S'
          3  |  0x94         | another ID character
          4  |  0x0a         |   "
          5  |  0x13         |   "
          6  |  0x1a         |   "
          7  |  0x0d         |   "
        8-11 |  see 2.3      | Encoding ID
       12-15 |  any          | number n of channels (unsigned)
       16-23 |  any          | number m of samples per channel (unsigned)
             |               | stored as a 64-bit value or all bytes are
             |               | 0xff if unspecified.
       24-31 |  any          | length d of the data part (3) in 32-bit words
             |               | (i.e. part (3) is 4*d bytes long) or all bytes
             |               | are 0xff if part (4) is not present.
             |
       32-   |  here begins the first variable header part (2) of an EBS file

Identification code:

The 'magic code' in the first 8 bytes identifies the file as an EBS file. Programs that read EBS files should complain about files that don't start with these 8 bytes.

Encoding ID:

The number in the next 32-bit word indicates the format in which the bio-signals are encoded in part (3) of the file. The possible encodings and their ID values in this field are described later in section 2.3.

Number of channels:

The 32-bit unsigned integer value n starting at byte 12 specifies, how many channels have been recorded.

Number of samples:

The 64-bit Bigendian unsigned integer value m in the next 8 bytes indicates, how many samples have been recorded in each channel.

[Note: Don't worry about the 64-bit values! Today, most implementations just check, whether the bytes 16-19 have the value 0x00 and read the bytes 20-23 as the 32-bit number of samples, because their operating system can't deal with 64-bit values and with files longer than a few gigabytes. It is all right if your implementation just gives a nice error message for EBS files with more then e.g. 4294967295 samples, but some applications might need files in which the number of samples can't be described with 32-bit (e.g. long-time recordings) and new operating systems support files of this length.]

If all bytes from position 16 to 23 have the value 0xff, then this indicates that the length of the whole file is NOT determined by the fixed header. Instead, the end of the data part (3) is determined by the operating system. This is called an EBS file with 'unspecified length' and may be used when recorded data has to be accessed while the recording is still in progress and part (3) is still growing. In this case, the program can read sequences of n sample values until the first end-of-file condition is signaled by the operating system. The undefined length value is only allowed in combination with TIME-BASED ORDER data encodings (see section 2.3) and no second variable header can be present in files with unspecified length.

Length of Data Part:

If a second variable header is present, then the 64-bit value starting at byte 24 will be the length d of part (3) counted in 32-bit words. I.e. part (3) is d*4 bytes long and 4*d bytes have to be skipped after the final tag of part 2 in order to get to the first byte of part (4). If no part (4) is present, all bytes from byte 24 to byte 31 have the value 0xff. The purpose of this value d is only to define the position of the second variable header. It can not be used to determine the number of samples stored in the data part (this information is stored in m in the fixed header). The number of bytes needed to store the n*m sample values in part (3) may be less than or equal to 4*d, but not greater.

[Note: In some (often called 'compressed') variable length encoding formats for the data part (3), the values n and m (number of channels and number of samples) from the fixed header can not be used to predict the exact size of the data part, because in compressed formats, the number of bits per sample is not always fixed. This makes it impossible to find the start of the second variable header part (4) quickly (i.e., without going through the whole data part). In order to avoid this problem, the length of the data part d is stored separately if a second variable header is present.]

If the number of samples is not specified in the fixed header (m = 0xffffffffffffffff), then no second part of the variable header is allowed and d also has the value 0xffffffffffffffff.

2.2 The Variable Header

Part (2) and (4) of EBS files contain a sequence of attributes (e.g. patient name and age, sample rate, description texts, date and time of recording, etc.) which a useful file format must be able to carry, but which are only of interest to some application programs. Other programs may simply ignore most or all attributes in this header.

Each attribute in the variable header is stored as a TLV (tag, length, value) sequence. A tag is a 32-bit unsigned Bigendian integer number that identifies the type of information stored in the attribute (e.g. patient name). Some tag numbers and the meaning and syntax of the following attribute value are already defined in appendix A, but other new ones may be easily defined for special applications according to the rules in appendix B. The tag number is followed by an unsigned 32-bit length indicator l that specifies the number of 32-bit words (i.e. l*4 bytes) of the directly following value of the attribute. The number of bytes in an attribute value is always a multiple of four.

Both variable header parts end with the special tag 0x00000000. If part (4) is present, these are normally the last bytes of the file. The final special tag 0x00000000 in part (2) is directly followed by the first byte of the data part (3). The tag 0xffffffff is reserved and must not be used in any EBS file. The format of both variable header parts is:

                   --------------------------
                   |      tag (4 bytes)     |
                   +------------------------+
                   |   length l (4 bytes)   |
                   +------------------------+
                   |    value (l*4 bytes)   |
                   +------------------------+
                   ... tag, length, value ...
                   +------------------------+
                   |       0x00000000       |
                   --------------------------

The interpretation of the value bytes depends completely on the value of the tag number. Most values are simple data types like integer numbers or text-strings or are sequences of these simple types. If not otherwise specified, the values of attributes defined in this text in appendix A use the following encoding for various simple types and it is recommended that attributes in new additional attributes use the same encoding where this is appropriate. All simple types are encoded so that their length in bytes is always a multiple of four. Simple data types without fixed length (e.g. strings and floating point numbers) are self delimiting (e.g. with final zero bytes).

a) 32-bit integer number

Integer numbers are stored starting with the most significant byte. Signed integer numbers are stored with the usual 2-complement encoding.

b) 64-bit integer numbers

They are also stored with the most significant byte first and use 2-complement encoding if the value is signed. In the variable header, only a 32-bit, not a 64-bit alignment is guaranteed, i.e. it is NOT guaranteed that 64-bit integer values start at an address relative to the first byte of the file which is a multiple of 8.

c) floating point numbers

Floating point numbers are stored as ASCII strings in the usual representation (e.g. as in the C programming language). These strings may only contain the characters '+', '-', 'e', 'E', '.' and the digits '0' to '9'. At the end of the string, between one and four 0x00 bytes are appended, so that the length of the encoded floating point number is always a multiple of 4. Examples of valid floating point numbers are

  '3.14'        0x33,0x2e,0x31,0x34,0x00,0x00,0x00,0x00
  '-.1'         0x2d,0x2e,0x31,0x00
  '+0.910e+45'  0x2b,0x30,0x2e,0x39,0x31,0x30,0x65,0x2b,0x34,0x35,0x00,0x00

The Extended Backus-Naur Form (EBNF) grammar of all possible real numbers (without the final 0x00 bytes) is

  ['-'|'+'] {digit} ['.' {digit}] [('e'|'E') ['-'|'+'] digit {digit}]

where digit is a character from '0' to '9', [] means optional, | describes a choice and {} means zero, one or several times. At least one digit must be present before the optional exponential part. The special value "not-a-number" (NaN) is represented by the empty string

0x00,0x00,0x00,0x00.

d) single-line and multi-line text-strings

Text-strings are stored using the 16-bit character set UCS-2 (the 16-bit subset of ISO 10646, also known as 'Unicode') which covers all other character sets on this planet. UCS-2 characters are stored as sequences of 16-bit Bigendian values.

[Note: If you are unfamiliar with ISO 10646, it is sufficient to know that ASCII and ISO 8859-1 (ISO Latin 1) characters have the same code in this 16-bit character set, i.e. you get the correct 16-bit value by prefixing each ASCII or Latin-1 byte with 0x00. Check a copy of the ISO 10646 standard or of the compatible Unicode Standard (Version 1.1 or higher) if you want to support other characters (e.g., Cyrillic, Greek, Chinese, Japanese, IBM PC, etc.) and need to know their 16-bit codes.]

If text-strings are allowed to span several lines, the code 0x000a (LF, line feed) should be used as the only line separator between these lines. The last line is not followed by another 0x000a code. Strings always end with one or two 0x0000 codes so that the number of bytes in the string including the two or four 0x00-bytes at the end is always a multiple of four. If not otherwise specified, single-line text-strings should not have more than 64 characters (not including the 1 or 2 0x0000 codes at the end), but application programs must be able to cope with longer lines, e.g. by truncating them. Multi-line strings may have any number of lines but should also have not more than 64 characters per line (not including the 0x000a line separation code and the 0x0000 end markers) if not otherwise specified. An example text-string is:

  'hello'       0x00,0x68,0x00,0x65,0x00,0x6c,0x00,0x6c,0x00,0x6f,0x00,0x00

Appendix A defines a lot of commonly used attribute tags and the semantic of their values and appendix B defines which tag values you may use to define your own attribute types.

The least significant bit of each attribute tag specifies, whether the attribute value contains information about specific channels (bit is 1) or not (bit is 0). In this way, programs that add, remove or rearrange channel data in EBS files can leave unknown attributes with even tag numbers in the file. They should remove unknown attributes with odd tag numbers and modify odd numbered attributes that are known to the programmer, because their content might assume a special channel layout in the file that does not exist any more after the file modification.

Each attribute tag shall appear not more than once in the variable headers.

2.3 The Data Part

The recorded data may consist of different types (e.g., signed 16-bit integers, unsigned 32-bit integers, signed 12-bit integers, floating point numbers) and these different types may be encoded in different ways (e.g., Bigendian, Littleendian, various compression methods). The values may also be ordered differently. The TIME-BASED sample ordering starts with the values of all channels at the first sample time followed by the values of all channels at the second sample time and so on. The CHANNEL-BASED ORDER of samples begins with all values of the first channel over the full recording time followed by all values of the second channel, etc. If the CHANNEL-BASED ORDER is used, the number m of samples MUST be indicated in byte 16 to 23 of the fixed header. Eight 0xff bytes in this field are only possible in combination with TIME-BASED ORDER formats.

The Encoding ID number stored in byte 8 to byte 11 of the fixed header may indicate one of the following data types and data encodings (others might be added in future versions of this specification):

TIB_16 (Encoding ID: 0x00000000):

This format stores 16-bit signed integer values with the high byte first in TIME-BASED ORDER. This means that e.g. the recorded values

     time    channel 1      channel 2      channel 3      n = 3

       0        20             13            1493
       1         5              7             307
       2       -11              9             421
       3       ...
       ...
       m-1

will be stored as

0x00,0x14,0x00,0x0d,0x05,0xd5,0x00,0x05,0x00,0x07,
0x01,0x33,0xff,0xf5,0x00,0x09,0x01,0xa5,...

(length: 2*n*m bytes, i.e. d >= (n*m*2)/4).

CIB_16 (Encoding ID: 0x00000001):

This format is very much like TIB_16 with the only difference that the values are stored in CHANNEL-BASED ORDER, i.e. the above example recording would be stored as

0x00,0x14,0x00,0x05,0xff,0xf5,...,0x00,0x0d,0x00,0x07,0x00,0x09,...,
0x05,0xd5,0x01,0x33,0x01,0xa5,...

TIL_16 (Encoding ID: 0x00000002):

This format is like TIB_16 a TIME-BASED ORDER, 16-bit signed integer encoding, with the difference that the integer values are stored in the Littleendian format (i.e. beginning with the low byte), which makes efficient programming possible on systems that use Littleendian as their native integer format (e.g., INTEL processors, Transputers, ...). The example recording is then stored as:

0x14,0x00,0x0d,0x00,0xd5,0x05,0x05,0x00,0x07,0x00,0x33,0x01,0xf5,
0xff,0x09,0x00,0xa5,0x01,...

CIL_16 (Encoding ID: 0x00000003):

This format is like CIB_16 a CHANNEL-BASED ORDER, 16-bit signed integer encoding, but in Littleendian format (i.e. beginning with the low byte). The example recording is stored as

0x14,0x00,0x05,0x00,0xf5,0xff,...,0x0d,0x00,0x07,0x00,0x09,0x00,...,
0xd5,0x05,0x33,0x01,0xa5,0x01,...

TI_16D (Encoding ID: 0x00000010):

In this compressed TIME-BASED ORDER encoding, 16-bit signed integer values are stored, but they are encoded in a way that will in many applications need only a little bit more than 50% of the storage space of TIB_16 or TIL_16. The trick is that only the difference between two consecutive samples in the same channel is stored as a signed 2-complement 8-bit value ranging from -127 (0x81) to +127 (0x7f). A positive difference means that the next sample value in the same channel has a higher value. If the value is the first sample of a channel or if the difference is less than -127 or greater than +127, then the absolute value will be stored in a 3 byte sequence starting with -128 (0x80) followed by the full 16-bit signed integer value of the sample with the high byte first. I.e., our example recording from above would look like this:

0x80,0x00,0x14,0x80,0x00,0x0d,0x80,0x05,0xd5,0xf1,0xfa,0x80,0x01,
0x33,0xf0,0x02,0x72,...

The length of the data part in bytes can't be predicted with the parameters in the fixed header if this compressed encoding is used (d >= n*(m+2)/4).

CI_16D (Encoding ID: 0x00000011):

The encoding is the same as TI_16D with the only difference that the sample values (i.e. the differences between them) are stored in CHANNEL-BASED ORDER. The example recording would look like this:

0x80,0x00,0x14,0xf1,0xf0,...,0x80,0x00,0x0d,0xfa,0x02,...,0x80,
0x05,0xd5,0x80,0x01,0x33,0x72,...

[Note: It is expected that CIB_16 will be the most popular format. If you are confused by the many different encodings, just support CIB_16 and reject other EBS encodings with other encoding IDs with a nice error message. There are tools available that allow easy conversion between the different encodings. On some popular processors, you might perhaps prefer CIL_16 if you operate on very huge data sets with efficient methods (e.g. memory mapped files). Time will show, whether the uncompressed TIME-BASED ORDER formats will be of use, and among the compressed formats, TI_16D will perhaps be the most popular version for archive and transfer purposes until more efficient compression techniques are available. If you have only one single channel, then there will be no difference between the TIME-BASED ORDER format and the corresponding CHANNEL-BASED ORDER format. Before you use a coin to decide whether you should indicate a TIME-BASED ORDER or a CHANNEL-BASED ORDER format, it is recommend to use the ID of the CHANNEL-BASED ORDER encoding.]

If a second variable header is present, between 0 and 3 zero padding bytes have to be appended after the above described encodings of the recording in order to give the whole data part a length in bytes that is a multiple of four. This will guarantee a 32-bit alignment for the second variable header part.

As a convention, program user interfaces should give the channels numbers beginning with 1 and samples should be numbered beginning with 0.

[Note: It seems to be most natural for most people to start with 0 for points of time, e.g. digital clocks count from 0 to 59, but only computer scientists find it as obvious that the first channel might also have the number 0). This convention makes user interfaces of programs operating on EBS files more consistent. The numbering convention is only defined for numbers visible to the user of a program and is not intended for variables used internally within a program or for attributes in the variable header.]

The Encoding IDs in the range from 0x80000000 to 0xfffffffe are reserved for private additional encodings and the encoding ID 0xffffffff is reserved and must not be used in EBS files.

[Note: Please use random numbers for your private encoding IDs in the range 0x80000000 to 0xfffffffe and don't simply start at 0x80000000 in order to keep the odds of collisions with other peoples' private IDs small.]

If the need for a new standardized encoding arises, please contact the EBS coordinator (see appendix C) and it is likely that other standard encodings will be added.

Appendix A -- Standardized Attribute Tags

This appendix defines a number of useful attribute tags and the meaning of the corresponding attribute values. The attribute values defined here are simple types with the encoding recommended in section 2.2, sequences of these simple types or other special types (e.g. graphical diagrams or dates).

Attributes that do not refer to individual channels and thus have an even tag number:

0x00000002 IGNORE (length: any)
This attribute should just be ignored by any application.
It allows to remove an attribute without having to copy the
whole file by just overwriting the tag field of this
attribute with the tag number of IGNORE. This attribute may
have any arbitrary value, but applications which delete
attributes should fill the value with 0x00 bytes so that
critical information (e.g. patient names in published
files) will surely be destroyed and not only be made
invisible.

This is the only attribute that may appear several times
in a variable header.

0x00000004 PATIENT_NAME (length: > 0 words, <= 33 words) This single-line text-string may contain the full name of the person of whom the signals have been recorded. 0x00000006 PATIENT_ID (length:> 0 words, <= 33 words) This single-line text-string may contain additional information that is used to identify the patient, e.g. a patient number in a hospital, etc. 0x00000008 PATIENT_BIRTHDAY (length: 2 words) This numeric string contains the birthday of the patient in the 'yyyymmdd' format stored as ASCII digits (not as 16-bit UCS-2 characters!). E.g., '19930210' (0x31,0x39,0x39,0x33, 0x30,0x32,0x31,0x30) means February 10, 1993. (This format is one of the date/time formats defined in ISO 8601.) 0x0000000a PATIENT_SEX (length: 1 word) This 32-bit integer value is 1 for male and 2 for female patients. (The numbers are those specified by ISO 5218.) 0x0000000c SHORT_DESCRIPTION (length:> 0 words, <= 33 words) A single-line text-string that summarizes with a few words the contents of the file. This attribute is intended for listings of many EBS files where each EBS file is listed in a single line. 0x0000000e DESCRIPTION (length:> 0 words)
A multi-line text-string that may tell the user of a file
everything he/she might need to know in addition to the
standardized attributes, e.g. the conditions under which
the recording has been made, etc.

0x00000010 SAMPLE_RATE (length: > 0 words)
The value is the sample rate in Hz stored as a
floating point number. E.g., a sample rate of 1024 per
second (1024 Hz) might be stored as 0x31,0x30,0x32,
0x34,0x00,0x00,0x00,0x00 ('1024').

0x00000012 INSTITUTION (length: > 0 words, <= 33 words) This single-line string may contain the name of the institution, where the file has been recorded, processed, etc. 0x00000014 PROCESSING_HISTORY (length:> 0 words)
This attribute is a sequence of multi-line strings. Each
string may describe a processing step that has been
performed in order to produce this file. This might e.g. be
the command line that has been used to start a program or a
list of parameters that have been applied. A program may
add its own processing description as another string to the
end of the already existing sequence. Also text information
about the equipment used to record the data and who did the
recording or processing can be stored here. The number of
multi-line text-strings in this attribute is determined by
the length of the attribute.

0x00000016 LOCATION_DIAGRAM (length: > 0 words)
This attribute contains a graphical diagram of the object
(e.g. brain, head, whole body, ...) from which the recorded
data has originated or any other diagram that may be used
to describe the positions of sensors/electrodes. The
attribute CHANNEL_LOCATIONS may assign to channels
coordinates in this diagram. In this way, software can
generate pictures that indicate the position of
electrodes/sensors on or in the body. This attribute
contains the background graphic for these pictures and
attribute CHANNEL_LOCATIONS contains the coordinates for
channel markers.

The value of LOCATION_DIAGRAM is a complete Computer
Graphics Metafile (CGM) as defined in ISO 8632. Only the
binary encoding of a CGM file as defined in ISO 8632-3 is
used. The end of the CGM file is filled with 0x00 to a
length in bytes divisible by 4. All coordinates are
specified as 16-bit integer values (i.e. VDC TYPE is
integer and INTEGER PRECISION is 16, which is the default
for the binary CGM encoding). The VDC EXTEND should be
specified for each picture. The attribute may contain
several pictures in the metafile. As most applications
won't need the full power of the CGM format, the following
subset of CGM elements is suggested as a minimum
requirement for software that uses this attribute:

BEGIN METAFILE, END METAFILE, BEGIN PICTURE, BEGIN
PICTURE BODY, END PICTURE, METAFILE VERSION, METAFILE
ELEMENT LIST, VDC EXTENT, POLYLINE

Programmers may of course support more CGM functionality
(e.g. colors, text, arcs, fill patterns, etc.) as defined
in ISO 8632 and it is possible that later versions of this
standard will add additional elements to this minimal
subset if necessary. Programs may ignore additional
elements and warn the user that the displayed diagram might
be incomplete or may ignore the whole attribute if
additional elements are present. Appendix F gives a short
introduction into the minimal CGM subset specified here.

Attributes that refer to a special channel layout and that have to be changed by programs which change, add, move or delete channels:

0x00000001 PREFERRED_INTEGER_RANGE (length: (1+1)*n words)
For integer data, this attribute gives display software a
hint, which value range might be most interesting in the
data. The value consists of a recommended display minimum
(32-bit signed integer) followed by a recommended display
maximum (32-bit signed integer) for each channel beginning
with channel 1. E.g., if in 16-bit signed integer data most
good values are in the range -2048 to +2047 in all
channels, then, if the value of this attribute is 0xff,
0xff,0xf8,0x00,0x00,0x00,0x07,0xff (repeated for each
channel), it will be easy for a visualization program to
find a nice default scaling factor. If both the minimum and
the maximum value for a channel are equal (e.g. both are
zero), then no preferred integer range is specified for
this channel as it would be the case for all channels if
this attribute were not present.

0x00000003 UNITS (length: >= (1+1)*n words)
This attribute contains a sequence of physical unit
specifications, one for each channel. It assigns each
channel an SI unit (e.g. mA, mV, nT) and a quotient of a
physical quantity and the encoded sample value that
represents it. Each unit specification is a sequence of a
floating point value and a single-line text-string. The
floating point number is the number with which the sample
value must be multiplied in order to get the physical value
(e.g. '0.0025' if a sample value of 400 represents 1.0 mV
and the specified unit in the text-string is 'mV'). The
quotient is followed by a single-line text-string with the
usual abbreviation for the SI unit (not more than 8
characters (= 20 bytes) long). E.g., the text-string for
Microvolts is 0x00,0xb5,0x00,0x56,0x00,0x00,0x00,0x00. Only
linear relations between the physical quantity and the
sample value in the encoded data can be described with this
attribute. If the float number is 'not a number'
(0x00,0x00,0x00,0x00), the physical unit and quantity is
unspecified for this single channel as it would be for all
channels if the whole attribute were absent. In this case,
the unit text-string should also be empty.

0x00000005 CHANNEL_DESCRIPTION (length: >= (1+1)*n words, <= (5+33)*n words) The attribute consists of a sequence of 2*n single-line text-strings, one pair for each channel. The first string in a pair must not contain more than 8 characters (not including the 1 or 2 0x0000-words at the end of each string). This string contains a very short name for the channel that might e.g. be used to label it in diagrams, etc. E.g., in EEG recordings, this will often be the name of the electrode position in the usual 10-20-system, like "F4-A1", "C4-Cz", etc. The second single-line text-string in the pair that follows directly behind each short label string may contain additional descriptive text for each channel that does not fit in the short 8 character label (e.g., in EEG recordings information about electrodes with bad contact, etc.). 0x00000007 CHANNEL_GROUPS (length:>= 3 words)
Each channel may belong to zero, one or several groups. A
channel group might e.g. be used to group channels from the
same biological source (e.g., one group for EEG and one
group for ECG channels) so that they can be more
conveniently selected together or shown in different colors
in interactive programs. The CHANNEL_GROUPS attribute
contains a sequence of group descriptions. A single group
description consists of

- a single-line text-string with a short name for the
group (e.g. "EEG") with not more than 8 characters,
followed by
- a single-line text-string with a description of the
group (this may of course be the empty string
0x00000000 if no description is available), followed by
- an unsigned 32-bit integer number g with the number of
channels in this group which is followed by
- g unsigned 32-bit integer numbers with the numbers of
the channels (with 0 being the first channel) that
belong to this group.

If groups are associated with numbers in a user interface,
then the first group in this attribute should be assigned
number 1.

0x00000009 EVENTS (length: any)
This attribute allows to mark events or time intervals in
the recording for all channels together or for individual
channels. Each event or interval belongs to one event list
and each event list has a short name and a description
text. In addition, each single event or interval may have a
description string. The attribute contains a sequence of
event lists. The number of event lists is determined by
the length of the attribute. Each event list consists of

- a single-line text-string with the short name (not more
than 8 characters), followed by
- a multi-line description string, followed by
- the number e (unsigned 32-bit integer) of
events/intervals in this event group, followed by
- a sequence of e events or intervals.

Each single event or interval in an event list is described
by the following sequence

- An unsigned 32-bit integer channel number. The first
channel is represented by number 0 and 0xffffffff
indicates that this event or interval is not associated
with a single channel.
- An unsigned 64-bit integer number that represents the
position (the first sample has position 0) of the event
or the start position of an interval.
- An unsigned 64-bit integer number that has the value
0x0000000000000000 for events or represents the length
of an interval if it has any other value.
- A single-line text-string (as usual not more than 64
characters long) may contain a textual description of
the type of event or interval that has been marked or
just an empty string.

The whole event/interval sequence in each event list
consists of these event/interval descriptions sorted
ascending by their start sample number (second integer
value).

0x0000000b RECORDING_TIME (length: 2 or 4 words)
This is the time when the recording of the physical signals
started. Two different formats are allowed, either only the
date (as in PATIENT_BIRTHDAY) or date and time.

The date and time format is 'yyyymmddThhmmss' stored as
ASCII digits (not 16-bit UCS-2 characters!), the ASCII
character 'T' and one final 0-byte. E.g. '19930211T153159'
stored as 0x31,0x39,0x39,0x33,0x30,0x32,0x31,0x31,0x54,
0x31,0x35,0x33,0x31,0x35,0x39,0x00 means that the
recording started on February 11, 1993, 3:31:59 pm local
time.

If no time is available, the date alone may be stored as
'19930211' or in bytes 0x31,0x39,0x39,0x33,0x30, 0x32,
0x31,0x31.

[Note: These attribute formats are two of the date/time
formats specified in ISO 8601. The ASCII 'T' has been
inserted for compatibility with the ISO standard. This
attribute has an odd tag number, because it has to be
modified or removed if a beginning part of a recording is
removed from an EBS file as then the recording time of the
first sample number changes.]

If this attribute is either not exactly 4 words long and
has not a 'T', a 0x00 and ASCII digits at the specified
positions, and is not 2 words long and contains only ASCII
digits, then it should be ignored, because it could be
another ISO 8601 time format that might be specified as an
alternative in a future version of this standard if
necessary (e.g. with time zone, milliseconds, several
concatenated intervals of time).

0x0000000d CHANNEL_LOCATIONS (length: any)
This attribute may only be present together with a
LOCATION_DIAGRAM attribute. It defines the locations of
sensors/electrodes in the coordinate space (VDC) of the
graphical diagrams in LOCATION_DIAGRAM. Each channel may
have zero, one or several positions, i.e. a channel may
appear on several places in a diagram and in different
diagrams. A channel may be associated with several single
points or with pairs of points, which might be represented
graphically as arrows from the first point to the second
one. The value of this attribute is a sequence of
positions (each is a point or an arrow representing a
channel) and each position is a sequence of the following
six 32-bit integer values:

- channel number (the first channel has number 0,
unsigned value).
- picture number (the first picture in the CGM file
of LOCATION_DIAGRAM has number 0, unsigned value).
- X1 coordinate (signed value)
- Y1 coordinate (signed value)
- X2 coordinate (signed value)
- Y2 coordinate (signed value)

Several positions can have the same channel number. For
point positions, X1 and Y1 are the coordinates of the
points and X2 and Y2 have the special value 0x80000000. For
arrow positions, X1 and Y1 are the coordinates of the tail
and X2 and Y2 are those of the head. Arrows may e.g. be
used to indicate that a channel represents the difference
potential between two electrode positions. The coordinates
are all inside the CGM VDC extent.

0x0000000f FILTERS (length: >= n words)
Information about the filters that have been applied to
each channel may be stored here. The attribute contains a
sequence of filter lists, one for each channel. It may only
be present if also a SAMPLE_RATE attribute is present. For
each channel, the filter list consists of a sequence of
filter specifications followed by 0xffffffff (i.e. the
attribute value contains at least one final 0xffffffff for
each channel). The following filter specifications may
appear in a filter list:

- lowpass filter: it is specified by a sequence of
the following three values.

o The first 32-bit integer number 0x00000001
identifies the filter as a lowpass filter.

o The second parameter is the cutoff frequency of the
filter [the usual -3 dB limit, i.e. the frequency
where the output voltage has been decreased to
1/sqrt(2) (71%) of the input voltage] which is
stored as a positive floating point value in Hz.

o The third value describes the falloff after the
cutoff frequency. It stores the attenuation in dB
per decade as a negative floating point value. If
this value is not known, a not-a-number value
(0x00000000) may be used here.

[Note: A -20 falloff value represents a filter
where the output voltage has decreased to -20 dB
(that is 10% of its input voltage) at a frequency
which is 10 times the cutoff frequency (decade).
This is identical to the alternative description
that the filter has a -6 dB/octave falloff,
i.e. the output voltage has dropped to 50% (-6 dB)
at double cutoff frequency. In general, a p-pole
filter (also known as a filter of order p) is
stored as the value -20*p.]

- highpass filter: it is specified by a sequence of
the following three values.

o The first 32-bit integer number 0x00000002
identifies the filter as a highpass filter.

[Note: If you are interested in the time constant t
in seconds of a highpass or lowpass filter and you
know only the cutoff frequency f in Hz: t = 1 /
(2*pi*f).]

o The third value describes the falloff before the
cutoff frequency. It stores the attenuation in dB
per decade as a negative floating point value. If
this value is not known, a not-a-number value
(0x00000000) may be used here.

- notch filter: it is specified by a sequence of
the following three values.

o The first 32-bit integer number 0x00000003
identifies the filter as a notch filter which
attenuates only the frequencies around a single
peak frequency.

o The second parameter is the peak frequency of the
filter (the most attenuated frequency) which is
stored as a positive floating point value in Hz.

o The third value describes the falloff around the
peak frequency. It stores the attenuation in dB per
decade as a negative floating point value. If this
value is not known, a not-a-number value
(0x00000000) may be used here.

Feel free to use those of the attributes you need, to use none at all or to define your own attribute tags as described in the next appendix.

Hosted by www.Geocities.ws