Ultima 5 File Formats (PC version)
==================================

Last updated on 2006-May-19.
Please send additions, corrections and feedback to this e-mail address:
Remove space + vowels from "marc winterrowd" and append "at yahoo dot com"


Tools for reverse-engineering real-mode DOS programs
----------------------------------------------------

386SWAT is a free (but not open-source) debugger for DOS:
http://www.sudleyplace.com/swat/swat.htm

NASM (open-source assembler) comes with a disassembler:
http://nasm.sourceforge.net/

If you don't have MS-DOS, try FreeDOS:
http://www.freedos.org/

pmaCompare is a Windows utility that shows the differences between two
binary files. Very useful for comparing savegames with small differences.
http://www.pmasoft.net/englisch/binarycomp.htm


LZW compression
---------------

Some files from the PC version of Ultima 5 have been compressed with the
LZW algorithm:
*.4
*.16
(and possibly others)

There's a utility for decompressing Ultima 6 files, which also works with
files from Ultima 5. You can download the u6decode source here:
http://www.geocities.com/nodling/


*.4
---

These files are LZW-compressed. The uncompressed files contain 4-color
images.
They have the same format as the *.16 files, except that each pixel has
2 bits.


*.16
----

These files are LZW-compressed. The uncompressed files contain 16-color
images.

   TILES.16
   --------
   Contains 0x200 tiles.
   Tile format: 16x16 pixels, 4 bits per pixel.

   struct Tiles_16 {
      Tile tiles[0x200];
   }

   struct Tile {
      Tile_Row rows[16];
   }

   struct Tile_Row {
      uint8 pixel_data[8]; // 4 bits per pixel
   }


   ITEMS.16 and MON*.16
   --------------------
   file = number_of_entries, set of offset16, set of masked_image

   number_of_entries = uint16
   offset16 = uint16
   masked_image = image, mask

   image = width, height, set of pixel4
   mask = width, height, set of pixel1

   width = uint16
   height = uint16
   pixel4 = uint8 (4 bits per pixel)
   pixel1 = uint8 (1 bit per pixel)


   All other *.16 files
   --------------------
   file = number_of_entries, set of offset32, set of image

   number_of_entries = uint16
   offset32 = uint32
   image = width, height, set of pixel4

   width = uint16
   height = uint16
   pixel4 = uint8 (4 bits per pixel)

Notes:
1) The end of each row is padded with 0-7 black pixels (value = 0), to make
sure that bytes_per_row % 4 == 0.
The width does not include the padding.
2) In DNG*.16, offsets 0x8 and 0x18 are zero.


*.CBT
-----

Combat maps (11x11 tiles).
File format:

struct CBT_File {
   Combat_Map c_maps[n];
}

struct Combat_Map {
   Map_Row              row0[1];   
   // row 1 = east
   // row 2 = west
   // row 3 = south
   // row 4 = north
   Map_PlayerPos_Row    row1_4[4];
   Map_MonsterTile_Row  row5[1];
   Map_MonsterX_Row     row6[1];
   Map_MonsterY_Row     row7[1];
   Map_Row              row8_10[3];   
}

struct Map_Row {
   uint8 tiles[11];
   const uint8 zeroes[21] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
}

struct Map_PlayerPos_Row {
   uint8 tiles[11];
   uint8 initial_X[6]; // initial x position of each party member
   uint8 initial_Y[6]; // initial y position of each party member
   uint8 zeroes[9] = {0,0,0,0,0,0,0,0,0};
}

struct Map_MonsterTile_Row {
   uint8 tiles[11];
   uint8 monster_tiles[16]; // tile for each monster
   uint8 zeroes[5] = {0,0,0,0,0};
}

struct Map_MonsterX_Row {
   uint8 tiles[11];
   uint8 initial_X[16]; // initial x position of each monster
   uint8 zeroes[5] = {0,0,0,0,0};
}

struct Map_MonsterY_Row {
   uint8 tiles[11];
   uint8 initial_Y[16]; // initial y position of each monster
   uint8 zeroes[5] = {0,0,0,0,0};
}

Notes:
- The initial party member positions depend on the direction from which
the party entered the map.


*.CH
----

Font files with 128 characters each.
ibm.ch contains the Latin characters and a number of symbols.
runes.ch contains the Britannian runes and the remaining symbols.

File format:
Each character is 8x8 pixels, 1 bit per pixel (0 = black, 1 = white).
I don't know which palette index is associated with a 0 or 1.
Each byte represents 8 pixels. The most significant bit represents the
leftmost pixel, and the least significant bit represents the rightmost
pixel.


*.DAT
-----

   BRIT.DAT
   --------
   The Britannian map.
   Its size is 256x256 tiles.
   It is divided into 256 chunks. Each chunk has a size of 16x16 tiles,
   and each tile is stored as a uint8 (in Ultima 4, the chunk size was
   32x32 tiles).
   To save space, chunks that are all water (tile 0x1) were left out.
   The location of the all-water chunks is stored in DATA.OVL.
   The chunks are stored from west to east, north to south, i.e. the first
   chunk in the uncompressed map is the one in the northwest corner.
   The tiles in a chunk are also stored from west to east, north to south.


   LOOK2.DAT
   ---------
   This file contains the "look" descriptions for the 0x200 tiles.
   Each description is a zero-terminated ASCII string.
   Some descriptions consist only of an asterisk, which the game displays
   as a diamond.
   The offsets are relative to the beginning of the file.

   look2.dat = set(0x200) of offset16, set(0x200) of ascii_string

   offset16 = uint16
   ascii_string = set of ascii_char, terminator

   ascii_char = uint8
   terminator = (uint8) 0


   MISCMAPS.DAT
   ------------
   This file contains:
   1) campfire/battle screens (11x11 tiles)
   2) intro screens (19x4 tiles)
   3) script data for the intro

   struct Miscmaps_DAT {
      Map_11x11 battle_maps[4];
      Map_19x4 intro_maps[4];
      uint8 script_data[0x28F];
   }

   struct Map_11x11 {
      Row_11x11 rows[11];
   }

   // each row is padded with 5 zeroes to make it 16 bytes long
   struct Row_11x11 {
      uint8 tiles[11];
      const uint8 zeroes[5] = {0,0,0,0,0};
   }

   struct Map_19x4 {
      Row_19x4 rows[4];
   }

   // each row is padded with 13 zeroes to make it 32 bytes long
   struct Row_19x4 {
      uint8 tiles[19];
      const uint8 zeroes[13] = {0,0,0,0,0,0,0,0,0,0,0,0,0};
   }

   Notes:
   - the intro maps are stored in the order in which they appear during the
   introduction:
   1) "The Summoning"
   2) "The Journey"
   3) "The Arrival"
   4) "The Welcoming"
   - the script data controls the movement of NPC's in the introduction
   - todo: battle screen descriptions, script format


   SIGNS.DAT
   ---------
   signs.dat = set(0x21) of offset16, set(0x21) of sign_group

   offset16 = uint16
   sign_group = set of sign_data

   sign_data = header4, set of sign_char, terminator
   
   header4 = location, coord_Z, coord_X, coord_Y
   location, coord_Z, coord_X, coord_Y = uint8   
   sign_char = uint8
   terminator = (uint8) 0

   Notes:
   - offset16[i] points to the signs in location i. For locations without
   signs, the corresponding offset16 is 0.
   If you remove the zero values, you'll find that the remaining offset16's
   are sorted in ascending order.
   - each sign starts with a 4-byte header, specifying location, z, x and
   y coordinates. The locations can be the following:
   
     0x0      Britannia/Underworld
     -- Towns --
     0x1      Moonglow
     0x2      Britain
     0x3      Jhelom
     0x4      Yew
     0x5      Minoc
     0x6      Trinsic
     0x7      Skara Brae
     0x8      New Magincia
     -- Lighthouses --
     0x9      Fogsbane
     0xA      Stormcrow
     0xB      Greyhaven
     0xC      Waveguide
     -- Huts --
     0xD      Iolo's Hut
     0xE      Sutek's Hut
     0xF      Sin'Vraal's Hut
     0x10     Grendel's Hut
     -- Castles --
     0x11     Castle British
     0x12     Castle Blackthorn
     -- Villages --
     0x13     West Britanny
     0x14     North Britanny
     0x15     East Britanny
     0x16     Paws
     0x17     Cove
     0x18     Buccaneer's Den
     -- Keeps --
     0x19     Ararat
     0x1A     Bordermarch
     0x1B     Farthing
     0x1C     Windemere
     0x1D     Stonegate
     -- Castles of the Principles --
     0x1E     The Lycaeum
     0x1F     Empath Abbey
     0x20     Serpent's Hold
     
   - Location numbers are identical to those in saved.gam.
   z coordinates have the same meaning as in saved.gam:
   0xFF = basement/Underworld
   0 = ground floor/Britannia
   1 = first floor, etc.      
   - I don't know if the game supports signs in dungeons.
   - signs can contain characters from both ibm.ch and runes.ch:
   0 <= sign_char <= 0x7F --> runes.ch[sign_char]
   0x80 <= sign_char <= 0xFF --> ibm.ch[sign_char - 0x80]


   UNDER.DAT
   ---------
   The Underworld map.
   Its size is 256x256 tiles.
   It is divided into 256 chunks. Each chunk has a size of 16x16 tiles,
   and each tile is stored as a uint8 (in Ultima 4, the chunk size was
   32x32 tiles).
   Unlike BRIT.DAT, it is not compressed (no chunks were left out).
   The chunks are stored from west to east, north to south, i.e. the first
   chunk in the map is the one in the northwest corner.
   The tiles in a chunk are also stored from west to east, north to south.


EGA.DRV
-------

See u5_ega_drv.txt


INIT.GAM
--------

Initial "SAVED.GAM".


SAVED.GAM
---------

See u5_saved_gam.txt


*.HCS
-----

Font files with 128 characters each.
ibm.hcs contains the Latin characters and a number of symbols.
runes.hcs contains the Britannian runes and the remaining symbols.

File format:
Each character is 16x12 pixels, 1 bit per pixel (0 = black, 1 = white).
I don't know which palette index is associated with a 0 or 1.
Each byte represents 8 pixels. The most significant bit represents the
leftmost pixel, and the least significant bit represents the rightmost
pixel.


*.NPC
-----

These files contain information about NPC's.

struct NPC_File {
   NPC_Info info[8]; // each NPC file has information for 8 maps
}

struct NPC_Info {
   NPC_Schedule schedule[32];
   uint8 type[32]; // merchant, guard, etc.
   uint8 dialog_number[32];
}

struct NPC_Schedule {
   uint8 AI_types[3];
   uint8 x_coordinates[3];
   uint8 y_coordinates[3];
   sint8 z_coordinates[3];
   uint8 times[4];
}

Notes:
1) All maps can hold a maximum of 31 (not 32) NPC's. In every map,
schedule[0], type[0] and dialog_number[0] are not used. However, type[0]
is sometimes 0 and sometimes 0x1C, so perhaps it has some unknown purpose.
2) Each NPC_Schedule contains information about 3 locations that the NPC
will go to at different times of day.
The x and y coordinates are between 0 and 31, because each map has a size
of 32x32 tiles.
The z coordinates represent the level, relative to level 0. 0xFF would make
the NPC go to the level below level 0, while 0x1 would make the NPC go to
the level above level 0.
The times are given in hours, so they range from 0 to 23.
times[0] --> NPC goes to location 0
times[1] --> NPC goes to location 1
times[2] --> NPC goes to location 2
times[3] --> NPC goes to location 1
- todo: AI types


*.OVL
-----

Code or data overlays.

   DATA.OVL
   --------
   offset      length      purpose
   // compressed strings
   0x104C      0x24E       compressed strings used in TALK.DAT
                           each string is a zero-terminated ASCII string
   // monster flags
   0x154C      0x30*2      flags that define the special abilities of
                           monsters during combat; 32 bits per monster
                           0x0020 = undead (affected by An Xen Corp)
                           todo:
                           - passes through walls (ghost, shadowlord)
                           - can become invisible (wisp, ghost, shadowlord)
                           - can teleport (wisp, shadowlord)
                           - can't move (reaper, mimic)
                           - able to camouflage itself
                           - may divide when hit (slime, gargoyle)
   // settlements and dungeon entrances
   0x1E4A      0x28*2      offsets of location name strings; add 0x10 to
                           get the real offset
   0x1E9A      0x28        x coordinates of locations
   0x1EC2      0x28        y coordinates of locations
   // moon phases
   0x1EEA      28*2        moon phases (28 byte pairs, one for each day
                           of the month)
   // shrines and mantras
   0x1F5E      8*2         offsets of shrine name strings; add 0x10 to get
                           the real offset
   0x1F6E      8*2         offsets of mantra strings; add 0x10 to get the
                           real offset
   0x1F7E      8           x coordinates of shrines
   0x1F86      8           y coordinates of shrines
   // compressed string offsets
   0x24F8      0x81*2      offsets of the compressed strings used in
                           TALK.DAT; add 0x10 to get the real offset
   // Britannia map chunks (info about the location of the all-water chunks
   // that were left out of BRIT.DAT)
   0x3886      0x100       0xFF = the chunk consists only of tile 0x1
                           else = index into BRIT.DAT
   // this section contains information about hidden, non-regenerating
   // objects (e.g. the magic axe in the dead tree in Jhelom); there are
   // only 0x71 such objects; the last entry in each table is 0
   0x3E88      0x72        object type (tile - 0x100)
   0x3EFA      0x72        object quality (e.g. potion type, number of gems)
   0x3F6C      0x72        location number (see "Party Location")
   0x3FDE      0x72        level
   0x4050      0x72        x coordinate
   0x40C2      0x72        y coordinate
   // dock coordinates (where puchased ships/skiffs are placed)
   // 0 = Jhelom
   // 1 = Minoc
   // 2 = East Brittany
   // 3 = Buccaneer's Den
   0x4D86      0x4         x coordinate
   0x4D8A      0x4         y coordinate
   // scan code translation table:
   // when the player presses a key that produces one of the scan codes in
   // the first table, the game translates it to the corresponding code in
   // the second table
   0x541E      8           scancodes
   0x5426      8           internal codes
   // wells
   0x7252      0x32        wishing for one of these keywords at a wishing
                           well gets you a horse


*.TLK
-----

These files contain conversation scripts.
File format:

tlk_file = number_of_entries, set of script_index, script_data

number_of_entries = uint16
script_index = npc_number, offset16
    npc_number = uint16
    offset16 = uint16
script_data = set of uint8

Notes:
1) The first NPC number is 1.
1) The script_indexes are sorted by NPC number, in ascending order.
2) The script_data blocks are sorted by NPC number, in ascending order.
3) The conversations appear to be scripted, like the ones in Ultima 6.
It also appears that the conversation text is not ASCII-encoded (unlike U6).


General Notes
-------------
1) Wishing Wells
There are two wishing wells in the game:
- Paws (location 0x16)
- Empath Abbey (location 0x1F)
These locations are hard-coded into the game.
There is no difference between horses from wishing wells and horses from
vendors. It also doesn't make any difference if you wish for "horse" or
a car brand.


Sources
-------
Nytegard <email: append "at yahoo dot com" to his nick>
http://martin.brenner.de/ultima/u5save.html
http://www.cosy.sbg.ac.at/~lendl/ultima/ultima5/
http://www.wi.leidenuniv.nl/~psimoons/ultima5t.htm
  Pieter Simoons has taken down his Ultima 5 website, but it's still cached on http://www.archive.org/
Sheng Long Gradilla <email: replace the two spaces in his name with dots and append "at gmail dot com">
