Author: | Joel Molin <joel@molins.nu> |
---|---|
Date: | 2004-05-17 |
Copyright: | This document has been placed in the public domain. |
The CEL files used in the game "Diablo" from Blizzard Entertainment (released in 1996) has always confused the mod-community around the game. Several attempts has been made on encoding the format, but none has been completely successful. The format is a graphic format containing a number of frames, usable as an animation or as an archive of different images.
This document strives to document what we know about this format.
2004-05-17: | Created. (Joel Molin <joel@molins.nu>) |
---|
The CEL image format is one of the formats of choice in Blizzard Entertainment's hit title "Diablo" from 1996. As an image format, it is pretty trimmed down, not so much redundant information is stored. On the other hand, it is not compressed, so loading is still pretty fast. These attributes were important to diablo, I still remember how slow it loaded on a pentium 60MHz.
Sadly, the designers of the format trimmed it a bit too much, and a simple data field is really what is bugging everyone working with CELs within the community. The problem is really computing the width of images, since the width is not stored explicitly within the file.
Other than that, CEL images uses color indices to designate colors. The palette is external, and there is no notion of which palette file to use for a CEL in the CEL file. It is up to the game at run-time. Actually, storing the name of the palette in the CEL wouldn't help at all, since all the graphics on the screen must use the same palette. However, Diablo uses palettes where some colors are always the same. Because of that most of the graphic files distributed with Diablo works with any of the palette's also distributed with the game. But palettes might become confusing for the creator or editor of game graphics.
Palette files are recognized by their size (always exactly 768 bytes) and their extension (.pal). They store 256 colors, each has 24 bits of depth, thus each color takes 3 bytes (this explains the fixed size). The bytes are stored as common RGB triplets. In the palette file the colors are stored with the red value first, green second, and blue last.
Since it is necessary to have black and white in the file at position 0x0 and 0xff respectively (at least in DirectX, but I think this should be true for all 8-bit modes), all .pal's have black and white at those positions.
CEL files do support transparency, but not through color-keys or so, but rather by using an encoding that tells what is transparent and what is filled with colors.
As everything else in the CEL, the header is a simple construction. Basically, first is a field of 4 bytes telling the number of frames in the file, then comes an array of offsets; one for the beginning of each frame, and one for the end of the file.
I believe this calls for an example:
00000000 0100 0000 0C00 0000 2000 0000 1300 FF00 00000010 FF00 FF00 FF00 FF00 FF00 FF00 FF00 FF00
This is one of the simplest CELs I could produce. Simply, the first four bytes is the number of frames in the CEL, in little-endian byte order (the common Diablo modder will recognize this as the "always reverse in hex" rule). Since there is one frame in the CEL, the next four bytes will be an offset to the start of a frame; we see that the first frame starts at offset 0x0C. The next four bytes -- byte 9-13 -- will contain the offset to the end of the file (the size of the file, if you want). The rest of the data, bytes 0x0C to 0x19, is the image data (in this case it happens to be a one-line black & white chess-pattern, but more on that later).
So, now the data format. In this section I am not sure if I can answer all the answers right away. I am hoping that someone will be able to fill the empty spots later.
The basic, fundamental idea with the CELs is this:
You might not understand all of this right away, I suggest you get back to this section as you need later. I will just provide definitions for some words and phrases I will be using.
I would like to think of the data in a CEL file as a list of commands. You don't have to see them that way, but I think it is easier to explain the format this way.
A command in a CEL file is quite similar to other commands in the world of computers. It has a command, and some arguments (well, the command character holds some data, despite its newly given name). There are just a few commands and they can be distinguished by looking at the numeric value of a command char.
If c is a command character read from some CEL frame-data, then this is true for c:
Condition | Command Type |
---|---|
0x10 < c <= 0x7f | Regular |
0x80 <= c | Transparency |
c <= 0x10 | Block |
Regular command sequences are the perhaps most vital sequences, being the only sequences that contain actual color-indices. They are also the most simple sequences. When c is a regular command, the following c bytes contain a color-index each. So if you are parsing a CEL file, and you find a regular command, read c bytes, and then the character after those will be the next command character.
Let's go back to example 1 for a while. The header gives that there is one frame, it starts at offset 0x0C and its data is 0x20 - 0x0C = 0x14 = 20 bytes large. When parsing this frame, you would read one character at offset 0x0C. This happens to be 0x13, so it is a regular command. The rules for regular commands say that you should read 0x13 bytes at this point. When you've done that, you have actually parsed all of the frame, but if there were more content, you would read a command character at offset 0x20 and continue in this fashion.
These commands are almost as easy to understand as regular commands, and they are easier to parse in a way, since they are always just the command character. The command character keeps information about how many pixels following that should be transparent. You can get the number of transparent pixels by doing something like:
transparent pixels = 0x100 - *c*
or by viewing c as negative (I usually prefer to view chars in C as unsigned, why I like the previous approach better, but they are really equivalent, really, as long as you've got your signedness right):
transparent pixels = - *c*
These commands seem to be the least documented ones. What I know here is how cvcel uses them.
If the first command in a frame is a block command, then the CEL has some information stored in the next c bytes. cvcel uses only byte 2 and 3, which together forms a 16-bit integer (perhaps it is actually a 32-bit integer, but the upper two bytes are never used). This integer is actually an offset in the frame. cvcel stores this value and waits until it gets to the byte in the data with offset c. When it gets to that byte, it checks how many pixels it has read (transparent and opaque pixels alike). It then computes the width of the cel (where pixels is the number of pixels read at the block offset):
cel width = *pixels* / 32
This is the root of all evil. cvcel can neither do it well (most CV5 users probably has opened a CEL, just to see that it looks as crap. Most oftenfly this is the width-ghost). There are also rumors about Diablo storing some of the widths in the source code (shrug, why would they want that?)
In some cases, the width is quite easy to do. Especially, a lot of CELs that only have regular commands are easy. These frames can have their width computed by looking at the first commands.
Starting from the beginning of the frame, do this:
prepare a variable, width, set it to zero.
read a command character to c.
if c is 127:
- add 127 to width.
- process the data if you want (not necessary for the width computation), or at least make sure you skip the next 127 bytes.
- go to 2
add c to width
process the data if you want.
width should now be the width of the CEL, and you have read a complete scan-line. Congratulations.
Now, the idea is that the command with value 127 extends the line, so that lines can be longer than 127 pixels. Everything else would just end the scan-line. I must say though, that I have no idea whatsoever on how you would create a 127 pixels wide image.
Frames that starts with a block command contains a way to compute the width, see block commands.
There is no way to compute the width of an "irregular" frame that does not start with a block command that I know of. This is unfortunate, of course. The way cvcel do this is by taking the square root of the pixels read, doing a somewhat bad approximation.
The author (Joel Molin <joel@molins.nu>) would like to refer to his sources while he has the chance to:
- Ted "Nykodaemus" Powell -- For giving me the initial information, which would still be a lot of what I know, over e-mail all those years ago and for just being a cool Canadian dude who could stand a young Swedish one.
- M. "TeLAMoN" König -- For the CV utilities, for sharing his source (unlike most of the Diablo community :-/), and for answering politely on my e-mail a year ago despite the age of the project.