Unofficial Diablo CEL Image Specification

Author: Joel Molin <joel@molins.nu>
Date: 2004-05-17
Copyright: This document has been placed in the public domain.

Contents

1   Abstract

The CEL files used in the game "Diablo" from Blizzard Entertainment (released in 1996) has always confused the mod-community around the game. Several attempts has been made on encoding the format, but none has been completely successful. The format is a graphic format containing a number of frames, usable as an animation or as an archive of different images.

This document strives to document what we know about this format.

2   ChangeLog

2004-05-17:Created. (Joel Molin <joel@molins.nu>)

3   Introduction to CEL-files

The CEL image format is one of the formats of choice in Blizzard Entertainment's hit title "Diablo" from 1996. As an image format, it is pretty trimmed down, not so much redundant information is stored. On the other hand, it is not compressed, so loading is still pretty fast. These attributes were important to diablo, I still remember how slow it loaded on a pentium 60MHz.

Sadly, the designers of the format trimmed it a bit too much, and a simple data field is really what is bugging everyone working with CELs within the community. The problem is really computing the width of images, since the width is not stored explicitly within the file.

Other than that, CEL images uses color indices to designate colors. The palette is external, and there is no notion of which palette file to use for a CEL in the CEL file. It is up to the game at run-time. Actually, storing the name of the palette in the CEL wouldn't help at all, since all the graphics on the screen must use the same palette. However, Diablo uses palettes where some colors are always the same. Because of that most of the graphic files distributed with Diablo works with any of the palette's also distributed with the game. But palettes might become confusing for the creator or editor of game graphics.

3.1   The Palette Format

Palette files are recognized by their size (always exactly 768 bytes) and their extension (.pal). They store 256 colors, each has 24 bits of depth, thus each color takes 3 bytes (this explains the fixed size). The bytes are stored as common RGB triplets. In the palette file the colors are stored with the red value first, green second, and blue last.

Since it is necessary to have black and white in the file at position 0x0 and 0xff respectively (at least in DirectX, but I think this should be true for all 8-bit modes), all .pal's have black and white at those positions.

3.2   Transparency

CEL files do support transparency, but not through color-keys or so, but rather by using an encoding that tells what is transparent and what is filled with colors.

4   Getting To The Format: The Header

As everything else in the CEL, the header is a simple construction. Basically, first is a field of 4 bytes telling the number of frames in the file, then comes an array of offsets; one for the beginning of each frame, and one for the end of the file.

I believe this calls for an example:

00000000   0100 0000 0C00 0000 2000 0000 1300 FF00
00000010   FF00 FF00 FF00 FF00 FF00 FF00 FF00 FF00

This is one of the simplest CELs I could produce. Simply, the first four bytes is the number of frames in the CEL, in little-endian byte order (the common Diablo modder will recognize this as the "always reverse in hex" rule). Since there is one frame in the CEL, the next four bytes will be an offset to the start of a frame; we see that the first frame starts at offset 0x0C. The next four bytes -- byte 9-13 -- will contain the offset to the end of the file (the size of the file, if you want). The rest of the data, bytes 0x0C to 0x19, is the image data (in this case it happens to be a one-line black & white chess-pattern, but more on that later).

5   The Data In Frames

So, now the data format. In this section I am not sure if I can answer all the answers right away. I am hoping that someone will be able to fill the empty spots later.

5.1   Basics

The basic, fundamental idea with the CELs is this:

  • Read a byte from the data.
  • Take an action depending on the contents (this can, for example, be to read some image data).
  • Repeat this over and over until there is no more data to read.

5.2   Terminology

You might not understand all of this right away, I suggest you get back to this section as you need later. I will just provide definitions for some words and phrases I will be using.

Command Character
This is the kind of character in the image data that tells how many bytes to read image data from, or how many transparent pixels to insert.
Command Sequence
A piece of the data in a frame consisting of a command character and any other bytes that it uses.
Regular Command Sequence
A regular command sequence is a command sequence that consists of "raw" non-transparent image data. The term "regular" comes from the CV sources by TeLAMoN (see acknowledgement). Actually, there is nothing more "regular" about these command sequences than say a transparency command sequence, so this naming convention might be a bit bad. But as longs as everyone knows what they're talking about.
Transparency Command Sequence
This command sequence type does not keep any data after the command character, it merely says how many pixels should be added as transparent pixels at the current position.
Block Command Sequence
Yeah, what about em'?
c
I use this as a placeholder for an arbitrary command character.

5.3   Commands

I would like to think of the data in a CEL file as a list of commands. You don't have to see them that way, but I think it is easier to explain the format this way.

A command in a CEL file is quite similar to other commands in the world of computers. It has a command, and some arguments (well, the command character holds some data, despite its newly given name). There are just a few commands and they can be distinguished by looking at the numeric value of a command char.

If c is a command character read from some CEL frame-data, then this is true for c:

Condition Command Type
0x10 < c <= 0x7f Regular
0x80 <= c Transparency
c <= 0x10 Block

5.3.1   Regular Command Sequences Revealed

Regular command sequences are the perhaps most vital sequences, being the only sequences that contain actual color-indices. They are also the most simple sequences. When c is a regular command, the following c bytes contain a color-index each. So if you are parsing a CEL file, and you find a regular command, read c bytes, and then the character after those will be the next command character.

Let's go back to example 1 for a while. The header gives that there is one frame, it starts at offset 0x0C and its data is 0x20 - 0x0C = 0x14 = 20 bytes large. When parsing this frame, you would read one character at offset 0x0C. This happens to be 0x13, so it is a regular command. The rules for regular commands say that you should read 0x13 bytes at this point. When you've done that, you have actually parsed all of the frame, but if there were more content, you would read a command character at offset 0x20 and continue in this fashion.

5.3.2   Transparency Commands

These commands are almost as easy to understand as regular commands, and they are easier to parse in a way, since they are always just the command character. The command character keeps information about how many pixels following that should be transparent. You can get the number of transparent pixels by doing something like:

transparent pixels = 0x100 - *c*

or by viewing c as negative (I usually prefer to view chars in C as unsigned, why I like the previous approach better, but they are really equivalent, really, as long as you've got your signedness right):

transparent pixels = - *c*

5.3.3   Block Commands

These commands seem to be the least documented ones. What I know here is how cvcel uses them.

If the first command in a frame is a block command, then the CEL has some information stored in the next c bytes. cvcel uses only byte 2 and 3, which together forms a 16-bit integer (perhaps it is actually a 32-bit integer, but the upper two bytes are never used). This integer is actually an offset in the frame. cvcel stores this value and waits until it gets to the byte in the data with offset c. When it gets to that byte, it checks how many pixels it has read (transparent and opaque pixels alike). It then computes the width of the cel (where pixels is the number of pixels read at the block offset):

cel width = *pixels* / 32

5.4   Computing the width of CELs

This is the root of all evil. cvcel can neither do it well (most CV5 users probably has opened a CEL, just to see that it looks as crap. Most oftenfly this is the width-ghost). There are also rumors about Diablo storing some of the widths in the source code (shrug, why would they want that?)

5.4.1   Pure Regular Frames

In some cases, the width is quite easy to do. Especially, a lot of CELs that only have regular commands are easy. These frames can have their width computed by looking at the first commands.

Starting from the beginning of the frame, do this:

  1. prepare a variable, width, set it to zero.

  2. read a command character to c.

  3. if c is 127:

    • add 127 to width.
    • process the data if you want (not necessary for the width computation), or at least make sure you skip the next 127 bytes.
    • go to 2
  4. add c to width

  5. process the data if you want.

  6. width should now be the width of the CEL, and you have read a complete scan-line. Congratulations.

Now, the idea is that the command with value 127 extends the line, so that lines can be longer than 127 pixels. Everything else would just end the scan-line. I must say though, that I have no idea whatsoever on how you would create a 127 pixels wide image.

5.4.2   "Blocked" Frames

Frames that starts with a block command contains a way to compute the width, see block commands.

5.4.3   Fall-back Options

There is no way to compute the width of an "irregular" frame that does not start with a block command that I know of. This is unfortunate, of course. The way cvcel do this is by taking the square root of the pixels read, doing a somewhat bad approximation.

6   Some Explanations

cvcel
This is a dynamic link library written by TeLAMoN for his CV5 program. It is a plug-in to CV5 that decodes CEL image files.
CV5
This is a program by TeLAMoN that is the de-facto standard program for viewing CEL image files. It can also do other formats, most notably the CL2 files also found in Diablo.
Diablo
The game. A game released by Blizzard Entertainment 1996, created by Blizzard North. The game was quite popular but its popularity is not as huge today due to the age of the game, but mostly because of the release of "Diablo II" -- Diablo's sequel.
Hellfire
An extension to Diablo created by Sierra. Its story-line is not part of the real Diablo-series, and its not really used by everyone (mostly because the lack of a multi-player mode). However, some very cool mods have been created around it. It is suitable for modifying due to its larger content set, and you can enable multi-player mode.

7   Acknowledgement

The author (Joel Molin <joel@molins.nu>) would like to refer to his sources while he has the chance to:

  • Ted "Nykodaemus" Powell -- For giving me the initial information, which would still be a lot of what I know, over e-mail all those years ago and for just being a cool Canadian dude who could stand a young Swedish one.
  • M. "TeLAMoN" König -- For the CV utilities, for sharing his source (unlike most of the Diablo community :-/), and for answering politely on my e-mail a year ago despite the age of the project.
Hosted by www.Geocities.ws

1