Background

Hello, my name is Shane Arnott, and like many of you, very much a DooMaholic. Heres a bit of history and a guide on how to get the most out of this document.

Last year, i.e. 1995, I wrote a thesis to satisfy my Honours requirement at Latrobe University, Australia. My thesis titled Optimisation Techniques of Virtual Reality Simulation Engines served me well and gave me first class Honours. :) Besides this it centered largely around DooM and part of my research consisted of creating an engine that read in DooM Wadfiles (like a hundred or so other people). This was done to demonstrate the use of Binary Space Partioning trees for Hidden Surface Removal. The engine I wrote was christened BVGE, so for the better half of this document you can interchange the word BVGE with DooM.

Anyway, to cut to the chase some of my research into the BSP trees was directed at types of Node Builders, since they built the BSP trees that where contained in the Wadfile, of which my program read in (and funnily enough so does DooM!). Through this process I benchmarked some of the publicily available Node-builders (originally written for DooM) with my engine. This has been of interest to the DooM community and maybe interesting to you. If you have any questions, please email me on either [email protected] or [email protected].

Abstract

Virtual Reality is a relatively new and exciting field in Computer Science, with many possible applications. Currently the level of immersion in most Virtual Reality systems, especially on home computers, is fairly poor as generation of realistic images cause high latency. This thesis looks into what can be done to optimize the results, and raise the level of immersion. The use of Binary Space Partitioning Trees for Hidden Surface Removal is concentrated on as it has been demonstrated as the best method currently available for Hidden Surface Removal, in three dimension simulation engines on home computers, evidenced by the game Doom from id software.

A sample engine was written for this discussion, of Doom-like quality called BVGE. This engine uses the same information as Doom uses for world information, thus BVGE has the ability to interactively render any Doom world. So using BVGE as an example of a Virtual Reality Graphical Simulation Engine research was conducted into various possible optimizations, for BVGE and engines with similar rendering techniques. The following possible optimizations where investigated; The best method of compiling the BSP tree; which compiler should be used, 16 - bit or 32 - bit; and what is the best platform to run this piece of Virtual Reality Software.

Chapter 7 - The BSP Node Builders

7.1 Introduction

The biggest area for optimization is in the creation of the BSP tree for the worlds that BVGE generates. If a better tree can be constructed the renderer can traverse the nodes quicker, and more efficiently. Therefore producing an engine that is capable of running at high enough speeds to give a greater level of immersion in VR. So how do we come across a better BSP tree? What is the definition for better? In this chapter the results from my tests will hopefully shed some light on what a better BSP tree is and the best way it can be made. All of the results of the following tests are related to the BVGE engine and engines that use similar rendering techniques only.

7.2 A Discussion of BSP tree Optimization

Ommited due to the over-academic nature and that it is unrelated to this discussion.

7.3 The Tools - DOOMs Node-Builders

The tools I will use for testing will be a selection of Doom Node-builders which can be found at ftp.cdrom.com/pub/idgames/doom/level_edit/node_builders . To document the efficiency of these node builders, an investigation into how their respective algorithms is required. Only the optimized parts of the algorithms will be discussed as BSP theory has already been covered in earlier chapters. The most crucial part of choosing splitters will be focused on. Later the speed of compilation of the tree from a level and speed of traversal of that tree by the renderer will be covered.

7.3.1 DOOMBSP aka IDBSP

This is the actual Wad builder that was used for Doom written by John Carmack from id software. The method it uses for building Wads is unusual compared to all the others that it first writes all the data to an ASCII file and then manipulates it from there. This may be a factor in compilation speed, where all other node builders manipulate the Wadfile directly and have no two file overhead as DOOMBSP does.

The best splitter is chosen by grading the quality of a split along each linedef for the current Sector or Sub-Sector. Evaluation is halted as soon as it is determined that a better split already exists. A split is good if it divides the lines evenly with cutting as few linedefs as possible, but not zero lines. This is considered a worst case and preferred to be placed in a leaf node. A horizontal or vertical split is graded higher than a sloping split.

This is the most simple of the selection, and I do not expect it to fair as well as the others in the results. But it must be understood that this is the original. All the following where based on this algorithm and it is therefore an important benchmark.

7.3.2 DEU 5.2x

The DEU builder is one of the most complex of all builders and operates slightly different than standard node builders. A full explanation is warranted to understand DEU algorithm.

While most builders center around processing linedefs, DEU 5.2x only takes Segs into account, where creating the Segs can be done directly from the SideDefs. The Seg list is divided in two after each call to a Node Creation Routine which inturn makes the algorithm faster. Other optimizations are implemented, such as looking at the two ends of a Seg to see on which side of the nodeline it lies or if it should be split in two. This builder was written by Raphael Quinet who also co-authored DEU.

Heres a short description of the algorithm:

The Nodes Creating Routine does the following:

This algorithm uses the same split technique as the DOOMBSP by finding a nodeline instead of a Seg. When a nodeline is found, then the Seg list is split, and the Node Routine is called for both lists. Otherwise, the Sub-Sector is returned which contains the list of Segs.

7.3.4 BSP12x

This builder is very simple, even simpler than the original id method. But it must not be dismissed, a mix in the level of complexity of the builders will give a good indication on the exact level required for the best build.

A splitter is chosen to divide the nodes down, by deciding which is the best Seg to use as a nodeline. It does this by selecting the line with least splits and has least difference of Segs on either side of it. Its node choosing procedure is very short, but has been documented in Doom related material as effective.

7.3.4 WARM 1.4

WARM is by far the most complex of all Wad Builders, and uses a fair degree of mathematics in choosing the node splitter. It also uses the Doom environment to its advantage in many ways. For example, some lines in a level do not affect the display in any way; these lines are those that have no floor nor ceiling changes on either side of them and do not have a texture on them. So visually, you can not tell there is a line present while playing the game. Thus there is no reason for these lines to be considered during node generation. WARM is the first to do this, to the best of my knowledge, resulting in creating fewer NODES, SSECTORS (Sub-sectors), and SEGS (thus a claiming to build a smaller WAD).

The splitter routine searches for a suitable SEG to use as a partition line. WARM defines suitable to mean the SEG that most equalizes the number of SEGS on each side of the partition line and minimizes the number of SEG splits that need to be done. Ideally, the partition line should also do this for all the node's children as well, which has been done with all the others. But WARM authors, who have branded this "The fastest Node Builder in the West, and East" decided the calculations would be far too time consuming; thus only the input node is considered. The most suitable partition is found by selecting the SEG that maximizes the geometric mean, squared, of the right and left side counts and the number of splits.

The geometric mean is computed via :

A*Rc*Lc/(B*Sc+1)

where Rc, Lc, Sc are the right side, left side, and split counts respectively and A, B are empirical constants (the +1 is for the split count zero case).

The geometric mean variable is initialized to -1 so that at least one SEG will qualify (even if the maximum mean is zero). The routine sets the node's partition line to be the SEG that creates the best (maximum) geometric mean. If the geometric mean is still -1 after all SEGS are processed (meaning no suitable partition line was found), then a partition line choice is forced by choosing the first SEG as the partition line. The fact that SEGS derived from the same LINEDEF are consecutive in the SEGS list is used to reduce the number of partition lines tried and the number of times the algorithm has to be called.

As you can see WARM seems to be very complex and I feel may take a long time to compile a world, despite the authors claims.

7.4 Method of Tests

To gauge the performance of each node-builder tests will be performed in the following fashion. For each builder a Wadfile featuring rooms of varying complexity will be applied to each builder and tested in the following aspects:

7.4.1 Weighting of the Tests


Test                                 Unit Of         Weighting of Test    
                                   Measurement       aspect out of 10     

Speed of the Wadfile with the       Frames per              5.9           
BVGE engine                           Second                              

Size of the BSP tree for the        # of Nodes               3            
Wadfile                                                                   

Size of the Final Wadfile             Bytes                  1            

Compilation time                     Seconds                0.1           



NB: All tests will be conducted on a 486 DX4-100, 16 MB RAM and DOS 7.0, (if not otherwise indicated) to provide a constant platform for all the tests. This means results should be treated as relative, not as is.

The speed of the Wadfile with the BVGE engine, is the best benchmark available, and has been given the highest weighting. The size of the BSP tree measured in Nodes is also important as smaller the tree, the less traversals the engine has to make, and this should be reflected in the speed of the wadfile with BVGE tests. Also the number of splits will be measured to further illustrate the effectiveness of the node builder. Size of the wadfile is fairly unimportant and will also be related back to the earlier tests. Compilation time is related to the node-builder under test, and could be seen as of minor or no importance. But Time Taken in Compilation of a wadfile is a factor in level creation, and therefore VR development and then must be considered.

The weighting will be decided by the percentage performance difference relative to the worse nodebuilders performance. For example if BSP12x compiles the biggest wadfile, and the goal is to have the smallest sized file, BSP12x will receive a 0% rating and the others the percentage better improvement. These percentages will be cumulative over all the tests to find the best method overall.

The speed of the Wadfile with BVGE is determined by using the -playdemo feature in BVGE version 1.5, so that each test is controlled and a standard path is taken through each world.

7.4.1 Wadfile Profile

All wadfiles where constructed using WinDEU-32, the user manual for DEU can be found in Appendix - Doom Editors. The values listed here with the Wad descriptions will assist in deducing the amount of splits each method does to the original number of Linedefs. This will be reflected in the original Linedef to Seg (line segment) ratio.

FileName: SIMPLE.WAD

Description: One square room, with bi-level floor. Very basic design.

# of Linedefs: 12

# of Sidedefs: 20

# of Sectors: 3

FileName: BVGE.WAD

Description: Adapted from Episode One Mission One of the original Doom IWAD. All doors have been removed, stairs and tunnels have been added. This is the size of a typical VR world.

# of Linedefs: 521

# of Sidedefs: 720

# of Sectors: 100

FileName: VCOMPLEX.WAD

Description: Adapted from Episode Two Mission Seven of the Doom IWAD. Again all doors have been removed and stairs added. This is the largest Doom level and thus the most complex.

# of Linedefs: 1068

# of Sidedefs: 1705

# of Sectors: 170

The floor-plan maps are displayed in Appendix - Wadfile Floor Plans.

7.5 Results

7.5.1 DOOMBSP


Wadfile Name          Complexity     Compilation        Size of the     
                                     Time (secs)      WadFile (Bytes)   

SIMPLE.WAD           Low                error              error        

BVGE.WAD             Medium               6               60,904        

VCOMPLEX.WAD         High                 10              132,291       




Wadfile Name          Original # of     Final # of    Final size of the  
                         Linedefs        Line Segs         BSP tree      

SIMPLE.WAD                  12             error            error        

BVGE.WAD                   521              802              247         

VCOMPLEX.WAD               1068            1903              674         




CPU Type    Memory    Speed of Low     Speed of Medium    Speed of High   
             Size      complexity        complexity        complexity     
                      Wadfile with      Wadfile with      Wadfile with    
                       BVGE (fps)        BVGE (fps)        BVGE (fps)     

486           16          error            24.175            17.636       
DX4-100                                                                   



Comments....

In building SIMPLE.WAD an error occurred within the compiler causing it incapable to built this, the simplest of levels. It must be noted that no level in Doom was ever this simple, but in VR worlds can range for the very simple to the very complex. The other wads compiled without error and results where taken. The Compilation time was added from the two different parts of the process outlined in the Node Builder explanation earlier.

7.5.2 DEU 5.2x


Wadfile Name          Complexity     Compilation        Size of the     
                                     Time (secs)      WadFile (Bytes)   

SIMPLE.WAD           Low                  1                2,162        

BVGE.WAD             Medium               5               60,754        

VCOMPLEX.WAD         High                 10              132,247       




Wadfile Name          Original # of     Final # of    Final size of the  
                         Linedefs        Line Segs         BSP tree      

SIMPLE.WAD                  12              28                8          

BVGE.WAD                   521              793              245         

VCOMPLEX.WAD               1068            1902              673         




CPU Type    Memory    Speed of Low     Speed of Medium    Speed of High   
             Size      complexity        complexity        complexity     
                      Wadfile with      Wadfile with      Wadfile with    
                       BVGE (fps)        BVGE (fps)        BVGE (fps)     

486           16         32.290            24.679            17.834       
DX4-100                                                                   



Comments...

None.

7.5.3 BSP12x


Wadfile Name          Complexity     Compilation        Size of the     
                                     Time (secs)      WadFile (Bytes)   

SIMPLE.WAD           Low                  1                2,042        

BVGE.WAD             Medium               6               62,110        

VCOMPLEX.WAD         High                 28              134,127       




Wadfile Name          Original # of     Final # of    Final size of the  
                         Linedefs        Line Segs         BSP tree      

SIMPLE.WAD                  12              28                8          

BVGE.WAD                   521              803              247         

VCOMPLEX.WAD               1068            1888              679         




CPU Type    Memory    Speed of Low     Speed of Medium    Speed of High   
             Size      complexity        complexity        complexity     
                      Wadfile with      Wadfile with      Wadfile with    
                       BVGE (fps)        BVGE (fps)        BVGE (fps)     

486           16         32.682            24.025            18.082       
DX4-100                                                                   



Comments....

None.

7.5.4 WARM 1.4


Wadfile Name          Complexity     Compilation        Size of the     
                                     Time (secs)      WadFile (Bytes)   

SIMPLE.WAD           Low                  1                2,042        

BVGE.WAD             Medium               7               60,754        

VCOMPLEX.WAD         High                 97              132,247       




Wadfile Name          Original # of     Final # of    Final size of the  
                         Linedefs        Line Segs         BSP tree      

SIMPLE.WAD                  12              28                8          

BVGE.WAD                   521              793              245         

VCOMPLEX.WAD               1068            1820              630         




CPU Type    Memory    Speed of Low     Speed of Medium    Speed of High   
             Size      complexity        complexity        complexity     
                      Wadfile with      Wadfile with      Wadfile with    
                       BVGE (fps)        BVGE (fps)        BVGE (fps)     

486           16         32.571            24.200            17.955       
DX4-100                                                                   



Comments...

Compile time on VCOMPLEX.WAD was very high due to the fact that WARM builds a very intense REJECT map. A REJECT map is used in DOOM on large levels like this one to speed up the rendering process with precomputed views in large rooms and complex areas. Despite this all other node builders built their REJECT maps in the recorded time periods and I cannot make exceptions. But if any is to be made the REJECT map was built in 62 seconds of that 97 second period, leaving 35 seconds as the node building time. This maybe will be considered in graphing the results.

7.6 Analysis of Results

The best method of displaying mass data is in graph form. Here is the results from the node builder testing.

7.6.1 Compilation time vs. Node Builder

Unit of time is seconds.

7.6.2 Compiled Wad size vs. Node Builder

7.6.3 Original Linedef Line Seg Ratio - Number of splits

Linedef to Seg ratio or number of splits is found by:

(Final Number of LineSegs - Original Number of Linedefs) / 2

This is so, since a Lineseg is a piece or a whole Linedef, the difference gives the number of extra Segs that where made by splitting. And since every split produces two segs, we must divide by 2.

7.6.4 Size of BSP tree vs. Node Builder

7.6.5 Frame rate vs. Node Builder

7.6.6 Final Evaluation via Weighting

After some very interesting results, like BSP12x's comparably good frame rate results for such a simple compiler, I must rate all the results to find the most optimal node-builder. But before I do this a small discussion on the results will be conducted.

In the compile time tests, a large difference in results where recorded for each node builder. But for all other tests the results where very close, hence the need for separate graphs for each set of results. The most surprising was the closeness in the frame rates between the differently built wads. The biggest difference recorded was only a 3% performance enhancement, proving optimization of a BSP is not as crucial as first thought concerning run time matters.

7.6.6.1 Preliminary un-weighted Final results

The following graphs are the relative percentage improvement each Node-builder has to the next. These values are relative, hence no unit of measurement is given and these are the raw values before the weightings are applied. The larger the bar the better the in all cases.

i.e. The aim is to get the:

7.6.6.2 The Final Weighted Results

The final weighted results are as follows:


These values are again only relative, but are very useful for the overall comparison.

7.7 Conclusion

As is obvious, the optimal choice in node-builders is WARM 1.4 written by Robert Fenske Jr., by a very convincing margin of performing more than twice as superior as all the other tested builders for the given weighting scheme. WARMs building method traded off the longer (73% longer than the fastest) compile time to produce a much better wad overall. Thus, WARM built wadfiles will now be used for the next bout of tests as I continue the search for the optimal combination for the BVGE engine.



Shane Arnott © 1995
Rights granted to SAMS Publishing for 3D Game Alchemy for DOOM, DOOM II, Heretic & Hexen.