Ugur Turan - Octree

Main
[home]
[me]
[my resume]
[my tutorials]
[my journal]
[my links]
[contact me]

Project pages
[Luces]
[Portfolio]
[SnakeCharm]
[Others]

Last update:
March 19, 2005
Contact:
webmaster

Octree's for rendering and collision

I started writing this very recently, therefore it is not complete.

However, I find some of the ideas interesting.

I did think a lot on this, and here is the solution I came up with.

I am assuming that we have T&L hardware, and the world is in fast memory (AGP or video ram).

Culling for rendering:

Yes, we shall have a very fast culling tool. I am assuming we have many cellars, dungeons, rooms, small objects, etc., probably connected to each other using tunnels, doors, whatever. My first idea was to use an octree or kd-tree on the entire scene. To avoid problems with blending/multipass algorithms, etc., we have to assure that a triangle appears in only a single node. To do this, we may either use a loose-octree ( as described by Thatcher Ulrich ), or clip triangles and split them to oct-tree bounding boxes, then stop further dividing when we are below, say, 1000 triangles. Both work fine (but I believe the second to be better ).

The drawback of this method appeared when I first implemented it. Gettin the nodes in the frustum and rendering them has several problems. To see, let's assume that the whole scene has two different shaders, one for walls, one for floor. Then, possibly many nodes will have triangles with two different shaders. If we have 1000 nodes in frustum, it means spoiling the texture cache 1000 times a frame, when only once should be necessary.

My nodes stored only indices to vertex data, and I stored the vertex data in one large array ( better than storing all vertex data in the node). Then, we may either render each node alone, or combine the indices. And if, when combining indices into a indice buffer, we sort the indices by shader, then we can minimize the state changes. However, that introduces a CPU burden, especially if we have a very large triangle count.

Then I came up with this idea: Every room, corridor, whatever should be modelled as seperate meshes. Or, if artists want to use many small meshes to build up a room, ask them to group them, and give the group a name like 'room_1'. Then, we may cull these rooms at object level, which will relieve both the GPU and the CPU.

But how to cull? My first idea was frustum culling again, which would do. Then I noticed that these rooms are 'snapped' to each other, and their intersection will only form a polygon. We may use these polygons to use portal-style culling. Given two meshes, pre-calculate their nearest point, if their distance is below an epsilon, the two meshes are neighbors. Calculate all vertice/edge/surface distances below that epsilon, connect these edges/vertices, and you have the portal. To avoid cracks on joints, we may need to check the vertices that overlap and equalize their indices.

Then, render the world using the portal way. Draw the room you are in - check if portals in that room are visible - if portal is visible, render the associated rooms - check portals in the new rooms - continue until all rooms are rendered and no more portals are visible.

This method is easy to implement, and reduces the fill-rate burden of the video card to a reasonable minimum, shader switching to a great low value, and totally eliminates the CPU overhead. However, because it is not a true portal engine, we will still need the z-buffer test.

Culling for collision detection:

Assuming each room may have ten thousands of polygons, we cannot simply check all polygons in the room the player is in. The oct-tree/kd-tree seems good enough for this. Depending on your collision and response method, you may or may not want to split triangles at box boundries. I used the elipsoid method for first-person for player-world collision. It is fine, but I had a hard time getting it work right. It is VERY cpu hungry, so I divided nodes until I reached a very low polygon count. An alternative to octrees would be an OBB tree, however, creating them is not as easy as using them.

Culling for Rendering - additions:

This came into my mind while thinking of a way to reduce cost of per-vertex setup for cubemap based spotlight effect. I was basically doing a diffuse+lightmap pass, then a second pass with glBlendFunc(GL_ONE,GL_ONE). My problem was that calculating the per-vertex texture coords for the second pass costed just too much, because of the calculation of very dense and high-poly objects that were not in the spotlight's cone.

Keeping a octree-style thing made up of split triangles would have some benefits. In the cases of projective textures, spotlights, or any attenuated dynamic light, you will probably have to do multiple passes. For a blended second-pass for a spotlight that will only illuminate a quarter of one room is not very nice. Most cards have very hard times doing blending. Basicly, reading from the framebuffer slows. Things like reading, multiplying, then writing to the framebuffer will slow you down to a turtle.

A problem that arise, if we split the triangles to nodes, we will fail on glDepthTest(GL_EQUAL) tests (we can make sure that our divisions are not perfect). If we dont' split, some triangles will be on two different nodes, they may be rendered twice, thus blended twice, introducing artifacts (like making them 'visible').

Ulrich's loose oct-tree seems to me a very good solution. No duplicate triangles, no splits.

So, if you are rendering a spotlight effect, get the loose-nodes in the spot cone, and render them with blending. Assuming that there will not be many nodes inside the cone, this will not slow down too much.

Still, I think the use octrees for blending depends on the complexity of scene. If you are only projecting light to the floor (quake 3 style) using octrees willl speed up, especially if the rest of the scene is complex.

Combination / Result:

I render the scene using some portal-like culling. If we need per-vertex calculations or a second pass that will affect only a little area, I use loose octrees. I use an oct-tree for filtering out triangles for collision detection.

Maybe not be perfect way, but works fine. The thing I like with it is that it allows the artists to design very complex worlds, and no model must be 'closed'. If you remove/change the portal-based method, it may even work on polygon soups.

Hosted by www.Geocities.ws