|Strike Force: The new ATI Radeon 9800, 9600 and 9200 Series|
Only a month ago, NVIDIA rolled out its new flagship chip, the GeForce FX 5800 Ultra, to counter the threat posed by rival ATi's line-up. In pure performance terms, the jury is out on who won, or if there was an outright winner. While the FX was inches ahead of the Radeon 9700 PRO in most standard benchmarks, its efforts were frustrated when tested in very high resolutions with FSAA. Additionally, an inordinately loud and therefore impractical cooling solution overshadowed a product launch that was, on the whole, okay. Lastly, although widespread market availability was promised for mid-February, we have yet to see any FX products in more than homoeopathic doses.
Another factor that caused many of our readers' questions to remain unanswered was the hectic and perhaps even hasty launch. Our review sample reached us on a Friday (with NDA expiring the following Monday morning, read: midnight), forcing us to forego certain interesting, relevant, and important tests due to time constraints. The 3DMark 2003 issue, brought up by NVIDIA, as well as the absence of a certified WHQL driver also did their part to add to the overall confusion and the testing difficulties. You would think both ATi and NVIDIA would learn from this experience. Unfortunately, it seems they didn't, and it was with a sense of deja vu that we received our Radeon 9800 only days before the NDA was to be lifted.
Both ATi and NVIDIA are unveiling new products today. NVIDIA is adding two members to its FX-family, introducing the mainstream and entry-level cards formerly known as NV31 and NV34. We'll bring you more information on these two products in a second article in a few hours - albeit without any benchmarks. Only a day before the official launch, NVIDIA decided it was still too early for benchmarks, and that these were to follow later. So in this article we'll take a closer look at ATi's new products.
ATi Updates Entire 9x00 Product Line
ATi is updating its entire product line. All cards of the 9000, 9500, and 9700 series will be replaced by successors named 9200, 9600, and 9800, respectively.>
The new Radeon 9800 DirectX 9 VPU (R350) supersedes the highly successful Radeon 9700 (R300). Technologically, the chip is an updated and optimized R300 core, with changes that go beyond a simple clock speed bump. The shader unit (SmartShader) has been supplemented with a so-called "F-Buffer," which theoretically allows shader code of infinite length. As a dig at NVIDIA for calling the FX a "DirectX 9+" part because of its extended programmability, ATi has named its R350 a "DirectX 9++" part. You've got to love that creativity.
Changes have also been made to "SmoothVision" (now 2.1) and HyperZ III (now III+ - you can't go wrong with pluses). Also, the clock speed was increased from 325/310 to 380/340. Neither the fabrication process nor the memory type have changed, however. Like the R300, the R350 will be manufactured on an 0.15 micron process and will use a 256Bit interface to connect to the DDR(-I) memory. Although all R350 chips are DDR-II ready, according to ATI, only the 9800 PRO model, due out later in H1/03, will use the newer memory. Like its predecessor, the R300 has eight pixel pipelines and four vertex shader units.
The most obvious visual changes compared to the 9700 have been
to the board layout. The Radeon 9800 board is slightly longer, making it look
less cluttered overall. The auxiliary power connector now uses a four-pin Molex
plug, which feels much more stable than the previous three-pin floppy power plug
found on 9700 boards. The new review board also sports a new heat sink/fan
Smart Shader 2.1
The most interesting change from SmartShader 2.0 and 2.1 is the addition of the so-called "F-Buffer", which stands for "Fragment-Stream FIFO buffer". With this new technique, it's theoretically possible to run shader code of infinite length without having to resort to performance-reducing multi-pass operations - note the word "theoretically." In practice, the VPU's performance will quickly limit the length of code that can realistically be run.
The advantage is obvious, though. If the length of a certain piece of shader code exceeds the maximum length specified in DirectX 9, the effect has to be broken down into several steps or passes - if possible. The trouble is that each pass will again require bandwidth-intensive memory accesses (vertex processing, backface culling, triangle setup, texture sampling, pixel shading, stencil testing, Z testing, anti-aliasing). The F-Buffer solves this problem. The concept of the F-Buffer is built on the ideas of William R. Mark and Kekoa Proudfoot of Stanford University.
In practice, the F-Buffer probably won't play much of a role in
the foreseeable future, since it will likely be a while yet before pixel shader
2.0 code makes an appearance in games, let alone exceeds the maximum code
length. Current chips simply don't offer sufficient performance. By the same
token, the extended programmability of the GeForce FX is only a theoretical
feature at this point.
Compared to SmoothVision 2.0, version 2.1 sports an optimized
memory controller. The benefit of this improvement should be better performance
in 4x and 6x FSAA in resolutions of 1024x768 and above. ATi is also advertising
its color compression feature for the first time. Its compression factor of 6:1
is also higher than that of the GeForce FX (4:1).
The optimizations in HyperZ III+ mostly affect the improved Z-Cache, which is now more flexible and was optimized for stencil-buffer data. One application of stencil calculations will be to create realistic shadows in future games. The Doom 3 Engine will make heavy use of this feature, for example. But even current games, like Ubisoft's Splinter Cell, make use of stencil data. ATi is reacting to this with its improved Z-cache.
The planned launch and introduction into the market is March 2003. Here's a list of the different versions of the card:
And, in summary, a short overview of the R350's features:
That brings us to ATi's new mainstream product, the Radeon 9600, which will replace the 9500 series.
The Radeon 9600 VPU (alias RV350) is a fully DirectX 9
compliant chip and is based largely on the R300 core (more a mix between
R300/R350), but with a few features omitted. Also, it is ATi's first chip to be
produced on the same 0.13 micron process as the GeForce FX. As a result, it
requires less current, can run at higher clock speeds, and produces less heat.
The number of pixel pipelines has been reduced from eight to four, and the
vertex shader units have also seen a cut, from four to only two. On paper, these
cuts seem like a definite step backwards from both the Radeon 9500 and 9500 PRO.
The RV350 does add the new "SmoothVision 2.1" and "SmartShader 2.0"
enhancements, though, without "F-Buffer". The "Hyper-Z" optimizations, on the
other hand, are still on the same level as that of the "old" R300, meaning 8:1
"Lossless Z" compression instead of the R350's 24:1 compression factor.
While we already have a review sample of the Radeon 9800 PRO, we won't be able to take a look at the 9600, since it won't be launched until later this month.
9600 Card Versions
And the features of the Radeon 9600 VPU (RV350) in summary:
And now, let's look at ATi's new entry-level chip. The Radeon 9200 VPU (RV280) is the replacement part for the Radeon 9000 series. This chip only differs from its predecessor in its AGP 8x support and its higher clock speeds. The core is still based on the Radeon 8500 design with its four pixel pipelines, but, like the 9000, it only has one texturing unit per pipe, instead of the 8500's two.
Despite the "9" in the product name, the chip is not a DirectX 9 part. Instead, being based on the 8500 and 900 VPUs, it only supports the DirectX 8.1 specification. The antialiasing implementation is also not quite up to date, as the chip still employs the slow SuperSampling technique.
We also don't have access to a Radeon 9200 review sample yet. ATi plans to introduce this part in April 2003. Again, there will be several versions of cards based on this chip:
There's no final word yet on the clock speeds for the 9200 cards. The Radeon 9200 VPU (RV280) supports the following features:
Due to the state of the (non-final) drivers on both sides, image quality comparisons are a bit problematic at this stage. NVIDIA's GeForce FX drivers in particular still have some kinks that need to be ironed out (Xs FSAA modes), which in some cases even cause image corruption. On top of that, we have no way of checking the floating-point precision with which pixel shader effects are calculated.
Therefore we will postpone our more extensive image quality comparison until we have WHQL certified drivers which must conform to certain standards and settings. Nonetheless, we can make some preliminary comparisons between the Radeon 9700 PRO and the GeForce FX. The ATi screenshots were taken on a Radeon 9700 PRO board, which should be representative of the 9800.
We decided to test using Grand Prix 4. Racing games tend to benefit more from anisotropic filtering and FSAA than other games. Especially rough transitions between mipmap levels and the high viewing distance create problems that a good anisotropic filtering implementation can easily remedy.
All screenshots were taken at a resolution of 1024x768. We recommend setting your screen to 1024 as well when viewing them to ensure a realistic comparison.
In this comparison, we'll take a closer look at the race track (the asphalt). Both cards were tested using maximum quality settings (FX: application; R9700 PRO: quality).
At default settings, NVIDIA's GeForce FX produces a visibly crisper image than the Radeon 9700 PRO. With 8x anisotropic filtering enabled, the differences between the cards are minimal, though. Increasing the filtering level to 16x does not visibly improve image quality, however.
Without anisotropic filtering, we can see the different mipmapping settings of the GeForce FX and the Radeon 9700 PRO. Looking at the crash barrier, there are still visible differences between the two cards, even at 8x aniso. While the FX makes the barrier look crisper, it also produces a moiré effect - clearly visible on the left-hand barrier just beyond the little brown shed. The filter also seems to cut out abruptly. The barrier seems softer, for lack of a better word, and the transition is also less abrupt. The benefit of the Radeon's 16x anisotropic filtering is also clearly visible (for example, at the bottom edge of the barrier on the right, but also on the left) when compared to 8x. On the other hand, the grass on the left seems a tad crisper at 8x on the GeForce FX.
While the FX makes the barrier look crisper, it also produces a moiré effect - clearly visible on the left-hand barrier behind the little brown house.
We couldn't tell any differences whatsoever between the different aniso performance settings on the GeForce FX in this game. We're not sure what caused it.
On the Radeon, on the other hand, we could easily spot the differences between the quality settings. In 8x Performance mode, the mipmap transitions are easily discernable. In 16x, the barrier looks much crisper than in 8x.
Since NVIDIA's driver still seems to be having some trouble with 2x FSAA mode (screenshots issue), we'll limit our evaluation to 4x and 8xs modes, or 6x mode, in ATi's case. We'll go into more detail on the different FSAA modes of these cards in a later article.
In 4x mode we can see slight differences when looking at the tire. Both cards do well when drawing the tire's curvature. With almost horizontal and vertical edges, the FX's image shows more jagged edges, however. NVIDIA's 8xs setting does not improve image quality any further. With ATi's 6x setting, on the other hand, there are no longer any aliasing artifacts around the tire.
This image shows the greatest drawback of NVIDIA's "Ordered Grid" antialiasing. Where the Pylons still show jagged edges in the FX's 4x mode, ATi's 4x "Jittered Grid" AA all but eliminates them. In 6x, the edges are practically picture perfect. NVIDIA's 8xs mode is an interesting mix, since its combines 4xMultiSampling with 4xSuperSampling. While the grid-like structure of the pylons is drawn much more sharply, it also looks less detailed and shows visible aliasing artifacts.
For those of you who would like to get an impression of the maximum attainable image quality of each card in this game, we have compiled the screenshots in an easily downloadable zip archive:
Image Quality Conclusions
Basing our conclusions on the currently available drivers, we can surmise that the Radeon 9700/9800 is superior to the GeForce FX in both antialiasing and anisotropic filtering. To be fair, NVIDIA's driver still seems to have some problems that are currently holding it back, as the lack of a difference between the different anisotropic filter settings show. Rest assured that as soon as NVIDIA releases a new "final" driver for the FX, we'll be back in the lab to bring you an extensive image quality comparison between the two cards. NVIDIA's 8xs mode, which is not really true 8x AA but a combination of 4x SuperSampling and 4x MultiSampling, offers no advantages worth mentioning. Since it also proved very slow, we strongly question the usefulness of this mode. ATi's 6x mode, on the other hand, offers visible image quality improvements - at least judging from the screenshots.
Due to ATi's slightly hectic product launch of the Radeon 9800, we have to limit our benchmarking to the bare essentials because of time constraints. To truly test these cards capabilities, we selected games that employ pixel and vertex shaders.
Instead of using Aquanox, we planned on benching with the sequel (Aquanox 2: Revelation), as the newest beta version already uses DX 9 shader code. Unfortunately, the game would consistently crash on ATi cards due to a sound error if there was no sound card installed.
Each test was first run without FSAA or anistropic filtering enabled. If you think that would be a cakewalk for these flagship cards, think again. Take Splinter Cell, for example, which uses extravagant effects like realtime shadow calculations to bring even these cards to their knees - at least with the detail slider set to MAX.
All of our testing candidates were benched in Unreal Tournament 2003 with 4x FSAA, 8x anisotropic filtering and a combination of the two. Additionally, the procedure was repeated in Splinter Cell with the top models. All tests were run at the highest possible detail level.
The synthetic benchmarks give a good overall impression of a
card's theoretical capabilities. Although it is heavily disputed as a test (more
on that in the benchmark section), we also used 3DMark2003 as a DirectX 9 test.
Our main focus during testing was the comparison between the Radeon 9800 and the
GeForce FX 5800/ 5800 Ultra.
Unreal Tournament 2003 - Antalus Flyby
Despite its clock speed advantage, the Radeon 9800 PRO is unable to keep pace with the FX 5800 Ultra. On the whole, the performance gain compared to the Radeon 9700 PRO - without any quality optimizations such as FSAA or anisotropic filtering enabled - is rather small.
Unreal Tournament 2003 - Antalus Botmatch
In the botmatch, the Radeon 9800 can claim a slight advantage, although its lead is marginal. In higher resolutions, the FX takes first place, although its lead is just as slim.
Ubisoft sent us a new benchmarking version of the game Splinter Cell, which we have exclusive access to. Our version is still a beta, and the benchmark is still in development. It will be officially released in the next few weeks.
Splinter Cell is based on the new Unreal Engine and uses very complex shadow and light effects as well as pixel shader v1.1 effects. For the most part, the game is not CPU limited. The framerate is influenced by normal 3D calculations to 50%, while the other 50% are attributable to shadow calculations. Splinter Cell uses very complex projected shadows. On NVIDIA cards, buffered shadows can be selected as an option.
For our tests, we selected projected shadows for all cards, though, since the framerate took a hit when buffered shadows were enabled.
NVIDIA's driver version 42.72, which the company is currently suggesting as the "proper" driver release for FX cards, has some rendering problems in this game (no glowing effect around lamps and such). Instead, we benchmarked using version 43.00, which fixes the problem. Interestingly, GeForce 4 cards didn't have any issues when used with driver version 42.72, but would score around 2 fps lower with version 43.00.
Splinter Cell was tested with all details set to maximum (High, High, Very High).
Splinter Cell impressively proves that standard tests, meaning without FSAA and anisotropic filtering, still have their place in the benchmarking sense - at least if the cards are pushed enough by a game that uses complex effects. The Radeon 9800 PRO and the GeForce FX 5800 Ultra are head to head, and the Radeon 9700 PRO is tied with the GeForce FX 5800.
We see a similar picture with the minimum fps scores. Both the Radeon 9800 PRO and the FX 5800 Ultra just scrape past the 25 fps barrier.
With more moderate detail settings, Splinter Cell runs much faster. But at the highest settings, even these high-end cards are pushed to their limits.
In Aquanox, the Radeon 9800 PRO can regain the lead the 9700 Pro lost to the FX 5800 Ultra. It leads the field across the board in all resolutions.
Serious Sam: Second Encounter
We forced all NVIDIA cards to use a 24Bit Z-buffer. Otherwise, the cards would have defaulted to a 16Bit Z-Buffer, while the ATi cards use 24Bits.
NVIDIA cards feel right at home in Serious Sam, as the benchmark scores bear out. The FX 5800s take a clear first place. The Non-Ultra doesn't start to fall behind until the resolution hits 1600x1200. The performance delta between the 9800 and the 9700 is only very small.
The ATi cards make a disappointing showing in the minimum frames category, only reaching the levels of a Ti4800. The picture only changes above 1600x1200. Whether the NVIDIA cards benefit from a better OpenGL driver or the drivers are simply highly optimized for this game is hard to tell.
3DMark 2001 SE
3DMark 2001 SE (build 330) tests cards on features of the DirectX 8 generation. For the sake of completeness we are also including the overall 3DMark score.
3DMark 2001 Detail Tests
Game 4 - Nature
Game 4 was the first to introduce extensive pixel shader effects of the DirectX 8 specification. The Radeon 9800 PRO beats the FX 5800 Ultra, despite the clock speed difference. The performance gain over the Radeon 9700 PRO is quite large.
Fillrate Single Texturing
In this test, the Radeon 9800 PRO just about eats the FX 5800
for lunch. The reason is easily found in the design of GeForce FX's pixel
pipelines. Contrary to what the official technical specs say, the FX is really
more of a 4x2 design and not an 8x1, like the Radeon 9500PRO/9700/9800 family.
As a result, the FX can only render four single textured pixels per clock cycle,
while the ATi cards can draw eight.
Fillrate Multi Texturing
When multitexturing is employed, the ranking changes. The FX cards can render four dual textured pixels, just like the ATi cards. Due to the FX's higher clock speeds, the ATi cards are unable to keep up.
High Polygon Count - Eight Lights
This test sees the Radeon 9800 fall quite a ways behind. One possible explanation would be that the FX carries a fixed-function T&L engine in addition to its vertex shader engine, which would explain its clear lead. (Thanks to Dave @ Beyond3d.com.)
Vertex Shader Speed
And suddenly we're back to the Radeon 9800 completely dominating the FX 5800 Ultra.
Pixel Shader Speed
The performance increase of the Radeon 9800 PRO from the 9700 PRO is impressive, enabling it to clearly beat the FX 5800 Ultra.
Advanced Pixel Shader Speed
Since each DirectX version is a superset of its predecessor, DirectX 9 cards support Pixel Shader 1.4 as well as 2.0. If a card does not support PS1.4, the test reverts to PS1.1, which requires more passes per shader. For some reason, the FX doesn't seem to be using its PS1.4 capabilities. It's unclear whether the cause is to be found in the driver or in 3DMark itself.
The newest version of 3DMark is currently being hotly disputed (see also 3D Mark 2003: The Gamers' Benchmark (?) and 3DMark 2003 - Talking Back to NVIDIA). As of yet, we are still undecided on how useful we find this test.
There is also a bit of confusion where NVIDIA's drivers are concerned. NVIDIA's official line is that the press should use driver version 42.72 (dated 24.02.2003) when testing the FX. Meanwhile, "newer" drivers have cropped up: i.e., version 43.00 (dated 13.02.2003).
The fact of the matter is that with version 42.72, the GeForce FX achieves an overall score that is 2000 points higher than with version 43.00. A first, and as yet unproven, hypothesis is that 42.72 uses only 16Bit floating-point precision, while other versions use full 32Bit precision and are consequently slower. Microsoft's WHQL requirements specify a minimum precision of 24Bits, which is what the ATi R300/ R350 chips use.
The interesting question is therefore which driver the FX cards will end up shipping with. Graphics card maker PNY may already have the answer, as the company's home page already sports drivers for its FX 5800 - version 43.00 of 13.02.2003! This gives the 42.72 drivers the taste of a "benchmark driver." Only a WHQL certified driver will be able to give us definite answers, but at this point, such a version does not yet exist.
We decided to test the FX with both available driver versions in 3DMark 2003.
Again, for the sake of completeness, here are the overall scores:
3DMark 2003 Detail Tests
Game 2 - Battle of Proxycon
This test uses pixel shaders of the PS1.4 spec. The Radeon 9800 PRO barely leads the FX 5800 Ultra (42.72) and takes a sound leap ahead of the 9700 PRO.
Game 4 - Mother Nature
In the DirectX 9 test "Mother Nature," the FX 5800 Ultra is positioned at the front of the field. Again, the 9800 PRO shows marked performance improvements over the Radeon 9700 PRO.
The results here are similar to those of the fillrate test in 3DM 2001. The FX cards are held back by their 4x2 design in the single texturing tests and benefit from their higher clock speeds in the multi-texturing discipline. Thanks to its higher clock speed, the Radeon 9800 PRO once again clearly leads the 9700 PRO.
The Radeon 9800 PRO can claim first place in both the vertex shader and the pixel shader tests.
The Codecreatures Direct 3D benchmark was originally published to showcase the 3D Engine, which was under development at the time. It uses pixel shaders of the DirectX 8.1 generation.
In these tests, the FX 5800 Ultra and the Radeon 9800 PRO are virtually tied.
This result tells us how many polygons were calculated per second (avg. MPolys/S). Again, we have parity between the two competitors.
Now we're getting to the interesting tests. While we saw the Radeon and the FX take turns at winning the benchmark categories, the picture changes in these quality tests.
Testing with 4X Full Scene Anti Aliasing.
The Radeon can clearly pull ahead of the Radeon 9700 PRO and takes the lead in this test.
8x Anisotropic Filtering
Here we test a card's speed when using anisotropic filtering. Since the two companies use different optimizations, a direct "apples-to-apples" comparison is difficult. The quality tests at the beginning of the article showed that visually, ATi's chips seem to have an edge on the competition.
UT 2003 - 8x Aniso Quality
The Radeon 9800 PRO dominates the FX 5800 Ultra. The performance improvement over the Radeon 9700 PRO is obvious.
UT 2003 - Aniso Performance
The GeForce FX 5800 has the upper hand here, although the differences are rather small.
4xFSAA + 8x Anisotropic Filtering
This test combines antialiasing with anisotropic filtering.
UT 2003 - 4x FSAA + 8x Aniso Quality
With both 4x FSAA and 8x aniso enabled, the Radeon 9800 PRO gets to play both of its trump cards, easily beating the FX 5800 Ultra, hands down. The performance increase over the 9700 PRO is also nothing short of impressive.
UT 2003 - 4x FSAA + 8x Aniso Performance
Thanks to its high performance in "Performance Mode," the FX regains some ground, but is unable to catch up to the Radeon 9800 PRO.
The Radeon 9800 PRO makes an impressive showing, nullifying the slim lead NVIDIA's FX 5800 Ultra held over the Radeon 9700 PRO. While the newcomer achieves parity with the NVIDIA card in standard tests, it totally dominates the FX 5800 Ultra when it comes to FSAA and anisotropic filtering. Additionally, the ATi cards offer the better FSAA/ aniso implementation in our comparison. It remains to be seen whether this will change with future driver updates from NVIDIA. We'll take a closer look at image quality on both cards as soon as we have WHQL (or final) drivers for these cards.
In addition to its more compact design (single-slot solution) and its simpler (and much quieter) cooler, the Radeon 9800 PRO is also much faster than the FX 5800 Ultra in all important disciplines (FSAA, anisotropic filtering) and offers the best image quality with those features enabled. If you're looking for the fastest 3D accelerator currently available, the Radeon 9800 PRO is your chip. This doesn't make the GeForce FX 5800 Ultra a bad product by any means, but the leadership is once again firmly in ATi's hands.
Owners of a Radeon 9700 PRO need not worry, though. Their card has not suddenly become obsolete because of the Radeon 9800 PRO. While there is a difference between the two, it isn't a dramatic one, and certainly nowhere near enough to justify an upgrade, in our opinion.
We'd be harder pressed to make any recommendations on the Radeon 9500's successor, the 9600 PRO. Judging from the specs, it looks like the 4x1 design will probably be slower than the older 9500 with its 8x1 design, despite the clock speed advantage (400MHz vs. 275MHz). We will only be able to answer that conclusively once we have a review sample, though.
Don't hold your breath for any surprises where the Radeon 9200 is concerned, however. This chip offers nothing new over its predecessor, aside from an AGP8x interface.
The way it looks, the mainstream segment promises to stay interesting for a while yet, especially considering that NVIDIA is set to launch its own mainstream products, based on the GeForce FX technology. Stay tuned for more!