id Software's Usenet Group Posts Archive

id_notes/John C/2001-02-22

[idsoftware.com]

Welcome to id Software's Finger Service V1.5!

Name: John Carmack

Email: johnc@idsoftware.com

Description: Programmer

Project:

Last Updated: 02/22/2001 21:02:26 (Pacific Standard Time)

-------------------------------------------------------------------------------

Feb 22, 2001

------------

I just got back from Tokyo, where I demonstrated our new engine

running under MacOS-X with a GeForce 3 card. We had quite a bit of

discussion about whether we should be showing anything at all,

considering how far away we are from having a title on the shelves, so

we probably aren't going to be showing it anywhere else for quite

a while.

We do run a bit better on a high end wintel system, but the Apple

performance is still quite good, especially considering the short amount

of time that the drivers had before the event.

It is still our intention to have a simultaneous release of the next

product on Windows, MacOS-X, and Linux.

Here is a dump on the GeForce 3 that I have been seriously working

with for a few weeks now:

The short answer is that the GeForce 3 is fantastic. I haven't had such an

impression of raising the performance bar since the Voodoo 2 came out, and

there are a ton of new features for programmers to play with.

Graphics programmers should run out and get one at the earliest possible

time. For consumers, it will be a tougher call. There aren't any

applications our right now that take proper advantage of it, but you should

still be quite a bit faster at everything than GF2, especially with

anti-aliasing. Balance that against whatever the price turns out to be.

While the Radeon is a good effort in many ways, it has enough shortfalls

that I still generally call the GeForce 2 ultra the best card you can buy

right now, so Nvidia is basically dethroning their own product.

It is somewhat unfortunate that it is labeled GeForce 3, because GeForce

2 was just a speed bump of GeForce, while GF3 is a major architectural

change. I wish they had called the GF2 something else.

The things that are good about it:

Lots of values have additional internal precision, like texture coordinates

and rasterization coordinates. There are only a few places where this

matters, but it is nice to be cleaning up. Rasterization precision is about

the last thing that the multi-thousand dollar workstation boards still do

any better than the consumer cards.

Adding more texture units and more register combiners is an obvious

evolutionary step.

An interesting technical aside: when I first changed something I was

doing with five single or dual texture passes on a GF to something that

only took two quad texture passes on a GF3, I got a surprisingly modest

speedup. It turned out that the texture filtering and bandwidth was the

dominant factor, not the frame buffer traffic that was saved with more

texture units. When I turned off anisotropic filtering and used

compressed textures, the GF3 version became twice as fast.

The 8x anisotropic filtering looks really nice, but it has a 30%+ speed

cost. For existing games where you have speed to burn, it is probably a

nice thing to force on, but it is a bit much for me to enable on the current

project. Radeon supports 16x aniso at a smaller speed cost, but not in

conjunction with trilinear, and something is broken in the chip that

makes the filtering jump around with triangular rasterization

dependencies.

The depth buffer optimizations are similar to what the Radeon provides,

giving almost everything some measure of speedup, and larger ones

available in some cases with some redesign.

3D textures are implemented with the full, complete generality. Radeon

offers 3D textures, but without mip mapping and in a non-orthogonal

manner (taking up two texture units).

Vertex programs are probably the most radical new feature, and, unlike

most "radical new features", actually turn out to be pretty damn good.

The instruction language is clear and obvious, with wonderful features

like free arbitrary swizzle and negate on each operand, and the obvious

things you want for graphics like dot product instructions.

The vertex program instructions are what SSE should have been.

A complex setup for a four-texture rendering pass is way easier to

understand with a vertex program than with a ton of texgen/texture

matrix calls, and it lets you do things that you just couldn't do hardware

accelerated at all before. Changing the model from fixed function data

like normals, colors, and texcoords to generalized attributes is very

important for future progress.

Here, I think Microsoft and DX8 are providing a very good benefit by

forcing a single vertex program interface down all the hardware

vendor's throats.

This one is truly stunning: the drivers just worked for all the new

features that I tried. I have tested a lot of pre-production 3D cards, and it

has never been this smooth.

The things that are indifferent:

I'm still not a big believer in hardware accelerated curve tessellation.

I'm not going to go over all the reasons again, but I would have rather

seen the features left off and ended up with a cheaper part.

The shadow map support is good to get in, but I am still unconvinced

that a fully general engine can be produced with acceptable quality using

shadow maps for point lights. I spent a while working with shadow

buffers last year, and I couldn't get satisfactory results. I will revisit

that work now that I have GeForce 3 cards, and directly compare it with my

current approach.

At high triangle rates, the index bandwidth can get to be a significant

thing. Other cards that allow static index buffers as well as static vertex

buffers will have situations where they provide higher application speed.

Still, we do get great throughput on the GF3 using vertex array range

and glDrawElements.

The things that are bad about it:

Vertex programs aren't invariant with the fixed function geometry paths.

That means that you can't mix vertex program passes with normal

passes in a multipass algorithm. This is annoying, and shouldn't have

happened.

Now we come to the pixel shaders, where I have the most serious issues.

I can just ignore this most of the time, but the way the pixel shader

functionality turned out is painfully limited, and not what it should have

been.

DX8 tries to pretend that pixel shaders live on hardware that is a lot

more general than the reality.

Nvidia's OpenGL extensions expose things much more the way they

actually are: the existing register combiners functionality extended to

eight stages with a couple tweaks, and the texture lookup engine is

configurable to interact between textures in a list of specific ways.

I'm sure it started out as a better design, but it apparently got cut and cut

until it really looks like the old BumpEnvMap feature writ large: it does

a few specific special effects that were deemed important, at the expense

of a properly general solution.

Yes, it does full bumpy cubic environment mapping, but you still can't

just do some math ops and look the result up in a texture. I was

disappointed on this count with the Radeon as well, which was just

slightly too hardwired to the DX BumpEnvMap capabilities to allow

more general dependent texture use.

Enshrining the capabilities of this mess in DX8 sucks. Other companies

had potentially better approaches, but they are now forced to dumb them

down to the level of the GF3 for the sake of compatibility. Hopefully

we can still see some of the extra flexibility in OpenGL extensions.

The future:

I think things are going to really clean up in the next couple years. All

of my advocacy is focused on making sure that there will be a

completely clean and flexible interface for me to target in the engine

after DOOM, and I think it is going to happen.

The market may have shrunk to just ATI and Nvidia as significant

players. Matrox, 3D labs, or one of the dormant companies may surprise

us all, but the pace is pretty frantic.

I think I would be a little more comfortable if there was a third major

player competing, but I can't fault Nvidia's path to success.