Doomsday + NVidia's Threaded Optimization = Slowdown

edited 2008 Jun 21 in Developers
<i>This post was originally made by <b>danij</b> on the dengDevs blog. It was posted under the categories: Engine, Platforms, Windows.</i>

As many of you will be aware, there is currently somewhat of an issue with playing Doomsday on a machine that is running an NVidia videocard with the "Threaded Optimization" turned on.

To date I've not had chance to investigate the cause of the problem so I thought I'd open this thread as an opportunity to discuss theories for likely causes.

This is something that we really should address before the next beta as this is a very common problem (the number of posts in the forum and on #irc are testament).

Comments

  • It could be that the GL calls we are doing per-frame are blocking the driver's GL thread at such intervals that it cannot run at full speed. We should see whether we have for instance some frequent glGet* calls that could be removed.
  • Fundamentally this is a bug with the Nvidia driver, and one that appears specific to the Windows drivers (It doesn't occur on say dual core Linux boxes).

    In a nutshell, the driver is designed to be taking large chunks of OpenGL commands in a single shot. Older cards and drivers worked OK, but not optimally with the more frequent smaller command batching you do, but it appears that on newer hardware, you need to batch up bigger chunks.

    Bare in mind, that the Nvidia threaded optimisation also breaks a lot of recent games too.

    My suggestions would be (and I should stress this really should be looked at in 1.9.0beta7) look at batching up more of your OpenGL calls, aggressively use OpenGL display lists to speed up rendering (you can generate those for models on load, including interpolation - you only need to scale, rotate, then call the display list), and consider switching the particle system onto point sprites if available.

    That should help alleviate the symptoms of the clearly low priority Nvidia driver bug, and improve performance on other systems.
  • I agree that we should be making better use of more modern GL API features such as display lists. However I don't think that it is our direct-mode usage that is the cause of this problem. Hmm, I wonder if this has something to do with the vertex array buffer as AFAIK, it is only being used under Windows?

    When I get time I'll do some experimentation as it could be something really simple (like skyjake suggested with too frequent calls to the query routines).
  • Vertex arrays (the OpenGL function GL_EXT_compiled_vertex_array ) have been in use on Win32, *NIX, and OSX the entire time I was with the project.

    If you do have access to hardware that shows the bug, try making a test OpenGL app, that renders eg an MD2 model in the various OpenGL modes and compare the fps.

    While I do have a dual-core system with a Nvidia graphics card, I've just discovered Windows 2000 won't even start up on it due to the amount of RAM I have installed.
  • I noticed via <a href="http://www.newdoom.com/&quot; rel="nofollow">NewDoom</a> that <a href="http://risen3d.newdoom.com/&quot; rel="nofollow">Risen3D</a> had released a new version. Upon checking it out I was amused to find that one of the issues they have addressed in that release is the one being discussed here.

    A quick look at the source confirms that the cause of the problem is indeed the vertex array usage.
  • That is actually very interesting. I look forward to seeing the results of your changes.
Sign In or Register to comment.