Y Quick tip – forcing your app to use the higher-performance GPU – orin tresnjak

Quick tip – forcing your app to use the higher-performance GPU

I recently switched from a home-built desktop PC to a laptop with an external GPU enclosure, and was surprised to discover that bytopia immediately crashed on startup on this system.

It turns out the NVIDIA driver isn’t always too smart about choosing which GPU to assign to a particular app, and was giving me the integrated Intel chip, which lacked the OpenGL 4.5 features that I’m using.1

You can of course solve this locally in the NVIDIA driver settings by forcing it to use the high-performance GPU, but I’d rather not have to ask every user to figure that out. So, after some google searching (resulting in a few false starts), I found this NVIDIA technical note, which explains the very hacky process by which you can force your app to use the high-performance GPU:

extern "C" {
    _declspec(dllexport) DWORD NvOptimusEnablement = 0x00000001;
}

This needs to be in the executable; (annoyingly) it won’t work in a DLL. And, needless to say, this solution is Windows-only; I don’t know if there’s something equivalent for OSX and/or Linux systems.

It turns out that AMD’s method is similar, except their variable is called AmdPowerXpressRequestHighPerformance.2 So, to cover all bases:

extern "C" {
    _declspec(dllexport) DWORD NvOptimusEnablement = 0x00000001;
    _declspec(dllexport) DWORD AmdPowerXpressRequestHighPerformance = 0x00000001;
}

I don’t have an AMD card handy so I couldn’t test the AMD version, but I can verify the NVIDIA version worked for me, with one caveat: it doesn’t work with a debugger attached.3  So I still ended up having to force the GPU choice in the driver settings locally, but at least when I distribute the game it will work for other people.


1 I will likely support earlier OpenGL versions eventually, although the performance of the Intel chips I’d need it for is so poor I’m not sure it’s worth the effort–and using direct state access does make the rendering layer easier to read…
2 Via this thread on AMD’s community forums. I don’t have an AMD graphics card to test this with, though, so I’m taking their word for it.
3 Presumably the driver has to inspect the executable to see if it exports the NvOptimusEnablement variable, and having the debugger attached prevents that somehow. (Full disclosure: I don’t know much about how debuggers work. 😛 )

© 2019 orin tresnjak . Powered by WordPress. Theme by Viva Themes.