Nvidia graphics driver causing noticeable frame stuttering
Asked Answered
H

4

34

Ok I've been researching this issue for a few days now so let me go over what I know so far which leads me to believe this might be an issue with NVidia's driver and not my code.

Basically my game starts stuttering after running a few seconds (random frames take 70ms instead of 16ms, on a regularish pattern). This ONLY happens if a setting called "Threaded Optimization" is enabled in the Nvidia control panel (latest drivers, windows 10). Unfortunately this setting is enabled by default and I'd rather not have to have people tweak their settings to get an enjoyable experience.

  • The game is not CPU or GPU intensive (2ms a frame without vsync on). It's not calling any openGL functions that need to synchronize data, and it's not streaming any buffers or reading data back from the GPU or anything. About the simplest possible renderer.

  • The problem was always there it just only started becoming noticeable when I added in fmod for audio. fmod is not the cause of this (more later in the post)

  • Trying to debug the problem with NVidia Nsight made the problem go away. "Start Collecting Data" instantly causes stuttering to go away. No dice here.

  • In the Profiler, a lot of cpu time is spent in "nvoglv32.dll". This process only spawns if Threaded Optimization is on. I suspect it's a synchronization issue then, so I debug with visual studio Concurrency Viewer.

  • A-HA! vsyncs

  • Investigating these blocks of CPU time on the nvidia thread, the earliest named function I can get in their callstack is "CreateToolhelp32Snapshot" followed by a lot of time spent in Thread32Next. I noticed Thread32Next in the profiler when looking at CPU times earlier so this does seem like I'm on the right track.

  • So it looks like periodically the nvidia driver is grabbing a snapshot of the whole process for some reason? What could possibly be the reason, why is it doing this, and how do I stop it?

  • Also this explains why the problem started becoming noticeable once I added in fmod, because its grabbing info for all the processes threads, and fmod spawns a lot of threads.

  • Any help? Is this just a bug in nvidia's driver or is there something I can do to fix it other telling people to disable Threaded "Optimization"?

edit 1: The same issue occurs with current nvidia drivers on my laptop too. So I'm not crazy

edit 2: the same issue occurs on version 362 (previous major version) of nvidia's driver

Hime answered 30/4, 2016 at 19:51 Comment(10)
Hey, Tyler. :) Out of curiosity - do you have some debug drivers installed or something? I can't for the life of me guess why a graphics driver would need to grab that kind of information unless it's for some kind of debugging/logging reasons.Harshman
nope. they're the publicly available drivers from nvidia's website. I'm not even sure where I'd get a debug mode driver...Hime
also I should mention I did also try this on its own without visual studio running, just in case visual studio was injecting some debug code... same issueHime
Have you tried creating a minimal reproduction for this? It might help people help youHarlen
Do you have a contact at NVIDIA that might be able to nail this? (Is this for Bombernauts? In which case, maybe blame Unity? ;) ) If not, and you think it could help, I have a semi-stale contact I could connect you with that could get you to the right department.Harshman
Usually the only way to resolve these issues is to contact NVIDIA developer support ([email protected]?). Since NVIDIA doesn't supply symbols (randomascii.wordpress.com/2011/11/27/a-tale-of-two-call-stacks) and since the set of possible explanations is infinite and ever changing there is no guarantee that you can resolve this on your own. Graphics drivers appear to be built of hacks layered on top of tradeoffs and it is very easy to trigger bad behavior. Good luck!Foulup
jim- no this is for a game in my own engine. Issue doesn't happen with Unity cause it uses directx. I'm working on getting in contact with nvidia about it (tommy refenes has some contacts there he's lending me), but if you know anyone who's directly involved in this that would help tooHime
Oh, for sure he has better people to contact then. The guy I know is/was more involved with the Tegra hardware, so best case, he could point me/you to another person that is more involved with the drivers.Harshman
Note that this has been a known issue in the NVIDIA drivers for many years, on both Windows and Linux, so I wouldn't hold out much hope of it getting fixed.Neology
There is another issue with threaded optimization, namely that it appears to mess with the thread affinities of the application. I find these behaviors utterly unacceptable - you buy a HARDWARE acceleration device and its driver not only spawns lots of threads that steal CPU time from the application, but also modifies threading settings that you have fine-tuned to perfection. Sigh.Burcham
E
8

... or is there something I can do to fix it other telling people to disable Threaded "Optimization"?

Yes.

You can create custom "Application Profile" for your game using NVAPI and disable "Threaded Optimization" setting in it.

There is a .PDF file on NVIDIA site with some help and code examples regarding NVAPI usage.

In order to see and manage all your NVIDIA profiles I recommend using NVIDIA Inspector. It is more convenient than the default NVIDIA Control Panel.

Also, here is my code example which creates "Application Profile" with "Threaded Optimization" disabled:

#include <stdlib.h>
#include <stdio.h>

#include <nvapi.h>
#include <NvApiDriverSettings.h>


const wchar_t*  profileName             = L"Your Profile Name";
const wchar_t*  appName                 = L"YourGame.exe";
const wchar_t*  appFriendlyName         = L"Your Game Casual Name";
const bool      threadedOptimization    = false;


void CheckError(NvAPI_Status status)
{
    if (status == NVAPI_OK)
        return;

    NvAPI_ShortString szDesc = {0};
    NvAPI_GetErrorMessage(status, szDesc);
    printf("NVAPI error: %s\n", szDesc);
    exit(-1);
}


void SetNVUstring(NvAPI_UnicodeString& nvStr, const wchar_t* wcStr)
{
    for (int i = 0; i < NVAPI_UNICODE_STRING_MAX; i++)
        nvStr[i] = 0;

    int i = 0;
    while (wcStr[i] != 0)
    {
        nvStr[i] = wcStr[i];
        i++;
    }
}


int main(int argc, char* argv[])
{
    NvAPI_Status status;
    NvDRSSessionHandle hSession;

    status = NvAPI_Initialize();
    CheckError(status);

    status = NvAPI_DRS_CreateSession(&hSession);
    CheckError(status);

    status = NvAPI_DRS_LoadSettings(hSession);
    CheckError(status);


    // Fill Profile Info
    NVDRS_PROFILE profileInfo;
    profileInfo.version             = NVDRS_PROFILE_VER;
    profileInfo.isPredefined        = 0;
    SetNVUstring(profileInfo.profileName, profileName);

    // Create Profile
    NvDRSProfileHandle hProfile;
    status = NvAPI_DRS_CreateProfile(hSession, &profileInfo, &hProfile);
    CheckError(status);


    // Fill Application Info
    NVDRS_APPLICATION app;
    app.version                     = NVDRS_APPLICATION_VER_V1;
    app.isPredefined                = 0;
    SetNVUstring(app.appName, appName);
    SetNVUstring(app.userFriendlyName, appFriendlyName);
    SetNVUstring(app.launcher, L"");
    SetNVUstring(app.fileInFolder, L"");

    // Create Application
    status = NvAPI_DRS_CreateApplication(hSession, hProfile, &app);
    CheckError(status);


    // Fill Setting Info
    NVDRS_SETTING setting;
    setting.version                 = NVDRS_SETTING_VER;
    setting.settingId               = OGL_THREAD_CONTROL_ID;
    setting.settingType             = NVDRS_DWORD_TYPE;
    setting.settingLocation         = NVDRS_CURRENT_PROFILE_LOCATION;
    setting.isCurrentPredefined     = 0;
    setting.isPredefinedValid       = 0;
    setting.u32CurrentValue         = threadedOptimization ? OGL_THREAD_CONTROL_ENABLE : OGL_THREAD_CONTROL_DISABLE;
    setting.u32PredefinedValue      = threadedOptimization ? OGL_THREAD_CONTROL_ENABLE : OGL_THREAD_CONTROL_DISABLE;

    // Set Setting
    status = NvAPI_DRS_SetSetting(hSession, hProfile, &setting);
    CheckError(status);


    // Apply (or save) our changes to the system
    status = NvAPI_DRS_SaveSettings(hSession);
    CheckError(status);


    printf("Success.\n");

    NvAPI_DRS_DestroySession(hSession);

    return 0;
}
Expecting answered 4/6, 2016 at 17:12 Comment(4)
thanks for the code, probably gonna end up having to use this if nvidia doesn't fix it by the time my game is doneHime
Note that profiles are loaded by the driver at the application start (from its name), so you need to restart it for your custom profile to take effect. Doing this in the installer is probably best.Bluma
Shouldn't it be NVDRS_APPLICATION_VER instead of NVDRS_APPLICATION_VER_V1?Clevis
@Bluma My tests seem to indicate that this is not the case and the profile is in fact activated immediately. Note: I use OpenGL via GLEW, which I only initialize after creating the profile. By the way: one should use NvAPI_DRS_FindApplicationByName to check if the profile already exists before creating.Clevis
R
1

Hate to state the obvious but I feel like it needs to be said.

Threaded optimization is notorious for causing stuttering in many games, even those that take advantage of multithreading. Unless your application works well with the threaded optimization setting, the only logical answer is to tell your users to disable it. If users are stubborn and don't want to do that, that's their fault.

The only bug in recent memory I can think of is that older versions of the nvidia driver caused applications w/ threaded optimization running in Wine to crash, but that's unrelated to the stuttering issue you describe.

Riti answered 1/5, 2016 at 5:46 Comment(2)
At this point if it's not something Nvidia wants to fix (I nailed down the windows function causing the stuttering I think) I'm just gonna switch over to directx where it doesn't seem like the issue is present.Hime
"That's their fault" ?! No, if your application doesn't work, it's your fault.Bluma
P
1

Thanks for subGlitch's answer first, based on that proposal, I just make a safer one, which would enable you to cache and change the thread optimization, then restore it afterward.

Code is like below:

#include <stdlib.h>
#include <stdio.h>
#include <nvapi.h>
#include <NvApiDriverSettings.h>

enum NvThreadOptimization {
    NV_THREAD_OPTIMIZATION_AUTO         = 0,
    NV_THREAD_OPTIMIZATION_ENABLE       = 1,
    NV_THREAD_OPTIMIZATION_DISABLE      = 2,
    NV_THREAD_OPTIMIZATION_NO_SUPPORT   = 3
};

bool NvAPI_OK_Verify(NvAPI_Status status)
{
    if (status == NVAPI_OK)
        return true;

    NvAPI_ShortString szDesc = {0};
    NvAPI_GetErrorMessage(status, szDesc);

    char szResult[255];
    sprintf(szResult, "NVAPI error: %s\n\0", szDesc);
    printf(szResult);

    return false;
}

NvThreadOptimization GetNVidiaThreadOptimization()
{
    NvAPI_Status status;
    NvDRSSessionHandle hSession;
    NvThreadOptimization threadOptimization = NV_THREAD_OPTIMIZATION_NO_SUPPORT;

    status = NvAPI_Initialize();
    if(!NvAPI_OK_Verify(status))
        return threadOptimization;

    status = NvAPI_DRS_CreateSession(&hSession);
    if(!NvAPI_OK_Verify(status))
        return threadOptimization;

    status = NvAPI_DRS_LoadSettings(hSession);
    if(!NvAPI_OK_Verify(status))
    {
        NvAPI_DRS_DestroySession(hSession);
        return threadOptimization;;
    }


    NvDRSProfileHandle hProfile;
    status = NvAPI_DRS_GetBaseProfile(hSession, &hProfile);
    if(!NvAPI_OK_Verify(status))
    {
        NvAPI_DRS_DestroySession(hSession);
        return threadOptimization;;
    }

    NVDRS_SETTING originalSetting;
    originalSetting.version = NVDRS_SETTING_VER;
    status = NvAPI_DRS_GetSetting(hSession, hProfile, OGL_THREAD_CONTROL_ID, &originalSetting);
    if(NvAPI_OK_Verify(status))
    {
        threadOptimization = (NvThreadOptimization)originalSetting.u32CurrentValue;
    }

    NvAPI_DRS_DestroySession(hSession);

    return threadOptimization;
}

void SetNVidiaThreadOptimization(NvThreadOptimization threadedOptimization)
{
    NvAPI_Status status;
    NvDRSSessionHandle hSession;

    if(threadedOptimization == NV_THREAD_OPTIMIZATION_NO_SUPPORT)
        return;

    status = NvAPI_Initialize();
    if(!NvAPI_OK_Verify(status))
        return;

    status = NvAPI_DRS_CreateSession(&hSession);
    if(!NvAPI_OK_Verify(status))
        return;

    status = NvAPI_DRS_LoadSettings(hSession);
    if(!NvAPI_OK_Verify(status))
    {
        NvAPI_DRS_DestroySession(hSession);
        return;
    }

    NvDRSProfileHandle hProfile;
    status = NvAPI_DRS_GetBaseProfile(hSession, &hProfile);
    if(!NvAPI_OK_Verify(status))
    {
        NvAPI_DRS_DestroySession(hSession);
        return;
    }

    NVDRS_SETTING setting;
    setting.version                 = NVDRS_SETTING_VER;
    setting.settingId               = OGL_THREAD_CONTROL_ID;
    setting.settingType             = NVDRS_DWORD_TYPE;
    setting.u32CurrentValue         = (EValues_OGL_THREAD_CONTROL)threadedOptimization;

    status = NvAPI_DRS_SetSetting(hSession, hProfile, &setting);
    if(!NvAPI_OK_Verify(status))
    {
        NvAPI_DRS_DestroySession(hSession);
        return;
    }

    status = NvAPI_DRS_SaveSettings(hSession);
    NvAPI_OK_Verify(status);

    NvAPI_DRS_DestroySession(hSession);
}

Based on the two interfaces (Get/Set) above, you may well save the original setting and restore it when your application exits. That means your setting to disable thread optimization only impact your own application.

static NvThreadOptimization s_OriginalNVidiaThreadOptimization = NV_THREAD_OPTIMIZATION_NO_SUPPORT;

// Set
s_OriginalNVidiaThreadOptimization =  GetNVidiaThreadOptimization();
if(    s_OriginalNVidiaThreadOptimization != NV_THREAD_OPTIMIZATION_NO_SUPPORT
    && s_OriginalNVidiaThreadOptimization != NV_THREAD_OPTIMIZATION_DISABLE)
{
    SetNVidiaThreadOptimization(NV_THREAD_OPTIMIZATION_DISABLE);
}

//Restore
if(    s_OriginalNVidiaThreadOptimization != NV_THREAD_OPTIMIZATION_NO_SUPPORT
    && s_OriginalNVidiaThreadOptimization != NV_THREAD_OPTIMIZATION_DISABLE)
{
    SetNVidiaThreadOptimization(s_OriginalNVidiaThreadOptimization);
};
Pylle answered 30/3, 2017 at 4:54 Comment(1)
But wouldn't this also impact other applications that start while this app is running?Clevis
O
0

Building off of @subGlitch's answer, the following checks to see if an application profile already exists, and if so updates the existing profile instead of creating a new one. It is also encapsulated into a function which can be called, that will bypass the logic if the nvidia api is not found on the system (AMD/Intel users), or an issue is encountered which prohibits modifying the profile:

#include <iostream>

#include <nvapi.h>
#include <NvApiDriverSettings.h>


const wchar_t*  profileName = L"Application for testing nvidia api";
const wchar_t*  appName = L"nvapi.exe";
const wchar_t*  appFriendlyName = L"Nvidia api test";
const bool      threadedOptimization = false;


bool nvapiStatusOk(NvAPI_Status status)
{
    if (status != NVAPI_OK)
    {
        // will need to not print these in prod, just return false

        // full list of codes in nvapi_lite_common.h line 249
        std::cout << "Status Code:" << status << std::endl;
        NvAPI_ShortString szDesc = { 0 };
        NvAPI_GetErrorMessage(status, szDesc);
        printf("NVAPI Error: %s\n", szDesc);

        return false;
    }

    return true;
}


void setNVUstring(NvAPI_UnicodeString& nvStr, const wchar_t* wcStr)
{
    for (int i = 0; i < NVAPI_UNICODE_STRING_MAX; i++)
        nvStr[i] = 0;

    int i = 0;
    while (wcStr[i] != 0)
    {
        nvStr[i] = wcStr[i];
        i++;
    }
}

void initNvidiaApplicationProfile()
{
    NvAPI_Status status;

    // if status does not equal NVAPI_OK (0) after initialization,
    // either the system does not use an nvidia gpu, or something went
    // so wrong that we're unable to use the nvidia api...therefore do nothing
    /*
    if (!nvapiStatusOk(NvAPI_Initialize()))
        return;
    */

    // for debugging use ^ in prod
    if (!nvapiStatusOk(NvAPI_Initialize()))
    {
        std::cout << "Unable to initialize Nvidia api" << std::endl;
        return;
    }
    else
    {
        std::cout << "Nvidia api initialized successfully" << std::endl;
    }
        
    // initialize session
    NvDRSSessionHandle hSession;
    if (!nvapiStatusOk(NvAPI_DRS_CreateSession(&hSession)))
        return;

    // load settings
    if (!nvapiStatusOk(NvAPI_DRS_LoadSettings(hSession)))
        return;

    // check if application already exists
    NvDRSProfileHandle hProfile;
    
    NvAPI_UnicodeString nvAppName;
    setNVUstring(nvAppName, appName);

    NVDRS_APPLICATION app;
    app.version = NVDRS_APPLICATION_VER_V1;

    // documentation states this will return ::NVAPI_APPLICATION_NOT_FOUND, however I cannot
    // find where that is defined anywhere in the headers...so not sure what's going to happen with this?
    //
    // This is returning NVAPI_EXECUTABLE_NOT_FOUND, which might be what it's supposed to return when it can't
    // find an existing application, and the documentation is just outdated?
    status = NvAPI_DRS_FindApplicationByName(hSession, nvAppName, &hProfile, &app);
    if (!nvapiStatusOk(status))
    {
        // if status does not equal NVAPI_EXECUTABLE_NOT_FOUND, then something bad happened and we should not proceed
        if (status != NVAPI_EXECUTABLE_NOT_FOUND)
        {
            NvAPI_Unload();
            return;
        }

        // create application as it does not already exist

        // Fill Profile Info
        NVDRS_PROFILE profileInfo;
        profileInfo.version = NVDRS_PROFILE_VER;
        profileInfo.isPredefined = 0;
        setNVUstring(profileInfo.profileName, profileName);

        // Create Profile
        //NvDRSProfileHandle hProfile;
        if (!nvapiStatusOk(NvAPI_DRS_CreateProfile(hSession, &profileInfo, &hProfile)))
        {
            NvAPI_Unload();
            return;
        }

        // Fill Application Info, can't re-use app variable for some reason
        NVDRS_APPLICATION app2;
        app2.version = NVDRS_APPLICATION_VER_V1;
        app2.isPredefined = 0;
        setNVUstring(app2.appName, appName);
        setNVUstring(app2.userFriendlyName, appFriendlyName);
        setNVUstring(app2.launcher, L"");
        setNVUstring(app2.fileInFolder, L"");

        // Create Application
        if (!nvapiStatusOk(NvAPI_DRS_CreateApplication(hSession, hProfile, &app2)))
        {
            NvAPI_Unload();
            return;
        }
    }

    // update profile settings
    NVDRS_SETTING setting;
    setting.version = NVDRS_SETTING_VER;
    setting.settingId = OGL_THREAD_CONTROL_ID;
    setting.settingType = NVDRS_DWORD_TYPE;
    setting.settingLocation = NVDRS_CURRENT_PROFILE_LOCATION;
    setting.isCurrentPredefined = 0;
    setting.isPredefinedValid = 0;
    setting.u32CurrentValue = threadedOptimization ? OGL_THREAD_CONTROL_ENABLE : OGL_THREAD_CONTROL_DISABLE;
    setting.u32PredefinedValue = threadedOptimization ? OGL_THREAD_CONTROL_ENABLE : OGL_THREAD_CONTROL_DISABLE;

    // load settings
    if (!nvapiStatusOk(NvAPI_DRS_SetSetting(hSession, hProfile, &setting)))
    {
        NvAPI_Unload();
        return;
    }

    // save changes
    if (!nvapiStatusOk(NvAPI_DRS_SaveSettings(hSession)))
    {
        NvAPI_Unload();
        return;
    }

    // disable in prod
    std::cout << "Nvidia application profile updated successfully" << std::endl;

    NvAPI_DRS_DestroySession(hSession);

    // unload the api as we're done with it
    NvAPI_Unload();
}

int main()
{
    // if building for anything other than windows, we'll need to not call this AND have
    // some preprocessor logic to not include any of the api code. No linux love apparently...so
    // that's going to be a thing we'll have to figure out down the road -_-
    initNvidiaApplicationProfile();
    
    std::cin.get();
    return 0;
}
Outlive answered 10/8, 2021 at 0:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.