Possible help for the micro-stutters
-
@airtex2019 Thanks for that, I wasn’t sure what that column entry meant!
-
Some questions/remarks WRT the “casual frame stutters” that IIRC you had since always and even though reduced (Good to hear that ), apparently are still there:
-
Why are you running with V-Sync, at all? I mean today when the Nvidia drivers support FPS limiting out of the box, isn’t V-Sync a bit of an overkill that may or may not cause issues WRT frame timings?
-
4K resolution with a 1660TI GPU? Hmmm, even for a relatively “lightweight GFX” experience as 4.36 is, at some conditions it may become simply too heavy, no? For example that Benchmark TE, unless we are talking about Rendering thread always above 60 FPS, I wouldn’t think this GPU fits “4K demands”, in general. I wonder if the system will behave the same when at e.g FHD resolution and V-Sync disabled.
-
Are you running with RTT displays enabled? Because I’m sure that may cause sometimes frame-glitching. Copying the RTT surfaces via D3D11 mapping is a blocking operation (D3D11 context isn’t thread safe, at least not the immediate context object), and even though we are talking about relatively small size textures, copying stuff back from GPU VRAM to system RAM is the slowest operation (GPUs are wild animals that like being pushed forward, but don’t like to hold and read stuff back to system RAM ) and I can imagine it may cause sometimes micro frame-stutters due to potential pipeline stalls.
-
-
@I-Hawk said in Possible help for the micro-stutters:
Why are you running with V-Sync, at all? I mean today when the Nvidia drivers support FPS limiting out of the box
1 – I dunno it just seems logical, to want to flip frames when the monitor is ready, and thus drive the whole loop on that 60hz clock signal.
With fps-cap of 60 but v-sync off… isn’t there a chance of flipping (or blitting to DWM buffer) in the middle of scanout? I suppose not, if triple-buffer enabled. I’ll give it a try.
2 – At 4k x 60 fps my 1660 Ti runs about 75-85% in most weather conditions. 16xAF and 4xAA, all the default BMS graphics options (viz. basically everything except shadows-on-smoke)… default trees and grass sliders. 100-degree FOV. Cooling pretty good, temps stay around 60C.
The new storm-clouds and rain effects in 4.36 seem to max it out to 100% tho, and temps begin to climb near 80C. There’s some added CPU cost there too tho, so not sure how badly GPU constrained I really am. But it’s definitely close to the limit, yeah… now that prices have normalized I’ll start shopping … maybe there’ll be black friday sale on a 3060 Ti or 3070.
3 – no RTT overlays running for these tests… I do sometimes play around with it, but haven’t since making this C-state change. Oh, I do still have the cfg set to export them tho…
set g_bExportRTTTextures 1
Does that incur the blocking GPU-readback operation even without RTT running? Oops.
-
@airtex2019 said in Possible help for the micro-stutters:
@I-Hawk said in Possible help for the micro-stutters:
Why are you running with V-Sync, at all? I mean today when the Nvidia drivers support FPS limiting out of the box
1 – I dunno it just seems logical, to want to flip frames when the monitor is ready, and thus drive the whole loop on that 60hz clock signal.
With fps-cap of 60 but v-sync off… isn’t there a chance of flipping (or blitting to DWM buffer) in the middle of scanout? I suppose not, if triple-buffer enabled. I’ll give it a try.
2 – At 4k x 60 fps my 1660 Ti runs about 75-85% in most weather conditions. 16xAF and 4xAA, all the default BMS graphics options (viz. basically everything except shadows-on-smoke)… default trees and grass sliders. 100-degree FOV. Cooling pretty good, temps stay around 60C.
The new storm-clouds and rain effects in 4.36 seem to max it out to 100% tho, and temps begin to climb near 80C. There’s some added CPU cost there too tho, so not sure how badly GPU constrained I really am. But it’s definitely close to the limit, yeah… now that prices have normalized I’ll start shopping … maybe there’ll be black friday sale on a 3060 Ti or 3070.
3 – no RTT overlays running for these tests… I do sometimes play around with it, but haven’t since making this C-state change. Oh, I do still have the cfg set to export them tho…
set g_bExportRTTTextures 1
Does that incur the blocking GPU-readback operation even without RTT running? Oops.
set g_bExportRTTTextures will cause textures to be exported yes, regardless of RTT active or not.
Regarding a new GPU, I’d recommend for the sake of the future to buy something stronger, especially for 4K res (I would try to go 3080 and above, in fact )
-
@airtex2019 said in Possible help for the micro-stutters:
1 – I dunno it just seems logical, to want to flip frames when the monitor is ready, and thus drive the whole loop on that 60hz clock signal.
With fps-cap of 60 but v-sync off… isn’t there a chance of flipping (or blitting to DWM buffer) in the middle of scanout? I suppose not, if triple-buffer enabled. I’ll give it a try.As a quick test in TR#3 … I do still like the 60hz v-sync+triple-buffer experience better than 60 fps-cap + triple-buffer. No tearing or glitching in either case, but I see less visible stutter with v-sync.
It’s entirely subjective, unfortunately, I don’t see any difference in the presentmon logs and it’s hard to reason why it would be visibly different. Looking at msBetweenPresents… in both cases, after frame N comes in 7ms late, frame N+1 comes in about 7ms early, keeping everything back in phase.
Pure conjecture… but I suspect it’s because presentmon records the Present() call timestamp before the fps-cap delay is imposed on the thread.
So it looks like N+1 comes in early to compensate but then it gets blocked immediately after that metric is recorded.
As opp to v-sync mode, the block doesn’t happen until after the Present call completes… thread waits on the v-blank clock signal (or not blocking at all, if the v-blank event already signaled). But again, just conjecture, not sure how to test or validate that.
-
@airtex2019 Hmmm… my frametime seems to be around 16-18ms for the Benchmark TE but I get an occasional one >20ms. To be honest, I don’t watch the game while in testing so I wouldn’t be able to tell if I would’ve noticed it. The TE_14 test is below 10ms but again get quite a few >20ms, about 6% of the total frametime entries. Oddly enough, for the Benchmark TE, although the average frametime is higher, the number of times it is above 20 is only 1.5%
-
@Atlas you have to graph it to see the periodic 7ms stutter I’m referring to. and not sure it would show up clearly on Benchmark TE, there’s a lot going on there! I usually just drop into TR#3 for quick test on a partly cloudy day, with minimal load from things like AI wingmen. and practice my overhead-break at Kunsan.
This is what the msBetweenPresents graph looks like, on my rig.
-
@airtex2019 interesting… do you have a before/after graph of your stutters? How do you “see” these in the game? I wonder if it’s more visible on a 60Hz screen but less so on a faster-refresh screen?
Seems like I have these too…
What I do notice though is that your “normal” is right about in the middle between your min and max frametimes whereas mine is about 25-30% from the min frametime. Maybe that’s a factor?
-
@Atlas yeah that’s because your “normal” is so fast … looks like about 8ms? there’s not much slack, for the CPU to catch up… may take a few frames
my normal is 16.7ms (because 60hz v-sync) but the [alt+C][F] overlay shows my CPU time per frame is only about 5-10ms typically. So that leaves me ~7-12ms of slack
-
@airtex2019 depends. On the Benchmark TE, like I said, it’s about 16-18ms.