Triple Buffering
-
I find myself struggling to understand triple-buffering and v-sync, in BMS.
I’m a software engineer by trade, so I understand the high-level concepts, just not sure how they’re being applied in BMS… or how they overlap or intersect with the various related settings from Nvidia drivers.
So, what exactly is the new triple-buffering checkbox in BMS 4.35… is it the same or similar to the NVidia fast-vsync feature (which sounds alot like forced triple-buffering, under the hood)? (If I understand correctly, the setting nvidia refers to as “triple-buffering” is an OpenGL only feature.)
Is there a reason to prefer one over the other (Nvidia fast-vsync vs BMS triple-buffering)? Does it cause problems if both are enabled (poor stability, or added latency) or is it merely redundant… kinda like the AA/AF settings can be controlled at app or driver level?
Does triple-buffering in BMS imply v-sync (tear-free buffer flipping) in the same way the v-sync=Fast setting in Nvidia control panel does?
Does one need to set the Nvidia v-sync option to “application-controlled” for the triple-buffering to have effect?
-
At first I wasn’t seeing any tearing, but I just played a mission with visible tearing … so I guess it’s not the (exactly) the same thing as nvidia fast-sync. (Although maybe it is, if you turn on both triple-buffering and v-sync, in game?)
-
I dont know much about graphic settings, but 4.35 caused constant screen freezings for me after some time flying, which I fixed disabling VSync.- Also my old and faithful Track IR5 began to crash and was fixed disabling triple buffering from the NVidia panel.-
-
I haven’t experienced any crashing, but it does sound like maybe some systems are having trouble related to triple-buffering.
Mainly I’m just looking for ways to minimize my input-latency … I have a decent frame rate (about 90fps in clear weather) but still I feel a sense of PIO in situations like aero-braking on landing, and aerial refueling.
Using an iphone app to record the time difference between a button press and its response appearing on-screen, and my typical latency is in the 80-100ms range. That seems bad to me (about 8 frames!) and I’m trying to understand where all that input-lag is coming from.
ie. Is BMS queuing up frames to be rendered … and does triple-buffering reduce that?
Is the Nvidia driver queuing up frames pre-render … does the new Low-latency mode reduce that?
Without a better understanding of what these options do, under the hood, I’m left to just experiment with different combinations… so far nothing has made a measurable impact to my latency.
-
I got something like an occasional ‘tearing’ in 4.35 when scanning the outside view from the cockpit. Not sure if it was tearing, more like a wavy effect. Read about VSync and checked it in BMS options. That wavy effect disappeared. Then I read about Triple Buffering from the BMS manual. It’s basically something like rendering an additional frame for the buffer. Unchecked VSync to unlimit FPS, applied Triple B in BMS options, and the wavy ‘tearing’ did not appear and FPS could go back up above 60 (in my case). Best way to find out is to test/try.
-
This article has the best explanation and diagrams that I’ve seen so far… although I can’t get the video animations to play back at the moment.
https://www.nvidia.com/en-us/geforce/guides/system-latency-optimization-guide/Although, after reading it a few times I still can’t explain the difference between Low-Latency-Mode “On” and “Ultra”.
-
For those playing along at home … I just discovered FrameView – a new profiler tool from NVidia. Looks like it’s been in beta for a while but was just released a few months ago.
It has a very nice FPS overlay (with avg/p90/p99 percentiles), a counter for dropped-frames, and a millisecond timer to track frame latency, all the way from the Present() call to the final render / buffer flip.
It can record CSV log file snapshots, including all that data, and much more.
The data seems in alignment, but noticeably different than the in-game [alt+c,f] FPS counter … I suspect because FrameView is showing a running average/percentile over past 100 frames, while the in-game display is more of a periodic 1/sec snapshot.
Anyway, I still don’t know what triple-buffering means, but I do at least learn that on my system, running in fullscreen mode makes a huge difference in lowering latency.
Hope this tool may help others get more mileage out of their older rigs…!
-
I do use Vsync (60) and TripleBuff , no problems … but I’m on AMD … so… just for you know.
-
Hi airtex2019, as you mentioned in post #4 you have pretty good high frames. BMS is not a first person shooter so is the input latency important?
I´m not an expert in pc games, just want to learn more about this stuffs. -
You control the aircraft with a joystick so yes, latency is VERY important for proper control. I have found that even plugging into a powered usb bar produces enough latency to be noticeable and affect fine control, most noticeable when you are leveling out on the horizon.
-
-
Hi airtex2019, as you mentioned in post #4 you have pretty good high frames. BMS is not a first person shooter so is the input latency important?
I´m not an expert in pc games, just want to learn more about this stuffs.I can feel it in 2 situations… 1) aero-braking on landing … ie. trying to hold nose stable around 10-13 degrees AOA. and 2) trying to hold a tight formation, eg AAR refueling. I end up with a lot of PIO (pilot-induced oscillation) which can be a pretty bad effect in RL fly-by-wire aircraft, so maybe it’s a fair aspect of the sim… lol
And (3) yeah I kinda do need every millisecond I can get when a SA-6 pops up on my RWR.
But mostly I’m just trying to tune my rig… and with the new DX11 engine maybe others will benefit from learning some more about what works best for their setups, too.
I’m a software engineer but I’ve been doing server-side stuff for 10+ years, so my graphics knowledge is way out of date… so it pays me to catch up and learn a little about this stuff.
-
Here is what I’ve learned, about triple-buffering and v-sync, while playing with NVidia’s excellent new FrameView tool and trying to tweak my 8 year old rig to get best possible performance…
Since starting this thread I eventually ended up upgrading my entire system, so now I’ve had the opportunity to play and test 4.35 on an older/slower box, and a newer/faster box.
Old and busted: Core i7-3770, GTX 1050 Ti, G-sync monitor 60-100hz
New hotness: Core i7-8700, GTX 1660 Ti, same G-sync monitor
…both Windows 10 20H2 with latest nvidia drivers[Disclaimer: I’m not a deep expert on DirectX game development, so take all this with a grain of salt… if it sounds like I know what I’m talking about, remember, I really don’t! I’m just trying to learn how stuff works. Anyone feel free to jump in and correct me if I get anything wrong. I started writing this down to help clarify my own thoughts, in pursuit of lower latency and smoother rendering, and I decided to share it here in case it helps anyone else searching the archives someday.]
- Borderless-windowed vs Fullscreen-exclusive (to DWM or not to DWM)
The first thing to learn is there is a pretty big structural difference in the rendering pipeline, depending on whether you are running windowed/borderless (where DWM composition owns the final display buffer) vs running fullscreen-exclusive, where the app is flipping the frame buffers to be displayed, directly.
The Windows DWM has the effect of making every double-buffered DirectX app behave inherently like a triple-buffered app – at the v-sync interval, the DWM “reaches in” to the app’s frame buffers and blits (copies) the latest of the 2 completed frames into the DWM-owned front-buffer. This blit is pretty fast, about 1-2ms, so it’s akin to a buffer-flip.
https://en.wikipedia.org/wiki/Desktop_Window_Manager
So it would seem the in-app triple-buffering is redundant, or even slightly harmful, when running in windowed/borderless modes. (In my testing, it seems to add 1 full frame of latency … in addition to the +1 frame of latency added by DWM). Not broken, but definitely not helpful.
- V-sync On vs Off
Another implication of running under the DWM, is that there will never be tearing – as far as I can tell, the DWM always waits for the v-blank signal before updating its front-buffer.
Without DWM, the app controls the GPU buffers directly and flips them whenever a frame is completed. With v-sync OFF there will be tearing, but very high throughput and low latency. With v-sync ON there will be no tearing, but throughput and latency will suffer because the DX app has to pause, briefly, potentially every frame, to wait for the next v-blank signal.
But with triple-buffering enabled, there are 2 back-buffers and 1 front-buffer. (Visualize the 2 back-buffers as side by side, not sequential – funneling down into 1 front-buffer… like two lanes on a freeway merging into one.) This allows the the CPU and GPU to work continuously, without pause – processing frames as fast as they’re able, without blocking to wait for the v-blank signal – because it is always safe to overwrite the older of the 2 back-buffers, without risk of tearing.
For BMS, it would seem that enabling triple-buffering should “imply” v-sync also enabled – after all there isn’t any point in doing triple-buffering if you’re just going to flip in the middle of a scanout (tearing) anyway! Indeed, on the NVidia Control Panel, the v-sync setting is a three-state (Off|On|Fast) where “Fast” is their implementation of tear-free triple-buffering.
(Strangely, with TB=on and VS=off, I do see a little bit of tearing happening when I look around quickly. Not sure why. But it doesn’t quite look like “normal” tearing… it looks more blocky, and seems limited to the cockpit, not the outside terrain or clouds. I have no idea what that means. This blocky-tearing doesn’t happen with NVidia’s v-sync=Fast triple buffering, on my systems – only with the BMS in-app triple-buffering.)
Alternatively, turning v-sync=on (with triple-buffering=on) affects the pacing of the BMS game loop – as one would expect, it lowers overall CPU and GPU utilization, and caps framerate to the monitor’s refresh rate. But in doing so, that wipes out most of the benefit of triple-buffering.
- Bottlenecks
Even with high-rez clouds and textures, shadows enabled, max AF and AA, my old rig is pretty heavily CPU-constrained. (CPU frame-time ~18ms, GPU render-time ~6ms.) It was hard to gain any value out of triple-buffering.
But my newer rig is able to push 100+ fps pretty consistently, and triple-buffering seems to have the effect of smoothing things out nicely, especially when running at a lower monitor refresh rate, eg 60hz.
On both my old and new rigs (without triple-buffering) I sometimes see a flicker or stutter… I don’t know if it’s a hardware or software glitch but it is visible in FrameView/PresentMon logs… a hiccup in the timing of every 87th frame. I have no idea if this is due to some important calculation BMS is performing in the background, or some sort of runtime garbage collection, or just a bug. But every 87th frame takes about 5-10ms longer than average… and a 7ms stutter is enough to drop p99 framerate from 100 to below 60! This happens on both of my Windows 10 rigs, regardless of DWM mode on/off, G-sync on/off, v-sync on/off, high or low refresh rates… it’s a mystery.
But! Somehow triple-buffering seems to be effective at smoothing that out. Partially due to randomizing the time-to-display latency of every single frame, by about 7ms… so a 7ms stutter once every 87 frames becomes entirely unnoticeable. But also it seems to change something fundamental in the game loop – I no longer see an obvious 5-10ms stutter between Present() calls, in the PresentMon logs, with TB enabled.
CONCLUSIONS
In general, the technique of triple-buffering is sound when you have upstream components capable of producing frames much faster than they can be consumed downstream (ie. a fast CPU and GPU, and low refresh-rate monitor). So, if you have a fairly modern, mid-to-high-end gaming PC, and are content with a modest ~60hz monitor refresh rate, it’s worth giving triple-buffering a try.
Either the in-app triple-buffering, or NVidia’s v-sync=Fast setting, for me, really seem to smooth out the microstutters and hiccups I get in other modes, and lower my overall average input-latency. (But, on my system, NVidia’s fast-sync has less tearing/jaggies, and also seems to have less impact on latency when alt+enter swapping over to borderless-windowed mode.)
But if you have an older PC that’s not delivering 60fps… or a newer PC that you’re trying to max out the framerate, driving a 120hz or higher monitor… triple-buffering is probably just going to get in the way.
Lastly… if you aren’t trying to squeeze every last millisecond, and you want to avoid tearing, while also enjoying the convenience of running in windowed/borderless mode (eg. easy alt-tab’ing) then definitely leave triple-buffering OFF and just let Windows DWM do its thing… it’s essentially forced triple-buffering.
FOOTNOTES
a) The “triple buffering” setting in the NVidia control-panel appears to be only relevant for OpenGL games – not applicable to Falcon BMS at all.
b) The “v-sync” setting in NVCP only applies when running fullscreen-exclusive (this includes the “fast” and “adaptive” v-sync modes) and effectively acts as an override to whatever the in-app settings may be. Note that v-sync=Fast is, essentially, NVidia’s implementation of triple-buffering.
c) Don’t forget you can switch between borderless-window and fullscreen-exclusive at any time with [Alt+Enter].
-
c) Don’t forget you can switch between borderless-window and fullscreen-exclusive at any time with [Alt+Enter].
…cc… if BMS doesn’t crash … oh it does work , just not everytime :twisted: (in 3D , on 2D map it works… can’t say that it ever crashed) - but I’m on AMD so…
-
@airtex2019…
the triplebuffering v-synch overkill.
it’s so simple and yet so complicated…
If I had to come up to a conclusion after your marvelous post which many will read few will understand and keep saying why i get this but I did this and they say that and so on and so forth…
would be…
try them and stick with the fluent smooth experience, don’t focus on it, live with it. -
If you are on AMD just lock your FPS with chill to your monitors refresh, works like a charm
-
That is good “quadro” card which handles most visual things to your screen by itself and don’t need time from Y computer’s CPU to aid it +transfer time. Make round things round in your screen without help of CPU etc.
DX cards use computer’s CPU to do same if they cant keep up.
That just simplified short story.
-
And with a GSync monitor, what is the best config??, I know that is different to V-sync but what is your experience??
Enviado desde mi SM-A530F mediante Tapatalk
-
And with a GSync monitor, what is the best config??, I know that is different to V-sync but what is your experience??
Enviado desde mi SM-A530F mediante Tapatalk
Depends on your computer. If you set lower refresh speed on demand you probably get more decent stable output on monitor.
-
And with a GSync monitor, what is the best config??, I know that is different to V-sync but what is your experience??
Enviado desde mi SM-A530F mediante Tapatalk
I have a G-sync monitor too … did a lot of testing with it, enabled and disabled. In the context of triple-buffering, I found G-sync doesn’t add a lot of additional value. (Although the opposite is true – if you don’t have G-sync or Freesync, then TB or nvidia Fast-sync, is probably worth trying.)
Fundamentally, the way G-sync interacts with DWM or fullscreen-exclusive triple-buffering, isn’t all that different to how it works without G-sync. There is still a v-blank signal, just now it’s a much wider window of time that the OS or the app has, to scanout the next frame to the monitor. But with TB enabled (and assuming your CPU/GPU are running faster than your monitor’s refresh rate, which is the whole point) there will almost always be a new frame ready and waiting, at the start of the sync window – so G-sync doesn’t end up being much of a factor.