GPU BOUND Hitching (frametime disparity w/ higher GPU load)

Dictator93Dictator93 Join Date: 2008-12-21 Member: 65833Members, Reinforced - Shadow
Hi everyone,
being a stickler for great and flawless performance I was very very excited about this new patch. First impressions are great, there is lessened hitching and I am 100% positive that texture memory problems are now a thing of the past. The precaching and new memory management definitely allows high textures with lower VRAM with no stuttering. Great stuff!

But of course work is to be done and some odd hitches still exist. Following a cue from the CDT I am going to account and document my experience with hitching, and specifically hitching of a certain variety.

GPU Bound Hitching (High GPU utilization = worse, longer, increased bad frames)

I created two instances to show what I am talking about a demo/benchmark for 60 seconds recorded with a 60 fps cap and one with a 200 fps cap. The graphical settings are the same in each demo (everything maxed @ 1080p Vsync set to off).

60 fps cap frame time graph: GPU Utilization at 30-40%
fps_cap_606yjiz.png
200 fps cap frame time graph: GPU Utilization at 90-100%
fps_cap_2001mjdx.png

Things to notice:

1. With a 200 fps cap the frame times are much more erratic. Although the average FPS is high (134 / 7,56 ms), the average second is filled with frames of great disparity in rendering time. The subjective experience is equally erratic feeling.
2. With a 60 fps cap the frame times are very steady from the base 16,6ms. Most stay around that frame time with fewer dips beyond 20 ms... if any. The frame time is concentrated around the average (a forced average here.
Extra interesting Bench
Purposefully reduced GPU load by turning down settings. Ambient occlusion and shadows set to off. Otherwise same as above.

200 fps cap frame time graph: GPU Utilization at 60%
fps_cap_200_low7huiz.png

Things to notice:
1. The normal frame times concentrated around the average (166 fps / 6,02ms). Few if any dips beyond 20 ms or even 16,6 ms. Subjective feeling is of similar smoothness to a forced cap of 60 at higher settings.
Conclusions?

Something in NS2 is changing the per frame GPU load radically. 1 frame could easily be rendered out in 4,2 ms and the next can be 12,5 or even beyond 20,0 ms. This level of per frame erratic load seems to increase the more you make yourself GPU bound, aka, increasing your GPU utilization. This could mean turning up settings.

An interesting thing to look at if possible would be the GPU Utilization per frame. To for example see if one frame that was 4,2 ms vs a the proceeding frame that was 12 ms in the scenario war you are GPU bound. Is that 12 ms frame 12 ms while the GPU was @ 90% utilization? Or was there as sudden drop in GPU Utilization signifying something else?

What is causing the per frame load to be so radically different all the time? Mind you, this same behavior is ubiquitious to any amount of CPU load (in a big server, empty server, whatever).

OK, hope this info helps anyone! I know Iron Horse will like this post at least!

Also, if anyone needs the demo files for each recording. I have em.
«1

Comments

  • WasabiOneWasabiOne Co-Lead NS2 CDT Join Date: 2011-06-15 Member: 104623Members, NS2 Developer, NS2 Playtester, Squad Five Gold, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Gold, Reinforced - Diamond, Reinforced - Shadow, WC 2013 - Shadow, Subnautica Playtester, Pistachionauts
    thank you for the detailed information, matso eats this stuff like candy! also please remember this just the first piece of the larger project, your comments and data help a lot.
  • ZEROibisZEROibis Join Date: 2009-10-30 Member: 69176Members, Constellation
    Does this also mean that we could get a smoother experience right now by setting our fps cap to something like 100 or right at our monitor refresh rate like 85fps for example instead of allowing it to go higher?
  • Dictator93Dictator93 Join Date: 2008-12-21 Member: 65833Members, Reinforced - Shadow
    ZEROibis wrote: »
    Does this also mean that we could get a smoother experience right now by setting our fps cap to something like 100 or right at our monitor refresh rate like 85fps for example instead of allowing it to go higher?

    It could mean something like that. Although, since I have no idea how to lookin to the engine to see what is happening, hard to say if what I point out correlates to something true. Just something that happens.


    I have a 120hz monitor and although I would love to cap at 120fps, with High GPU utilization the frames are too erratic. Hence why I am currently capping at 60 even though only like 30-40% of my GPU is being used.

    I suggest benchmarking your games performance, posting it in here, and I can help advise you!
  • IronHorseIronHorse Developer, QA Manager, Technical Support & contributor Join Date: 2010-05-08 Member: 71669Members, Super Administrators, Forum Admins, Forum Moderators, NS2 Developer, NS2 Playtester, Squad Five Blue, Subnautica Playtester, Subnautica PT Lead, Pistachionauts
    edited August 2014
    Can you reproduce the poor graph with AO enabled but without shadows??
    AO is expensive and seems to add input delay as a result, so it wouldn't surprise me

    Glad you aren't seeing the hitching anymore, and to add better news: Not all the hitches are fixed, (there's some rifle UI display ones etc) but the rest are known and being worked on.
    Feel free to report any here though.

    Thanks dictator :)


    p.s. I am pretty sure that any method of capping fps to lower than the fps you would otherwise be experiencing can lead to input delay in exchange for smoother frame times.
    Essentially mimicking the downsides of Vsync without the upsides (if the number is different from your monitor's refresh rate)
    At least this is what i recall..

  • Dictator93Dictator93 Join Date: 2008-12-21 Member: 65833Members, Reinforced - Shadow
    edited August 2014
    IronHorse wrote: »
    Can you reproduce the poor graph with AO enabled but without shadows??
    AO is expensive and seems to add input delay as a result, so it wouldn't surprise me

    Glad you aren't seeing the hitching anymore, and to add better news: Not all the hitches are fixed, (there's some rifle UI display ones etc) but the rest are known and being worked on.
    Feel free to report any here though.

    Thanks dictator :)


    p.s. I am pretty sure that any method of capping fps to lower than the fps you would otherwise be experiencing can lead to input delay in exchange for smoother frame times.
    Essentially mimicking the downsides of Vsync without the upsides (if the number is different from your monitor's refresh rate)
    At least this is what i recall..
    Based on your suggestion I benched a ton of different GPU limited/ non-GPU limited scenarios.
    At your request:
    All settings high, 1080p, shadows off / AO high, FPS cap of 200
    fps_cap_200_aobrrum.png

    same as above, FPS cap 120
    fps_cap_120_ao2bra0.png

    Then a test for the shadows

    All settings high, 1080p, shadows on / AO off, FPS cap of 200
    fps_cap_200_shadowsb9ysb.png

    same as above, FPS cap 120
    fps_cap_120_shadowsa3zui.png

    Conclusions
    1. Fraps is reading the fps cap, but it is also saying frames are going above it. Probably due to where fraps reads the frame time in the pipe line. I imagine turning on Vsync would prevent fraps from reading the frames above the cap.
    2. Both results (ao on/of, shadows on/off) show a much tighter spread around the average than 200fps cap / poor frames graph from my first post (everything on). Implying a connection to what I am saying perhaps.

    I will post more of the different benches I made which have very interesting results when I have a break during work.
  • Dictator93Dictator93 Join Date: 2008-12-21 Member: 65833Members, Reinforced - Shadow
    edited August 2014
    IronHorse wrote: »
    p.s. I am pretty sure that any method of capping fps to lower than the fps you would otherwise be experiencing can lead to input delay in exchange for smoother frame times.
    Essentially mimicking the downsides of Vsync without the upsides (if the number is different from your monitor's refresh rate)
    At least this is what i recall..
    While you would have the downside of having screen tearing, you would not have the Vsync downside (tripple buiffreed or otherwise) of the frame rate halfing or going down by 1/3 even if you drop frametime below 16,6 or 8,3.

    The best way to actually get a smooth framerate, with little tearing, and with no Vsync problems is to cap your framerate externally with Nvidia inspector, then to turn on Vsync in the driver but to use the adaptive setting.

    This turns Vsync on for all frames which exceed or meet the targeted framerate/refreshrate, but turns it off for all other frames which go bellow the target. This means... tear free at refresh rate, but with tearing when frames drop. A much better solution for input lag problems assocated with Vsync in comparison to triple or double buffering.

    More benches which show interesting results
    Going on the assumption that it was not a specific setting (like AO or shadows as the above tests show), but rather on how GPU bound you are, I made myself resolution bounded tests. This would allow for my GPU to become bounded, but without running the questionable AO effect.

    All settings high, 1080p, shadows on / AO off, FPS cap of 120
    fps_cap_120_shadowsa3zui.png

    All settings high, 1440p, shadows on / AO off, FPS cap of 120
    fps_cap_120_1440pmzar7.png

    All settings high, 4K, shadows on / AO off, FPS cap of 120
    fps_cap_120_4k6kbok.png


    1. The amount of disparate frametimes increase the more GPU bound one is. With each increase of resolution, the GPU utilization increased... and similar behavior is scene to an uncapped framerate with all the settings cranked.

    Interesting extra bench

    To try and check whether there is an inherent problem with the AO I tested all settings on high but at a lower resolution to get rid of my GPU boundedness.

    All settings high, 720p, shadows on / AO high, FPS cap of 120
    fps_cap_120_ao_720py1y8q.png

    1. Take notice of the relatively stable frametimes. The subjective experience was also buttery smooth.
  • ZEROibisZEROibis Join Date: 2009-10-30 Member: 69176Members, Constellation
    For reference I was assuming vsync off as tearing does not happen on my crt. I assume that setting it lower would not cause any input delay as the connection is purely analog. Or is the delay occurring within the game itself?
  • Dictator93Dictator93 Join Date: 2008-12-21 Member: 65833Members, Reinforced - Shadow
    edited August 2014
    ZEROibis wrote: »
    For reference I was assuming vsync off as tearing does not happen on my crt. I assume that setting it lower would not cause any input delay as the connection is purely analog. Or is the delay occurring within the game itself?

    The delay occurs at the driver interaction game level. The driver is forcing the GPU to put out, perhaps delaying, and even discarding "runt" or frames way below the budgeted frame time. There is math to the pacing of course.

    Limiting FPS typically does not increase input lag beyond the alloted frametime to render. aka 16,6 ms for 60fps limited.
  • CLARK_KENTCLARK_KENT Vancouver, Canada Join Date: 2002-11-21 Member: 9508Members, Reinforced - Silver
    Dictator93 wrote: »
    The best way to actually get a smooth framerate, with little tearing, and with no Vsync problems is to cap your framerate externally with Nvidia inspector, then to turn on Vsync in the driver but to use the adaptive setting.

    I've also noticed what you're describing in this build. It felt like constant hitching... or constant micro rubber-banding. Through experimentation, I came to the same general finding you did with regards to adaptive vsync. That helped me quite a bit.

  • IronHorseIronHorse Developer, QA Manager, Technical Support & contributor Join Date: 2010-05-08 Member: 71669Members, Super Administrators, Forum Admins, Forum Moderators, NS2 Developer, NS2 Playtester, Squad Five Blue, Subnautica Playtester, Subnautica PT Lead, Pistachionauts
    edited August 2014
    @Dictator93‌
    It's my understanding that if you start drawing an old image to the screen, (because you've thrown away frames) and a newer one is created, the newer one is ignored to maintain the same frames per second. As a result, you get input lag, as old frame data is used.

    "External framerate limiters of any kind adds lag in current 3D architectures." -Blur Busters

    It makes sense to me when i see graphs of buttery smooth frametimes when your FPS is being capped below the FPS you average without it, since the frames are being delayed or thrown away and essentially your GPU rests, or at least makes you no longer GPU bound.
    So yea.. GPU bound systems having poor frametimes makes sense when compared to the results of frame limiting.

    What about reproducing the issue with no frame limiting but also not being GPU bound? (the only way to be sure is that r_stats reports not waiting on GPU whatsoever)

    *Prays to the G-Sync Gods to deliver a monitor on my desk*

    edit: Also, way more accurate than fraps:
    http://www.anandtech.com/show/6862/fcat-the-evolution-of-frame-interval-benchmarking-part-1
    http://www.geforce.com/hardware/technology/fcat
  • DC_DarklingDC_Darkling Join Date: 2003-07-10 Member: 18068Members, Constellation, Squad Five Blue, Squad Five Silver
    I want that monitor, gimme the card to go with it!

    I should REALLY get to sleep. I did a few tests just now but it didnt go quite how I planned with logs. So I shall do some frametime runs myself tomorrow. I can not even get close to 200fps unless I disable a gazilion settings, but thats a bridge for tomorrow to cross.

    Thinking how to see if im gpu bound at a certain moment. I mean I can see it just fine with certain overlays, but I also wanna record it and fraps didnt do it.. hmm.
  • Dictator93Dictator93 Join Date: 2008-12-21 Member: 65833Members, Reinforced - Shadow
    edited August 2014
    Yeah I would love to set up FCAT (FCAT did great thing for multi-GPU drivers), but it does take some time. I will look into it.

    And what you are saying about purposefully delaying a frame is true, but still at the MS rates we are talking about (8,3 for 120fps) that would obviously be pretty hard to notice. We all just need to wait for adaptive displays to become better. I for one really want a 1440p IPS Gsync monitor. Tired of TN Panels...

    I think my purpose in showing that disparate frames increase the more GPU bound one is, is to point out how odd/Broken the work load apparently in NS2 is. I am not sure it is by design or correct that the milisecond perframe can be so erratic when the game environment is not changing radically. If the game environment were changing rapidly and radically perframe then I would say that the load disparity makes sense (even then, I would be reluctant to say it is normal for any game to do that). BUt currently, just walking around a map induces these random and constant swings in per-frame rendering.


    Here are two graphs following the new scheme:

    GPU Bound, everything High (AO / Shadows on) @ 1080p w/ no external frame-limiting (NS2 caps at 200)
    gpu_boundlclko.png

    CPU Bound, everything High (AO / Shadows on) @ 720p w/ no external frame-limiting (NS2 caps at 200)
    cpu_boundmczwd.png

    Proof of the two scenarios (since taking a screen in NS2 would compromise the framerate for that second):
    boundedosac3.png


    1. The subjective difference in smoothness is extremely apparent even though the average framerates are very similar (Both are even higher than my refresh rate of 120hz/120fps, 137fps and 154fps respectively).
    2. Something IS UNDENIABLY occuring in NS2 apparently, probably not related to pre-caching, causing either GPU stalling when running around or radically changing GPU load. I am positive standing still for 60 seconds would lead to a flat line graph. Still, the change in visuals per-frame does not justify such crazy behavior. Pretty sure it is not intended behavior.
    3. I for one thing it has something to do with drawing new geometry or loading different map sections. The experience of a hitch is eerily similar to the hitch that used to occur when exiting a gorge tunnel. But I could of course be completely off base there.

    If anyone wants the demos for each of these recordings, I would happily provide them. Further more, I could p_log them too for data sakes.
  • IronHorseIronHorse Developer, QA Manager, Technical Support & contributor Join Date: 2010-05-08 Member: 71669Members, Super Administrators, Forum Admins, Forum Moderators, NS2 Developer, NS2 Playtester, Squad Five Blue, Subnautica Playtester, Subnautica PT Lead, Pistachionauts
    Awesome! Thanks dictator
  • DC_DarklingDC_Darkling Join Date: 2003-07-10 Member: 18068Members, Constellation, Squad Five Blue, Squad Five Silver
    unfortunately I can not reproduce at this time. Possibly because I tend to not overbudget so much that I get 200fps. :P (and if I lower settings i dont load my gpu enough it seems).
    I still wanted to look at my plog however..

    I can however share what I wanted to do with you.

    * bind fraps benchmark/video & making a plog all to the same key. This will make them somewhat start at the same instant with very little frames apart.
    - confirmed if I bind them to F11 that the vid, fpstimes and plog all start.

    * compare plog to frametimes.
    - unable to do due to plog issues.. (being discussed)

    * compare frametimes to vid stutters.
    - video failed, as in stopped recording after 4 seconds.. :( (however I used the vid trick before so I know it can be done)


    My plan was to match anything inbetween the log, video and plog which did not match up and compare. For the vid this may be done by a frame to frame comparison.
    (hence the need to start them as close together as possible)

    perhaps @Dictator93 can do such tests with his logs, as he experiences the problem.
  • matsomatso Master of Patches Join Date: 2002-11-05 Member: 7000Members, Forum Moderators, NS2 Developer, Constellation, NS2 Playtester, Squad Five Blue, Squad Five Silver, Squad Five Gold, Reinforced - Shadow, NS2 Community Developer
    edited August 2014
    Dictator93 wrote: »


    Something in NS2 is changing the per frame GPU load radically. 1 frame could easily be rendered out in 4,2 ms and the next can be 12,5 or even beyond 20,0 ms. This level of per frame erratic load seems to increase the more you make yourself GPU bound, aka, increasing your GPU utilization. This could mean turning up settings.

    I believe what you are seeing are probably related to how NS2 avoids having too many screens buffered in the video driver - it normally limits it to one full outstanding screen.

    So if you get GPU limited, ie NS2 manages to produce render data faster than the GPU can process it, the main render thread in NS2 simply sits in a wait loop until the GPU says "1" when asked for how many screens it has buffered.

    This ensures that your user input is always based on no more than one frame old data.

    In the p_log, this waiting is visible as "WaitForBufferedFrames" or something, so if you can get your PerfAnalyser going, I'm predicting you will see a lot of that (the other possibility is "D3D9::Present", where the DRIVER blocks NS2 from pushing too much data, but... I'm betting on "Wait...").

    Detecting that you are pushing too much data to the card and spreading out the load rather than cutting it hard might improve things a bit - especially now that there are so many threads running and doing useful things, spending CPU to produce more render data than the card can handle seems silly.

    Really, are we discussing frame quality for NS2 at 100+ fps? Need to check for flying pigs...
  • DC_DarklingDC_Darkling Join Date: 2003-07-10 Member: 18068Members, Constellation, Squad Five Blue, Squad Five Silver
    In the plog I did the very few (small) spikes i had were indeed waitforbufferedframes and D3D9:present
  • Dictator93Dictator93 Join Date: 2008-12-21 Member: 65833Members, Reinforced - Shadow
    edited August 2014
    IronHorse wrote: »
    Awesome! Thanks dictator
    @IronHorse
    NO, Thanks for being patient with my neurotic obsession with frame quality!
    matso wrote: »
    I believe what you are seeing are probably related to how NS2 avoids having too many screens buffered in the video driver - it normally limits it to one full outstanding screen.

    So if you get GPU limited, ie NS2 manages to produce render data faster than the GPU can process it, the main render thread in NS2 simply sits in a wait loop until the GPU says "1" when asked for how many screens it has buffered.

    This ensures that your user input is always based on no more than one frame old data.

    In the p_log, this waiting is visible as "WaitForBufferedFrames" or something, so if you can get your PerfAnalyser going, I'm predicting you will see a lot of that (the other possibility is "D3D9::Present", where the DRIVER blocks NS2 from pushing too much data, but... I'm betting on "Wait...").

    Detecting that you are pushing too much data to the card and spreading out the load rather than cutting it hard might improve things a bit - especially now that there are so many threads running and doing useful things, spending CPU to produce more render data than the card can handle seems silly.

    Really, are we discussing frame quality for NS2 at 100+ fps? Need to check for flying pigs...

    @Matso
    Thanks for the response. I will go through the p_log immediately and try and figure out what happens at the various hitching frame moments. Like you said though... I imagine it is the buffering. DOes r_sync still function?

    I tried out different values but it really did not seem to affect hitching at all. Was it hard locked a while back even though the console reports changes ("Now using 5 maximum buffered frames" or something)?

    Furthermore, if you would like matso I can provide you any data you need. Even though it sounds superfulous to some degree to be talking about frame quality at average framerates of 134 or so (yeah.-... pigs are flying)... I really would love to have a smooth NS2 experience at this framerate. It always seemed strange to me that NS2 becomes less smooth the higher its framerate and GPU usage becomes (I reported this for the first time last year december or so if you check my post history), and now there is a chance that it can be fixed.... and well, I am very hopeful and ecstatic.

    Any other data you would like to see matso? DX9 v DX11? Anything to help!

    https://mega.co.nz/#!QwtTSRoK!zPgzgWQDYil715n-0hA0qSng4KTDUzyXIP9HrlMxaZw (link to plog download)
    Been going through my p_log... it appears that D3D9Device::Present (going from .01 ms to 7ms in one frame) as well as children under the Renderscene::GetVisibleObjects , RenderRenderer::Rendershadowmaps , and Renderscene::InternalRender(going from .25 to 7.5 ms in one frame!) The most interesting frames which encapsulate the hitching are probably from 4268 - 4273. There you will see hitches emblematic of my expience. From loading world geometry... and from D3D9Device::Present. When the Geometry related hitches pop up (or ms per frame dramatically increases)... you also see RenderDevice::Lock pop up (Frame 4271). GPU STALL!

    @DC_Darkling‌
    What is your rig Darkling? I think with my help we could dial in your settings to 100% replicate this behavior actually.

    The thing is... with a 120 hz screen... seeing this behavior is soooo much easier than if it is locked at 60hz. Every frame difference is captured as long as it is below 120. It the difference between a flat 120 and a jittery 120... is really painfully obvious.

    Preliminary conclusions
    1. Loading in new world geometry causes stalling and massive MS spikes (an increase of 7-10 ms in one frame is pretty big). This causes RenderDevice::Lock.
    2. D3D9Device::Present more so than D3D9Device:WaitForBufferedFrames seems to be my problem. This is apparently realated to frame buffering also?
  • DC_DarklingDC_Darkling Join Date: 2003-07-10 Member: 18068Members, Constellation, Squad Five Blue, Squad Five Silver
    My rig is far from the monster you have @Dictator93, although for my ends it suits well. But I shall humour you!

    CPU: I have a, at the moment not overclocked, firstgen i7 970 running on 3.2Ghz with enabled turbo.
    CPU Cooling: Corsair H80 with noctua fans. (the ones for radiators, to lazy to look it up now) & Arctic Silver 5 everywhere.
    Motherboard: Gigabyte X58A-UD7 v1 (latest non beta bios sufficed for the 970)
    Video: Nvidia Gainward Geforce 570GTX Phantom 1280MB (no additional OC yet beyond what gainward supplied)
    Ram: Corsair 3x4096MB DDR3 1600 CMZ12GX3M3A1600C9, worth 12GB. (this is tripple channel!)
    O/S: Win7 Ultimate 64bit
    Monitor: Dell P2211H (connected by dvi).. I bothered to install the monitor in windows, so its detected as this one and not plug&play.
    Mouse: Logitech G400s
    Keyboard: Ducky Zero - blue
    Headset/Mic: Steelseries Seberia V2 - Blue

    PSU: Antec TP-750 750 Watt
    Disks: (see below)
    Data (and programs): 2x WD Caviar RE3 1 TB, 7200 Rpm in RAID1
    OS: 1x Crucial M4 128GB, pagefile & some games.
    Games: 2x Hybrid desktop drives ST2000DX001 in RAID1.

    Resolution: Full HD (1920*1080)
    As my monitor is 60Hz I do not bother to budget my cpu/gpu for to high over 60. I just dont see the difference.


    Gaming is usually done in Fullscreen windowed mode. (I recently set aero to off in the ns2 properties, see if I personally notice a difference).
    Ambient Occlusion is naturally off, same for shadows. Enabling this on my rig proved..... unwise as the 570gtx is already maxing out without it in late game.
  • Dictator93Dictator93 Join Date: 2008-12-21 Member: 65833Members, Reinforced - Shadow
    edited August 2014
    My rig is far from the monster you have @Dictator93, although for my ends it suits well. But I shall humour you!

    CPU: I have a, at the moment not overclocked, firstgen i7 970 running on 3.2Ghz with enabled turbo.
    CPU Cooling: Corsair H80 with noctua fans. (the ones for radiators, to lazy to look it up now) & Arctic Silver 5 everywhere.
    Motherboard: Gigabyte X58A-UD7 v1 (latest non beta bios sufficed for the 970)
    Video: Nvidia Gainward Geforce 570GTX Phantom 1280MB (no additional OC yet beyond what gainward supplied)
    Ram: Corsair 3x4096MB DDR3 1600 CMZ12GX3M3A1600C9, worth 12GB. (this is tripple channel!)
    O/S: Win7 Ultimate 64bit
    Monitor: Dell P2211H (connected by dvi).. I bothered to install the monitor in windows, so its detected as this one and not plug&play.
    Mouse: Logitech G400s
    Keyboard: Ducky Zero - blue
    Headset/Mic: Steelseries Seberia V2 - Blue

    PSU: Antec TP-750 750 Watt
    Disks: (see below)
    Data (and programs): 2x WD Caviar RE3 1 TB, 7200 Rpm in RAID1
    OS: 1x Crucial M4 128GB, pagefile & some games.
    Games: 2x Hybrid desktop drives ST2000DX001 in RAID1.

    Resolution: Full HD (1920*1080)
    As my monitor is 60Hz I do not bother to budget my cpu/gpu for to high over 60. I just dont see the difference.


    Gaming is usually done in Fullscreen windowed mode. (I recently set aero to off in the ns2 properties, see if I personally notice a difference).
    Ambient Occlusion is naturally off, same for shadows. Enabling this on my rig proved..... unwise as the 570gtx is already maxing out without it in late game.
    Your computer is pretty beastly excluding the GPU. I mean... mine in comparison is...

    CPU: Core i7 930 @ 4.2 (you should look at overclocking your CPU especially since it is on water!)
    CPU Cooling : Noctua DH-14
    Motherboard: X58 SLI FTW3
    Video: GTX 570 tri-SLI (overclocking these is like turning on a furnace)
    Ram: Corsair 3x4096MB DDR3 1600 CMZ12GX3M3A1600C9, worth 12GB. (this is tripple channel!) (copied this from you, because mine is the exact same, lol)
    OS: Win7 Pro x64
    Monitor: ASUS VG236H (3d / 120hz)
    Mouse: Razer Death Adder 3.5G
    Keyboard: POS
    Headset: some stereo headphones, whatever!

    PSU: Corsair HX 1050W Gold
    Storage:
    OS / games: OCZ Vertex 4 240GB,
    Data: WD Caviar Black 7200 RPM

    Chance you could turn on Ambient Occlusion (to make yourself 100% GPU limited) and run a frametime test whilst running around summit or someting? (best palce to see it is just running around reactor core whilst facing the core).

    More data and proof of what is happening always helps!
  • DC_DarklingDC_Darkling Join Date: 2003-07-10 Member: 18068Members, Constellation, Squad Five Blue, Squad Five Silver
    I can do that, after nomnom time. :P
    Although im fairly high on max gpu already. (im already wait on gpu)

    Before I shoved in this used 970 I ALSO had a 930.. Small world eh.. That one was OCed, although less severe.. I dislike heat coming out my pc, but its a neccesity to keep running aye. ;) (I dislike heat)
    Yeh.. its on the todo list to shove in a 2nd card in SLI.. but ya know.. budget. :P
    I was meaning beasty because you overclock a lot.. I try to limit OCs to a low OC to minimize heat. Also my SSD is relative old by now. Also up for replacement of a bigger faster one.
    As for the memory.. good news.. I had a chat with corsair before buying it, and it IS 100% guaranteed on firstgen i7. ;) (it was the ONLY one of that size though, just in case you ever wanna replace or add)


    As a second test seeing you have more resolution then I run.. What is your ns2 commit memory & system commit when ran?
    My system commit reached around 6GB and ns2 around 1.6GB of commit memory. As some other folk have out of memory errors, I wonder how many may pass their commit limit considering windows default is always way to low or way to high. :p
  • IronHorseIronHorse Developer, QA Manager, Technical Support & contributor Join Date: 2010-05-08 Member: 71669Members, Super Administrators, Forum Admins, Forum Moderators, NS2 Developer, NS2 Playtester, Squad Five Blue, Subnautica Playtester, Subnautica PT Lead, Pistachionauts
    edited August 2014
    My understanding is that you can fix this by either having a better GPU that can handle the increased data, or by increasing the number of buffered frames (which incurs input delay ofc) and/or capping FPS.

    The typical solution for the engine would be "FPS smoothing" that some games employ, which just internally caps FPS to avoid any input delay (no 3rd party limiter), but also ensures that successive frames are not wildly different from one another to avoid frame time oscillation.
  • DC_DarklingDC_Darkling Join Date: 2003-07-10 Member: 18068Members, Constellation, Squad Five Blue, Squad Five Silver
    As I promised to humour you, here you go.
    With enought zoom 'spikes' do emergy but they dont pass 18ms from the looks of it. No way I see the difference.

    http://imgur.com/z0FEW5W,QSjczO1
    http://imgur.com/z0FEW5W,QSjczO1#1

    of course the one with higher fps is the one with no occlusion on.
  • Dictator93Dictator93 Join Date: 2008-12-21 Member: 65833Members, Reinforced - Shadow
    edited August 2014
    IronHorse wrote: »
    My understanding is that you can fix this by either having a better GPU that can handle the increased data, or by increasing the number of buffered frames (which incurs input delay ofc) and/or capping FPS.

    The typical solution for the engine would be "FPS smoothing" that some games employ, which just internally caps FPS to avoid any input delay (no 3rd party limiter), but also ensures that successive frames are not wildly different from one another to avoid frame time oscillation.

    The thing is... frame rate smoothing works in games where the frames are not so disparate. The frame differences I am seeing are from 134 fps.. to a sudden 54 within the span of a frame. Framerate smoothing would mean turning my internal FPS cap... to 60 or so? AKA, less 50% of my GPU being used. Even though my average without it is more than 2xs the amount. I get the feeling that the intended engine behavior should not be so erratic.

    Once again, the question is begged. What is so rapidly different from frame 1 at 134 fps.. to the next hitch frame of 50 fps. Normally I would say that such undynamic scenes in an unpopulated level... should not reasonably have such rapid changes in fps. Rather... when more objects are added on screeen, more effects go off, etc...

    Not just running around a completely normal empty level... that points to some archtictural problem IMO.

    I mean... many many other games perform at a high FPS whilst using High GPU utilization without having hitching (and not having incredible input lag through aggressive frame smoothing).

    Also.. as my plog shows, the geometry being called in also incurs a really high per frame MS. Is it designed so that it should cause such spikes? Why is the geometry loaded all on one frame? etc... questions about how and why these things are done is important. Much like it was important to ask, How are textures being loaded and how is texture memory handeled?

    Some how I find it hard to believe that the rendering is so intense when just running around empty levels to cause such huge spikes, and then to say it is reasonable for thengine to induce that. It is not as if 3 GTX 570s are not powerful GPUs....

    naja!

    P.S. - it is possible to activate a driver level Vsync frame smoothing in Nvidia inspector. It lowers your average frame rate... and I have done it in NS2. Nonetheless... it does NOT get rid of the spikes that occur.
    As I promised to humour you, here you go.
    With enought zoom 'spikes' do emergy but they dont pass 18ms from the looks of it. No way I see the difference.

    http://imgur.com/z0FEW5W,QSjczO1
    http://imgur.com/z0FEW5W,QSjczO1#1

    of course the one with higher fps is the one with no occlusion on.
    Those spikes IMO are pretty periodic would probably be pretty noticable to me on my screen. But alas.. .16,6 ms is all your screen shows :(. But thanks for your input and time taken!

  • DC_DarklingDC_Darkling Join Date: 2003-07-10 Member: 18068Members, Constellation, Squad Five Blue, Squad Five Silver
    yeh, like I said. 16ms is beyond what id notice ina spike.
  • IronHorseIronHorse Developer, QA Manager, Technical Support & contributor Join Date: 2010-05-08 Member: 71669Members, Super Administrators, Forum Admins, Forum Moderators, NS2 Developer, NS2 Playtester, Squad Five Blue, Subnautica Playtester, Subnautica PT Lead, Pistachionauts
    edited August 2014
    Yea, that's a valid question dictator..
    May have to do with how occlusion is setup currently

    Let's investigate further for matso :)
  • Dictator93Dictator93 Join Date: 2008-12-21 Member: 65833Members, Reinforced - Shadow
    edited August 2014
    IronHorse wrote: »
    Yea, that's a valid question dictator..
    May have to do with how occlusion is setup currently

    Let's investigate further for matso :)
    Well, one can hope. That and the GPU stalling due to D3D9Device::Present are pretty brutal. I would rate the game as currently unplayable at high framerates. And even then... capping or running vsync @ 60 still induces this behavior.

    Actually IronHorse, do you think you could run an FPS test for me to try and reproduce some hitching? Essentially load up summit, go to reactor core and p_log while running around the reactor whilst facing it. That induces it for me in a purelly periodic way. It is quite brutal.
    @matso
    What do you think can be done in the future?
  • matsomatso Master of Patches Join Date: 2002-11-05 Member: 7000Members, Forum Moderators, NS2 Developer, Constellation, NS2 Playtester, Squad Five Blue, Squad Five Silver, Squad Five Gold, Reinforced - Shadow, NS2 Community Developer
    Well, what you want to do is to produce frames at a rate the GPU can handle without stalling, so detecting "GPU overrun" and then lowering the rate at which frames are produced should do the trick... basically, a dynamic maxfps.

    In the meantime, experimenting with maxfps and setting it low enough to avoid GPU stalls in normal play should - hopefully - help.

    There is also the bug with double/tripple buffering - they actually work, but you have to unset/set them in the options every time you start the game. If you run your monitor at 100Hz and use double buffering, it might give you a smooth 100Hz experience.

    Try it out and see if it helps.
  • IronHorseIronHorse Developer, QA Manager, Technical Support & contributor Join Date: 2010-05-08 Member: 71669Members, Super Administrators, Forum Admins, Forum Moderators, NS2 Developer, NS2 Playtester, Squad Five Blue, Subnautica Playtester, Subnautica PT Lead, Pistachionauts
    edited August 2014
    @matso‌
    IIRC You have to re enable it after every map change ;)
    Also Idk about using double buffering over triple, due to the high amount of input delay.
    Either option is still too much input delay for a competitive twitch shooter, imo.

    That dynamic maxfps idea sounds sexy..
  • Dictator93Dictator93 Join Date: 2008-12-21 Member: 65833Members, Reinforced - Shadow
    @matso‌
    @IronHorse‌
    @McGlaspie‌

    Any update on something to remedy this issue per-chance? Obviously, no expectations on my end :D Thanks for anything.
  • IronHorseIronHorse Developer, QA Manager, Technical Support & contributor Join Date: 2010-05-08 Member: 71669Members, Super Administrators, Forum Admins, Forum Moderators, NS2 Developer, NS2 Playtester, Squad Five Blue, Subnautica Playtester, Subnautica PT Lead, Pistachionauts
    I don't think this has moved forward much in terms of finding a solution to radically changing FPS / frame times just fyi.
    I did briefly speak about Triple Buffering / dynamic maxfps / fps smoothing internally with mats and mcglaspie internally the other day, but so far these are just ideas.

    269 needs to get out the door to address crashes, after that we still have tons already scheduled but maybe if Mats finds the time..
    In the interim try using that method he mentioned last, utilizing maxfps with the proper value
Sign In or Register to comment.