CPU and GPU both under 50% utilization while FPS is low?

mechtechmechtech Join Date: 2012-11-19 Member: 172376Members
edited November 2012 in Technical Support
<div class="IPBDescription">Feedback wanted on this optimization observation</div>I was monitoring my CPU/GPU/FPS during recent games, and this is what I discovered:

My setup is an overclocked GTX480 (GTX 570 performance), i5 750 @ 4.0GhZ, and 8GB RAM.
I have infestation low, atmospherics/bloom/AA off, textures on high, 1920x1200 res, tex streaming off, multi-core rendering on. Drivers are up to date.

During a recent game with a very aggressive alien commander (lots of player models on screen, lerk fog, tons of infestation, etc), my FPS was falling to 25 and I wanted to know why it was running so poorly on fairly low settings.

I took a look at my CPU/GPU readout, and the GPU was at 40% utilization (swinging from 40% to 100%, usually around 60% use) while my 4 CPU threads were each at 50% utilization. There's a very serious optimization problem here, over 1/2 of my computer's power was sitting unused while the game was chugging at 25-30fps. It wasn't occasional dips into low utilization, the game was consistently leaving about 1/2 of my CPU and GPU sitting idle.

Normally the game runs at about 50 FPS and utilization is in a somewhat normal range, but as more "stuff" gets added onto the screen, especially alien stuff, it leaves more and more CPU+GPU power to waste, and FPS drops accordingly. Eventually it gets to a point where a chunk of resources is perpetually untouched.

I would like to see if anyone else observes this phenomenon, and if the devs are aware that this is what is sucking so much FPS off of otherwise powerful PCs.

Comments

  • FatfoolFatfool Join Date: 2012-11-19 Member: 172399Members
    edited November 2012
    <!--quoteo(post=2029370:date=Nov 20 2012, 08:16 AM:name=mechtech)--><div class='quotetop'>QUOTE (mechtech @ Nov 20 2012, 08:16 AM) <a href="index.php?act=findpost&pid=2029370"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->I was monitoring my CPU/GPU/FPS during recent games, and this is what I discovered:

    My setup is an overclocked GTX480 (GTX 570 performance), i5 750 @ 4.0GhZ, and 8GB RAM.
    I have infestation low, atmospherics/bloom/AA off, textures on high, 1920x1200 res, tex streaming off, multi-core rendering on. Drivers are up to date.

    During a recent game with a very aggressive alien commander (lots of player models on screen, lerk fog, tons of infestation, etc), my FPS was falling to 25 and I wanted to know why it was running so poorly on fairly low settings.

    I took a look at my CPU/GPU readout, and the GPU was at 40% utilization (swinging from 40% to 100%, usually around 60% use) while my 4 CPU threads were each at 50% utilization. There's a very serious optimization problem here, over 1/2 of my computer's power was sitting unused while the game was chugging at 25-30fps. It wasn't occasional dips into low utilization, the game was consistently leaving about 1/2 of my CPU and GPU sitting idle.

    Normally the game runs at about 50 FPS and utilization is in a somewhat normal range, but as more "stuff" gets added onto the screen, especially alien stuff, it leaves more and more CPU+GPU power to waste, and FPS drops accordingly. Eventually it gets to a point where a chunk of resources is perpetually untouched.

    I would like to see if anyone else observes this phenomenon, and if the devs are aware that this is what is sucking so much FPS off of otherwise powerful PCs.<!--QuoteEnd--></div><!--QuoteEEnd-->


    Same issue here. I even logged it to see how it was responding throughout the game.

    <img src="http://img.photobucket.com/albums/v140/Fatfool/NS2G2CPU.png" border="0" class="linked-image" />
    <img src="http://img.photobucket.com/albums/v140/Fatfool/NS2G2GPU.png" border="0" class="linked-image" />

    About 8 of my cores were being used. but at lower levels than yours. Overall only about 20% of processor resources were being used.

    Apparently, there's a p_logall command to check that out in game.
  • GartGart Join Date: 2012-11-20 Member: 172544Members
    edited November 2012
    Same thing is happening to me. Haven't looked at gpu usage though, running a 660Ti. I guess I didn't need to make a new thread but I didn't see this one.

    As I noted in my thread, my cpu stays throttled at 1.2ghz and 50% usage. Did you guys check your CPU frequency too, or is your cpu locked at 4ghz? Wonder if its just coincidence we have the same CPU.
  • mechtechmechtech Join Date: 2012-11-19 Member: 172376Members
    <!--quoteo(post=2029795:date=Nov 20 2012, 12:41 AM:name=Gart)--><div class='quotetop'>QUOTE (Gart @ Nov 20 2012, 12:41 AM) <a href="index.php?act=findpost&pid=2029795"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->Same thing is happening to me. Haven't looked at gpu usage though, running a 660Ti. I guess I didn't need to make a new thread but I didn't see this one.

    As I noted in my thread, my cpu stays throttled at 1.2ghz and 50% usage. Did you guys check your CPU frequency too, or is your cpu locked at 4ghz? Wonder if its just coincidence we have the same CPU.<!--QuoteEnd--></div><!--QuoteEEnd-->

    I was at 4ghz CPU wise. Generally games are GPU bound and will use up all of the GPU resources before they max out the CPU. NS2 seems well multithreaded and I suspect this is the case.

    The under-utilization of the GPU is heavy load scenarios is concerning though. Does this happen with ATI users as well?

    Either way, this should be top priority for the devs performance wise. They need to tighten up the rendering code, because latency somewhere in the code is seriously hurting the engine. It looks like many of us have an extra 20+ FPS that can be unlocked if they can just convince the engine to use our extra clocks!
  • GartGart Join Date: 2012-11-20 Member: 172544Members
    <!--quoteo(post=2029804:date=Nov 19 2012, 11:51 PM:name=mechtech)--><div class='quotetop'>QUOTE (mechtech @ Nov 19 2012, 11:51 PM) <a href="index.php?act=findpost&pid=2029804"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->Generally games are GPU bound and will use up all of the GPU resources before they max out the CPU. NS2 seems well multithreaded and I suspect this is the case.<!--QuoteEnd--></div><!--QuoteEEnd-->

    I really have no idea what my issue is then.
  • A_PajanderA_Pajander Join Date: 2002-12-31 Member: 11695Members, Reinforced - Shadow
    That's interesting. What do the graphs look like if you run a different game?

    Also, Fatfool, are you running an 8-core Xeon? :o
  • FatfoolFatfool Join Date: 2012-11-19 Member: 172399Members
    edited November 2012
    <!--quoteo(post=2029804:date=Nov 20 2012, 04:51 PM:name=mechtech)--><div class='quotetop'>QUOTE (mechtech @ Nov 20 2012, 04:51 PM) <a href="index.php?act=findpost&pid=2029804"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->I was at 4ghz CPU wise. Generally games are GPU bound and will use up all of the GPU resources before they max out the CPU. NS2 seems well multithreaded and I suspect this is the case.

    The under-utilization of the GPU is heavy load scenarios is concerning though. Does this happen with ATI users as well?

    Either way, this should be top priority for the devs performance wise. They need to tighten up the rendering code, because latency somewhere in the code is seriously hurting the engine. It looks like many of us have an extra 20+ FPS that can be unlocked if they can just convince the engine to use our extra clocks!<!--QuoteEnd--></div><!--QuoteEEnd-->

    Well, I'm on a Radeon 5870. Same issue there though it does sometimes peak to close to 90%

    <!--quoteo(post=2029878:date=Nov 20 2012, 07:38 PM:name=snaga)--><div class='quotetop'>QUOTE (snaga @ Nov 20 2012, 07:38 PM) <a href="index.php?act=findpost&pid=2029878"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->That's interesting. What do the graphs look like if you run a different game?

    Also, Fatfool, are you running an 8-core Xeon? :o<!--QuoteEnd--></div><!--QuoteEEnd-->

    Nope. That's a 16 core/8 module system with dual Opteron 4280 @ 3.1ghz (turbo).
    Something tells me I'm not short of processor resources to play this game......
  • A_PajanderA_Pajander Join Date: 2002-12-31 Member: 11695Members, Reinforced - Shadow
    Oh, so Bulldozer based? As I understand it, the 2 cores in a Bulldozer module are not actually 100% independent cores but sort of like extended Hyperthreading? So I'm wondering how that would show up in a graph like that -- if you can't actually utilize both cores at 100% since one has to occasionally wait for the other? I dunno, I might be way off.
  • FatfoolFatfool Join Date: 2012-11-19 Member: 172399Members
    <!--quoteo(post=2029923:date=Nov 20 2012, 07:37 PM:name=snaga)--><div class='quotetop'>QUOTE (snaga @ Nov 20 2012, 07:37 PM) <a href="index.php?act=findpost&pid=2029923"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->Oh, so Bulldozer based? As I understand it, the 2 cores in a Bulldozer module are not actually 100% independent cores but sort of like extended Hyperthreading? So I'm wondering how that would show up in a graph like that -- if you can't actually utilize both cores at 100% since one has to occasionally wait for the other? I dunno, I might be way off.<!--QuoteEnd--></div><!--QuoteEEnd-->

    Yeah. They're Valencia processors aka the desktop bulldozer.

    The cores aren't totally independant and it is quite like hyperthreading, but with real resources. The 8 threads I mentioned are indeed on each module though even though you can't see it on the graph (Libreoffice only has colours for 12 data sets lol). However, some threads are only using like 20% of one core of a module. (probably axillary tasks). Thus there are still lots of resources to exploit even if you consider the used core as a module itself. NS2 is very demanding on the processor though. In games like Diablo III, the processors just conserve energy by staying at 1.4ghz.

    You can use 100% on both cores on a module though. happens on tasks like encoding or benchmarks. What I'm more curious about is how the data is accessed from the memory. It's probably spread over both processor memory banks which means they've got to share data over the HT link.

    But come on, I just had an awesome game and near the end I was slideshowing with this rig. It just doesn't add up.
  • uxlapogiuxlapogi Join Date: 2012-10-31 Member: 165238Members
    edited November 2012
    <!--quoteo(post=2029757:date=Nov 20 2012, 07:58 AM:name=Fatfool)--><div class='quotetop'>QUOTE (Fatfool @ Nov 20 2012, 07:58 AM) <a href="index.php?act=findpost&pid=2029757"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->Same issue here. I even logged it to see how it was responding throughout the game.

    About 8 of my cores were being used. but at lower levels than yours. Overall only about 20% of processor resources were being used.

    Apparently, there's a p_logall command to check that out in game.<!--QuoteEnd--></div><!--QuoteEEnd-->

    what tool did you use to create those graphs?

    I'm also experiencing said phenomenon on a completly different system:

    Intel Core2Duo E8500 @ 3.16GHz
    8GB Ram
    Geforce GTX 590

    CPU doesnt go above ~70%
    GPU1 and2 stay close to 15%

    I have almost everything I can think of on low settings,
    when there's a lot going on FPS drop to 15,
    while the above values dont change
  • FatfoolFatfool Join Date: 2012-11-19 Member: 172399Members
    <!--quoteo(post=2030057:date=Nov 20 2012, 10:23 PM:name=uxlapogi)--><div class='quotetop'>QUOTE (uxlapogi @ Nov 20 2012, 10:23 PM) <a href="index.php?act=findpost&pid=2030057"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->what tool did you use to create those graphs?

    I'm also experiencing said phenomenon on a completly different system:

    Intel Core2Duo E8500 @ 3.16GHz
    8GB Ram
    Geforce GTX 590

    CPU doesnt go above ~70%
    GPU1 and2 stay close to 15%

    I have almost everything I can think of on low settings,
    when there's a lot going on FPS drop to 15,
    while the above values dont change<!--QuoteEnd--></div><!--QuoteEEnd-->


    I used AIDA64 to log the data, then manually opened it with Libreoffice calc and plotted the data. Engineering student here ;)
  • A_PajanderA_Pajander Join Date: 2002-12-31 Member: 11695Members, Reinforced - Shadow
    edited November 2012
    Hmm, how accurate are the graphs in task manager? I was playing around a bit with low settings (so GPU doesn't bottleneck), and this is what it looked like during the game:

    <img src="http://i.imgur.com/5gfkd.png" border="0" class="linked-image" />

    It was running at 40-50 fps mostly. "Wait for GPU" was always 0 ms, but the other wait time (CPU I guess?) was constantly something like 1-5 ms. So yeah, never hits 100% usage on any of the cores even though it seems to be limited by the CPU. I might try running some actual logs with better tools later.

    Phenom II x4 965 @ 3.8 GHz, GTX 560Ti, 8GB DDR3.

    Edit: I tried limiting NS2.exe to two cores and then one and this is what that looked like:

    <img src="http://i.imgur.com/KTtHK.png" border="0" class="linked-image" /> <img src="http://i.imgur.com/ieCxS.png" border="0" class="linked-image" />

    Using two cores didn't cause a noticeable difference, but dropping it to one finally caused a big performance loss when it finally managed to saturate the core, and the wait times jumped to double digits.

    Sooo... with 4 cores the game obviously wants more clock cycles from somewhere (since it isn't running at 200 fps), but from where? :D GPU wait time was always 0 ms.
  • FatfoolFatfool Join Date: 2012-11-19 Member: 172399Members
    <!--quoteo(post=2030190:date=Nov 21 2012, 02:01 AM:name=snaga)--><div class='quotetop'>QUOTE (snaga @ Nov 21 2012, 02:01 AM) <a href="index.php?act=findpost&pid=2030190"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->Hmm, how accurate are the graphs in task manager? I was playing around a bit with low settings (so GPU doesn't bottleneck), and this is what it looked like during the game:

    <img src="http://i.imgur.com/5gfkd.png" border="0" class="linked-image" />

    It was running at 40-50 fps mostly. "Wait for GPU" was always 0 ms, but the other wait time (CPU I guess?) was constantly something like 1-5 ms. So yeah, never hits 100% usage on any of the cores even though it seems to be limited by the CPU. I might try running some actual logs with better tools later.

    Phenom II x4 965 @ 3.8 GHz, GTX 560Ti, 8GB DDR3.

    Edit: I tried limiting NS2.exe to two cores and then one and this is what that looked like:

    <img src="http://i.imgur.com/KTtHK.png" border="0" class="linked-image" /> <img src="http://i.imgur.com/ieCxS.png" border="0" class="linked-image" />

    Using two cores didn't cause a noticeable difference, but dropping it to one finally caused a big performance loss when it finally managed to saturate the core, and the wait times jumped to double digits.

    Sooo... with 4 cores the game obviously wants more clock cycles from somewhere (since it isn't running at 200 fps), but from where? :D GPU wait time was always 0 ms.<!--QuoteEnd--></div><!--QuoteEEnd-->


    That's a short sampling period though (task manager shows a smaller time period for every core you have. For mine, it only displays 44s lol!) . Is it always like that or does it drop in most portions of a game or is that a peak?
  • A_PajanderA_Pajander Join Date: 2002-12-31 Member: 11695Members, Reinforced - Shadow
    <!--quoteo(post=2030699:date=Nov 21 2012, 02:42 AM:name=Fatfool)--><div class='quotetop'>QUOTE (Fatfool @ Nov 21 2012, 02:42 AM) <a href="index.php?act=findpost&pid=2030699"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->That's a short sampling period though (task manager shows a smaller time period for every core you have. For mine, it only displays 44s lol!) . Is it always like that or does it drop in most portions of a game or is that a peak?<!--QuoteEnd--></div><!--QuoteEEnd-->

    Yeah obviously that's not the proper way to observe things, it was just a quick try. I was running the game windowed and kept an eye on the task manager and I'm pretty sure it never reached 90% usage on any core (when using four cores). Certainly not 100%. It was pretty much what you see there the whole time.
  • DghelneshiDghelneshi Aims to surpass Fana in post edits. Join Date: 2011-11-01 Member: 130634Members, Squad Five Blue, Reinforced - Shadow
    Long story short:

    You cannot magically calculate everything in parallel. Things have to be processed in the right order unless you want to violate the principle of cause and effect.

    The game does not reach 100% CPU utilization on more than one core since one thread has much more workload than the others. The reason why all cores are stressed equally is that Windows switches threads over to other cores constantly. E.g. a singlethreaded application will use each core to exactly 25% on a quadcore machine.
  • A_PajanderA_Pajander Join Date: 2002-12-31 Member: 11695Members, Reinforced - Shadow
    edited November 2012
    So, at any given moment of time there's one thread that's bottlenecking the whole program... <strike>Couldn't it be given more clock cycles, since more are available?</strike> Of course the location of the bottleneck (among threads) can change numerous times per second... So I guess that's what optimizing code actually means. :)

    I guess what confused me is how Windows moves the threads to other cores, like you explained. Since it makes it look like not even the main thread is given a "full core's worth" of CPU time. Funny, you would think that constantly moving the threads (when there's no reason to, like in that single-threaded example) would in itself inflict some kind of performance hit, but I guess not (or it's insignificant).
  • RamirezRamirez Join Date: 2012-11-21 Member: 172752Members
    I'm getting the same trouble, wasn't lagging at all before patch 230, but now as soon as there is too much action on the screen, things start to get really bad.
  • DghelneshiDghelneshi Aims to surpass Fana in post edits. Join Date: 2011-11-01 Member: 130634Members, Squad Five Blue, Reinforced - Shadow
    <!--quoteo(post=2031295:date=Nov 21 2012, 06:10 PM:name=snaga)--><div class='quotetop'>QUOTE (snaga @ Nov 21 2012, 06:10 PM) <a href="index.php?act=findpost&pid=2031295"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->Funny, you would think that constantly moving the threads (when there's no reason to, like in that single-threaded example) would in itself inflict some kind of performance hit, but I guess not (or it's insignificant).<!--QuoteEnd--></div><!--QuoteEEnd-->
    It does. I don't have benchmarks/numbers right now but singlethreaded programs forced on one core can run significantly faster compared to just leaving it on multiple cores. Iirc the thread switching stuff is mostly to conserve power and heat and also automatically balances out the load across an arbitrary number of cores when there are multiple threads (and/or multiple programs) running at the same time, so this kind of serves as a catch-all.
  • waflzwaflz Join Date: 2012-09-07 Member: 158459Members
    <!--quoteo(post=2031325:date=Nov 21 2012, 09:35 AM:name=Ramirez)--><div class='quotetop'>QUOTE (Ramirez @ Nov 21 2012, 09:35 AM) <a href="index.php?act=findpost&pid=2031325"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->I'm getting the same trouble, wasn't lagging at all before patch 230, but now as soon as there is too much action on the screen, things start to get really bad.<!--QuoteEnd--></div><!--QuoteEEnd-->

    Im in the same boat *see* Performance issues - Thread

    Running a brand new core , mobo, ram.
  • FatfoolFatfool Join Date: 2012-11-19 Member: 172399Members
    edited November 2012
    <!--quoteo(post=2031546:date=Nov 22 2012, 05:36 AM:name=Dghelneshi)--><div class='quotetop'>QUOTE (Dghelneshi @ Nov 22 2012, 05:36 AM) <a href="index.php?act=findpost&pid=2031546"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->It does. I don't have benchmarks/numbers right now but singlethreaded programs forced on one core can run significantly faster compared to just leaving it on multiple cores. Iirc the thread switching stuff is mostly to conserve power and heat and also automatically balances out the load across an arbitrary number of cores when there are multiple threads (and/or multiple programs) running at the same time, so this kind of serves as a catch-all.<!--QuoteEnd--></div><!--QuoteEEnd-->

    NS2 isn't being shuffled around multiple cores. The data I logged showed that each used core was being loaded similarly throughout the game; they didn't dip after a while leaving another core to pick up the load. Might I add that shuffling threads around across 2 processors would be really nasty as the data might have to be shuffled to the other processor's memory banks too (to avoid incurring a performance hit accessing another socket's memory banks)
  • SkulkBaitsSkulkBaits Join Date: 2013-06-13 Member: 185549Members
    My Frames Per Second drops to 42 while my GPU usage drops to 53% at the same time. As my GPU usage increases my Frames Per Second do as well. My GPU only hits 100% utilization on the main menu.
  • wopwopwopwop Join Date: 2013-08-23 Member: 187037Members
  • IronHorseIronHorse Developer, QA Manager, Technical Support & contributor Join Date: 2010-05-08 Member: 71669Members, Super Administrators, Forum Admins, Forum Moderators, NS2 Developer, NS2 Playtester, Squad Five Blue, Subnautica Playtester, Subnautica PT Lead, Pistachionauts
    @skulkbaits start a new thread, this thread is too old to be relevant anymore.
    Closed.
This discussion has been closed.