224 tech changes, part 1

matsomatso Master of PatchesMembers, Forum Moderators, NS2 Developer, Constellation, NS2 Playtester, Squad Five Blue, Squad Five Silver, Squad Five Gold, Reinforced - Shadow, NS2 Community Developer Join Date: 2002-11-05 Member: 7000Posts: 1,554 mod
"You don't have to be insane. But it helps".
About 3 weeks ago, I had finished my last batch of performance improvements and was scanning through the latest playtest performance logs, looking for something to improve. And it was all dross - 0.5% here, another possible 0.3 percent there - so I was looking at spending days at the 0.5% improvement level, twiddling with minor tweaks here and there.

Boring.

So I decided to go insane instead.

Now, that sounds a bit worse than what it actually means - it simply means picking something from my list of "stuff that would be insane to do before 1.0 release". Insane because they would introduce new architectural concepts in the engine, so its hard to figure out just how much it would destabilize everything.

However, there was this thing about movement prediction on the client that had been itching at the back of my head for a long time.

Some background info here ... the Spark Engine samples input before rendering every frame, generating a Move data structure (ie, a "move"). It adds that move to the list of moves-not-yet-part-of-the-latest-server-update, then resets the world back to the latest server update and executes all the moves, using the final state of the world render from.

Each move is quite costly to run, at about 0.5-0.7ms or so, and the length of the queue grows with effective server latency. Typically, you have maybe 100 ms net lag and 100ms interpolation lag for an effective lag of about 200ms. At 50 fps, you are looking at running a minimum of 10 prediction frames every fps (this is the "Prediction" line at the bottom of the net_stats display). If you wanted to run at 100fps, you would need to run 20 prediction frames instead - every frame. Yea, that would be 10-15ms every 10 ms. Kinda hard to do.

And that's the reason why fps goes down with latency. And why fps goes down when the server drops below 20 ticks per second - the queue gets longer. And why its so hard to increase fps on the client - faster fps means you need to predict more moves, more times.

Now, the client doesn't strictly need to do it this way - it could just take the world it has already predicted to the previous frame, add only the latest move to the world and use it to render right away. Unfortunately, 20 times per second the server sends a new update, and you would need to run all the moves from that in order to get in sync - which would cause 20 frames every second to be MUCH longer than all the other frames, resulting in some really hitchy experience. Not a good thing.

As to why the Spark Engine runs this way? Well, to quote Max: "It was not supposed to be that slow". In other words, other engines avoid similar problems by running moves really fast in hardcoded C++. In Spark, its run in Lua, allowing awesome flexibility (skulk wallwalking, jetpacks, sprinting, lerk flying - they are all coded in Lua) - at greater cost than was foreseen when the choice was made.

ENOUGH BACKGROUND .... back to the insanity.

The idea is actually quite simple. Instead of delivering a raw server snapshot to the main thread which it then has to run all the moves on, why not deliver an already predicted world to the main thread? Ie, give the snapshot to another Lua VM with almost identical code to the Client world, have it run all the moves in its own thread, and only deliver an updated world to the main thread.

That allows the main thread to just keep adding moves to its world, and now and then whenever the Prediction thread is finished preparing the server snapshot, it can just swap its state with it, at pretty much zero cost.

Nice idea. And faced with < 0.5%/day improvements, I figured I might as well give it a try - if I spent 3 days on it and it turned out to not work, three days less wasn't going to make much difference to performance anyhow.

After an intense three day hacking, the prototype was finished and worked beyond expectation. Depending on latency and how built-up the area was, the FPS increase was 30-50% extra. Some minor bugs here and there, but it was good enough to present it to UWE. When I pitched it to Brian C, I could sense his "Are this guy insane? Introducing multithreading and multiple Lua VM's less than a month before release?" - but after testing it and tasting the FPS increase, it was pretty much ... "Yea, we have to do this".

This was Monday of the 223 release. Right after the 223 release, UWE switched to iron out the bugs and unforeseen weirdness to be expected when doing something like that. It went pretty well, all things considered, and the new version was build and presented to the playtesters the following Monday.

At which time the ###### hit the server fan.

To be continued in part 2.
Member of CDT, Senior Spark Engine Hacker
«1

Comments

  • ZeikkoZeikko Members, Squad Five Blue, NS2 Map Tester Join Date: 2007-12-16 Member: 63179Posts: 586 Fully active user
    Great job Matso!

    This was pretty much my biggest hurdle with the NS2 architecture for a long time because it mad client optimizations to hurt other parts of the system, especially on servers.

    I haven't been able to try out the new patch yet but it sounds awesome.

    Great work man!
  • ToothyToothy ir-regard-less Members, Constellation Join Date: 2003-02-12 Member: 13447Posts: 335 Fully active user
    But why haven't you added bullet holes?
    image - EUPT
  • _Necro__Necro_ Members, Reinforced - Shadow Join Date: 2011-02-15 Member: 81895Posts: 1,904 Fully active user
    edited October 2012
    Damn! This is so awesome! Great work matso! And thanks for the heads up. I really appreciate such posts. I can imagine how much work it had to be.

    Can't wait to play this build now. So excited. :D

    €dit: While I'm certain that the statement from Toothy must be sarcasm, I want to warn before the real trolls arrive: Don't feed them! Devs should never answer to posts containing bad manners or insults. Things like that only encourage the community to cry louder to get a dev-response.

    €dit2: Oh and would you mind to tell more about the background of the rubber-banding problem?
  • YuukiYuuki Members Join Date: 2010-11-20 Member: 75079Posts: 1,731
    Sounds great!

    So if I understood correctly, you decreased the computational complexity (total number of calculation) and improved parallelization at the same time ?
  • twilitebluetwiliteblue bug stalker Members, NS2 Playtester, Squad Five Blue Join Date: 2003-02-04 Member: 13116Posts: 1,994 Advanced user
    I love you matso, and everyone at UWE! My minimum FPS leaped from 20 to 30! I think everyone should be excited about build 224!
  • falcfalc Members Join Date: 2011-03-18 Member: 87128Posts: 133
    Ok, now you got me that i have to try the new build.

    I was already curious of the changes, as the server console stated that a server VM had been started.
    FRAGFEST.de | Steam Group | Twitter
    A german website and blog about games and stuff.
    NS2 Servers (i7 @ 3.8 GHz)
    Server #1 - 176.9.107.253:27015
    Server #2 - 176.9.107.253:27025
  • YoungTrotskyYoungTrotsky Members Join Date: 2007-03-09 Member: 60307Posts: 202
    I only understood about 10% of that but thank you very much for figuring it all out, you sound like some sort of mega-power-super genius and I am glad people like you are out there making it easier for people like me to procrastinate revel in the glory of computer games!
  • puzlpuzl The Old Firm Retired Developer, NS1 Playtester, Forum Moderators, Constellation Join Date: 2003-02-26 Member: 14029Posts: 4,112 mod
    The suspense is killing me.. what hit the server fan?
    Retired NS1 Developer, currently making myself useful as a forum moderator - message me for any mod related requests.
  • WilsonWilson Members Join Date: 2010-07-26 Member: 72867Posts: 1,397
    Awesome work and a really interesting read.
    In-game name: Wilson

    My Crosshair Pack: LINK
  • NurEinMenschNurEinMensch Members, Constellation Join Date: 2003-02-26 Member: 14056Posts: 1,352
    /subscribe
    This is the kind of thing that goes on behind close doors on normal game releases, where they are optimizing it right up until the end. You guys are just getting to play with a version of the game that normally only the developers see.

    --Cory
  • G1RG1R Members Join Date: 2012-08-23 Member: 156275Posts: 86
    edited October 2012
    Any comment on future optimization? What can we hope for?
  • tk-421tk-421 Members, Reinforced - Shadow Join Date: 2006-11-03 Member: 58315Posts: 191
    Super-interesting. Thanks for sharing, can't wait for part 2.
  • Squirreli_Squirreli_ Members, Reinforced - Shadow Join Date: 2012-04-25 Member: 151046Posts: 343
    What a cliffhanger... You writing a soap opera or something? Give us the info already, I am holding my breath here ;)
    "To crush your enemies, to see them driven before you, and to hear the lamentations of their women." -Conan

    Steam profile
  • wirywiry Members Join Date: 2009-05-25 Member: 67479Posts: 507 Advanced user
    Thanks, very interesting read.
    :0
  • MisterYoonMisterYoon Members Join Date: 2012-08-18 Member: 155747Posts: 279 Fully active user
    QUOTE (matso @ Oct 25 2012, 02:43 AM) »
    As to why the Spark Engine runs this way? Well, to quote Max: "It was not supposed to be that slow".



    LOL. Max must have been ashamed when you asked it. Almost only flaw(of course animation also.), but so big of his own-made-engine.
  • Onii-chanOnii-chan Members Join Date: 2002-11-05 Member: 7164Posts: 503
    That's some great stuff.
    Thank you, matso!

    QUOTE (Toothy @ Oct 25 2012, 12:56 PM) »
    But why haven't you added bullet holes?


    Because nanites repair bullet holes before they happen.
    It's not like I want to gore you or anything!
  • carlgmcarlgm Members, Constellation Join Date: 2004-08-26 Member: 30907Posts: 101
    But...but... you CANT LEAVE IT THERE!? :(
    Nice to hear about what you've been working on and I look forward to seeing if it's worked on my specific PC! Even it it hasn't looks to be a good improvement for others, which is good. Well done! :)
  • phoenixbbsphoenixbbs Members, Constellation, NS2 Playtester, Subnautica Playtester Join Date: 2003-02-10 Member: 13379Posts: 437 Advanced user
    Toothy's a playtester, and we're always ripping the back out of him, so he's not really a troll, he's a whipping boy for the rest of us :-)

    Great work Matso, you just need to ask Max where his LUA JIT VM is up to now...
    My nickname in-game is "phoe", and I'm an NS addict.

    I haven't managed more than two days without NS since 2002.

    My wife tells me I regularly jump in bed as though I've just been ambushed, and swear (or shout out orders) in my sleep :-}
  • phoenixbbsphoenixbbs Members, Constellation, NS2 Playtester, Subnautica Playtester Join Date: 2003-02-10 Member: 13379Posts: 437 Advanced user
    Oh, and Toothy - if you want bullet holes, stand behind a gorge when he heal sprays - I think a little bit comes out of both ends, and his "tails" stick out, revealing his rusty bullet hole :-)
    My nickname in-game is "phoe", and I'm an NS addict.

    I haven't managed more than two days without NS since 2002.

    My wife tells me I regularly jump in bed as though I've just been ambushed, and swear (or shout out orders) in my sleep :-}
  • VitdomVitdom Members, Reinforced - Supporter, Reinforced - Silver, Reinforced - Gold, Reinforced - Diamond, Reinforced - Shadow, WC 2013 - Shadow Join Date: 2012-04-30 Member: 151345Posts: 285 Advanced user
    edited October 2012
    Awesome job getting this done!

    QUOTE (matso @ Oct 25 2012, 11:43 AM) »
    Some background info here ... the Spark Engine samples input before rendering every frame, generating a Move data structure (ie, a "move"). It adds that move to the list of moves-not-yet-part-of-the-latest-server-update, then resets the world back to the latest server update and executes all the moves, using the final state of the world render from.

    But why would anyone design a net game state system like this? Wouldn't it just be simpler and faster to have both the server and client running state update simulations/predictions, like the server frame updates synchronizes the client-server states and the S/C just keep on simulating/predicting the game sending synchronize data back and forth, instead of keeping a list of updates-not-part-of-the-latest-server-update and just predicting what will happen over and over again? If both the server and client has access to the same game state data (which it has), there would be no additional downsides, coming with the improvements. Why would it according to you be a "hitchy experience" when that way of doing things can be implemented in a very nice and well-working way?

    Doesn't the Source engine work like that?
    Steam profile
    Hyperdata for breakfast. Hyperdata for lunch. Hyperdata for dinner. Hyperdata for supper. Hyperdata as snacks!
    Dream hyperdata! Speak hyperdata! Live hyperdata! Be hyperdata!


    Nothing like the smell of distress bacon and egg lock in the morning! -- KwisatzHaderach
  • _Necro__Necro_ Members, Reinforced - Shadow Join Date: 2011-02-15 Member: 81895Posts: 1,904 Fully active user
    edited October 2012
    I don't know if I understand you. But you seem to forget that the internet does not work with 100% guaranteed packet delivery. You can never know when or if you get an update from the server or if an update gets lost. At least with UDP.
  • FappuchinoFappuchino Members Join Date: 2012-10-10 Member: 162008Posts: 281 Fully active user
    edited October 2012
    QUOTE (Toothy @ Oct 25 2012, 01:56 AM) »
    But why haven't you added bullet holes?

    Seriously. Who cares about this mumbo jumbo?

    Bullet. Holes.
  • carlgmcarlgm Members, Constellation Join Date: 2004-08-26 Member: 30907Posts: 101
    QUOTE (_Necro_ @ Oct 25 2012, 09:14 AM) »
    You can never know when or if you get an update from the server or if an update gets lost. At least with UDP.

    https://twitter.com/NS2/status/253188640656719872 :D
  • creamsodasecreamsodase Members Join Date: 2012-10-17 Member: 162550Posts: 10
    i think he meant the "merde" hit the server fan.

    you know how this word is on high stakes lately
  • TimMcTimMc Members Join Date: 2012-02-06 Member: 143945Posts: 1,425
    Interesting thread. Looking forward to next one :)
    My Steam Workshop:
    Arms Limitation Treaty - Whips, Sentries, ARCs and Hallucinations count towards MAC/Drifter limits.
    No Power Usage - Marine structures no longer require power. Power nodes only affect lighting, but are much weaker.
  • shadershader Members Join Date: 2003-02-07 Member: 13247Posts: 70
    edited October 2012
    QUOTE (Vitdom @ Oct 25 2012, 07:58 AM) »
    Doesn't the Source engine work like that?


    The source engine and every single multiplayer FPS game since quakeworld works the same as NS2 in this regard.

    EDIT: here is a good explanation
    I found my old forum account. Most of my NS2 posts are under user shad3r
  • shadershader Members Join Date: 2003-02-07 Member: 13247Posts: 70
    Thanks for the post, matso.

    Was the client framerate increase from splitting the client prediction into a parallel thread what prompted the move command rate change?

    Or did the that optimization come first?
    I found my old forum account. Most of my NS2 posts are under user shad3r
  • TychoCelchuuuTychoCelchuuu Anememone Members Join Date: 2002-03-23 Member: 345Posts: 10,499
    QUOTE (Fappuchino @ Oct 25 2012, 08:18 AM) »
    Seriously. Who cares about this mumbo jumbo?

    Bullet. Holes.

    Toothy is making fun of somone I suspect :D
    QUOTE (MOOtant @ Sep 21 2012, 11:06 AM) »
    What is wrong with being a racist?

  • countbasiecountbasie Members Join Date: 2008-12-27 Member: 65884Posts: 824
    edited October 2012
    EDIT: I gotta learn to read.
    Post edited by Unknown User on
  • TechercizerTechercizer 7th Player Members Join Date: 2011-06-11 Member: 103832Posts: 1,850
    Running a virtual environment server-side instead of loading in new static snapshots constantly is a pretty awesome upgrade, but I can't figure out how you're swapping states at zero cost without creating discrepancies. How do you resolve conflicts between server prediction and client information in a way that doesn't rubber-band the world into or out of some unintuitive configuration?

    Or is it that the client updates are coming in fast enough that your prediction's drift is negligible?

            Once the infestation reaches the Command Chair, the process begins. One Gorge enters the chair to provide the necessary height. Another climbs on its shoulders to access the controls.

            A Gorge Lab is quickly established, staffed by microscopic Gorges who work tirelessly to unlock the secrets of Frontiersman Technology, stopping only to change their lab coats when they become dirtied. Once the research progresses to a certain point, the Gorgecom gives the order. Nanites are called into service.

            The armature forms. A chosen Gorge, tested many times in the field of battle, enters the machine.

            Servos whir; miniguns spin up in diagnostics; an Exogorge is born.

Sign In or Register to comment.