GetAllScriptActors() are evil, or why hydra spam kills a server

2

Comments

  • xposed-xposed- Join Date: 2007-09-23 Member: 62412Members, Constellation
    <!--quoteo(post=1840134:date=Apr 7 2011, 01:34 PM:name=Warmonger)--><div class='quotetop'>QUOTE (Warmonger @ Apr 7 2011, 01:34 PM) <a href="index.php?act=findpost&pid=1840134"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->I'm pretty sure Max had nothing to do with this code since Max is coding the engine (in C++). Charlie and Brian are responsible for the game-play elements (in Lua).<!--QuoteEnd--></div><!--QuoteEEnd-->

    Indeed, didn't mean to imply otherwise.. was just making the point that I'm sure someone is aware of it already, given they wrote it.
  • InsaneInsane Anomaly Join Date: 2002-05-13 Member: 605Members, Super Administrators, Forum Admins, NS1 Playtester, Forum Moderators, NS2 Developer, Constellation, NS2 Playtester, Squad Five Blue, NS2 Map Tester, Subnautica Developer, Pistachionauts, Future Perfect Developer
    edited April 2011
    Fantastic stuff, thank you for taking the time to look into this!
  • ThaldarinThaldarin Alonzi&#33; Join Date: 2003-07-15 Member: 18173Members, Constellation
    It's a bit off topic but someone just pointed your signature out to me RobB. Hilarious.
  • AdamaAdama Join Date: 2010-12-21 Member: 75826Members
    Certainly makes sense and definitely needs someone to look into the code behind it (which it seems you have done)

    It doesn't look like there is any solid testing on this apart from standalone LAN games, etc.

    I've added the code to all 3 NS2UK servers - if anyone would like to jump on and respond to this thread to see if the server tick rate has improved? I'm sure Matso and the rest of the community would like to know the results.
  • Soylent_greenSoylent_green Join Date: 2002-12-20 Member: 11220Members, Reinforced - Shadow
    <!--quoteo(post=1840145:date=Apr 7 2011, 05:23 AM:name=Asraniel)--><div class='quotetop'>QUOTE (Asraniel @ Apr 7 2011, 05:23 AM) <a href="index.php?act=findpost&pid=1840145"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->i would like to say that the real problem is that GetAllScriptActors returns all actors. There should be a method that only returns those in a certain range.<!--QuoteEnd--></div><!--QuoteEEnd-->

    The number of gameplay-relevant entities is small; hundreds at most. Creating a list culled by distance in C code is lightening fast(you don't need to calculate any square roots; just compare squared distance against squared max range).

    Culling the list by entity type is similarly quick.

    Just expose a 'GetCulledScriptActors' function which takes a max range(zero or negative means infinite range) and entity type flags( E.g. SA_Team1_player | SA_Team1_structure would return only players and structures from team1).

    What you need is just to do the culling before you get to LUA.

    <!--quoteo(post=1840145:date=Apr 7 2011, 05:23 AM:name=Asraniel)--><div class='quotetop'>QUOTE (Asraniel @ Apr 7 2011, 05:23 AM) <a href="index.php?act=findpost&pid=1840145"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->This list should be maintained with something like a quadtree or a octtree (max will understand me), so that the amount of actors that need to be queried are minimal.<!--QuoteEnd--></div><!--QuoteEEnd-->

    That seems like it would be more costly than just checking every script actor.
  • endarendar Join Date: 2010-07-27 Member: 73256Members, Squad Five Blue
    <!--quoteo(post=1840158:date=Apr 7 2011, 11:17 PM:name=Adama)--><div class='quotetop'>QUOTE (Adama @ Apr 7 2011, 11:17 PM) <a href="index.php?act=findpost&pid=1840158"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->I've added the code to all 3 NS2UK servers - if anyone would like to jump on and respond to this thread to see if the server tick rate has improved? I'm sure Matso and the rest of the community would like to know the results.<!--QuoteEnd--></div><!--QuoteEEnd-->
    Seems to have had a positive effect.


    <a href="http://img829.imageshack.us/i/352070.png/" target="_blank"><img src="http://img829.imageshack.us/img829/1193/352070.png" border="0" class="linked-image" /></a>
  • AdamaAdama Join Date: 2010-12-21 Member: 75826Members
    <!--quoteo(post=1840162:date=Apr 7 2011, 03:47 PM:name=endar)--><div class='quotetop'>QUOTE (endar @ Apr 7 2011, 03:47 PM) <a href="index.php?act=findpost&pid=1840162"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->Seems to have had a positive effect.


    <a href="http://img829.imageshack.us/i/352070.png/" target="_blank"><img src="http://img829.imageshack.us/img829/1193/352070.png" border="0" class="linked-image" /></a><!--QuoteEnd--></div><!--QuoteEEnd-->

    Thanks - slightly confused as they're not the servers I was referring too :) - Have you changed these servers and tested? Looks good if you have.

    I'll pop on the NS2UK servers and see if others join and respond to this thread shortly.
  • LazerLazer Join Date: 2003-03-11 Member: 14406Members, Contributor, Constellation, NS2 Playtester
    <!--quoteo(post=1840145:date=Apr 7 2011, 03:23 AM:name=Asraniel)--><div class='quotetop'>QUOTE (Asraniel @ Apr 7 2011, 03:23 AM) <a href="index.php?act=findpost&pid=1840145"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->i would like to say that the real problem is that GetAllScriptActors returns all actors. There should be a method that only returns those in a certain range. This list should be maintained with something like a quadtree or a octtree (max will understand me), so that the amount of actors that need to be queried are minimal.<!--QuoteEnd--></div><!--QuoteEEnd-->

    <!--quoteo(post=1840161:date=Apr 7 2011, 08:38 AM:name=Soylent_green)--><div class='quotetop'>QUOTE (Soylent_green @ Apr 7 2011, 08:38 AM) <a href="index.php?act=findpost&pid=1840161"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->That seems like it would be more costly than just checking every script actor.<!--QuoteEnd--></div><!--QuoteEEnd-->

    Not exactly. Octrees are very useful especially when dealing with collision code or parsing sections of a map. I wrote one for a game engine I've been making in my spare time and have a decent understanding how it can be used so I will have to say this could be an improved implementation.

    For those not familiar, consider trying to collision check against every triangle in the map every frame. No way. So what they do is build a big cube around the entire map, assign all geometry to it and if there are more than X triangles (might have to play with the value to optimize) then the cube gets divided down into 8 smaller children and all geometry of the larger cube divided between them. This process keeps going until you have the map partitioned into the largest cubes possible holding no more than X triangles per cube. The process is automated and likely part of 'pre-processing world'.

    So how is this any better than before? Now as a player walks around the world, it first collision checks against nodes (cubes) of the octree and then proceeds to only collision check with the geometry they contain. This saves on performance tremendously but I'm assuming it has already been implemented for collision checks. What Asraniel was getting at (I think) is that if entities had their actor indices assigned to octree nodes as they traversed the map then a similar query can be made with a larger radius and all actors occupying octree nodes within a certain range will be returned instead of all actors in the map. Combine that solution with what matso has posted and you get very noticeable FPS gains (although either solution by themselves should show obvious improvement).
  • SkvateSkvate Join Date: 2002-11-25 Member: 9892Members, Reinforced - Shadow, WC 2013 - Silver
  • AdamaAdama Join Date: 2010-12-21 Member: 75826Members
    Okay so quick verdict is as follows (only quick 15minute test)

    10 players
    15 Hydras
    5 Turrets
    Various other buildings, etc.

    Between 8-12 tick rate.

    Overall, I think there is an improvement. Most noticeably the tick rate declines as more players join, anyhow thank you very much - I will keep this code running on these servers for the meantime.
  • endarendar Join Date: 2010-07-27 Member: 73256Members, Squad Five Blue
    Here's a test I did on a live server (passworded so only I was in it).
    I did 2 runs, first with vanilla, second with matso's files.
    I placed 12 turrets, and 63 hydras in close proximity (hydras are so much easier to spam).
    Used playeru's mod to receive the tickrate, over 180 seconds, heres the results:

    <a href="http://img13.imageshack.us/i/graphxk.png/" target="_blank"><img src="http://img13.imageshack.us/img13/1832/graphxk.png" border="0" class="linked-image" /></a>
    Average
    Mod = 24.4
    Vanilla = 20.3

    Maximum
    Mod = 27
    Vanilla = 23

    Minimum
    Mod = 22
    Vanilla = 17

    <a href="http://img831.imageshack.us/i/90114869.jpg/" target="_blank"><img src="http://img831.imageshack.us/img831/4109/90114869.jpg" border="0" class="linked-image" /></a>

    <a href="http://img97.imageshack.us/i/73570671.jpg/" target="_blank"><img src="http://img97.imageshack.us/img97/9301/73570671.jpg" border="0" class="linked-image" /></a>
  • PersianImm0rtalPersianImm0rtal Join Date: 2010-12-02 Member: 75414Members, Constellation, NS2 Map Tester
    That is a very nice performance boost! Is your server now running matso code? And if so, do i need to do anything in order to connect?
  • endarendar Join Date: 2010-07-27 Member: 73256Members, Squad Five Blue
    If that was asked of me, it is not running the code any longer. Saw a couple of new lua errors in the log while doing the test so I don't want to keep it live, just in case...
  • w0dk4w0dk4 Join Date: 2008-04-22 Member: 64129Members, Constellation, Reinforced - Shadow
    Also installed the patch, nice work!
  • EnceladusEnceladus Join Date: 2004-01-18 Member: 25442Members
    Unfortunately with that fix Hydras now try to shoot macs through walls. Haven't tried it with other stuff yet..
  • NurEinMenschNurEinMensch Join Date: 2003-02-26 Member: 14056Members, Constellation
    Good job, what a great community!

    One thing I was wondering, and I didn't check if it's accounted for: What are the events that cause the static target list to be rebuilt? Because there are a few things that may mandate that. Not just other buildings being built/destroyed that affect visibility, but also entities moving, doors opening/closing possibly elevators etc. Can map geometry change during runtime?
  • FehaFeha Join Date: 2006-11-16 Member: 58633Members
    <!--quoteo(post=1840164:date=Apr 7 2011, 04:00 PM:name=Lazer)--><div class='quotetop'>QUOTE (Lazer @ Apr 7 2011, 04:00 PM) <a href="index.php?act=findpost&pid=1840164"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->Not exactly. Octrees are very useful especially when dealing with collision code or parsing sections of a map. I wrote one for a game engine I've been making in my spare time and have a decent understanding how it can be used so I will have to say this could be an improved implementation.

    For those not familiar, consider trying to collision check against every triangle in the map every frame. No way. So what they do is build a big cube around the entire map, assign all geometry to it and if there are more than X triangles (might have to play with the value to optimize) then the cube gets divided down into 8 smaller children and all geometry of the larger cube divided between them. This process keeps going until you have the map partitioned into the largest cubes possible holding no more than X triangles per cube. The process is automated and likely part of 'pre-processing world'.

    So how is this any better than before? Now as a player walks around the world, it first collision checks against nodes (cubes) of the octree and then proceeds to only collision check with the geometry they contain. This saves on performance tremendously but I'm assuming it has already been implemented for collision checks. What Asraniel was getting at (I think) is that if entities had their actor indices assigned to octree nodes as they traversed the map then a similar query can be made with a larger radius and all actors occupying octree nodes within a certain range will be returned instead of all actors in the map. Combine that solution with what matso has posted and you get very noticeable FPS gains (although either solution by themselves should show obvious improvement).<!--QuoteEnd--></div><!--QuoteEEnd-->

    When you first mentioned quad/octtrees I didnt know what they were, but thanks for explaining. Makes alot of sense, and is similar to collisioncheking models (for those who doesnt know, you usually have a bounding box/sphere, which each model check collision with first, if they collide with something, they check their actuall collision models).

    I dont think GetAllScriptActors() should use that tho, maybe something like FindInSphere instead?
    Its what hydras and sentrys uses anyway, except they do the distance checking themself. Would just have to check what boxes the sphere intersects.
  • AsranielAsraniel Join Date: 2002-06-03 Member: 724Members, Playtest Lead, Forum Moderators, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Shadow, Subnautica Playtester, Retired Community Developer
    edited April 2011
    <!--quoteo(post=1840164:date=Apr 7 2011, 05:00 PM:name=Lazer)--><div class='quotetop'>QUOTE (Lazer @ Apr 7 2011, 05:00 PM) <a href="index.php?act=findpost&pid=1840164"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->Not exactly. Octrees are very useful especially when dealing with collision code or parsing sections of a map. I wrote one for a game engine I've been making in my spare time and have a decent understanding how it can be used so I will have to say this could be an improved implementation.

    For those not familiar, consider trying to collision check against every triangle in the map every frame. No way. So what they do is build a big cube around the entire map, assign all geometry to it and if there are more than X triangles (might have to play with the value to optimize) then the cube gets divided down into 8 smaller children and all geometry of the larger cube divided between them. This process keeps going until you have the map partitioned into the largest cubes possible holding no more than X triangles per cube. The process is automated and likely part of 'pre-processing world'.

    So how is this any better than before? Now as a player walks around the world, it first collision checks against nodes (cubes) of the octree and then proceeds to only collision check with the geometry they contain. This saves on performance tremendously but I'm assuming it has already been implemented for collision checks. What Asraniel was getting at (I think) is that if entities had their actor indices assigned to octree nodes as they traversed the map then a similar query can be made with a larger radius and all actors occupying octree nodes within a certain range will be returned instead of all actors in the map. Combine that solution with what matso has posted and you get very noticeable FPS gains (although either solution by themselves should show obvious improvement).<!--QuoteEnd--></div><!--QuoteEEnd-->

    exactly, didn't have the energy to write such a long explenation :)

    edit: what i also wanted to say is that a octtree is probably overkill for a game like ns2 that plays mostly in 2d space, so a quadtree would probably be better suited. But if the spark engine wants to be more general, a octtree is probably the way to go.
  • Soylent_greenSoylent_green Join Date: 2002-12-20 Member: 11220Members, Reinforced - Shadow
    edited April 2011
    <!--quoteo(post=1840164:date=Apr 7 2011, 11:00 AM:name=Lazer)--><div class='quotetop'>QUOTE (Lazer @ Apr 7 2011, 11:00 AM) <a href="index.php?act=findpost&pid=1840164"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->For those not familiar, consider trying to collision check against every triangle in the map every frame. No way. So what they do is build a big cube around the entire map, assign all geometry to it and if there are more than X triangles (might have to play with the value to optimize) then the cube gets divided down into 8 smaller children and all geometry of the larger cube divided between them. This process keeps going until you have the map partitioned into the largest cubes possible holding no more than X triangles per cube. The process is automated and likely part of 'pre-processing world'.<!--QuoteEnd--></div><!--QuoteEEnd-->

    Here you do something fairly expensive(traversing the octree) to avoid doing something <b>extremely expensive</b>, brute forcing testing every object's bounding primitive(usually an AABB or OBB) for collision with the player's bounding primitive. It's even worse if you have a polygon soup of world polies. If you're using OBBs and you have 10000 static models, 30 ticks per second, 16 players you will perform at least 1 attempt to find a separating axis on each one and at most 15(this is rare, most objects are separated from the player by a great distance); you will be performing at least 16*30*10000 = 4.8 million tests per second, each test having a couple of dot products and an if statement.

    Insertion sort, which is a terrible, terrible, terrible O(n^2) sorting algorithm is much faster than heap sort, merge sort or any other kind of O(nlogn)-sort when you're only sorting 10 items.

    <!--quoteo(post=1840164:date=Apr 7 2011, 11:00 AM:name=Lazer)--><div class='quotetop'>QUOTE (Lazer @ Apr 7 2011, 11:00 AM) <a href="index.php?act=findpost&pid=1840164"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->What Asraniel was getting at (I think) is that if entities had their actor indices assigned to octree nodes as they traversed the map then a similar query can be made with a larger radius and all actors occupying octree nodes within a certain range will be returned instead of all actors in the map.<!--QuoteEnd--></div><!--QuoteEEnd-->

    A plain array of all GameScriptActors in a level(a few hundred) has cache locality, it is _very_ cheap to maintain and traverse.

    To get a snug fit around the bounding sphere of GameScriptActors that are within range of the hydra/turret you will probably need to visit atleast ten nodes. And each GameScriptActor still needs to be range-checked and checked for being a GameScriptActor that we want to shoot at as before(if an object belongs to a node which touches the surface of the sphere the object is not necessarily inside the sphere).

    An octree needs to be repopulated each frame because players, whips, macs aren't static. Unless great care is taken it involves many small heap-allocated chunks of memory with little or no cache locality; this makes traversing the octree quite expensive. The octree is a much larger data structure than a simple array, this will pollute the cache; it takes 1-3 clocks to retrieve something from the tiny L1 cache, it takes 10-20 clocks to retrieve something from the larger L2 cache and it takes several hundred clocks if you have to go out to RAM to get something. If you have to go out to memory to get the simple array of GameScriptActors the linear access pattern will cause the CPU to predict what data you need next and precache it; whereas the traversing the octree has a more "pointer-chasing" kind of access pattern unless you are very careful; this is slow.

    I'm quite confident that an octree will have no benefit whatsoever compared to implementing a version of GetScriptActors that traverses the list and culling the undesired GameScriptActors before returning to LUA.

    ETA: missing decimal point
  • LazerLazer Join Date: 2003-03-11 Member: 14406Members, Contributor, Constellation, NS2 Playtester
    edited April 2011
    Just would like to add traversing an octree is not checking locality to every node in it. You start from the top node, find which of 8 you are in and repeat a couple times until you arrive at the node(s) you are contained in. It is actually a very quick search. Agreed maintaining the octree would require repopulating, but strictly with non-static actors (and of those only the ones that have moved since the previous frame). I could see where this may be overkill with just a few non-static actors, but if the numbers increase, it can help a lot (easy traversal to local nodes and then you will only be checking against necessary actors instead of cycling through them all). If part of the initial octree generation gives every octree node a list of all other octree nodes within a few different radii, then the octree traversal becomes even simpler and only requires finding the most local node and using its list of other local nodes instead of traversing all nodes within a radius and recreating the list every check.
  • AsranielAsraniel Join Date: 2002-06-03 Member: 724Members, Playtest Lead, Forum Moderators, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Shadow, Subnautica Playtester, Retired Community Developer
    i love tech talk :)

    indeed, it all depends on the number of actors and the number of movable actors.
  • Soylent_greenSoylent_green Join Date: 2002-12-20 Member: 11220Members, Reinforced - Shadow
    If I read the first post correctly it looks like a ray is traced to every GameScriptActor that is a possible target and is within targeting distance. Tracing rays sounds quite expensive, even though it's not implemented in LUA. It would be much cheaper if the targeting function can be simplified so that you don't have to trace quite as many rays.

    What you can do is to sort targets in order of juicyness before you start firing rays; as soon as you find a target that is in line of sight, you can ignore the remaining, lower priority targets.

    Try players first, sorted by distance; if none are in LoS, try MACs and ARCs sorted by distance; if none are in LoS attack the nearest enemy building(only need to fire one ray; since it's static the result can be precached).
  • FehaFeha Join Date: 2006-11-16 Member: 58633Members
    When it comes to target juiciness, I think that the distance is what would make most sense to new people, aswell as make tactics like cannonfodder possible (drifters and fades being sentry targets, while lerks can snipe in peace).
    Maybe make it persist a target until it is out of LoS or range, not sure about that one tho.
  • Soylent_greenSoylent_green Join Date: 2002-12-20 Member: 11220Members, Reinforced - Shadow
    edited April 2011
    <!--quoteo(post=1840203:date=Apr 7 2011, 05:21 PM:name=Lazer)--><div class='quotetop'>QUOTE (Lazer @ Apr 7 2011, 05:21 PM) <a href="index.php?act=findpost&pid=1840203"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->Just would like to add traversing an octree is not checking locality to every node in it. You start from the top node, find which of 8 you are in and repeat a couple times until you arrive at the node(s) you are contained in.<!--QuoteEnd--></div><!--QuoteEEnd-->

    All targets that are within range are inside a bounding sphere. This bounding sphere is large(firing distance is large).

    If you were to traverse the octree until you find the smallest node which fully contains the bounding sphere and grab all the GameScriptActors attached to that node there are a few problems.

    Often the sphere will stradle two nodes with sides much longer than the diameter of the sphere; the smallest node that fully contains the bounding sphere can be the root node if you are unlucky. Here's a quick and dirty sketch of the same problem in a quadtree, which is easier to draw:

    <img src="http://img543.imageshack.us/img543/9646/quadtree1.gif" border="0" class="linked-image" />

    Here the smallest node fully encompassing the bounding circle is the root node.

    No, you need to push down to a finer granularity, so that the bounding sphere is fully encompased in perhaps 10 nodes(less in a quadtree, since there are only two axes) or so:

    <img src="http://img33.imageshack.us/img33/6513/quadtree2.gif" border="0" class="linked-image" />

    Now you're traversing down multiple branches of the quadtree to achieve reasonable snugness around the bounding circle.

    It takes serious care to make sure that the octree is in one contiguous block of memory and avoid frequent dynamic memory allocation(expensive!) when lists shrink or expand. A reasonable approach may be a list that is realocated with double the memory you call insert on it while full and reallocated with half-size when you call delete on it when it is a quarter full.

    Another problem is that other entities may stradle nodes of the quad tree as well. So at any given level, the same GameScriptActor may be encompassed by several nodes. These duplicates would have to be stripped; this is most easily done by sorting the list of chosen GameScriptActors, so that all duplicates are adjacent, then traversing the list linearly and picking the unique ones.

    <!--quoteo(post=1840203:date=Apr 7 2011, 05:21 PM:name=Lazer)--><div class='quotetop'>QUOTE (Lazer @ Apr 7 2011, 05:21 PM) <a href="index.php?act=findpost&pid=1840203"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->It is actually a very quick search.<!--QuoteEnd--></div><!--QuoteEEnd-->

    Is it as quick as going linearly through a an array of a few hundred items and doing a bitwise and, an integer compare, 3 FP mul, 2 FP add and 1 FP compare on each one?

    <!--quoteo(post=1840203:date=Apr 7 2011, 05:21 PM:name=Lazer)--><div class='quotetop'>QUOTE (Lazer @ Apr 7 2011, 05:21 PM) <a href="index.php?act=findpost&pid=1840203"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->Agreed maintaining the octree would require repopulating, but strictly with non-static actors (and of those only the ones that have moved since the previous frame).<!--QuoteEnd--></div><!--QuoteEEnd-->

    Repopulating the tree is not entirely trivial. To avoid new[] and delete[](expensive!) the easiest solution that springs to mind is the exact same process in which things are added to the octree, but with last frame's value, and removing from instead of adding to lists. After you've backed those out you have to put them back in with the new coordinates.

    <!--quoteo(post=1840203:date=Apr 7 2011, 05:21 PM:name=Lazer)--><div class='quotetop'>QUOTE (Lazer @ Apr 7 2011, 05:21 PM) <a href="index.php?act=findpost&pid=1840203"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->I could see where this may be overkill with just a few non-static actors, but if the numbers increase, it can help a lot (easy traversal to local nodes and then you will only be checking against necessary actors instead of cycling through them all). If part of the initial octree generation gives every octree node a list of all other octree nodes within a few different radii, then the octree traversal becomes even simpler and only requires finding the most local node and using its list of other local nodes instead of traversing all nodes within a radius and recreating the list every check.<!--QuoteEnd--></div><!--QuoteEEnd-->

    If hundreds of hydras and turrets is a frequent occurence you've got bigger problems.

    Those list are extra data. It may pollute the cache, slowing the rest of the programme down more than it speeds up the octree traversal.
  • Soylent_greenSoylent_green Join Date: 2002-12-20 Member: 11220Members, Reinforced - Shadow
    edited April 2011
    There is a a much more primitive algorithm for finding objects within range that requires no additional memory and just a little bit of sorting.

    What you do is that you sort an array of GameScriptActors by their x-coordinate; ignoring y and z.

    When you traverse the list you find the first GameScriptActor with an x-coordinate greater than the x coordinate of the hydra minus the max firing distance(the smallest x-coordinate a GameScriptActor could have and still be in range). You do this by testing a GameScriptActor in the middle of the array; if its x-coordinate is larger than xmin you go one quarter of the array size downwards, if it is too small you go one quarter of the array size upward. Next step you do 1/8th of the array size, then 1/16th until the step size is one and you've homed in on the right value; this kind of searching is O(log n). When you've found your match you traverse the sorted list linearly until you encounter a GameScriptActor with an x-coordinate greater than Xmax.
  • CoolCookieCooksCoolCookieCooks Pretty Girl Join Date: 2003-05-18 Member: 16446Members, NS1 Playtester, Contributor, Constellation
    What.

    I felt clever for understanding this thread, then I got to these last posts and my brain melted.
  • LazerLazer Join Date: 2003-03-11 Member: 14406Members, Contributor, Constellation, NS2 Playtester
    edited April 2011
    <!--quoteo(post=1840208:date=Apr 7 2011, 04:26 PM:name=Soylent_green)--><div class='quotetop'>QUOTE (Soylent_green @ Apr 7 2011, 04:26 PM) <a href="index.php?act=findpost&pid=1840208"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->If hundreds of hydras and turrets is a frequent occurence you've got bigger problems.

    Those list are extra data. It may pollute the cache, slowing the rest of the programme down more than it speeds up the octree traversal.<!--QuoteEnd--></div><!--QuoteEEnd-->
    The goal is to handle as many turrets and hydras as possible without slowdown right? I guess we've got bigger problems by definition.

    Adding a list of nearby octree indices to each octree leaf node is not an incredible amount of data to store. It also eliminates the need to check all of the nodes that fall within a bounding sphere. Instead you simply run a point check to retrieve the most local node, and then use that node's precomputed list to find all actors contained nearby in the nodes from that list. This IS less computations to find a trimmed list of actors (specifically for the more populated scenarios we are trying to get the engine to handle better), although does require precomputation to make such lists. Clearly checking against the entire octree for every actor every frame is not going to work any better and that is not what I was suggesting.
  • Soylent_greenSoylent_green Join Date: 2002-12-20 Member: 11220Members, Reinforced - Shadow
    edited April 2011
    <!--quoteo(post=1840213:date=Apr 7 2011, 06:44 PM:name=Lazer)--><div class='quotetop'>QUOTE (Lazer @ Apr 7 2011, 06:44 PM) <a href="index.php?act=findpost&pid=1840213"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->The goal is to handle as many turrets and hydras as possible without slowdown right? I guess we've got bigger problems by definition.<!--QuoteEnd--></div><!--QuoteEEnd-->

    No. The goal is to handle a reasonable number of turrets and hydras as fast as possible(say, 50 of each). Optimizing for 1000 hydras and turrets at the expense of being slower at handling 100 is not a good trade off.

    <!--quoteo(post=1840213:date=Apr 7 2011, 06:44 PM:name=Lazer)--><div class='quotetop'>QUOTE (Lazer @ Apr 7 2011, 06:44 PM) <a href="index.php?act=findpost&pid=1840213"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->Adding a list of nearby octree indices to each octree leaf node is not an incredible amount of data to store.<!--QuoteEnd--></div><!--QuoteEEnd-->

    On intel processors L1 cache is 8 kB data, 8 kB code. On AMD it is 32 kb data, 32 kb code, with access latency being a bit higher.

    <!--quoteo(post=1840213:date=Apr 7 2011, 06:44 PM:name=Lazer)--><div class='quotetop'>QUOTE (Lazer @ Apr 7 2011, 06:44 PM) <a href="index.php?act=findpost&pid=1840213"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->It also eliminates the need to check all of the nodes that fall within a bounding sphere. Instead you simply run a point check to retrieve the most local node, and then use that node's list to find all actors contained nearby in the nodes from that list. This IS less computations to find a trimmed list of actors (specifically for the more populated scenarios we are trying to get the engine to handle better), although does require precomputation to make such lists.<!--QuoteEnd--></div><!--QuoteEEnd-->

    The goal isn't to minimize computation, it's to minimize the time taken. Minimizing time is mostly about being smart with memory; fitting inside caches, having high spacial and temporal coherency, streaming from memory to take advantage of precaching etc. It is not infrequent that it would be worth doing 50 floating point operations to avoid 1 memory access.
  • LazerLazer Join Date: 2003-03-11 Member: 14406Members, Contributor, Constellation, NS2 Playtester
    Cool I understand all that, reminds me of my CIS273 class from a couple years back. The point was that although an actor list can be cached, a smaller actor list can be achieved otherwise and therefore requires less positional and los checking once we have gotten the list of actors. Not to mention static actors are even better candidates into the octree as they are practically level geometry until they are destroyed. Regardless I'm more curious about hearing the 'other planned solution' that is going in place.
  • spellman23spellman23 NS1 Theorycraft Expert Join Date: 2007-05-17 Member: 60920Members
    <!--quoteo(post=1840215:date=Apr 7 2011, 04:00 PM:name=Soylent_green)--><div class='quotetop'>QUOTE (Soylent_green @ Apr 7 2011, 04:00 PM) <a href="index.php?act=findpost&pid=1840215"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->The goal isn't to minimize computation, it's to minimize the time taken. Minimizing time is mostly about being smart with memory; fitting inside caches, having high spacial and temporal coherency, streaming from memory to take advantage of precaching etc. It is not infrequent that it would be worth doing 50 floating point operations to avoid 1 memory access.<!--QuoteEnd--></div><!--QuoteEEnd-->

    ding!

    Hate those unknown memory access latencies. Screws with my ILP something fierce.
Sign In or Register to comment.