The Big Flaw of balancing games with 50/50 winrate

2»

Comments

  • IronHorseIronHorse Developer, QA Manager, Technical Support & contributor Join Date: 2010-05-08 Member: 71669Members, Super Administrators, Forum Admins, Forum Moderators, NS2 Developer, NS2 Playtester, Squad Five Blue, Subnautica Playtester, Subnautica PT Lead, Pistachionauts
    Handschuh wrote: »
    From what I've read for months is that somehow "50/50" is being the holy grail for balancing in ns2.

    Where the hell have you been reading this?

    Winrate by round time is at least a bit more indicative because it can highlight the ebb and flow, show you things overall winrates cant, as well as tech timing issues etc - but it's still just a general pointer and a guideline at best.
    Overall winrates mean nothing, and this has been said / discussed since 5 years ago?
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    The urban myth that UWE balances only on win rates using overwatch mechanics.
  • cooliticcoolitic Right behind you Join Date: 2013-04-02 Member: 184609Members
    Nordic wrote: »
    The urban myth that UWE balances only on win rates using overwatch mechanics.

    Cringes at the mention of Overwatch
  • AeglosAeglos Join Date: 2010-04-06 Member: 71189Members
    It's not that hard. There is some conflation in server team balance and faction team balance.
    IronHorse wrote: »
    Handschuh wrote: »
    From what I've read for months is that somehow "50/50" is being the holy grail for balancing in ns2.

    Where the hell have you been reading this?

    Winrate by round time is at least a bit more indicative because it can highlight the ebb and flow, show you things overall winrates cant, as well as tech timing issues etc - but it's still just a general pointer and a guideline at best.
    Overall winrates mean nothing, and this has been said / discussed since 5 years ago?

    Funny how the team celebrates the 50/50 win rate on release then. :trollface:
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited November 2017
    I think that sounds about right, but it also doesn't capture the full picture.
    Statistics never do by nature, and I still question the conditions I used to get those numbers. I am still open to suggestions. That was my best attempt at trying to estimate the amount.
    Nordic wrote: »
    Overall it seems that very few desperate rushes are successful, tunnel or otherwise. If we trust the conditions I have listed, then we can assume that tunnel rushes cause an unexpected alien win in less than 2% of games. I am pretty surprised by this myself. Less than 2% of games seems much lower than I expected. The conditions make sense though, but that doesn't mean they are right. What do you guys think of the conditions used?

  • NousWandererNousWanderer Join Date: 2010-05-07 Member: 71646Members
    edited November 2017
    Nordic wrote: »
    I think that sounds about right, but it also doesn't capture the full picture.
    Statistics never do by nature, and I still question the conditions I used to get those numbers. I am still open to suggestions. That was my best attempt at trying to estimate the amount.
    Nordic wrote: »
    Overall it seems that very few desperate rushes are successful, tunnel or otherwise. If we trust the conditions I have listed, then we can assume that tunnel rushes cause an unexpected alien win in less than 2% of games. I am pretty surprised by this myself. Less than 2% of games seems much lower than I expected. The conditions make sense though, but that doesn't mean they are right. What do you guys think of the conditions used?
    I don't have specific suggestions for you, but, again, here's the way I would encourage you to think about the problem.

    1. Aliens can have 2 hives, 3 harvesters, and be in an almost-as-bad-but-not-quite-as-bad situation as they would be on 1 hive, 3 harvesters. In either situation, the marine team's perceived and/or real advantage can still be eliminated with a single successful tunnel rush. In either situation, the marine team's perceived and/or real advantage is the cause for frustration when a rush results in a win in these cases.
    2. Most rushes do fail. But only one rush has to succeed in any given round where the marines haven't established a second base as an insurance policy. And marines simply don't develop second bases where players can spawn, phase from, or be beaconed to in a large number of games. This is the result of game design and a lack of positive incentive to do so.

    So the frustration from sneaky tunnel rushes does not come from a subconscious awareness we all have about some global alien tunnel rush success rate under certain conditions. The frustration comes from the fragility baked into the marine team's tech progression / team design. Marine teams can be dominant and winning in every visible way except for setting up a second base, and still fall prey to a game-winning sneaky tunnel rush. Responses which demand that marine teams simply set up a second base miss the point: the res investment is large, the gains are small, and the investment is almost always needed elsewhere during the phase of the game where it matters most (meds, upgrades, etc.).

    Preventing every single tunnel rush against a dedicated alien team [in servers with high player counts] requires coordinated lane control, multiple static exos on the field (lame), a very cautious commander willing to scan repeatedly, or intelligent marine players who check naturals / patrol lanes or zones. Oftentimes, those same intelligent marines would be more optimally used on the front lines. Nevertheless, this is non-optional because the marine base has multiple points of fragility (destroy the CC outright = win, destroy the power node = probable win, destroy the obs + main phase = probable win) and it takes only one successful tunnel rush to win. It's a constant psychological 'check' played against the marine team unless they commit to that insurance policy.

    I'll add that the current "limit aliens to 2 tech points, get exos, roll the map" meta does limit this problem somewhat because a steady stream of exos pushing out of marine start via both natural lanes has the result of thwarting a majority (if not all) tunnel rushes being established along the way. But that's solving one game mechanic's frustration points by introducing an entirely new set of them.
  • moultanomoultano Creator of ns_shiva. Join Date: 2002-12-14 Member: 10806Members, NS1 Playtester, Contributor, Constellation, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Gold, NS2 Community Developer, Pistachionauts
    edited November 2017
    @Nordic , I don't know the resolution of your data. The pattern that I see in general is "marines are generally winning. Some combination of: bad laneblocking, failing to spot the gorge, vent cheese, slow beacon. Marines lose their most important forward base or main. Aliens win."

    If you could interrogate the data deeply, I'd say something like, "marines are up by 2 res nodes for the first 10 minutes of the game, but then lose."
  • Kouji_SanKouji_San Sr. Hινε Uρкεερεг - EUPT Deputy The Netherlands Join Date: 2003-05-13 Member: 16271Members, NS2 Playtester, Squad Five Blue
    edited November 2017
    In a superficial statement, completely balanced games can actually be quite bland and boring after a while...

    NS2 is definitely guilty of having a lot of stuff that serves close to no purpose or are just added as fluff. And of course as a lot of games, guilty of "the perfect build/techtree" phenomena.



    Here's one for ya... I've played my most interesting rounds on maps in BF1942 (Forgotten Hope), where one team started out with a disadvantage (also enforced by flag ticket bleed. But it was still possible to outflank and get into areas to take over flags, getting access to slightly better equipment and better spawnpoints and perhaps a timer for better stuff to spawn in after a while. As a result, it was a battle of attrition, which at a first glance looked completely unbalanced in favor of the defending team. the very same team which actually had to hold on for deer life, because losing a base was permanent and as a result could potentially cause a snowball effect. the interesting part came from the fact that rounds on these kinds of maps always were completely different. Not one round was the same in essence, even losing was fun. It was dependent on which bases you could conquer and what kind of support you received.

    Adding a bit of randomness into your life, makes things interesting

    The most famous scenarios where this "push map" system was implemented were obviously the D-Day landings (Gold, Juno, Omaha, Utah, Sword). And some other favorites of me personally. Arnhem, which was a reverse pushmap with Germans starting out with a big Panzer Gruppe, but it was set up in such a way that tanks weren't as effective so the Brits did have a chance. And Foy, a push on a town from the foggy forest.




    To draw a comparison to NS2, basically Last Stand was a simplified version of this. Marines are in the advantage, with a ticked bleed (timer for aliens, which than ramp up their power based on a timer). However, some people came in and "nerfed certain things, against most advice given, just to "balance" the game. Not actually understanding what the mod was all about and as a result taking quite a lot of charm and the whole desperation vibe away from the game in the name of balance. Now that mod has turned into a empty and boring shell of what the buggy early version was, sure quite a bit more polished, but also with a lot less character of what made it stand out. It could've been a great test bed for NS2 vanilla mode to try out things, but instead it was made to be more balanced and boring :D I see the same kinda happeden to the one trick pony teams in NS2, one tech tree to rule em all.

    Hmm, I think Charlie said it best. Paraphrasing here:
    "Tear down a game to it's bare components, to the point that it is barebones. What's left is the core essence of the game, from that you can build it up to what it needs to be and definitely avoid adding fluff."



    This is just a quick jab at the "golden balance" religion that most game developers strive for. It isn't always in best interest of the game's core and essence. A game needs hard counters, but not cheap kills. A game needs newbies that are still able to kill, but that stuff shouldn't be easy to exploit by godlike players. But balance shouldn't be the ultimate end goal, games are much more interesting and fun when there are some unbalanced bits, which in of itself are balanced out by counters.

    Should always avoid the repetitive nature of the one true techtree, which will eventually get quite boring after some time.


    NS2 has probably developed way beyond this point though and is now quite a bit too complex to simply twiddle with the knobs, the total lack of randomness in this game makes me think NS/NS2 players are control freaks based on the mandatory techtree choices and the "don't change mah game bruh" drama we've come to love and hate :D
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    moultano wrote: »
    Nordic , I don't know the resolution of your data. The pattern that I see in general is "marines are generally winning. Some combination of: bad laneblocking, failing to spot the gorge, vent cheese, slow beacon. Marines lose their most important forward base or main. Aliens win."

    If you could interrogate the data deeply, I'd say something like, "marines are up by 2 res nodes for the first 10 minutes of the game, but then lose."

    Between October 12th, 2016 and December 11th, 2016 there were 31,026 games played total. I know that is the amount of games played because that is the amount of games recorded by hive in that period. Hive records all games. The stats in question were produced in January 2016 with data I had upto that point which includes data before and after the period I have referenced. I chose the dates between October 12th, 2016 and December 11th, 2016 because I have hive data for that date range that I can check the total games recorded by wonitor to hive recorded games.

    My wonitor data is aggregated from several NS2 servers. Between October 12th, 2016 and December 11th, 2016 wonitor recorded 9,169 games. That is 29% of games recorded by wonitor during that period. Please note that >50% of players are rookies, and that >50% of games played are likely on rookie only servers. Most rookie only servers do not record wonitor data. 29% is not as low as it might seem because that 29% is from the cohort of games we want. It is more like 29/50, or about 58% of all games that would be valid for the query. This is an assumption that I can not back up, but I think it is a valid assumption.

    The following server's had recorded all, or nearly all, of their games from that period. There were 20 individual server names recorded, but I cut out what looked like duplicates of the same server, leaving 10 servers. There is a rookie only server in this list, but its data was not included in the query because I removed all rookie only games.
    ENSL.org | DE Comp #1 | Palaven
    ENSL.org | SE Comp #1 | by Thirsty Onos
    Survival of the Fattest
    Survival of the Fattest #RookieOnly
    TacticalGamer.com - 8v8 - 6 Spec
    The Thirsty Onos # 18
    The Thirsty Onos # 22
    The Thirsty Onos # Rookies only
    Wooza's Hamster Wheel
    Wooza's Mod Box (Vanilla + Custom Modes)

    The statistics I referenced were created from data in the fall of 2016, which could have more or less than 29% of games recorded. I restricted the data further by limiting it to games not on rookie only servers and with player counts between 12 and 24. That restriction left 7,406 games, or about 23% of all games played, or about 81% of all games recorded by wonitor in that period.

    More servers record games using wonitor now, so the percentage of games recorded has improved in more recent months. For example, The TA servers now record wonitor data, and have produced the most data even though they have been recording the least amount of time.

    I hope that answers your question regarding the data resolution. As quoted earlier, here are the conditions I used during that time to produce those statistics.
    Nordic wrote: »
    Given the data sources I have, Nintendows and I tried to figure out a rough estimate.

    Using wonitor data there are 48,408 non-rookie only games between 12-24 players. 24,834 of those games were alien wins. Aliens won 51.3% of games in this sample.
    Aliens hold 3.5 resource towers on average. Marines hold 4.5 resource towers on average. There were 403 alien wins in games ending with aliens on 1 hive, aliens having <4 RT's, marines having >=4 RT's, and were over 10 minutes long.
    403/24,834 = 2%
    An estimated less than 2% of alien wins ended in a desperate rush going by wonitor data

    Using sponitor data there are 21,287 non-rookie only games between 12-24 players. 11,136 of those games were alien wins. Aliens won 52.3% of games in this sample.
    There were 332 alien wins ending with marines having >=50% more res than aliens, and were over 10 minutes long.
    332/11,136 = 3%
    An estimated less than 3% of alien wins ended in a desperate rush going by by sponitor data.

    I limited to games over 10 minutes, because bilebomb typically comes out between 7 and 9.5 minutes.

    Overall it seems that very few desperate rushes are successful, tunnel or otherwise. If we trust the conditions I have listed, then we can assume that tunnel rushes cause an unexpected alien win in less than 2% of games.

    I am pretty surprised by this myself. Less than 2% of games seems much lower than I expected. The conditions make sense though, but that doesn't mean they are right. What do you guys think of the conditions used?

    I should run this query again on more recent data to make it more relevant to this discussion, and because wonitor now records a higher percentage of all games played. I believe the statistics lack validity because of the conditions used, rather than the data quality. Those conditions are my best attempt at capturing games where desperate rushes would occur.
Sign In or Register to comment.