Forum Game: Who can hand balance better than shuffle?

NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
edited February 2016 in NS2 General Discussion
Awhile back I was trying to beat the shuffle algorithm with hand crafted teams based on hive skill only. I was unable to do so; shuffle still made better teams than mine. So I thought, why not make a little game out of this. Who can beat the shuffle?

How does this game work?
I will provide a list of 16 players. You will try to balance the teams by hand to the best of your ability. In about a week, more or less depending on interest, I will post the teams that shuffle would make. Shuffle tries to balance my minimizing the Average Skill and Standard Deviation so I recommend you try to do the same. If you try to balance without using Average skill or Standard Deviation, please state why you think they should be balanced that way.

If you have the ability to use a computer algorithm to balance these teams, please refrain from using it. This is a contest of hand crafted teams.

What do you win? Internet cookies and maybe an awesome.

For this exercise we are assuming that hive skill values are are fairly accurate. In reality that is not always the case, but for the intents and purpose of this forum game we will assume so.

Try to balance these 16 players.
3094
2715
1887
1730
1691
1622
1432
1420
1281
1276
1202
1191
1130
1061
902
520

In order to help you, I made a little google sheet you can test your teams with. Please do not troll it and delete your work afterwords, so others can't use it.
https://docs.google.com/spreadsheets/d/1xaTg5QVkxVej3Dg_VwkL5EIb6jBEnRjqJUWIyDjZ0D0/edit?usp=sharing
«1

Comments

  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited February 2016
    As an example of how to show your teams, I will post my worst attempt at balance.
    Marine
    3094
    2715
    1887
    1730
    1691
    1622
    1432
    1420
    Average: 1948.9 Standard Deviation: 577.7
    
    Alien
    1281
    1276
    1202
    1191
    1130
    1061
    902
    520
    Average: 878.5 Standard Deviation: 238.1
    
  • AeglosAeglos Join Date: 2010-04-06 Member: 71189Members
    edited February 2016
    1. 3094 2715
    2. 1730 1887
    3. 1691 1622
    4. 1420 1432
    5. 1281 1276
    6. 1191 1202
    7. 1130 1061
    8. 520 902

    Edit - This is basically a 1-2-2-2 placing. With the exception of the first and last couple, the skills are virtually identical. Its really not that hard to balance. It gets harder with a greater variance in skill.
  • Person8880Person8880 Join Date: 2013-01-02 Member: 177167Members, Squad Five Blue
    Aeglos wrote: »
    1. 3094 2715
    2. 1730 1887
    3. 1691 1622
    4. 1420 1432
    5. 1281 1276
    6. 1191 1202
    7. 1130 1061
    8. 520 902

    Edit - This is basically a 1-2-2-2 placing. With the exception of the first and last couple, the skills are virtually identical. Its really not that hard to balance. It gets harder with a greater variance in skill.
    Spolier alert: Shuffle comes up with a better composition than you have here. :)
  • IxianIxian Denmark Join Date: 2014-03-16 Member: 194783Members, Squad Five Blue
    Agreed, aeglos - however, no method I've ever witnessed works with greater variance in skill - elo in a small player base where people dont fight their peers usually turn out to what is happening here in ns2 (wrong discussion for this subject, I know). Aeglos' method, the captain method, or the gather method, whatever you might call it, might not be numerically balanced, but it comes alot closer to fair, and the pressure of performing scaling with the elo also seems more realistic, rather than stacking the pressure or need to perform as the current system often seems to be doing.

    The problem is still the same as it is now - what if one guy is 3k elo and the rest 900-1300 elo? The team with the 3k elo, will have a significant advantage - however, the pressure to perform isn't there. Which leaves the question - does the 3k guy wanna be a pub-stomper? A question the elo can have no say in.

    Hmm.... If it needs to be more numerically satisfying, the beforementioned method might work for the first 3-5 "picks", to destribute the pressure of performing, whereafter the rest are put either in random teams, or destributed using the current system. For a variable number of "picks", could depend on the repaining players' standart variation going below a set threshhold (Could be a server decided number?). This would only work, if its true that higher elo means higher pressure of performing, and the lower part actually only have low impact.

    NB: By pressure of performing this is meant: You can not expect greenies to have +25% LMG accuracy marine side, whereas you can more reasonably expect it from high skilled competitive players. Therefore the pressure of performing has nothing to do with your skill level in relation to other players, but to do with you performing to your own skill level, which elo in an ideal world would be an indicator of.
  • dePARAdePARA Join Date: 2011-04-29 Member: 96321Members, Squad Five Blue
    A shuffle might try to balance teams by numbers (its the only variable it has) but behind the numbers are humans.
    Is a 1900 player automatic better than a 1400?
    The system says yes, the reality no.
    A 2000 wooza player is far away from the 2000 gather player for example.

    THATS why shuffle cant end in perfect balanced rounds.
    Shuffle is good to break obvious stacks, thats all.

    You can play around with the numbers in whatever way, but you cant predict the human factor.
  • AsranielAsraniel Join Date: 2002-06-03 Member: 724Members, Playtest Lead, Forum Moderators, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Shadow, Subnautica Playtester, Retired Community Developer
    I dont think the idea is to predict the exact outcome of a match. The idea of the team shuffle/balancing is to have, on average, more balanced games. Thats all.
  • Person8880Person8880 Join Date: 2013-01-02 Member: 177167Members, Squad Five Blue
    @Ixian Not sure why you disagreed with me. I wasn't saying "shuffle's teams feel better", I mean shuffle literally produces statistically better teams than what he posted. Given I wrote the thing, I ran the numbers and got better results.

    That doesn't mean I claim there's a "real" balance from the teams it produces, but you can't expect a computer to be able to do more than number crunching.
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited February 2016
    dePARA wrote: »
    A shuffle might try to balance teams by numbers (its the only variable it has) but behind the numbers are humans.
    Is a 1900 player automatic better than a 1400?
    The system says yes, the reality no.
    A 2000 wooza player is far away from the 2000 gather player for example.
    THATS why shuffle cant end in perfect balanced rounds.
    Shuffle is good to break obvious stacks, thats all.
    You can play around with the numbers in whatever way, but you cant predict the human factor.
    For this exercise we are assuming that hive skill values are are fairly accurate. In reality that is not always the case, but for the intents and purpose of this forum game we will assume so.
    Aeglos wrote: »
    Edit - This is basically a 1-2-2-2 placing. With the exception of the first and last couple, the skills are virtually identical. Its really not that hard to balance. It gets harder with a greater variance in skill.
    I am really glad you balanced with 1-2-2-2-2 because we can then compare shines results. These are relatively easy teams to balance which is why I chose this set of players. Hand balancing as you have done produces pretty good teams but shine does give an even better team composition.

    Shuffle balance will be statistically better, but for the second part of this thread I want to compare them qualitatively. @Ixian, when I do share the team composition that shine gives I expect you to give an explanation of why Aeglos are better. You seem very adamant without even knowing the shuffle composition.

  • Soul_RiderSoul_Rider Mod Bean Join Date: 2004-06-19 Member: 29388Members, Constellation, Squad Five Blue
    I don't see the point of this 'game'. You want to test us against an algorithm, but won't let us use anything other than pen & paper, then you'll proceed to gloat because a computer algorithm trumps humans at maths...

    Oh my god, shocking surprise.

    Here's a game that people might actually be interested:

    Write an algorithm that shuffles better than shuffle..
  • TheriusTherius Join Date: 2009-03-06 Member: 66642Members, Reinforced - Shadow, WC 2013 - Supporter
    The first thing that came to mind would indeed be the "captain draft" method, where the first team to pick a member picks once, followed by a series of two picks every round, and finally the team that picked first gets the last person left to pick. Now, assuming that the hypothetical captains have a perfect knowledge of the players' skills and that those skills are accurate, the person with the highest skill left will always be picked. This results in the following teams:
    Team 1
    
    3094
    1730
    1691
    1420
    1281
    1191
    1130
    520
    
    Average: 1507.1
    Standard Deviation: 744.2
    
    --------------------
    
    Team 2
    
    2715
    1887
    1622
    1432
    1276
    1202
    1061
    902
    
    Average: 1512.1
    Standard Deviation: 577.4
    

    As expected, the average skill values are almost exactly the same. However, the standard deviations are wildly different. This is due to the team picking first getting both the best individual player as well as the worst individual player. Thus, Team 1 has both extremes within its ranks, inevitably leading to larger standard deviation.

    I then hypothesised another way to create the teams. To avoid one team having both extremes, I started with Team 1 having both the best and the second-worst player, while the Team 2 got the second-best and the worst player. Now, obviously Team 2 is worse off at this point, since both of its players are worse than the corresponding players in Team 1. To compensate for this, the rest of the player pool is divided into pairs starting from the top, and for every pair, the better player goes to Team 2. The teams become the following:
    Team 1
    
    3094
    1730
    1622
    1420
    1276
    1191
    1061
    902
    
    Average: 1537.0
    Standard Deviation: 686.6
    
    --------------------
    
    Team 2
    
    2715
    1887
    1691
    1432
    1281
    1202
    1130
    520
    
    Average: 1482.3
    Standard Deviation: 643.5
    

    The standard deviations are much closer to each other than in the first example. This is done at the expense of a larger difference in average skill, but the difference isn't huge.

    I have absolutely no theory behind any of these calculations, just two draft processes that struck my mind first. Out of these two options, I'd choose the latter just out of pure gut feeling since I have no way judge the trade-off between mean and stdev meaningfully. How does shuffle optimise this trade-off, I don't know, but the trade-off is always there.

  • Soul_RiderSoul_Rider Mod Bean Join Date: 2004-06-19 Member: 29388Members, Constellation, Squad Five Blue
    Too many different skill threads, but I am going to post this here, Hive is meaningless until the stats are reset. Why? Everyone talks about the 1000 point rookies vs 0 point rookies, which is obviously a problem, but the real problem with the system is the 1000 starting point vets vs all the other 0 starting point players.

    The reason my score is crashing so hard is because I was an 1000 score player, as were most others. We are now playing against even experienced and better players with lower score than us, causing this balance to happen. The trouble is, most people 1000-1800, should in reality be 0-800, although, obviously, that is not entirely accurate, they must be roughly average to still be around the thousand mark.

    Realistically, 0-800 is not likely, and the range would probably be 400-1200 or so. The problem is, with all the new players scoring points from 0, older players losing are going to suffer huge points losses as the system tries to converge on itself. But with the small playerbase, and the separatism in play between pub and comp, essentially, this convergence is never going to happen.

    The only way to start getting meaningful information from data is to have accurate data. What we have from the current hive system is anything but that. That is without even mentioning the whitelisting issue.

    Essentially, all the hive analysis stuff is just practice in anaylsis, because the data you formulate is from an innacurate representation. A total hive reset is the only way, with non-rookies info somehow able to pass over to mark them as non-rookies so they don't need to do the tuts etc.
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    I understand your point about hive skill not always being very accurate, and that we have players starting from different points. This is valid.

    It created this thread separate from the others to focus in on shuffle itself separate from the skill system itself. For the purpose of this thread I would like it if we assume hive skill is fairly accurate.

    So let's keep the hive skill discussion in the other threads, at least for now.
  • simbasimba Join Date: 2012-05-06 Member: 151628Members
    NS2 has too many cooks.
  • SupaDupaNoodleSupaDupaNoodle Join Date: 2003-01-12 Member: 12232Members
    Sigh.

    The problem is not the shuffle or the Hive.

    It is the total lack of players.
  • CalegoCalego Join Date: 2013-01-24 Member: 181848Members, NS2 Map Tester
    Assuming Balance works (neither here nor there on this, because of the following):

    People don't like being told what to do, so they'll either leave, or switch teams > wrecking balance. Sure it takes into account where you are now for most of the players, but that one it switched to make the teams balanced might just decide that "fk it. I don't really want to play aliens, and I really should be doing <suchforth>." So they leave. There's literally nothing anything can do about this. In my experience most games turn into stomps only when people start leaving.

    Now, many people leave when the game's a stomp. So there's that too.

    Shuffle's not ideal, but that's not its fault, and it's better than an intentional stack, always has been, always will be.
  • FrozenFrozen New York, NY Join Date: 2010-07-02 Member: 72228Members, Constellation
    I was talking to a computer scientist about the Kentucky Derby last year at bar. He was claiming that weather conditions and other factors about the horses known through experience have nothing to do with the outcome and that each horse has equal chance, so what's the point of betting?

    I couldn't talk to him for very long.
  • xDragonxDragon Join Date: 2012-04-04 Member: 149948Members, NS2 Playtester, Squad Five Gold, NS2 Map Tester, Reinforced - Shadow
    Is there any shuffle currently that wouldn't consider these balanced teams?

    Team1:
    1000
    1000
    1000
    3000

    Team2:
    1500
    1500
    1500
    1500
  • Person8880Person8880 Join Date: 2013-01-02 Member: 177167Members, Squad Five Blue
    xDragon wrote: »
    Is there any shuffle currently that wouldn't consider these balanced teams?

    Team1:
    1000
    1000
    1000
    3000

    Team2:
    1500
    1500
    1500
    1500
    That's a little too small team sizes for it to work well with. It would also depend on whether teams started like that, or whether the players were distributed differently before the shuffle occurred. If they start like that, it won't change them. If they don't, it'll find solutions that give a better standard deviation difference (i.e. not having an entire team of 1500) but that make the average difference worse.

    There's no real "good" way to balance those teams anyway, all the solutions end up sacrificing something, be it average difference or standard deviation. That's probably the only real issue in team sorting and it's not something that anything can solve without literally excluding the outliers from the game entirely.
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited February 2016
    I guess I might as well show how shine would shuffle the teams in the OP.
    Marines:
    2715
    1887
    1730
    1691
    1432
    1202
    902
    520
    Average: 1509.875. Standard Deviation: 625.10267906561.
    
    Aliens:
    3094
    1622
    1420
    1281
    1276
    1191
    1130
    1061
    Average: 1509.375. Standard Deviation: 620.95248157568.
    
    In comparison, here was the 1-2-2-2 split.
    Aeglos wrote: »
    Marines:
    3094
    1730
    1691
    1420
    1281
    1191
    1130
    520
    Average: 1507.125 Standard Deviation: 744.211
    
    Aliens:
    2715
    1887
    1622
    1432
    1276
    1202
    1061
    902
    Average: 1512.125 Standard Deviation: 577.3871
    

    Edit - This is basically a 1-2-2-2 placing. With the exception of the first and last couple, the skills are virtually identical. Its really not that hard to balance. It gets harder with a greater variance in skill.

    Now I ask, is shuffles team composition better or worse than the 1-2-2-2 split? Could shuffles team composition be improved?




    xDragon wrote: »
    Is there any shuffle currently that wouldn't consider these balanced teams?

    Team1:
    1000
    1000
    1000
    3000

    Team2:
    1500
    1500
    1500
    1500

    Even the example of "balanced" teams that I give I don't find that balanced. It is just the best team composition one can make based on hive skill alone.
  • develdevel Join Date: 2014-09-13 Member: 198444Members
    Can't shuffle without nicknames.
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    devel wrote: »
    Can't shuffle without nicknames.

    Why not?
  • Kouji_SanKouji_San Sr. Hινε Uρкεερεг - EUPT Deputy The Netherlands Join Date: 2003-05-13 Member: 16271Members, NS2 Playtester, Squad Five Blue
    Ricky Jay could shuffle your Ace players... :trollface:
  • MendaspMendasp I touch maps in inappropriate places Valencia, Spain Join Date: 2002-07-05 Member: 884Members, NS1 Playtester, Contributor, Constellation, NS2 Playtester, Squad Five Gold, NS2 Map Tester, Reinforced - Shadow, WC 2013 - Shadow, Retired Community Developer
    I don't know why people argue so much about this topic, the basic concept is flawed because it's based around the idea that players have the same skill (or impact in a round) as aliens and marines, doesn't take into account how much/if they command, etc.

    PS: No "Hive 2.0 will address this" answers, please. As long as that's not released, it might as well not exist for these type of discussions.
  • TheriusTherius Join Date: 2009-03-06 Member: 66642Members, Reinforced - Shadow, WC 2013 - Supporter
    Most smart people are not saying that there are no problems with the skill system whatsoever. They're arguing against people saying that you can't measure skill with win/loss. These people will most likely continue their brainless crusade even after separate skill ratings for marines/aliens/commanders is introduced.
  • UncleCrunchUncleCrunch Mayonnaise land Join Date: 2005-02-16 Member: 41365Members, Reinforced - Onos
    ...and remember... there is no stacking.

  • AliteAlite Join Date: 2007-03-02 Member: 60188Members, Reinforced - Shadow
    Soul_Rider wrote: »
    Too many different skill threads, but I am going to post this here, Hive is meaningless until the stats are reset. Why? Everyone talks about the 1000 point rookies vs 0 point rookies, which is obviously a problem, but the real problem with the system is the 1000 starting point vets vs all the other 0 starting point players.

    The reason my score is crashing so hard is because I was an 1000 score player, as were most others. We are now playing against even experienced and better players with lower score than us, causing this balance to happen. The trouble is, most people 1000-1800, should in reality be 0-800, although, obviously, that is not entirely accurate, they must be roughly average to still be around the thousand mark.

    Realistically, 0-800 is not likely, and the range would probably be 400-1200 or so. The problem is, with all the new players scoring points from 0, older players losing are going to suffer huge points losses as the system tries to converge on itself. But with the small playerbase, and the separatism in play between pub and comp, essentially, this convergence is never going to happen.

    The only way to start getting meaningful information from data is to have accurate data. What we have from the current hive system is anything but that. That is without even mentioning the whitelisting issue.

    Essentially, all the hive analysis stuff is just practice in anaylsis, because the data you formulate is from an innacurate representation. A total hive reset is the only way, with non-rookies info somehow able to pass over to mark them as non-rookies so they don't need to do the tuts etc.

    I get that you think that people new to the system should start at the same skill rating, however I don't understand how some people starting at different points absolutely breaks all of the skill values.

    The whole premise of this system is that the more you play, the more your skill rating will go towards where it should be, hence new people starting at 0 vs 1000 shouldn't cause such a huge problem assuming they play enough games to get where they should be. It all balances out in the end...
  • SquishpokePOOPFACESquishpokePOOPFACE -21,248 posts (ignore below) Join Date: 2012-10-31 Member: 165262Members, Reinforced - Shadow
    Team one:
    3094
    2715
    1887
    1730
    1691
    1622
    1432
    1420

    Team two:
    1281
    1276
    1202
    1191
    1130
    1061
    902
    520

    At the end of round (should take about 5 minutes), swap teams.

    Marine wins: 1

    Alien wins: 1

    50% win percentage for both sides.

    Perfect balance.
  • sotanahtsotanaht Join Date: 2013-01-12 Member: 179215Members
    xDragon wrote: »
    Is there any shuffle currently that wouldn't consider these balanced teams?

    Team1:
    1000
    1000
    1000
    3000

    Team2:
    1500
    1500
    1500
    1500

    It's the best anyone or anything could do, but the 3000 is going to mop the floor with the 1500s.
  • SantaClawsSantaClaws Denmark Join Date: 2012-07-31 Member: 154491Members, Reinforced - Shadow
    xDragon wrote: »
    Is there any shuffle currently that wouldn't consider these balanced teams?

    Team1:
    1000
    1000
    1000
    3000

    Team2:
    1500
    1500
    1500
    1500
    This is how I would do it.

    1500 1500
    1000 1500
    1000 1500
    1000

    The server should have skill caps to prevent 3k players from joining in the first place. Skill segregation is the only way to achieve proper team balance.
  • NovoReiNovoRei US Join Date: 2014-11-18 Member: 199718Members
    edited February 2016
    First shuffle is to provide Aliens with a better team core. Marines have advantage at tails.
                  Marines	Aliens	
    Put Players --->	3094	2715	
    Put Players --->	1730	1887	
    Put Players --->	1432	1691	
    Put Players --->	1420	1622	
    Put Players --->	1276	1281	
    Put Players --->	1191	1202	
    Put Players --->	1061	1130	
    Put Players --->	902	520	       Difference
    Average Skill	        1513.3	1506.0	7.3
    Standard Deviation	642.2	603.2	39.0
    

    Second shuffle is more traditional, trading each tail.
                  Marines	Aliens	
    Put Players --->	3094	2715	
    Put Players --->	1730	1887	
    Put Players --->	1691	1622	
    Put Players --->	1432	1420	
    Put Players --->	1281	1276	
    Put Players --->	1202	1191	
    Put Players --->	1130	1061	
    Put Players --->	520	902	        Difference
    Average Skill	        1510.0	1509.3	0.8
    Standard Deviation	695.4	541.1	154.2
    

    I bet an Alien win on both by a thin margin.
    If the 520/902 on the 2nd shuffle were swapped, I would bet Marine win.
Sign In or Register to comment.