competetive ELO ratings

weywey Cineastè Join Date: 2003-06-01 Member: 16910Members, NS1 Playtester, Contributor, Constellation
I had a little fun with python and glued together a quick script to calculate relative rankings between current clans using the ELO system. I mainly parsed results from the Exertus and archaea page, so there's a lot of matches missing. Still, current rankings are:

<!--c1--><div class='codetop'>CODE</div><div class='codemain'><!--ec1--># team          ELO
1 archaea      1140.6
2 Exertus      1056.2
3 SuperPaxBros 1047.4
4 Duplex        991.9
5 Team #156     986.5
6 HBZ           981.1
7 Skulkrush     980.2
8 Mr.           980.1
9 DarkSide      979.6
10 All-in        974.1
11 Mix           964.2
12 RwD           960.1
13 420           958.1<!--c2--></div><!--ec2-->

Depending on if there's interest, this could be improved a lot by using more match data (all those twitch TV weekend matches etc), experiment with different starting values for starting ELO and K-constant, and potentially even using the advanced TrueSkill algorithm.

It can also be used to predict matches, e.g. for tonights archaea vs SPB: archaeas has an expected score of 63%, which would approximately translate to a 3-1. But there are only 2 SPB matches in my dataset at the moment.

Comments

  • DghelneshiDghelneshi Aims to surpass Fana in post edits. Join Date: 2011-11-01 Member: 130634Members, Squad Five Blue, Reinforced - Shadow
    edited May 2012
    I like the idea, though it will probably just introduce a lot of discussions and hate, since the game just isn't balanced yet.
    If team A is objectively better than team B and they play a balanced build, team A will have a higher score. If they play a more imbalanced build where one side almost always wins, they will have an equal score. This might cause problems, especially of one team just happens to play more in a balanced build and another one in more imbalanced builds.
  • RyneRyne Join Date: 2012-02-25 Member: 147408Members, NS2 Map Tester
    edited May 2012
    Cool idea!

    I would be selective in which games you input though. A lot of games have ringers, not the "A-Team" (grab random 6 team mates, not the best 6), etc. Wouldn't want the rankings to start affecting teams scrimming! (i.e. Team Z cancels a scrim because Person A isnt available, and they dont want to lose and lower their ranking)
  • FloodinatorFloodinator [HBZ] Member Join Date: 2005-02-22 Member: 42087Members, Reinforced - Shadow
    Well with a correct ELO ranking I would open a NS2-Bet plazform!
    Jokes by side, I would love to have a ELO-Ranking!
  • TrCTrC Join Date: 2008-11-30 Member: 65612Members
  • weywey Cineastè Join Date: 2003-06-01 Member: 16910Members, NS1 Playtester, Contributor, Constellation
    At this point in the beta, this is obviously little more than a gimmick. But with more and more data to stabilize the rankings in more balanced versions, this might be a hint at skill, or just a funny statistic.

    I've removed the matches vs ringer teams and all older matches before april of this year, and added the matches from this weekend. The current sources are: <a href="http://pastebin.com/p3cBQgDJ" target="_blank">http://pastebin.com/p3cBQgDJ</a>

    But if this is something that should go on, there should be some more automated way to optain match results. Or someone more connected in the competitive community could help out.

    Updated rankings as of today:
    <!--c1--><div class='codetop'>CODE</div><div class='codemain'><!--ec1--> 1 Exertus      1079.9
    2 SuperPaxBros 1076.3
    3 archaea      1070.0
    4 Duplex       1015.4
    5 Skulkrush     980.2
    6 Mr.           977.4
    7 DarkSide      976.8
    8 420           958.0
    9 RwD           957.9
    10 HBZ           957.9
    11 All-in        950.1<!--c2--></div><!--ec2-->
  • fanaticfanatic This post has been edited. Join Date: 2003-07-23 Member: 18377Members, Constellation, Squad Five Blue
    edited May 2012
    Yeah, good idea testing this out -- could have some value when tournaments start happening. Obviously using practice matches for ELO ratings is pretty pointless.

    Publishing practice match results is pretty tacky (cough exertus cough duplex cough archaea), but I guess it's fair enough until actual tournaments start happening.
  • ZeikkoZeikko Join Date: 2007-12-16 Member: 63179Members, Squad Five Blue, NS2 Map Tester
    <!--quoteo(post=1935979:date=May 14 2012, 12:05 AM:name=fanatic)--><div class='quotetop'>QUOTE (fanatic @ May 14 2012, 12:05 AM) <a href="index.php?act=findpost&pid=1935979"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->Publishing practice match results is pretty tacky (cough exertus cough duplex cough archaea), but I guess it's fair enough until actual tournaments start happening.<!--QuoteEnd--></div><!--QuoteEEnd-->

    Yeah they're jsut practice matches after all. Teams are testing new players, new lineups and new strategies. It's quite far fetched to draw any rankings with this data. But then again you as duplex do this too and I don't see anything wrong with it.
  • swalkswalk Say hello to my little friend. Join Date: 2011-01-20 Member: 78384Members, Squad Five Blue
    Here is the match history we have laying around from the duplex site, some of it might overlap with Archaea and exertus scores(look at dates).
    First number underneath the date is our score, the second number is the enemy score.
    Note that pub.eu is Archaea's old name.
    Interesting idea you got here.
    PCW
    Archaea
    13/05/2012
    1
    1

    PCW
    exertus
    13/05/2012
    4
    0

    PCW
    All-In
    12/05/2012
    0
    4

    PCW
    exertus
    11/05/2012
    1
    3

    PCW
    HBZ
    10/05/2012
    2
    0

    PCW
    Archaea
    09/05/2012
    2
    4

    PCW
    exertus
    09/05/2012
    3
    1

    PCW
    exertus
    07/05/2012
    0
    5

    PCW
    exertus
    01/05/2012
    1
    4

    PCW
    exertus
    28/04/2012
    3
    1

    PCW
    Archaea
    27/04/2012
    0
    2

    PCW
    exertus
    26/04/2012
    1
    1

    PCW
    420
    07/04/2012
    3
    1

    PCW
    Archaea
    06/04/2012
    1
    3

    PCW
    exertus
    04/04/2012
    4
    0

    PCW
    D|S
    27/02/2012
    2
    0

    PCW
    Archaea
    23/02/2012
    4
    0

    PCW
    OHNOS
    14/02/2012
    5
    1

    PCW
    OHNOS
    06/02/2012
    3
    1

    PCW
    Archaea
    17/01/2012
    1
    3

    PCW
    Pub.eu
    21/11/2011
    2
    2

    PCW
    HBZ
    20/11/2011
    2
    1

    PCW
    Pub.eu
    16/11/2011
    2
    2

    PCW
    pub.eu
    15/11/2011
    2
    0

    PCW
    pub.eu
    08/11/2011
    2
    2

    PCW
    Inversion
    05/11/2011
    0
    4

    PCW
    pub.eu
    01/11/2011
    1
    1

    PCW
    pub.eu
    12/10/2011
    2
    0
  • HughHugh Cameraman San Francisco, CA Join Date: 2010-04-18 Member: 71444NS2 Developer, NS2 Playtester, Reinforced - Silver, Reinforced - Onos, WC 2013 - Shadow, Subnautica Developer, Pistachionauts
    This is spectacularly cool - I'm very interested in taking this further
  • weywey Cineastè Join Date: 2003-06-01 Member: 16910Members, NS1 Playtester, Contributor, Constellation
    Yeah I'm still not sure how to proceed. ELO ranking depends on a lot of input so the scores converge faster than the teams strength changes - which is not a situation we have (new versions, players, strategies etc). And I agree that practice matches are not a good source. But if only tournaments are used, that's barely any data. So maybe weight tournament matches higher, like 3times as much or something? Or only use practice matches when both teams agree.

    And srsly, if there is interest, other people need to collect scores. The results on team pages are all horribly inconsistent and only show those of the big teams.
  • GORGEousGORGEous Join Date: 2012-02-19 Member: 146762Members, NS2 Map Tester
    There's a huge difference between "scrims" and "matches."

    Scrims should note be used for ELO (or any other kind of) rankings. They are practice games where Team A can play against team B, often trying new players/strats.

    Matches are for-serious games. Tournaments are usually a series of matches. Having an ELO ranking for these games would be cool. Teams need to agree that a game is a match for it to actually be a match.

    As far as I've understand, the weekend casted games have not been matches. I've always thought of them as scrims or show-matches, not for-serious-matches.
  • WilsonWilson Join Date: 2010-07-26 Member: 72867Members
    Matches are serious business.
  • mikeditkamikeditka Join Date: 2012-03-31 Member: 149764Members, NS2 Map Tester
    I would like to see only matches seen on it.

    Scrims are for practicing.
  • fanaticfanatic This post has been edited. Join Date: 2003-07-23 Member: 18377Members, Constellation, Squad Five Blue
    <!--quoteo(post=1938121:date=May 22 2012, 11:06 AM:name=wey)--><div class='quotetop'>QUOTE (wey @ May 22 2012, 11:06 AM) <a href="index.php?act=findpost&pid=1938121"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->And srsly, if there is interest, other people need to collect scores. The results on team pages are all horribly inconsistent and only show those of the big teams.<!--QuoteEnd--></div><!--QuoteEEnd-->
    You're never going to get accurate results from practice matches, there isn't even any point in trying. Part of it is that teams do all sorts of silly things during practice and part of it is that there's no authority on what was the actual result of a match -- it's up to whomever decides to publish the score on their website (like exertus recently posting a 2-1 win against duplex, that was actually a 2-2).
  • TrCTrC Join Date: 2008-11-30 Member: 65612Members
    <!--quoteo(post=1938406:date=May 23 2012, 02:08 PM:name=fanatic)--><div class='quotetop'>QUOTE (fanatic @ May 23 2012, 02:08 PM) <a href="index.php?act=findpost&pid=1938406"><{POST_SNAPBACK}></a></div><div class='quotemain'><!--quotec-->You're never going to get accurate results from practice matches, there isn't even any point in trying. Part of it is that teams do all sorts of silly things during practice and part of it is that there's no authority on what was the actual result of a match -- it's up to whomever decides to publish the score on their website (like exertus recently posting a 2-1 win against duplex, that was actually a 2-2).<!--QuoteEnd--></div><!--QuoteEEnd-->
    Fixed point still valid!
Sign In or Register to comment.