Basic Statistics of the Hive Skill System

24

Comments

  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited March 2015
    krOoze wrote: »
    Nordic wrote: »
    That is impossible because afaik it is not even recorded by hive. What I do have for data is skill, time(h), level, score, wins, losses, kills, deaths, assists, W/L, and K/D for 59,000 players. I have given moultano all the same data.
    Well it is sent to the server after winning a game. It just can't be accessed by public (if it's stored).

    In the raw data I have from which I pulled the skills etc from, it does have alien time and marine time but no way of knowing which games win was alien or marine.

    If you see some information verbatim on a hive profile, I could probably pull it out.
  • krOozekrOoze Join Date: 2014-04-24 Member: 195593Members
    edited March 2015
    I know. Just saying, the game sends after win which side you were on. But I don't think the server would actually store it (it does show last ten of your matches though)... I assume you are using the webAPI and do not have direct access to the Hive databases. It would be nice to coax the data out of Unknown Worlds. Or perhaps make a mod that sends the copy of the data elsewhere too, to some less restrictive server. ns2stats.com does track it, if they are still active, but the sample size is lower (only those servers that use it)
  • GhoulofGSG9GhoulofGSG9 Join Date: 2013-03-31 Member: 184566Members, Super Administrators, Forum Admins, Forum Moderators, NS2 Developer, NS2 Playtester, Squad Five Blue, Squad Five Silver, Reinforced - Supporter, WC 2013 - Supporter, Pistachionauts
    edited March 2015
    krOoze wrote: »
    I know. Just saying, the game sends after win which side you were on. But I don't think the server would actually store it (it does show last ten of your matches though)... I assume you are using the webAPI and do not have direct access to the Hive databases. It would be nice to coax the data out of Unknown Worlds. Or perhaps make a mod that sends the copy of the data elsewhere too, to some less restrictive server. ns2stats.com does track it, if they are still active, but the sample size is lower (only those servers that use it)

    The hive profile page is basically showing all data available in the player table of the hive db. UWE uses/used for balance stats tracking another service called "sponitor", but nobody knows if that is even working anymore.

    So far in the past (even so ns2stats.com has seen better days) ns2stats and "sponitor" were always pretty equally "balance data"-wise.

  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited July 2015
    More graphs from cleaned data.
    peCoMIL.png
  • [AwE]Sentinel[AwE]Sentinel Join Date: 2012-06-05 Member: 152949Members
    You have no access to such a tool? How much communication is there between uwe and cdt?
  • moultanomoultano Creator of ns_shiva. Join Date: 2002-12-14 Member: 10806Members, NS1 Playtester, Contributor, Constellation, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Gold, NS2 Community Developer, Pistachionauts
    Finally got a chance to start digging in!

    One interesting stat so far. Hive has recorded 2637171 losses and 3000245 wins. This means that the losing team on average has 12% fewer players than the winning team. I believe that wins and losses are still counted if you quit before the game ends. If that's true and doesn't just represent quitting players, that's a pretty big effect.
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited May 2015
    moultano said:
    Finally got a chance to start digging in!

    One interesting stat so far. Hive has recorded 2637171 losses and 3000245 wins. This means that the losing team on average has 12% fewer players than the winning team. I believe that wins and losses are still counted if you quit before the game ends. If that's true and doesn't just represent quitting players, that's a pretty big effect.
    That is very interesting. Something pretty basic too with big meaning. Can't wait for more.
  • moultanomoultano Creator of ns_shiva. Join Date: 2002-12-14 Member: 10806Members, NS1 Playtester, Contributor, Constellation, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Gold, NS2 Community Developer, Pistachionauts
    edited May 2015
    Hmm, there's something going on here. Hive has recorded 42788230 deaths and 48802609 kills, which is about the same ratio 87.6%. How did you sample the data? Is it possible that it was biased towards better players? Otherwise it seems like their might be something wrong with how it's recording things.
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited May 2015
    I will message you on slack, or steam. It only has about 77% of all player data.
  • moultanomoultano Creator of ns_shiva. Join Date: 2002-12-14 Member: 10806Members, NS1 Playtester, Contributor, Constellation, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Gold, NS2 Community Developer, Pistachionauts
    edited May 2015
    We think we know what's going on. The scrape is probably missing the worst 23% of players, so that's likely the source of the discrepancy. Moving on. Check out this noob stomping.




    Blue is total number of deaths. Red is total number of kills. X axis is total time sqrt scaled. Noobs get stomped a lot.
  • moultanomoultano Creator of ns_shiva. Join Date: 2002-12-14 Member: 10806Members, NS1 Playtester, Contributor, Constellation, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Gold, NS2 Community Developer, Pistachionauts


    Here's a correlation table for skill, time, winrate (wins/wins + losses), killrate (kills / kills + deaths) for players with more than 100 hours.

    It's interesting that all 4 are correlated, but time is correlated with skill more than with either killrate or winrate. This suggests perhaps that people keep improving as they put in more time, but they tend to also play with similarly skilled players, which deflates their kill and win statistics.
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited May 2015
    Lots of rookies being stomped seems to conflict with people playing around their skill level. It makes sense for people to want to play with those of similar skill level. I would expect some sort of emergent self sorting as that seems to confirm. Lots of rookies being stomped may just be, as known, an issue with the small player base. It is not hard for a new player to find a person of extraordinary skill.
  • CCTEECCTEE Join Date: 2013-06-20 Member: 185634Members, Reinforced - Shadow
    So basicly we can conclude that "L2P noob" is pretty good advice afterall.
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    CCTEE said:
    So basicly we can conclude that "L2P noob" is pretty good advice afterall.
    To an extent. If you look at his graph, or mine from earlier, at skill vs hours recorded it is not perfectly correlated. If it was correlated it would mean that the graph would be more like a line. You can mathematically see how correlated a line is with the correlation coefficient which he has calculated on the top right. I wish I could tell which graph the coefficients were representing. The closer to 1 the more correlated, the closer to 0 the less. So 0.135 is less correlated, but 0.654 is rather correlated.
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited July 2015
    I regained interest in looking at the data again. I have some new data in the form of time played as aliens and marines. I also have the last recorded match in hive. This is all with the same data from March that I was using before, just new code that gave me new information.

    First off, the most recent match does not give a good idea of how many people play ns2. This is because very few servers have hive enabled. From January 1 to March 10, 2015, I only have 7,617 profiles. My data only has 77% of all profiles to begin with, so I am missing players too.

    Here is just some basic stats from the time frame of January 1 to March 10, 2015.
    h23DlFX.png

    I started looking at team preferences, which team is played more.

    On average people who prefer playing aliens, have an average skill of 964. On average people who prefer playing marines, have an average hive skill of 918. This means that on average, more skilled players may prefer aliens.
    Most people already assumed that rookies liked marines because it is familiar, and this appears to be true. I think there is more going on here though, at least with the variation in hive skill value average from marine to alien. Since aliens have a 55% win rate, I wonder if this has any effect on this. It has been assumed that alien hive kill values are inflated above marine because of the win rate imbalance. I think this supports that.
    If I take just the players who have played from in 2015 until March 10, the average skill of people who prefer marine is 784 and the average skill of people who prefer alien is 984.
    People who have an above average hive skill value tend to play alien more than marine.
    People with less than 50 hours recorded (rookies) tend to play marines more than people with 50 hours recorded (veterans).

    Looking at just that 7,617 players from January 1 to March 10, 2015 I was wondering if the skill distribution would be more relevant to what we actually see today across servers. The people who have played in this time frame, are more likely to still be playing. When I look at this small set of players it also cleans out all the skill reset players with exactly 0 and exactly 1000 hive skill. This can give a better picture of what is really seen in games.
    Ebs2XXg.png

    I did not expect this many rookies still in the beginning of 2015 alone. The mode for skill is 0, with 106 players with exactly 0 skill just in this date range.
    hgpbfkc.png
  • BensonBenson Join Date: 2012-03-07 Member: 148303Members, Reinforced - Shadow, WC 2013 - Shadow
    edited July 2015
    Something to note on all the Hive data:

    Hive has about 1/5 of my hours recorded.

    http://steamcommunity.com/id/BensonTheCasual/

    628 hours on my steam profile.

    http://hive.naturalselection2.com/profile/9002354

    92 hours on my Hive profile.

    I dont think this will skew things too much since the threshold is +- 50 hours, but just something to note!


    P.S. i dont idle in the menu, I've played since before Hive was a thing :)
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited July 2015
    Benson wrote: »
    Something to note on all the Hive data:

    Hive has about 1/5 of my hours recorded.

    http://steamcommunity.com/id/BensonTheCasual/

    628 hours on my steam profile.

    http://hive.naturalselection2.com/profile/9002354

    92 hours on my Hive profile.

    I dont think this will skew things too much since the threshold is +- 50 hours, but just something to note!
    I did note in my last post that there are few hive enabled servers. Hive hours are also recorded time in game actually playing. It is nothing with main menu, or waiting for server, etc. Both of those, and you being a veteran who probably played before hive was a thing.

    The fact that you have 92 hours recorded by hive means little if the convergence is fast enough.

    If you said this because my surprise that there are so many rookies, and that the hours might be off. The players with low hours correlate with low skill pretty well.
    5lDs3BV.png
    0d8ESzt.png

    Really this are just fun stats, nothing to make to make decisions off of. That is why I am playing with them. Take them with a grain of salt. The stuff moultano puts out, those are more serious stats.
  • FrozenFrozen New York, NY Join Date: 2010-07-02 Member: 72228Members, Constellation
    I'm almost positive that Hive Hours are based entirely on an addition of game times, not having client open. That skews the gap between hours as well.
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited July 2015
    mattji104 wrote: »
    I'm almost positive that Hive Hours are based entirely on an addition of game times, not having client open. That skews the gap between hours as well.

    Correct. That is what I was trying to say in part, from my previous post. I know I have said it sometime before in this thread too. It is recorded hours of play in game, not having the client open. By in game I mean from when the timer starts until a team wins.
  • Cannon_FodderAUSCannon_FodderAUS Brisbane, AU Join Date: 2013-06-23 Member: 185664Members, Squad Five Blue, Squad Five Silver, Reinforced - Shadow
    Benson wrote: »
    Something to note on all the Hive data:

    Hive has about 1/5 of my hours recorded.

    http://steamcommunity.com/id/BensonTheCasual/

    628 hours on my steam profile.

    http://hive.naturalselection2.com/profile/9002354

    92 hours on my Hive profile.

    I dont think this will skew things too much since the threshold is +- 50 hours, but just something to note!


    P.S. i dont idle in the menu, I've played since before Hive was a thing :)

    @Benson can I add you to my Steam list for NS2 games??
  • BensonBenson Join Date: 2012-03-07 Member: 148303Members, Reinforced - Shadow, WC 2013 - Shadow
  • SupaDupaNoodleSupaDupaNoodle Join Date: 2003-01-12 Member: 12232Members
    What sample size are all these graphs based on?
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited July 2015
    What sample size are all these graphs based on?
    Depends on the graph. If you give me a specific graph I can tell you what I used exactly.

    The data in total is over 59,000 players which is about 77% of all players recorded in hive at the time (March). The initial graphs were me playing with all the data. 77% of all profiles should be more than adequate sample size. I haven't re read though the thread, but I probably said mentioned what I changed about the data if I posted a graph not from all the data. I soon realized that tens of thousands of players had exactly 0 or 1000, and had nearly nothing recorded in hive. They were messing with the numbers, so I decided to attempt to remove the data. I took out about 30,000 profiles who had a level of 0 or 1 in hive. Graphs from these numbers something like "Skill Distribution Cleaned." Just recently I got new code that pulled out data including hours played as marines, hours played as aliens, and the most recently played match. The most recent graphs I have posted have all been from January 1st, 2015 to March 10, 2015. The graphs of these 7,617 players data have the date range on the graph. Even though the most recent graphs are only about 12% of the total profiles I have, I think it is the most representative because these are players who most likely are still playing.
  • FrozenFrozen New York, NY Join Date: 2010-07-02 Member: 72228Members, Constellation
    @Nordic
    I don't know if this is possible or if interesting enough to spend time on, but could do a distribution of growth chart. It would show, from say -4000 to +4000 on the independent, representing how much a players skill has gone up or down over the period, and number of players as the dependent. I'm curious if we'll see a bell curve, and if so where it's shifted to
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited July 2015
    mattji104 wrote: »
    @Nordic
    I don't know if this is possible or if interesting enough to spend time on, but could do a distribution of growth chart. It would show, from say -4000 to +4000 on the independent, representing how much a players skill has gone up or down over the period, and number of players as the dependent. I'm curious if we'll see a bell curve, and if so where it's shifted to

    I am not entirely sure I understand what you are asking, so I am going to paraphrase and see how close I get.

    You want the skill value range of -4000 to 4000 on the bottom or X axis to show how much skill has changed over time. You want number of players on the left or Y axis. You think there will be a bell curve, and this will show how the total hive skill value's have gone up or down.

    The aim of this graph would be to see if on average peoples skill values are increasing or decreasing?

    The biggest thing I am unsure of is what you mean by "number of players as dependent." This makes me think you mean total players. You also say "representing how much a players skill has gone up or down over the period" which implies a singular player.

    By what I do understand I don't think this is possible with the data I have. I have a single snap shot of the hive skill system from March. I can not show change over time. If it is singular player, my data is anonymous. I don't know who is who.

    What I do have is an excel file with the following headers: Skill Value, Time(h), Hive Level, Alien Time(h), Marine Time(h), Team Preference, Total Score, W/L, Total Wins, Total Losses, K/D, Total Kills, Total Deaths, Total Assists, Most Recent Match.
    Team preference is alien time - marine time. If the number is positive, that individual plays aliens more often and I assume prefers aliens. If the number is negative I assume the opposite.
  • FrozenFrozen New York, NY Join Date: 2010-07-02 Member: 72228Members, Constellation
    Yea, if the data is anonymous entirely, then there's no way you could you compare one set to another from a few months down the road.

    But yea the thought was seeing the average increase/decrease of hive skill over t time. I'm not really assuming it would actually turn out to be a bell curve though
  • WheeeeWheeee Join Date: 2003-02-18 Member: 13713Members, Reinforced - Shadow
    Nordic wrote: »
    I started looking at team preferences, which team is played more.

    On average people who prefer playing aliens, have an average skill of 964. On average people who prefer playing marines, have an average hive skill of 918. This means that on average, more skilled players prefer aliens.
    If I take just the players who have played from in 2015 until March 10, the average skill of people who prefer marine is 784 and the average skill of people who prefer alien is 984.

    I'm going to take huge issue with your analysis here.

    The only thing that this tells us is that the average hive score for majority-alien players is higher than that of majority-marine players. Since hive score depends most heavily on W/L ratio, and since aliens have a higher winrate, it only proves a correlation. There is zero relation shown by these hive scores between player preference and player skill.
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited July 2015
    mattji104 wrote: »
    Yea, if the data is anonymous entirely, then there's no way you could you compare one set to another from a few months down the road.

    But yea the thought was seeing the average increase/decrease of hive skill over t time. I'm not really assuming it would actually turn out to be a bell curve though

    You could do that for yourself easily enough. On your hive profile it shows the last 20 matches you have played in a hive enabled server. You could copy down those 20 numbers into excel over a few weeks, months, etc and do exactly what you want. It might be interesting. I believe mine would be acts more like a wave with 1900 +- 100. I believe this has to do with me playing marines for a awhile, and then aliens for awhile.
    Wheeee wrote: »
    Nordic wrote: »
    I started looking at team preferences, which team is played more.

    On average people who prefer playing aliens, have an average skill of 964. On average people who prefer playing marines, have an average hive skill of 918. This means that on average, more skilled players prefer aliens.
    If I take just the players who have played from in 2015 until March 10, the average skill of people who prefer marine is 784 and the average skill of people who prefer alien is 984.

    I'm going to take huge issue with your analysis here.

    The only thing that this tells us is that the average hive score for majority-alien players is higher than that of majority-marine players. Since hive score depends most heavily on W/L ratio, and since aliens have a higher winrate, it only proves a correlation. There is zero relation shown by these hive scores between player preference and player skill.
    I came to the same conclusion and I mentioned it just a little bit below. What you quoted was me typing out my thought process.
    Nordic wrote:
    Most people already assumed that rookies liked marines because it is familiar, and this appears to be true. I think there is more going on here though, at least with the variation in hive skill value average from marine to alien. Since aliens have a 55% win rate, I wonder if this has any effect on this. It has been assumed that alien hive kill values are inflated above marine because of the win rate imbalance. I think this supports that.

    I could of worded that whole post better though, you got me there. I have been working a lot, and seem to be not catching errors as well as I can. That specific post was stream of consciousness.

    This thread is a mess, and not really organized to begin with. Maybe I will consolidate all the graphs into the OP one of these days.
  • NordicNordic Long term camping in Kodiak Join Date: 2012-05-13 Member: 151995Members, NS2 Playtester, NS2 Map Tester, Reinforced - Supporter, Reinforced - Silver, Reinforced - Shadow
    edited July 2015
    This thread has been a mess so I decided to organize it and make all the graphs and statistics into one post. I will be copying this into the OP, yes @yojimbo, copy pasting. This will also answer @SupaDupaNoodle's question directly about what the sample sizes are for each graph. I believe this has every graph so far I, and moultano, has posted. It should also clear up confusion because I try to be more clear in what I did.

    I had taken it upon myself to collect some basic statistics from the hive system just for myself. Others have express interest in such statistics so I thought I would share. I am only sharing the graphs and statistics you will see in this thread, not the data itself. The data I have is in CSV format and has the headers: Skill Value, Time(h), Hive Level, Total Score, W/L, Total Wins, Total Losses, K/D, Total Kills, Total Deaths, Total Assists. As you can see, the data is anonymous as I have no way of knowing who is who.
    Hive is not enabled on very few of all ns2 servers, so even if I had the data from all 100% of ns2 players recorded at the time, the data would still only give statistics not parameters.

    I should note that any reference to hours recorded, or hive hours, is hours in a live game. These are not the same hours as steam. These hours are only from when a live game's timer starts until the timer stops with a team winning. Hive only counts games if there are enough players, so low player count games are not recorded.

    If there is anything else you would like me to do with the data tell me and I might do it. If it is something complicated you should also tell me how to do it. I am working with excel right now.
    I am just doing these stats for fun. I have only taken one college course in statistics. They are real numbers from the hive system but I would urge you to take any analysis I do with a grain of salt.


    The first round of graphs and other statistics I have collected into this spoiler. All the information here is from all of the original data I had collected on March 10th, 2015. There are only 59,587 players in the data, which was only ~77% of all ns2 players recorded in hive.
    Of all the 59,587 players I can assume most of them do not play ns2 anymore. So I originally decided to break down the statistics on hours played. I also thought this might give more representative numbers because the hive skill system gets more accurate the more time is had to collect data on a given player.

    For all data:
    Count = 59587, Average Skill = 938, Median Skill = 1000, Mode of Skill = 1000, Minimum Skill = 0, Maximum Skill = 3753, Average Hours = 29, Median Hours = 29, Mode of Hours = 3, Minimum Hours = 2, Maximum Hours = 1580

    For Players with 5+ hours of recorded play:
    73% of players have 5+ hours, Average skill = 915, Median Skill = 1000, Mode of Skill = 1000, Minimum Skill = 0, Average Hours = 39, Median Hours = 39, Mode of Hours = 5

    For Players with 10+ hours of recorded play:
    45% of players have 10+ hours, Average skill = 923, Median Skill = 1000, Mode of Skill = 1000, Minimum Skill = 0, Average Hours = 59, Median Hours = 59, Mode of Hours = 10

    For Players with 20+ hours of recorded play:
    25% of players have 5+ hours, Average skill = 1003, Median Skill = 1000, Mode of Skill = 1000, Minimum Skill = 0, Average Hours = 96, Median Hours = 96, Mode of Hours = 20

    For Players with 50+ hours of recorded play:
    12% of players have 50+ hours, Average skill = 1148, Median Skill = 1128, Mode of Skill = 1000, Minimum Skill = 0, Average Hours = 167, Median Hours = 167, Mode of Hours = 51

    For Players with 100+ hours of recorded play:
    7% of players have 100+ hours, Average skill = 1301, Median Skill = 1202, Mode of Skill = 1000, Minimum Skill = 19, Average Hours = 245, Median Hours = 245, Mode of Hours = 104

    For Players with 200+ hours of recorded play:
    3% of players have 200+ hours, Average skill = 1505, Median Skill = 1445, Mode of Skill = 1000, Minimum Skill = 19, Average Hours = 358, Median Hours = 358, Mode of Hours = 218

    In the following spoiler I have put all the graphs in order of when they were posted. As time goes on you can notice I go from having excel auto make a graph, to where I cutomize the graph and make it look a little better. You will notice some repeated graphs I am sure.

    One flaw with this set of graphs is that they take account all players who have ever been entered into the hive system. Not active players. There are thousands of players who have not played since the skill system was reset, and have exactly 1000 skill. I think we can assume that those with 50+ hours are more represenative than all the players.
    BjRFeq8.png
    mmLZrbP.png

    These three graphs below appear to be similar to a left skewed normal distribution, exactly what we would expect it to look like. I think this supports that the skill values are fairly accurate.
    Yl00d9z.png

    Since the skill system was reset and players were given a default value of 1000 skill, you can really see the effect of that here. There are a lot of players with exactly 1000 skill who haven't played much.
    Gqe9LaJ.png

    I am amazed by some of these outliers.
    w4zFDSd.png

    Made this graph at the request of krooze.
    mbGm9Jg.png

    More Graphs! I don't even know what this one could mean but I made it.
    DgVQjl0.png

    h1jWp7X.png
    LcIX4ab.png


    Looking at all the previous data I decided to try and cull out as many people as I could who had not played since the last reset. I made a new file, deleted all data of players with exact skill of 1000 and a hive level of 0 or 1. I also cut out all players with exact skills of 0 who had either 0 wins, or 0 losses, or 0 kills, or 0 deaths.
    The total players in the new "cleaned" data is 29,915. Only 1181 players in the culled data have exactly 1000 skill, but out of those players they all have had over 10 hours recorded in hive. Only 102 players in the culled data have exactly 0 skill.
    I believe this cuts out the majority of the poor quality data from subsequent player data resets.
    4UTTMyH.png

    The average win rate being about 1 is a really good sign that I have enough data for good statistics.

    These are graphs from the cleaned data. That spike at about 1000 skill shows that my cleaned data still has a lot of inactive players who have not played much since their skill was reset to 1000. The lines on the the W/L graph disappeared though so that is progress. I also changed the dots to loops because it better shows concentration.
    qPgHfsA.png

    peCoMIL.png


    I regained interest in looking at the data again. I have some new data in the form of time played as aliens and marines. I also have the last recorded match in hive. This is all with the same data from March that I was using before, just new code that gave me new information. My excel file now has Time(h), Hive Level, Alien Time(h), Marine Time(h), Team Preference, Total Score, W/L, Total Wins, Total Losses, K/D, Total Kills, Total Deaths, Total Assists, Most Recent Match.

    First off, the most recent match does not give a good idea of how many people play ns2. This is because very few servers have hive enabled. From January 1 to March 10, 2015, I only have 7,617 profiles. My data only has 77% of all profiles to begin with, so I am missing players too.
    h23DlFX.png

    I decided to look at the statistics of team preferences.

    From all of the ~59,000 players, on average people who prefer playing aliens, have an average skill of 964. On average people who prefer playing marines, have an average hive skill of 918. This means that on average, more skilled players may prefer aliens.
    Most people already assumed that rookies liked marines because it is familiar, and this appears to be true. I think there is more going on here though, at least with the variation in hive skill value average from marine to alien. Since aliens have a 55% win rate, I wonder if this has any effect on this. It has been assumed that alien hive kill values are inflated above marine because of the win rate imbalance. I think this supports that.
    If I take just the players who have played from in 2015 until March 10, the average skill of people who prefer marine is 784 and the average skill of people who prefer alien is 984.
    People who have an above average hive skill value tend to play alien more than marine.
    People with less than 50 hours recorded (rookies) tend to play marines more than people with 50 hours recorded (veterans).

    Looking at just that 7,617 players from January 1 to March 10, 2015 I was wondering if the skill distribution would be more relevant to what we actually see today across servers. The people who have played in this time frame, are more likely to still be playing. When I look at this small set of players it also cleans out all the skill reset players with exactly 0 and exactly 1000 hive skill. This can give a better picture of what is really seen in games today.
    Ebs2XXg.png

    5lDs3BV.png
    0d8ESzt.png

    In this spoiler is a bit of stats on players who have exactly 0 skill in the date range of January 1, 2015 to March 10, 2015. Just kind of interesting. These players had a minimum of 4 hours recorded, which means that most new players who try the game give it a good shot.
    I did not expect this many rookies still in the beginning of 2015 alone. The mode for skill is 0, with 106 players with exactly 0 skill just in this date range.
    hgpbfkc.png

    Any and all data I have had I shared with moultano. These are some stats and graphs he has put in this thread. Worthwhile info so thought I should add it here.
    Finally got a chance to start digging in!
    moultano wrote: »
    Finally got a chance to start digging in!

    One interesting stat so far. Hive has recorded 2637171 losses and 3000245 wins. This means that the losing team on average has 12% fewer players than the winning team. I believe that wins and losses are still counted if you quit before the game ends. If that's true and doesn't just represent quitting players, that's a pretty big effect.
    moultano wrote: »
    Hmm, there's something going on here. Hive has recorded 42788230 deaths and 48802609 kills, which is about the same ratio 87.6%. How did you sample the data? Is it possible that it was biased towards better players? Otherwise it seems like their might be something wrong with how it's recording things.
    moultano wrote: »
    We think we know what's going on. The scrape is probably missing the worst 23% of players, so that's likely the source of the discrepancy. Moving on. Check out this noob stomping.

    En2bTDrl.png


    Blue is total number of deaths. Red is total number of kills. X axis is total time sqrt scaled. Noobs get stomped a lot.

    moultano wrote: »
    N60FBsQ.png


    Here's a correlation table for skill, time, winrate (wins/wins + losses), killrate (kills / kills + deaths) for players with more than 100 hours.

    It's interesting that all 4 are correlated, but time is correlated with skill more than with either killrate or winrate. This suggests perhaps that people keep improving as they put in more time, but they tend to also play with similarly skilled players, which deflates their kill and win statistics.
  • WheeeeWheeee Join Date: 2003-02-18 Member: 13713Members, Reinforced - Shadow
    lol that one guy with hive skill 1000 and a kdr of 28:1, i wonder if he played a 1v1 round and set up turrets for the whole game
Sign In or Register to comment.