That is impossible because afaik it is not even recorded by hive. What I do have for data is skill, time(h), level, score, wins, losses, kills, deaths, assists, W/L, and K/D for 59,000 players. I have given moultano all the same data.
Well it is sent to the server after winning a game. It just can't be accessed by public (if it's stored).
In the raw data I have from which I pulled the skills etc from, it does have alien time and marine time but no way of knowing which games win was alien or marine.
If you see some information verbatim on a hive profile, I could probably pull it out.
I know. Just saying, the game sends after win which side you were on. But I don't think the server would actually store it (it does show last ten of your matches though)... I assume you are using the webAPI and do not have direct access to the Hive databases. It would be nice to coax the data out of Unknown Worlds. Or perhaps make a mod that sends the copy of the data elsewhere too, to some less restrictive server. ns2stats.com does track it, if they are still active, but the sample size is lower (only those servers that use it)
I know. Just saying, the game sends after win which side you were on. But I don't think the server would actually store it (it does show last ten of your matches though)... I assume you are using the webAPI and do not have direct access to the Hive databases. It would be nice to coax the data out of Unknown Worlds. Or perhaps make a mod that sends the copy of the data elsewhere too, to some less restrictive server. ns2stats.com does track it, if they are still active, but the sample size is lower (only those servers that use it)
The hive profile page is basically showing all data available in the player table of the hive db. UWE uses/used for balance stats tracking another service called "sponitor", but nobody knows if that is even working anymore.
So far in the past (even so ns2stats.com has seen better days) ns2stats and "sponitor" were always pretty equally "balance data"-wise.
moultanoCreator of ns_shiva.Join Date: 2002-12-14Member: 10806Members, NS1 Playtester, Contributor, Constellation, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Gold, NS2 Community Developer, Pistachionauts
Finally got a chance to start digging in!
One interesting stat so far. Hive has recorded 2637171 losses and 3000245 wins. This means that the losing team on average has 12% fewer players than the winning team. I believe that wins and losses are still counted if you quit before the game ends. If that's true and doesn't just represent quitting players, that's a pretty big effect.
One interesting stat so far. Hive has recorded 2637171 losses and 3000245 wins. This means that the losing team on average has 12% fewer players than the winning team. I believe that wins and losses are still counted if you quit before the game ends. If that's true and doesn't just represent quitting players, that's a pretty big effect.
That is very interesting. Something pretty basic too with big meaning. Can't wait for more.
moultanoCreator of ns_shiva.Join Date: 2002-12-14Member: 10806Members, NS1 Playtester, Contributor, Constellation, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Gold, NS2 Community Developer, Pistachionauts
edited May 2015
Hmm, there's something going on here. Hive has recorded 42788230 deaths and 48802609 kills, which is about the same ratio 87.6%. How did you sample the data? Is it possible that it was biased towards better players? Otherwise it seems like their might be something wrong with how it's recording things.
moultanoCreator of ns_shiva.Join Date: 2002-12-14Member: 10806Members, NS1 Playtester, Contributor, Constellation, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Gold, NS2 Community Developer, Pistachionauts
edited May 2015
We think we know what's going on. The scrape is probably missing the worst 23% of players, so that's likely the source of the discrepancy. Moving on. Check out this noob stomping.
Blue is total number of deaths. Red is total number of kills. X axis is total time sqrt scaled. Noobs get stomped a lot.
moultanoCreator of ns_shiva.Join Date: 2002-12-14Member: 10806Members, NS1 Playtester, Contributor, Constellation, NS2 Playtester, Squad Five Blue, Reinforced - Shadow, WC 2013 - Gold, NS2 Community Developer, Pistachionauts
Here's a correlation table for skill, time, winrate (wins/wins + losses), killrate (kills / kills + deaths) for players with more than 100 hours.
It's interesting that all 4 are correlated, but time is correlated with skill more than with either killrate or winrate. This suggests perhaps that people keep improving as they put in more time, but they tend to also play with similarly skilled players, which deflates their kill and win statistics.
Lots of rookies being stomped seems to conflict with people playing around their skill level. It makes sense for people to want to play with those of similar skill level. I would expect some sort of emergent self sorting as that seems to confirm. Lots of rookies being stomped may just be, as known, an issue with the small player base. It is not hard for a new player to find a person of extraordinary skill.
So basicly we can conclude that "L2P noob" is pretty good advice afterall.
To an extent. If you look at his graph, or mine from earlier, at skill vs hours recorded it is not perfectly correlated. If it was correlated it would mean that the graph would be more like a line. You can mathematically see how correlated a line is with the correlation coefficient which he has calculated on the top right. I wish I could tell which graph the coefficients were representing. The closer to 1 the more correlated, the closer to 0 the less. So 0.135 is less correlated, but 0.654 is rather correlated.
I regained interest in looking at the data again. I have some new data in the form of time played as aliens and marines. I also have the last recorded match in hive. This is all with the same data from March that I was using before, just new code that gave me new information.
First off, the most recent match does not give a good idea of how many people play ns2. This is because very few servers have hive enabled. From January 1 to March 10, 2015, I only have 7,617 profiles. My data only has 77% of all profiles to begin with, so I am missing players too.
Here is just some basic stats from the time frame of January 1 to March 10, 2015.
I started looking at team preferences, which team is played more.
On average people who prefer playing aliens, have an average skill of 964. On average people who prefer playing marines, have an average hive skill of 918. This means that on average, more skilled players may prefer aliens.
Most people already assumed that rookies liked marines because it is familiar, and this appears to be true. I think there is more going on here though, at least with the variation in hive skill value average from marine to alien. Since aliens have a 55% win rate, I wonder if this has any effect on this. It has been assumed that alien hive kill values are inflated above marine because of the win rate imbalance. I think this supports that.
If I take just the players who have played from in 2015 until March 10, the average skill of people who prefer marine is 784 and the average skill of people who prefer alien is 984.
People who have an above average hive skill value tend to play alien more than marine.
People with less than 50 hours recorded (rookies) tend to play marines more than people with 50 hours recorded (veterans).
Looking at just that 7,617 players from January 1 to March 10, 2015 I was wondering if the skill distribution would be more relevant to what we actually see today across servers. The people who have played in this time frame, are more likely to still be playing. When I look at this small set of players it also cleans out all the skill reset players with exactly 0 and exactly 1000 hive skill. This can give a better picture of what is really seen in games.
I did not expect this many rookies still in the beginning of 2015 alone. The mode for skill is 0, with 106 players with exactly 0 skill just in this date range.
I dont think this will skew things too much since the threshold is +- 50 hours, but just something to note!
I did note in my last post that there are few hive enabled servers. Hive hours are also recorded time in game actually playing. It is nothing with main menu, or waiting for server, etc. Both of those, and you being a veteran who probably played before hive was a thing.
The fact that you have 92 hours recorded by hive means little if the convergence is fast enough.
If you said this because my surprise that there are so many rookies, and that the hours might be off. The players with low hours correlate with low skill pretty well.
Really this are just fun stats, nothing to make to make decisions off of. That is why I am playing with them. Take them with a grain of salt. The stuff moultano puts out, those are more serious stats.
I'm almost positive that Hive Hours are based entirely on an addition of game times, not having client open. That skews the gap between hours as well.
Correct. That is what I was trying to say in part, from my previous post. I know I have said it sometime before in this thread too. It is recorded hours of play in game, not having the client open. By in game I mean from when the timer starts until a team wins.
Depends on the graph. If you give me a specific graph I can tell you what I used exactly.
The data in total is over 59,000 players which is about 77% of all players recorded in hive at the time (March). The initial graphs were me playing with all the data. 77% of all profiles should be more than adequate sample size. I haven't re read though the thread, but I probably said mentioned what I changed about the data if I posted a graph not from all the data. I soon realized that tens of thousands of players had exactly 0 or 1000, and had nearly nothing recorded in hive. They were messing with the numbers, so I decided to attempt to remove the data. I took out about 30,000 profiles who had a level of 0 or 1 in hive. Graphs from these numbers something like "Skill Distribution Cleaned." Just recently I got new code that pulled out data including hours played as marines, hours played as aliens, and the most recently played match. The most recent graphs I have posted have all been from January 1st, 2015 to March 10, 2015. The graphs of these 7,617 players data have the date range on the graph. Even though the most recent graphs are only about 12% of the total profiles I have, I think it is the most representative because these are players who most likely are still playing.
FrozenNew York, NYJoin Date: 2010-07-02Member: 72228Members, Constellation
@Nordic
I don't know if this is possible or if interesting enough to spend time on, but could do a distribution of growth chart. It would show, from say -4000 to +4000 on the independent, representing how much a players skill has gone up or down over the period, and number of players as the dependent. I'm curious if we'll see a bell curve, and if so where it's shifted to
@Nordic
I don't know if this is possible or if interesting enough to spend time on, but could do a distribution of growth chart. It would show, from say -4000 to +4000 on the independent, representing how much a players skill has gone up or down over the period, and number of players as the dependent. I'm curious if we'll see a bell curve, and if so where it's shifted to
I am not entirely sure I understand what you are asking, so I am going to paraphrase and see how close I get.
You want the skill value range of -4000 to 4000 on the bottom or X axis to show how much skill has changed over time. You want number of players on the left or Y axis. You think there will be a bell curve, and this will show how the total hive skill value's have gone up or down.
The aim of this graph would be to see if on average peoples skill values are increasing or decreasing?
The biggest thing I am unsure of is what you mean by "number of players as dependent." This makes me think you mean total players. You also say "representing how much a players skill has gone up or down over the period" which implies a singular player.
By what I do understand I don't think this is possible with the data I have. I have a single snap shot of the hive skill system from March. I can not show change over time. If it is singular player, my data is anonymous. I don't know who is who.
What I do have is an excel file with the following headers: Skill Value, Time(h), Hive Level, Alien Time(h), Marine Time(h), Team Preference, Total Score, W/L, Total Wins, Total Losses, K/D, Total Kills, Total Deaths, Total Assists, Most Recent Match.
Team preference is alien time - marine time. If the number is positive, that individual plays aliens more often and I assume prefers aliens. If the number is negative I assume the opposite.
FrozenNew York, NYJoin Date: 2010-07-02Member: 72228Members, Constellation
Yea, if the data is anonymous entirely, then there's no way you could you compare one set to another from a few months down the road.
But yea the thought was seeing the average increase/decrease of hive skill over t time. I'm not really assuming it would actually turn out to be a bell curve though
I started looking at team preferences, which team is played more.
On average people who prefer playing aliens, have an average skill of 964. On average people who prefer playing marines, have an average hive skill of 918. This means that on average, more skilled players prefer aliens.
If I take just the players who have played from in 2015 until March 10, the average skill of people who prefer marine is 784 and the average skill of people who prefer alien is 984.
I'm going to take huge issue with your analysis here.
The only thing that this tells us is that the average hive score for majority-alien players is higher than that of majority-marine players. Since hive score depends most heavily on W/L ratio, and since aliens have a higher winrate, it only proves a correlation. There is zero relation shown by these hive scores between player preference and player skill.
Yea, if the data is anonymous entirely, then there's no way you could you compare one set to another from a few months down the road.
But yea the thought was seeing the average increase/decrease of hive skill over t time. I'm not really assuming it would actually turn out to be a bell curve though
You could do that for yourself easily enough. On your hive profile it shows the last 20 matches you have played in a hive enabled server. You could copy down those 20 numbers into excel over a few weeks, months, etc and do exactly what you want. It might be interesting. I believe mine would be acts more like a wave with 1900 +- 100. I believe this has to do with me playing marines for a awhile, and then aliens for awhile.
I started looking at team preferences, which team is played more.
On average people who prefer playing aliens, have an average skill of 964. On average people who prefer playing marines, have an average hive skill of 918. This means that on average, more skilled players prefer aliens.
If I take just the players who have played from in 2015 until March 10, the average skill of people who prefer marine is 784 and the average skill of people who prefer alien is 984.
I'm going to take huge issue with your analysis here.
The only thing that this tells us is that the average hive score for majority-alien players is higher than that of majority-marine players. Since hive score depends most heavily on W/L ratio, and since aliens have a higher winrate, it only proves a correlation. There is zero relation shown by these hive scores between player preference and player skill.
I came to the same conclusion and I mentioned it just a little bit below. What you quoted was me typing out my thought process.
Most people already assumed that rookies liked marines because it is familiar, and this appears to be true. I think there is more going on here though, at least with the variation in hive skill value average from marine to alien. Since aliens have a 55% win rate, I wonder if this has any effect on this. It has been assumed that alien hive kill values are inflated above marine because of the win rate imbalance. I think this supports that.
I could of worded that whole post better though, you got me there. I have been working a lot, and seem to be not catching errors as well as I can. That specific post was stream of consciousness.
This thread is a mess, and not really organized to begin with. Maybe I will consolidate all the graphs into the OP one of these days.
This thread has been a mess so I decided to organize it and make all the graphs and statistics into one post. I will be copying this into the OP, yes @yojimbo, copy pasting. This will also answer @SupaDupaNoodle's question directly about what the sample sizes are for each graph. I believe this has every graph so far I, and moultano, has posted. It should also clear up confusion because I try to be more clear in what I did.
I had taken it upon myself to collect some basic statistics from the hive system just for myself. Others have express interest in such statistics so I thought I would share. I am only sharing the graphs and statistics you will see in this thread, not the data itself. The data I have is in CSV format and has the headers: Skill Value, Time(h), Hive Level, Total Score, W/L, Total Wins, Total Losses, K/D, Total Kills, Total Deaths, Total Assists. As you can see, the data is anonymous as I have no way of knowing who is who.
Hive is not enabled on very few of all ns2 servers, so even if I had the data from all 100% of ns2 players recorded at the time, the data would still only give statistics not parameters.
I should note that any reference to hours recorded, or hive hours, is hours in a live game. These are not the same hours as steam. These hours are only from when a live game's timer starts until the timer stops with a team winning. Hive only counts games if there are enough players, so low player count games are not recorded.
If there is anything else you would like me to do with the data tell me and I might do it. If it is something complicated you should also tell me how to do it. I am working with excel right now.
I am just doing these stats for fun. I have only taken one college course in statistics. They are real numbers from the hive system but I would urge you to take any analysis I do with a grain of salt.
The first round of graphs and other statistics I have collected into this spoiler. All the information here is from all of the original data I had collected on March 10th, 2015. There are only 59,587 players in the data, which was only ~77% of all ns2 players recorded in hive.
Of all the 59,587 players I can assume most of them do not play ns2 anymore. So I originally decided to break down the statistics on hours played. I also thought this might give more representative numbers because the hive skill system gets more accurate the more time is had to collect data on a given player.
For all data:
Count = 59587, Average Skill = 938, Median Skill = 1000, Mode of Skill = 1000, Minimum Skill = 0, Maximum Skill = 3753, Average Hours = 29, Median Hours = 29, Mode of Hours = 3, Minimum Hours = 2, Maximum Hours = 1580
For Players with 5+ hours of recorded play:
73% of players have 5+ hours, Average skill = 915, Median Skill = 1000, Mode of Skill = 1000, Minimum Skill = 0, Average Hours = 39, Median Hours = 39, Mode of Hours = 5
For Players with 10+ hours of recorded play:
45% of players have 10+ hours, Average skill = 923, Median Skill = 1000, Mode of Skill = 1000, Minimum Skill = 0, Average Hours = 59, Median Hours = 59, Mode of Hours = 10
For Players with 20+ hours of recorded play:
25% of players have 5+ hours, Average skill = 1003, Median Skill = 1000, Mode of Skill = 1000, Minimum Skill = 0, Average Hours = 96, Median Hours = 96, Mode of Hours = 20
For Players with 50+ hours of recorded play:
12% of players have 50+ hours, Average skill = 1148, Median Skill = 1128, Mode of Skill = 1000, Minimum Skill = 0, Average Hours = 167, Median Hours = 167, Mode of Hours = 51
For Players with 100+ hours of recorded play:
7% of players have 100+ hours, Average skill = 1301, Median Skill = 1202, Mode of Skill = 1000, Minimum Skill = 19, Average Hours = 245, Median Hours = 245, Mode of Hours = 104
For Players with 200+ hours of recorded play:
3% of players have 200+ hours, Average skill = 1505, Median Skill = 1445, Mode of Skill = 1000, Minimum Skill = 19, Average Hours = 358, Median Hours = 358, Mode of Hours = 218
In the following spoiler I have put all the graphs in order of when they were posted. As time goes on you can notice I go from having excel auto make a graph, to where I cutomize the graph and make it look a little better. You will notice some repeated graphs I am sure.
One flaw with this set of graphs is that they take account all players who have ever been entered into the hive system. Not active players. There are thousands of players who have not played since the skill system was reset, and have exactly 1000 skill. I think we can assume that those with 50+ hours are more represenative than all the players.
These three graphs below appear to be similar to a left skewed normal distribution, exactly what we would expect it to look like. I think this supports that the skill values are fairly accurate.
Since the skill system was reset and players were given a default value of 1000 skill, you can really see the effect of that here. There are a lot of players with exactly 1000 skill who haven't played much.
I am amazed by some of these outliers.
Made this graph at the request of krooze.
More Graphs! I don't even know what this one could mean but I made it.
Looking at all the previous data I decided to try and cull out as many people as I could who had not played since the last reset. I made a new file, deleted all data of players with exact skill of 1000 and a hive level of 0 or 1. I also cut out all players with exact skills of 0 who had either 0 wins, or 0 losses, or 0 kills, or 0 deaths.
The total players in the new "cleaned" data is 29,915. Only 1181 players in the culled data have exactly 1000 skill, but out of those players they all have had over 10 hours recorded in hive. Only 102 players in the culled data have exactly 0 skill.
I believe this cuts out the majority of the poor quality data from subsequent player data resets.
The average win rate being about 1 is a really good sign that I have enough data for good statistics.
These are graphs from the cleaned data. That spike at about 1000 skill shows that my cleaned data still has a lot of inactive players who have not played much since their skill was reset to 1000. The lines on the the W/L graph disappeared though so that is progress. I also changed the dots to loops because it better shows concentration.
I regained interest in looking at the data again. I have some new data in the form of time played as aliens and marines. I also have the last recorded match in hive. This is all with the same data from March that I was using before, just new code that gave me new information. My excel file now has Time(h), Hive Level, Alien Time(h), Marine Time(h), Team Preference, Total Score, W/L, Total Wins, Total Losses, K/D, Total Kills, Total Deaths, Total Assists, Most Recent Match.
First off, the most recent match does not give a good idea of how many people play ns2. This is because very few servers have hive enabled. From January 1 to March 10, 2015, I only have 7,617 profiles. My data only has 77% of all profiles to begin with, so I am missing players too.
I decided to look at the statistics of team preferences.
From all of the ~59,000 players, on average people who prefer playing aliens, have an average skill of 964. On average people who prefer playing marines, have an average hive skill of 918. This means that on average, more skilled players may prefer aliens.
Most people already assumed that rookies liked marines because it is familiar, and this appears to be true. I think there is more going on here though, at least with the variation in hive skill value average from marine to alien. Since aliens have a 55% win rate, I wonder if this has any effect on this. It has been assumed that alien hive kill values are inflated above marine because of the win rate imbalance. I think this supports that.
If I take just the players who have played from in 2015 until March 10, the average skill of people who prefer marine is 784 and the average skill of people who prefer alien is 984.
People who have an above average hive skill value tend to play alien more than marine.
People with less than 50 hours recorded (rookies) tend to play marines more than people with 50 hours recorded (veterans).
Looking at just that 7,617 players from January 1 to March 10, 2015 I was wondering if the skill distribution would be more relevant to what we actually see today across servers. The people who have played in this time frame, are more likely to still be playing. When I look at this small set of players it also cleans out all the skill reset players with exactly 0 and exactly 1000 hive skill. This can give a better picture of what is really seen in games today.
In this spoiler is a bit of stats on players who have exactly 0 skill in the date range of January 1, 2015 to March 10, 2015. Just kind of interesting. These players had a minimum of 4 hours recorded, which means that most new players who try the game give it a good shot.
I did not expect this many rookies still in the beginning of 2015 alone. The mode for skill is 0, with 106 players with exactly 0 skill just in this date range.
Any and all data I have had I shared with moultano. These are some stats and graphs he has put in this thread. Worthwhile info so thought I should add it here.
One interesting stat so far. Hive has recorded 2637171 losses and 3000245 wins. This means that the losing team on average has 12% fewer players than the winning team. I believe that wins and losses are still counted if you quit before the game ends. If that's true and doesn't just represent quitting players, that's a pretty big effect.
Hmm, there's something going on here. Hive has recorded 42788230 deaths and 48802609 kills, which is about the same ratio 87.6%. How did you sample the data? Is it possible that it was biased towards better players? Otherwise it seems like their might be something wrong with how it's recording things.
We think we know what's going on. The scrape is probably missing the worst 23% of players, so that's likely the source of the discrepancy. Moving on. Check out this noob stomping.
Blue is total number of deaths. Red is total number of kills. X axis is total time sqrt scaled. Noobs get stomped a lot.
Here's a correlation table for skill, time, winrate (wins/wins + losses), killrate (kills / kills + deaths) for players with more than 100 hours.
It's interesting that all 4 are correlated, but time is correlated with skill more than with either killrate or winrate. This suggests perhaps that people keep improving as they put in more time, but they tend to also play with similarly skilled players, which deflates their kill and win statistics.
Comments
In the raw data I have from which I pulled the skills etc from, it does have alien time and marine time but no way of knowing which games win was alien or marine.
If you see some information verbatim on a hive profile, I could probably pull it out.
The hive profile page is basically showing all data available in the player table of the hive db. UWE uses/used for balance stats tracking another service called "sponitor", but nobody knows if that is even working anymore.
So far in the past (even so ns2stats.com has seen better days) ns2stats and "sponitor" were always pretty equally "balance data"-wise.
One interesting stat so far. Hive has recorded 2637171 losses and 3000245 wins. This means that the losing team on average has 12% fewer players than the winning team. I believe that wins and losses are still counted if you quit before the game ends. If that's true and doesn't just represent quitting players, that's a pretty big effect.
Blue is total number of deaths. Red is total number of kills. X axis is total time sqrt scaled. Noobs get stomped a lot.
Here's a correlation table for skill, time, winrate (wins/wins + losses), killrate (kills / kills + deaths) for players with more than 100 hours.
It's interesting that all 4 are correlated, but time is correlated with skill more than with either killrate or winrate. This suggests perhaps that people keep improving as they put in more time, but they tend to also play with similarly skilled players, which deflates their kill and win statistics.
First off, the most recent match does not give a good idea of how many people play ns2. This is because very few servers have hive enabled. From January 1 to March 10, 2015, I only have 7,617 profiles. My data only has 77% of all profiles to begin with, so I am missing players too.
Here is just some basic stats from the time frame of January 1 to March 10, 2015.
I started looking at team preferences, which team is played more.
On average people who prefer playing aliens, have an average skill of 964. On average people who prefer playing marines, have an average hive skill of 918. This means that on average, more skilled players may prefer aliens.
Most people already assumed that rookies liked marines because it is familiar, and this appears to be true. I think there is more going on here though, at least with the variation in hive skill value average from marine to alien. Since aliens have a 55% win rate, I wonder if this has any effect on this. It has been assumed that alien hive kill values are inflated above marine because of the win rate imbalance. I think this supports that.
If I take just the players who have played from in 2015 until March 10, the average skill of people who prefer marine is 784 and the average skill of people who prefer alien is 984.
People who have an above average hive skill value tend to play alien more than marine.
People with less than 50 hours recorded (rookies) tend to play marines more than people with 50 hours recorded (veterans).
Looking at just that 7,617 players from January 1 to March 10, 2015 I was wondering if the skill distribution would be more relevant to what we actually see today across servers. The people who have played in this time frame, are more likely to still be playing. When I look at this small set of players it also cleans out all the skill reset players with exactly 0 and exactly 1000 hive skill. This can give a better picture of what is really seen in games.
I did not expect this many rookies still in the beginning of 2015 alone. The mode for skill is 0, with 106 players with exactly 0 skill just in this date range.
Hive has about 1/5 of my hours recorded.
http://steamcommunity.com/id/BensonTheCasual/
628 hours on my steam profile.
http://hive.naturalselection2.com/profile/9002354
92 hours on my Hive profile.
I dont think this will skew things too much since the threshold is +- 50 hours, but just something to note!
P.S. i dont idle in the menu, I've played since before Hive was a thing
The fact that you have 92 hours recorded by hive means little if the convergence is fast enough.
If you said this because my surprise that there are so many rookies, and that the hours might be off. The players with low hours correlate with low skill pretty well.
Really this are just fun stats, nothing to make to make decisions off of. That is why I am playing with them. Take them with a grain of salt. The stuff moultano puts out, those are more serious stats.
Correct. That is what I was trying to say in part, from my previous post. I know I have said it sometime before in this thread too. It is recorded hours of play in game, not having the client open. By in game I mean from when the timer starts until a team wins.
@Benson can I add you to my Steam list for NS2 games??
heh, sure
The data in total is over 59,000 players which is about 77% of all players recorded in hive at the time (March). The initial graphs were me playing with all the data. 77% of all profiles should be more than adequate sample size. I haven't re read though the thread, but I probably said mentioned what I changed about the data if I posted a graph not from all the data. I soon realized that tens of thousands of players had exactly 0 or 1000, and had nearly nothing recorded in hive. They were messing with the numbers, so I decided to attempt to remove the data. I took out about 30,000 profiles who had a level of 0 or 1 in hive. Graphs from these numbers something like "Skill Distribution Cleaned." Just recently I got new code that pulled out data including hours played as marines, hours played as aliens, and the most recently played match. The most recent graphs I have posted have all been from January 1st, 2015 to March 10, 2015. The graphs of these 7,617 players data have the date range on the graph. Even though the most recent graphs are only about 12% of the total profiles I have, I think it is the most representative because these are players who most likely are still playing.
I don't know if this is possible or if interesting enough to spend time on, but could do a distribution of growth chart. It would show, from say -4000 to +4000 on the independent, representing how much a players skill has gone up or down over the period, and number of players as the dependent. I'm curious if we'll see a bell curve, and if so where it's shifted to
I am not entirely sure I understand what you are asking, so I am going to paraphrase and see how close I get.
You want the skill value range of -4000 to 4000 on the bottom or X axis to show how much skill has changed over time. You want number of players on the left or Y axis. You think there will be a bell curve, and this will show how the total hive skill value's have gone up or down.
The aim of this graph would be to see if on average peoples skill values are increasing or decreasing?
The biggest thing I am unsure of is what you mean by "number of players as dependent." This makes me think you mean total players. You also say "representing how much a players skill has gone up or down over the period" which implies a singular player.
By what I do understand I don't think this is possible with the data I have. I have a single snap shot of the hive skill system from March. I can not show change over time. If it is singular player, my data is anonymous. I don't know who is who.
What I do have is an excel file with the following headers: Skill Value, Time(h), Hive Level, Alien Time(h), Marine Time(h), Team Preference, Total Score, W/L, Total Wins, Total Losses, K/D, Total Kills, Total Deaths, Total Assists, Most Recent Match.
Team preference is alien time - marine time. If the number is positive, that individual plays aliens more often and I assume prefers aliens. If the number is negative I assume the opposite.
But yea the thought was seeing the average increase/decrease of hive skill over t time. I'm not really assuming it would actually turn out to be a bell curve though
I'm going to take huge issue with your analysis here.
The only thing that this tells us is that the average hive score for majority-alien players is higher than that of majority-marine players. Since hive score depends most heavily on W/L ratio, and since aliens have a higher winrate, it only proves a correlation. There is zero relation shown by these hive scores between player preference and player skill.
You could do that for yourself easily enough. On your hive profile it shows the last 20 matches you have played in a hive enabled server. You could copy down those 20 numbers into excel over a few weeks, months, etc and do exactly what you want. It might be interesting. I believe mine would be acts more like a wave with 1900 +- 100. I believe this has to do with me playing marines for a awhile, and then aliens for awhile.
I came to the same conclusion and I mentioned it just a little bit below. What you quoted was me typing out my thought process.
I could of worded that whole post better though, you got me there. I have been working a lot, and seem to be not catching errors as well as I can. That specific post was stream of consciousness.
This thread is a mess, and not really organized to begin with. Maybe I will consolidate all the graphs into the OP one of these days.
I had taken it upon myself to collect some basic statistics from the hive system just for myself. Others have express interest in such statistics so I thought I would share. I am only sharing the graphs and statistics you will see in this thread, not the data itself. The data I have is in CSV format and has the headers: Skill Value, Time(h), Hive Level, Total Score, W/L, Total Wins, Total Losses, K/D, Total Kills, Total Deaths, Total Assists. As you can see, the data is anonymous as I have no way of knowing who is who.
Hive is not enabled on very few of all ns2 servers, so even if I had the data from all 100% of ns2 players recorded at the time, the data would still only give statistics not parameters.
I should note that any reference to hours recorded, or hive hours, is hours in a live game. These are not the same hours as steam. These hours are only from when a live game's timer starts until the timer stops with a team winning. Hive only counts games if there are enough players, so low player count games are not recorded.
If there is anything else you would like me to do with the data tell me and I might do it. If it is something complicated you should also tell me how to do it. I am working with excel right now.
I am just doing these stats for fun. I have only taken one college course in statistics. They are real numbers from the hive system but I would urge you to take any analysis I do with a grain of salt.
The first round of graphs and other statistics I have collected into this spoiler. All the information here is from all of the original data I had collected on March 10th, 2015. There are only 59,587 players in the data, which was only ~77% of all ns2 players recorded in hive.
For all data:
Count = 59587, Average Skill = 938, Median Skill = 1000, Mode of Skill = 1000, Minimum Skill = 0, Maximum Skill = 3753, Average Hours = 29, Median Hours = 29, Mode of Hours = 3, Minimum Hours = 2, Maximum Hours = 1580
For Players with 5+ hours of recorded play:
73% of players have 5+ hours, Average skill = 915, Median Skill = 1000, Mode of Skill = 1000, Minimum Skill = 0, Average Hours = 39, Median Hours = 39, Mode of Hours = 5
For Players with 10+ hours of recorded play:
45% of players have 10+ hours, Average skill = 923, Median Skill = 1000, Mode of Skill = 1000, Minimum Skill = 0, Average Hours = 59, Median Hours = 59, Mode of Hours = 10
For Players with 20+ hours of recorded play:
25% of players have 5+ hours, Average skill = 1003, Median Skill = 1000, Mode of Skill = 1000, Minimum Skill = 0, Average Hours = 96, Median Hours = 96, Mode of Hours = 20
For Players with 50+ hours of recorded play:
12% of players have 50+ hours, Average skill = 1148, Median Skill = 1128, Mode of Skill = 1000, Minimum Skill = 0, Average Hours = 167, Median Hours = 167, Mode of Hours = 51
For Players with 100+ hours of recorded play:
7% of players have 100+ hours, Average skill = 1301, Median Skill = 1202, Mode of Skill = 1000, Minimum Skill = 19, Average Hours = 245, Median Hours = 245, Mode of Hours = 104
For Players with 200+ hours of recorded play:
3% of players have 200+ hours, Average skill = 1505, Median Skill = 1445, Mode of Skill = 1000, Minimum Skill = 19, Average Hours = 358, Median Hours = 358, Mode of Hours = 218
In the following spoiler I have put all the graphs in order of when they were posted. As time goes on you can notice I go from having excel auto make a graph, to where I cutomize the graph and make it look a little better. You will notice some repeated graphs I am sure.
One flaw with this set of graphs is that they take account all players who have ever been entered into the hive system. Not active players. There are thousands of players who have not played since the skill system was reset, and have exactly 1000 skill. I think we can assume that those with 50+ hours are more represenative than all the players.
These three graphs below appear to be similar to a left skewed normal distribution, exactly what we would expect it to look like. I think this supports that the skill values are fairly accurate.
Since the skill system was reset and players were given a default value of 1000 skill, you can really see the effect of that here. There are a lot of players with exactly 1000 skill who haven't played much.
I am amazed by some of these outliers.
Made this graph at the request of krooze.
More Graphs! I don't even know what this one could mean but I made it.
Looking at all the previous data I decided to try and cull out as many people as I could who had not played since the last reset. I made a new file, deleted all data of players with exact skill of 1000 and a hive level of 0 or 1. I also cut out all players with exact skills of 0 who had either 0 wins, or 0 losses, or 0 kills, or 0 deaths.
The total players in the new "cleaned" data is 29,915. Only 1181 players in the culled data have exactly 1000 skill, but out of those players they all have had over 10 hours recorded in hive. Only 102 players in the culled data have exactly 0 skill.
I believe this cuts out the majority of the poor quality data from subsequent player data resets.
The average win rate being about 1 is a really good sign that I have enough data for good statistics.
These are graphs from the cleaned data. That spike at about 1000 skill shows that my cleaned data still has a lot of inactive players who have not played much since their skill was reset to 1000. The lines on the the W/L graph disappeared though so that is progress. I also changed the dots to loops because it better shows concentration.
I regained interest in looking at the data again. I have some new data in the form of time played as aliens and marines. I also have the last recorded match in hive. This is all with the same data from March that I was using before, just new code that gave me new information. My excel file now has Time(h), Hive Level, Alien Time(h), Marine Time(h), Team Preference, Total Score, W/L, Total Wins, Total Losses, K/D, Total Kills, Total Deaths, Total Assists, Most Recent Match.
First off, the most recent match does not give a good idea of how many people play ns2. This is because very few servers have hive enabled. From January 1 to March 10, 2015, I only have 7,617 profiles. My data only has 77% of all profiles to begin with, so I am missing players too.
I decided to look at the statistics of team preferences.
From all of the ~59,000 players, on average people who prefer playing aliens, have an average skill of 964. On average people who prefer playing marines, have an average hive skill of 918. This means that on average, more skilled players may prefer aliens.
Most people already assumed that rookies liked marines because it is familiar, and this appears to be true. I think there is more going on here though, at least with the variation in hive skill value average from marine to alien. Since aliens have a 55% win rate, I wonder if this has any effect on this. It has been assumed that alien hive kill values are inflated above marine because of the win rate imbalance. I think this supports that.
If I take just the players who have played from in 2015 until March 10, the average skill of people who prefer marine is 784 and the average skill of people who prefer alien is 984.
People who have an above average hive skill value tend to play alien more than marine.
People with less than 50 hours recorded (rookies) tend to play marines more than people with 50 hours recorded (veterans).
Looking at just that 7,617 players from January 1 to March 10, 2015 I was wondering if the skill distribution would be more relevant to what we actually see today across servers. The people who have played in this time frame, are more likely to still be playing. When I look at this small set of players it also cleans out all the skill reset players with exactly 0 and exactly 1000 hive skill. This can give a better picture of what is really seen in games today.
In this spoiler is a bit of stats on players who have exactly 0 skill in the date range of January 1, 2015 to March 10, 2015. Just kind of interesting. These players had a minimum of 4 hours recorded, which means that most new players who try the game give it a good shot.
Any and all data I have had I shared with moultano. These are some stats and graphs he has put in this thread. Worthwhile info so thought I should add it here.