Forums / Community / Matchmaking Feedback & Discussion

[Locked] Matchmaking Feedback Update – May 7

OP ZaedynFel

  1. 1
  2. ...
  3. 2
  4. 3
  5. 4
  6. 5
  7. ...
  8. 13
LUKEPOWA wrote:
Are there some undocumented changes in TS2 that you feel comfortable making the seemingly gross statement that "the new system is right and the old one was wrong all along"? - because the paper doesn't really support that claim. The paper, at best, claims that the new method is right more often- but I don't see how anyone can claim more than that.
He can correct me if I'm wrong, but I think he was just simplifying things to make it easier to understand and to get his point across more definitively. If you look at the bolded question which prompted that response, it's easy to understand why he would say it that way because some people just don't get it and will continue to say they were so and so rank before so why are this now even after he gives a detailed explanation.
I completely get that- and given what the paper actually says, I think it's a misrepresentation. I hope people can understand "accurate more often" vs "old system wrong." I am also capable of understanding detailed arguments. Neither case meshes right now and making global claims to subpopulations that may not be applicable is poor science.

And his recent tweet w/ Mikwen is a perfect example of why it is a misrepresentation:
https://twitter.com/joshua_menke/status/993610852698767360
For all of the training data- it has Mikwen at Platinum after 3 games, despite there being no reset in MMR.

The last game that dropped him from Diamond to Platinum? It was this one:
https://www.halowaypoint.com/en-us/games/halo-5-guardians/xbox-one/mode/arena/matches/ee511ea7-d02f-457c-9178-b45d4d5e2b1f/players/mikwen?gameHistoryMatchIndex=0&gameHistoryGameModeFilter=Arena

It was a 100-0 Strongholds game (that they won) where half of the other team quit. The entire game had 6 kills. I'm guessing, because I don't see the data, that a lot of high end players are getting royally hosed if they have multiple games where the opponents quit out and they don't meet expected kill rates.

There seem to be some pretty clear indications that there are flaws inside the black box. Maybe it's just at the high end of the curve. But "the new system is right and the old one was wrong all along" makes it easy to sweep away obvious shortcomings, and we can do better.
Why can we not just have an actual 1-50 Ranking system!? A clear and concise expression of skill. Everyone starts at rank 1 and works their way up to 50 like Halo 3's ranking system. This will separate the good players from the bad.

I don't want a system that "predicts" my loses or kill/death ratio... I want a ranking system that measures W/L. Not all games require K/Ds. Look at capture the flag or other objective type gameplay. Players do not win off of kills alone.
Skerpy wrote:
like Halo 3's ranking system.
Search Dr Menke's posts for the answer but the tl;dr is that this IS the Halo 3 ranking system, reskinned. And allowing the good kids to stomp the bad kids on their way up is demoralizing and resulted in a lot of attrition.

This is better for everyone.
LUKEPOWA wrote:
it's easy to understand why he would say it that way because some people just don't get it and will continue to say they were so and so rank before so why are this now even after he gives a detailed explanation.
maybe I need to walk back my disagreeing that people could understand the difference...

Skerpy wrote:
Why can we not just have an actual 1-50 Ranking system!? A clear and concise expression of skill. Everyone starts at rank 1 and works their way up to 50 like Halo 3's ranking system. This will separate the good players from the bad.
I just played my 10 placement games for slayer. Only one game was 50-40 or closer. Three were worse than 50-25. This new system feels atrocious. It might be better at guessing individual skill but the matches it makes are godawful. I can't see myself playing much if 90% of games are blowouts..
so i read your new update info and everythng. yes i see where it makes sense however with better odds and stats in my arena match than my slayer match why have i placed so differently?

slayer 5.1 kda 8 wins 7 loss 61.5 win rate 43.1 accuracy Diamond 3/4
Team arena 5.9 kda 7 wins 3 losses 70 win rate 41.9 accuracy Platinum 5

if your info is true then if i win my next game with good kda ill shoot up to Diamond 2 or 3 because my mmr is high?
Why do I repeatedly get stomped in warzone? Idk how to really run stats, but I know I'm decent if not average. Then why do I constantly get placed with people who'll end a match with 4-10 kills? Out of my last 20 games I've only won 6. Constantly and I mean constantly I find myself getting stomped; not just beaten but utterly dominated. Day in and day out whenever I play warzone with my friend( who is total garbage) or alone, we/I get annihilated. Why? If I could pull up all the numbers it would make absolutely no sense why I'm being paired up with less than average or dare I say it- pathetically bad teammates. I played 3 games where once they captured the first point it was just spawn kill after spawn kill. The other team was on req 7 before I was even at 4 and I was the 3rd from top player on my team! Someone help me understand this, because after they ruined breakout for me( which I'm slightly above average at) there's nothing now I have interest in.
I just played my 10 placement games for slayer. Only one game was 50-40 or closer. Three were worse than 50-25. This new system feels atrocious. It might be better at guessing individual skill but the matches it makes are godawful. I can't see myself playing much if 90% of games are blowouts..
The matches you got were not made the way the system would have preferred. The matchmaker tried, but was unable to.

The new system predicted you would win 30% of your Slayer matches, and you did. So it's doing pretty well at knowing your skill. It also predicted your kills well, you had 10 per game, it predicted 11.

So the new system definitely knows your skill better than the old one, but it's up to the matchmaker and population to actually be able to create fair matches.

You are also Onyx, which means the matchmaker can match you against the full range of the top skilled players in the game. This is the life of Onyx players.
Are there some undocumented changes in TS2 that you feel comfortable making the seemingly gross statement that "the new system is right and the old one was wrong all along"? - because the paper doesn't really support that claim. The paper, at best, claims that the new method is right more often- but I don't see how anyone can claim more than that.

According to the paper most of the predictive gains are for squads larger than the max in any ranked playlist (think Warzone, BTB). Squads less than or equal to 4 show minimal predictive improvement over TS1.

Despite the incorporation of kill rate correlations, the expected change of a 10 kill game vs 20 kill game over a full time 4v4 slayer game is <5%.

And while I've seen many players dragged on Twitter who are perhaps rightfully annoyed that their 2 year investment of time was "wrong all along" - I can think of two obvious examples where TS2 didn't predict skill within a couple games. Commonly's 10-0 in HCS is the most public example. Another player I played my placement (tongue twister) matches with, GT: NomsEmoo, placed Diamond 4 in Slayer, is now Champ with a 1.7 k/d and absurd 85-5 record. That's a substantial deviation that does not scream "got carried."

I applaud the addition of improved systems - it just seems we are overselling it with global statements that can't be applied globally.
You are misunderstanding the reporting in the paper.

The subpopulation splits don't represent improvements in accuracy, they represent improvements in precision. They are also only relevant to the specific subpopulations that they describe. They aren't relevant to the whole. The party improvement is a nice one, but by far not the most impactful on the results.

The subpopulation reporting is also very rough, just to show the inclusion of the improvements meets the target that originally drove them, but that's not its only purpose.

The inclusion of all of the improvements leads to a massive improvement in prediction accuracy. On the exact same matches, the old system was guessing the correct winner only just over half the time.

The new system guesses the correct winner almost 70% of the time. And this is on data where the old system was trying to matchmake, so it's already biased to be really hard to predict. Yet the new system drastically surpasses it anyways.

That's a huge meaningful gap in accuracy.

That same gap bears out player to player. When I grab specific players and check, I see the same thing. The old system is wrong half the time, whereas the new one is right 70%+ of the time.

That's a big difference.

The new system also consistently predicts both who will win, their number of kills, their number of deaths, and their likelihood to quit. All simultaneously with amazing precision. In cases where it's not able to predict the right answer, it knows why. it can predict its own accuracy. It is less accurate on less predictable games, which is also as it should be.

The old system could not do most of that, only win loss, and that much worse.

The old system also took a very long time to find a player's skill. Up to 100+ matches in 4v4, compared to less than 10 matches for the new system. This alone makes a huge improvement in predictions since it will always be more accurate on players that have few matches, of which there are many.

Anyways, you're misreading the paper. Focusing on the subpopulation precision results won't get you anywhere near the actual accuracy improvement.

In summary, the new method is consistently more accurate, match to match, at predicting the winner, and so is also more correct at measuring player skill.
Why do I repeatedly get stomped in warzone? Idk how to really run stats, but I know I'm decent if not average. Then why do I constantly get placed with people who'll end a match with 4-10 kills? Out of my last 20 games I've only won 6. Constantly and I mean constantly I find myself getting stomped; not just beaten but utterly dominated. Day in and day out whenever I play warzone with my friend( who is total garbage) or alone, we/I get annihilated. Why? If I could pull up all the numbers it would make absolutely no sense why I'm being paired up with less than average or dare I say it- pathetically bad teammates. I played 3 games where once they captured the first point it was just spawn kill after spawn kill. The other team was on req 7 before I was even at 4 and I was the 3rd from top player on my team! Someone help me understand this, because after they ruined breakout for me( which I'm slightly above average at) there's nothing now I have interest in.
You are winning more than expected (40% vs. 20%) in Warzone. Killing about as expected (20 vs. 18). You are maybe top 7% in skill compared to the average over all time (not compared to just recent players).

Why are you getting matches predicted at 80/20? Looks like because of quits. The Matchmaker aims for 60/40, but 1/3 of the players per match are quitting out, at which point there's nothing the MM can do.
TheIrisWIL wrote:
so i read your new update info and everythng. yes i see where it makes sense however with better odds and stats in my arena match than my slayer match why have i placed so differently?

slayer 5.1 kda 8 wins 7 loss 61.5 win rate 43.1 accuracy Diamond 3/4
Team arena 5.9 kda 7 wins 3 losses 70 win rate 41.9 accuracy Platinum 5

if your info is true then if i win my next game with good kda ill shoot up to Diamond 2 or 3 because my mmr is high?
It means you played easier opponents in the Team Arena matches, so those kills and wins don't count for as much.
LUKEPOWA wrote:
it's easy to understand why he would say it that way because some people just don't get it and will continue to say they were so and so rank before so why are this now even after he gives a detailed explanation.
maybe I need to walk back my disagreeing that people could understand the difference...

Skerpy wrote:
Why can we not just have an actual 1-50 Ranking system!? A clear and concise expression of skill. Everyone starts at rank 1 and works their way up to 50 like Halo 3's ranking system. This will separate the good players from the bad.
The 1-50 system is just classic TrueSkill which is already proven worse at separating the good from the bad.

Though you could skin any system to be 1-50. Just do (CSR - 900)/300*(25/3) + 25 and you'll get a 1-50 number just like Halo 3's.

But starting at the bottom and working your way up to the top is provably bad at separating players because you have both good and bad players at the same Ranks near the bottom while the good players still work their way up. This can take many games in 4v4. This is also bad for all but the top players because they get stomped and quit the game while the other players are on their way up. Feels great if you're 40-50 in skill, but not so great to everyone else, which is the majority of your players.
ZaedynFel wrote:
I just played my 10 placement games for slayer. Only one game was 50-40 or closer. Three were worse than 50-25. This new system feels atrocious. It might be better at guessing individual skill but the matches it makes are godawful. I can't see myself playing much if 90% of games are blowouts..
The matches you got were not made the way the system would have preferred. The matchmaker tried, but was unable to.

The new system predicted you would win 30% of your Slayer matches, and you did. So it's doing pretty well at knowing your skill. It also predicted your kills well, you had 10 per game, it predicted 11.

So the new system definitely knows your skill better than the old one, but it's up to the matchmaker and population to actually be able to create fair matches.

You are also Onyx, which means the matchmaker can match you against the full range of the top skilled players in the game. This is the life of Onyx players.
My point wasn't to question the matchmaker's ability to predict my skill. It feels right on target for that; I've played a decent amount of both slayer and swat and in the old system I would have said my skill is probably high platinum or low diamond for slayer (old system had me high platinum and I always felt it was a bit low) and low onyx for swat (old system repeatedly let me climb to onyx 1700+ which I thought was too high; it even let me climb to ~1900 and champion once). In fact, part of my frustration is that I know the new system is accurate and so I can directly blame the game when it consistently makes lousy matches.

If it's not possible to get an actual fun game experience, why bother with all this? If the current population of a list doesn't support competitive gameplay, it shouldn't be a ranked list. In my 10 slayer games, the average score was 50-30.5. That is not acceptable in a ranked playlist. And this is at diamond, not onyx, where theoretically the player population should be healthier.

Under the old system, I might have thought "well, the matchmaker guessed wrong on some of these games and I'll keep trying, it will probably get it right eventually". Under this new system, I am thinking "this system knows what it's doing and the best it can do is create coin toss games where I'm going to either get destroyed by the other team or completely steamroll them". If the slayer playlist population is at fault, then slayer needs to become a social list and the included player skill on each team broadened.

--edit--
I am also trying to understand why the matchmaker cannot just shuffle teams around when it predicts a landslide victory. If it's predicting a 50-30 game, surely teams could have solo players swapped around to make a 50-40 or better predicted outcome.
ZaedynFel wrote:
I just played my 10 placement games for slayer. Only one game was 50-40 or closer. Three were worse than 50-25. This new system feels atrocious. It might be better at guessing individual skill but the matches it makes are godawful. I can't see myself playing much if 90% of games are blowouts..
The matches you got were not made the way the system would have preferred. The matchmaker tried, but was unable to.

The new system predicted you would win 30% of your Slayer matches, and you did. So it's doing pretty well at knowing your skill. It also predicted your kills well, you had 10 per game, it predicted 11.

So the new system definitely knows your skill better than the old one, but it's up to the matchmaker and population to actually be able to create fair matches.

You are also Onyx, which means the matchmaker can match you against the full range of the top skilled players in the game. This is the life of Onyx players.
My point wasn't to question the matchmaker's ability to predict my skill. It feels right on target for that; I've played a decent amount of both slayer and swat and in the old system I would have said my skill is probably high platinum or low diamond for slayer (old system had me high platinum and I always felt it was a bit low) and low onyx for swat (old system repeatedly let me climb to onyx 1700+ which I thought was too high; it even let me climb to ~1900 and champion once). In fact, part of my frustration is that I know the new system is accurate and so I can directly blame the game when it consistently makes lousy matches.

If it's not possible to get an actual fun game experience, why bother with all this? If the current population of a list doesn't support competitive gameplay, it shouldn't be a ranked list. In my 10 slayer games, the average score was 50-30.5. That is not acceptable in a ranked playlist. And this is at diamond, not onyx, where theoretically the player population should be healthier.

Under the old system, I might have thought "well, the matchmaker guessed wrong on some of these games and I'll keep trying, it will probably get it right eventually". Under this new system, I am thinking "this system knows what it's doing and the best it can do is create coin toss games where I'm going to either get destroyed by the other team or completely steamroll them". If the slayer playlist population is at fault, then slayer needs to become a social list and the included player skill on each team broadened.
Again, this is <1% problem, not a 99%. In fact, the top 10% are going 50/50 just fine. As are the bottom 10%.

Your experience is an exception and not the rule, and just part of upper Onyx life.

Slayer has one of the smallest end-game kill gap differences in the game right now.
--edit--
I am also trying to understand why the matchmaker cannot just shuffle teams around when it predicts a landslide victory. If it's predicting a 50-30 game, surely teams could have solo players swapped around to make a 50-40 or better predicted outcome.
It won't shuffle a pre-made party, and and it also won't move players after a quit. Most imbalances happen because of early quits though, not parties.

But you're higher than the top 1% where the matchmaker ends up capping your skill. At a certain skill level, it ignores all other skill gaps above it and treats everyone as equal. So the matchmaker sees everything as fine, but the skill system knows better.

This is necessary to ensure matchmaking can happen in a timely manner. It can make life roughter for some Onyx players, but that's Onyx life yo.
LUKEPOWA wrote:
Are there some undocumented changes in TS2 that you feel comfortable making the seemingly gross statement that "the new system is right and the old one was wrong all along"? - because the paper doesn't really support that claim. The paper, at best, claims that the new method is right more often- but I don't see how anyone can claim more than that.
He can correct me if I'm wrong, but I think he was just simplifying things to make it easier to understand and to get his point across more definitively. If you look at the bolded question which prompted that response, it's easy to understand why he would say it that way because some people just don't get it and will continue to say they were so and so rank before so why are this now even after he gives a detailed explanation.
I completely get that- and given what the paper actually says, I think it's a misrepresentation. I hope people can understand "accurate more often" vs "old system wrong." I am also capable of understanding detailed arguments. Neither case meshes right now and making global claims to subpopulations that may not be applicable is poor science.

And his recent tweet w/ Mikwen is a perfect example of why it is a misrepresentation:
https://twitter.com/joshua_menke/status/993610852698767360
For all of the training data- it has Mikwen at Platinum after 3 games, despite there being no reset in MMR.

The last game that dropped him from Diamond to Platinum? It was this one:
https://www.halowaypoint.com/en-us/games/halo-5-guardians/xbox-one/mode/arena/matches/ee511ea7-d02f-457c-9178-b45d4d5e2b1f/players/mikwen?gameHistoryMatchIndex=0&gameHistoryGameModeFilter=Arena

It was a 100-0 Strongholds game (that they won) where half of the other team quit. The entire game had 6 kills. I'm guessing, because I don't see the data, that a lot of high end players are getting royally hosed if they have multiple games where the opponents quit out and they don't meet expected kill rates.

There seem to be some pretty clear indications that there are flaws inside the black box. Maybe it's just at the high end of the curve. But "the new system is right and the old one was wrong all along" makes it easy to sweep away obvious shortcomings, and we can do better.
Yes, we do not tune the system to handle the whims of a handful of top pros returning from sweating their lives out for Worlds, and it would never make sense to tune for them for at the very least these two reasons, a) they are a tiny handful that has no impact on the overall system performance and b) we already know they're the best players, we don't need the ranking system to prove it. If Mikwen wants to spartan charge his buddies for kicks instead of getting more kills, great! But that's not worth modeling because of it's pretty much zero impact on everything else, and it's also not the common story among everyone else.
ZaedynFel wrote:
--edit--
I am also trying to understand why the matchmaker cannot just shuffle teams around when it predicts a landslide victory. If it's predicting a 50-30 game, surely teams could have solo players swapped around to make a 50-40 or better predicted outcome.
It won't shuffle a pre-made party, and and it also won't move players after a quit. Most imbalances happen because of early quits though, not parties.

But you're higher than the top 1% where the matchmaker ends up capping your skill. At a certain skill level, it ignores all other skill gaps above it and treats everyone as equal. So the matchmaker sees everything as fine, but the skill system knows better.

This is necessary to ensure matchmaking can happen in a timely manner. It can make life roughter for some Onyx players, but that's Onyx life yo.
Diamond 2 is higher than top 1%? I'm not onyx and never have been in slayer. And only one out of the 10 games had a quitter.
ZaedynFel wrote:
--edit--
I am also trying to understand why the matchmaker cannot just shuffle teams around when it predicts a landslide victory. If it's predicting a 50-30 game, surely teams could have solo players swapped around to make a 50-40 or better predicted outcome.
It won't shuffle a pre-made party, and and it also won't move players after a quit. Most imbalances happen because of early quits though, not parties.

But you're higher than the top 1% where the matchmaker ends up capping your skill. At a certain skill level, it ignores all other skill gaps above it and treats everyone as equal. So the matchmaker sees everything as fine, but the skill system knows better.

This is necessary to ensure matchmaking can happen in a timely manner. It can make life roughter for some Onyx players, but that's Onyx life yo.
Diamond 2 is higher than top 1%? I'm not onyx and never have been in slayer. And only one out of the 10 games had a quitter.
Sorry, looked at the wrong results after seeing so many today. So your experience is even less common then. Then again, you've only played 10 Slayer matches. 3-7 is a pretty common record even in just coinflips. You'll have to play a lot more matches to see if there's any real bias.

I'm seeing 50/50 at your skill level an average, and tight matches.
ZaedynFel wrote:
ZaedynFel wrote:
--edit--
I am also trying to understand why the matchmaker cannot just shuffle teams around when it predicts a landslide victory. If it's predicting a 50-30 game, surely teams could have solo players swapped around to make a 50-40 or better predicted outcome.
It won't shuffle a pre-made party, and and it also won't move players after a quit. Most imbalances happen because of early quits though, not parties.

But you're higher than the top 1% where the matchmaker ends up capping your skill. At a certain skill level, it ignores all other skill gaps above it and treats everyone as equal. So the matchmaker sees everything as fine, but the skill system knows better.

This is necessary to ensure matchmaking can happen in a timely manner. It can make life roughter for some Onyx players, but that's Onyx life yo.
Diamond 2 is higher than top 1%? I'm not onyx and never have been in slayer. And only one out of the 10 games had a quitter.
Sorry, looked at the wrong results after seeing so many today. So your experience is even less common then. Then again, you've only played 10 Slayer matches. 3-7 is a pretty common record even in just coinflips. You'll have to play a lot more matches to see if there's any real bias.

I'm seeing 50/50 at your skill level an average, and tight matches.
Guess I'll give it another go. It felt really frustrating to have consistently worse matches in slayer at diamond than I was getting at onyx in swat. I expect all sorts of silliness in swat because I'm onyx - it's not fun to play against champs but I expect it and understand why it happens. I *don't* expect to have games be consistently awful (win or lose) at diamond in slayer.
ZaedynFel wrote:
...
You are also Onyx, which means the matchmaker can match you against the full range of the top skilled players in the game. This is the life of Onyx players.
So once I cross that skill line into Onyx (CSR/MMR) I'm open to match any Onyx level player? All the way up to Champ? Hell, I'm right on that border in the most populated playlist (Team Slayer) which indicates to me that it has perceived me as playing at an Onyx MMR level occasionally and I very well could reach an Onyx CSR in that playlist fairly soon and likely will be one in several of the lesser populated playlists. I'm definitely not looking forward to games against amateur pro and actual pro players because I clearly know where I stand in comparison to them (I suck). Since you updated and fine tuned the previous matchmaker settings/restrictions in 2017, and continuously did so throughout 2017-18, I've been bouncing around between Gold - Platinum under the previous TrueSkill system. I realize that system has no bearing on my current more accurate rating, but I took comfort knowing that the matchmaker wouldn't allow the upper level players (amateur pros and actual pros within the Onyx/Champ level) to match-make down to my level unless I was in a Social match and the overall team balance gave my team a somewhat equally talented player or players to compensate.

Have the matchmaking settings/restrictions been re-tuned with TrueSkill2 in mind along with the 8% increase to the Onyx group to reflect a similar restriction as before? Onyx is now suppose to be 10% of the population instead of 2%, right? If so, shouldn't that have meant that a majority of the people who were legit upper Gold and lower-mid Platinum level players previously (under the old and less accurate TS system) wouldn't reach the Onyx level within the more accurate TrueSkill2 system simply because the majority of legit upper Platinum and all Diamond players from that previous system should, in all likelihood, be filling out that 8% increase in population well before someone like myself could reach it? While I'm flattered that the system is telling me I'm almost Onyx in relation to all the players who have ever played the playlist it just doesn't seem right if the Onyx level is restricted to 10% of the population who have ever played that playlist. I just can't fathom how my, in all honesty, "Average" skill level would be seen as being in the top 10% for the lifetime of the Team Slayer playlist. I expected an inflated ranking because of all the players who originally bought Halo 5 at release and quickly moved on when they couldn't get good or just didn't like the game-play, but I didn't think I'd be inflated into the top 10%. I actually thought I'd see myself at that Platinum/Diamond border, but then I noticed some of the people I play with at the mid-upper Diamond rankings and thought gezz I wonder if I'll be at that rank too or higher. Now, I'm worried about bad gaming experiences becoming a norm if I reach the Onyx level.
  1. 1
  2. ...
  3. 2
  4. 3
  5. 4
  6. 5
  7. ...
  8. 13