Home » Forum Home » General

Topic: DSG Rating Suggestion
Replies: 26   Views: 98,414   Pages: 2   Last Post: Feb 6, 2004, 12:29 AM by: vitals

Search Forum

Back to Topic List Topics: [ Previous | Next ]
Replies: 26   Views: 98,414   Pages: 2   [ 1 2 | Next ]
mightybyte

Posts: 10
Registered: Dec 16, 2001
DSG Rating Suggestion
Posted: Feb 4, 2004, 6:31 AM

I recently perused the DSG Ratings: Fact and Fiction thread with interest. I have a degree in math and was delighted to find a good explanation of the rating process. I didn't spend much time analyzing the methods, but I have a couple of comments.

First, it seems wrong for a player's rating to go up if the player loses and down if he wins. Even if your rating is provisional it is still wrong. Something should be done about this.

Second, it also seems wrong that playing two games against an opponent has a different affect dependng on which seat you play first. I think this should be changed as well. I don't think there should be a requirement for the players to play a "set" though. Each and every game shoudl have an affect independant of all other games. I suggest that the rating system should ignore the advantage for going first. With the tournament rule, this advantage is reduced and I think the advantage is small enough to not make enough of a difference to warrant screwing up the rating system to account for it.

I would like to know if chess ratings give an advantage to the first player. If chess doesn't give an advantage, then I don't think pente should either.

Third, I think the whole idea of a provisional rating is stupid. It makes a person's rating a non-continuous function.

Because of these suggestions, I propose a different rating system. I suggest that a player's rating consist of two numbers, the rating, and the Rating Deviation (or just RD for short). The RD is essentially a measure of how accurate your rating is. The concept can also be explained as follows. A player's rating number is just an approximation of that player's "real rating". It is impossible for the rating number to be exactly the same as their "real rating", so it is useful to know how close the rating is to the "real rating". The RD says that the person's real rating is within a certain distance of the actual rating. Let's say a player has a rating of 1400 and an RD of 50. This tells us that the player's real rating is somewhere between 1350 and 1450. When a player plays a game, both the rating and the RD are modified. Every time a player plays a game, the player's RD goes down to show that their rating is probably more accurate. This makes the rating function a nice continuous function which is a better reflection of the behavior of ratings.

The rating system that I have described is called glicko. There is also a new rating system that improves on the original glicko system. It is called glicko-2. I suggest that one of these two systems be implemented,

For more information about these rating systems, go to:

http://www.glicko.com


up2ng

Posts: 542
Registered: May 9, 2002
From: Northeast USA
Re: DSG Rating Suggestion
Posted: Feb 4, 2004, 8:18 AM

Hey mightybyte, it's great to hear more views on this contravercial subject. I predict that you may be absolutely barraged by responses to this post so I'll try to be brief.

First, I don't have a degree in math so it would be cool if you could re-check some of my "fuzzy math" near the end of my original ratings description.

Next, I tend to agree instinctively that you should gain points when you win and lose points when you lose. This is precisely what happens once you are established. While provisional, you are generally encouraged to play against players reasonably close to your skill level to help generate a more accurate rating once established. The reasons for why the provisional rating formula is the way it is isn't purely mathematical. As Dmitri pointed out, it is much easier to commit ratings fraud by taking advantage of a wild jump in rating after your first game if modified to be more mathematically pleasing.

The whole idea of provisional ratings is to quickly "zero in" on a reasonably accurate rating by the time you become established. I know that it "seems" at first like this is just causing much more problems than it solves, but consider what happens if you eliminate the provisional rating system and every new player (whether he's never heard of pente or if he's the world champion) becomes rated 1600. Well, higher rated players would not be adequately well "protected" if he turns out to be the world champion, and lower rated players become inaccurately rewarded (inflated) if it turns out this player is a zoo monkey banging on a keyboard. The provisional system basically takes the place of your RD, and does it in a vastly less complicated way.

With regards to your second point, I think you may have misunderstood my original post. The rating system here at DSG does NOT inherently give an advantage to one player or the other. It adjusts your rating EXACTLY the same way, whether you were playing as player 1 or as player 2. And my contention is that THAT'S MY POINT! In other words, there's nothing in the calculation of ratings that gives player 1 an advantage. Player 1 has an advantage because of the rules of the game of pente. Furthermore, the fact that when playing two games it is more advantageous to sit in one seat first rather than the other does NOT have anything to do with the ratings formula, it is COMPLETELY a side effect of the fact that individual games are rated instead of sets. If you played and were rated based on sets, this problem ceases to exist -- BUT, a VAST many other problems also are solved by playing in sets as well.

Now, I often make an analogy to chess to make various points, but I think you are using it here in the wrong context. You are saying that chess has the same degree of advantage for "moving the first piece" as pente does for "placing the first piece". Nothing could be further from the truth, and besides various common occurances which actually lead to an unavoidable TIE in chess, it's also comparing apples to oranges to consider a game where you have the same number of pieces throughout to a game where one player always has an extra piece.

Now, let me be the first of I'm sure many many people to make this point: Player 1 has a SIGNIFICANT advantage to win the game of pente. I mean seriously significant. Now I want to let you know that as I graduated from being a self-proclaimed average player to a good player to even a very good player, I vigerously denied the fact that player one has a noticable advantage when using the tournament rule. I became rated pretty high at DSG and was still completely dismissing this fact as myth. Well, once you play the game enough and think about it enough and the light just goes on one day and you realize that the player 1 advantage in pente is as clear and as true as 2 + 2 = 4. It's like Neo battling his *** off against agent Smith inside the Matrix, struggling and struggling, and one day, BAM, he becomes unbeatable -- to even fight him is simply foolish. Well, fortunately the game is complex and the human mind makes mistakes so hence the term "player 1 advantage" instead of "player 1 sure thing". But, to ignore that this fact renders the current rating system obsolete is to just put one's head in the sand and hope the problem goes away.

Changing the system to rate sets instead of games solves so many problems that I should probably receive the nobel prize of pente for coming up with it (of course, I probably wasn't the first to come up with it but that's ok ). When new players join DSG with the software changed to a "set friendly atmosphere", they won't even realize that it was once any other way. And when the geniuses of the game who have retired due to the "futility of overcoming the player 1 advantage" learn through the grapevine that the rating system has been fixed, they will come back in droves.

To end this "brief" response on a positive note, I really do like the ideas presented about having a standard deviation calculation factor into ratings adjustments. However, I'm afraid that the consensus will surely be that those alternative rating systems are just too complex to be understood, and too unwieldly to be implemented.

I just realized that this has a harsh tone to it -- that was not my intention. I think you may have just misunderstood some of the previous points posted and I was just trying to clarify.

Always,
up2ng

dmitriking

Posts: 375
Registered: Dec 16, 2001
Age: 40
Re: DSG Rating Suggestion
Posted: Feb 4, 2004, 1:33 PM

mightybyte: "First, it seems wrong for a player's rating to go up if the player loses and down if he wins. Even if your rating is provisional it is still wrong. Something should be done about this."

you offered no justification for this statement. just stating it does not make it so. I habve clearly explained why the rating should go down after a win and no one has refuted my explanation in any way.

"With the tournament rule, this advantage is reduced and I think the advantage is small enough to not make enough of a difference to warrant screwing up the rating system to account for it. "

this is an indefensible position. the player 1 advantage is overwhelming. I defy anyone to show me a win for player 2. point out any of my few losses as player 1, and I will quickly show the winning sequence for player 1 that was missed.

"Third, I think the whole idea of a provisional rating is stupid. It makes a person's rating a non-continuous function."

nonsense. the provisional system is useful and its advantages have been discussed. to say it is a non continuous function is injecting meaningless math jargon as a means of confusing people. first, your statement is vague at best; the rating is still continuous it just changes at a different rate. second, if it weer discontinuous, who cares? are we taking derivatives of ratings? what does contionuity of the function have to do with anything?

there is a simple way to avoid losing points after a win: DON'T PLAY SOMEONE LESS THAN 400 RATING POINTS BELOW YOU WHILE PROVISIONAL! this is the time to establish oneself, so he shold play people close to him in skill.

not having the provisional system at all is no good. the ratings would not be protected this way. someone who plays 1 game and wins should not immediately be established as 1800! if he were, then a 1800 established player could come along and gain 16 points by beating him. this does not make sense.

If I do not accept a game invite right away, it means I will once I have fewer games in progress.
mightybyte

Posts: 10
Registered: Dec 16, 2001
Re: DSG Rating Suggestion
Posted: Feb 4, 2004, 6:23 PM

First of all, in regards to making something so by stating it. I never intended for my ideas to be "made so". In no way were my comments intended to be comprehensive defenses of fact or opinion. They are just ideas. If I stated something as fact, it was because I did not feel like typing "I think" in front of every sentence that wasn't a concrete fact.

If you don't agree that it is wrong for a player's rating to go down when they win and up when they lose, that's fine. I'm not going to present a mathematical argument because it is as waste of my time to prove something that is common sense.

I'll give you the first player advantage. The way I would suggest dealing with it would be to do something like the Free Internet Chess Server. When you play against someone, it automatically chooses the sides for you and makes sure that you switch off. You can override it's choice if you want, but that is usually not done. It's harder to do with DSG's format of sitting at a table though.

The idea of a provisional rating is a kludge that people used before the glicko rating system was developed. It has seemed to work well and I'm not going to argue that. My point is that there is already something better. If there is controversy about the current rating system, then why not try one of the better systems?

dmitriking: "not having the provisional system at all is no good. the ratings would not be protected this way. someone who plays 1 game and wins should not immediately be established as 1800! if he were, then a 1800 established player could come along and gain 16 points by beating him. this does not make sense."

You completely missed my point about the RD. When a new player comes in, he automatically starts with a high RD which basically labels him as what we now call provisional. Because of his high RD, his score would be able to fluctuate a lot more to find the right value. If an 1800 established player played an 1800 new player, then the established player would not gain as much because of the relatively low confidence in the new player's rating.

I'm not saying we should throw away provisional ratings. I'm saying that we should throw them away and replace them with something better. I would also suggest that you look at the site I referenced before you comment on what my suggestion can and can't do.

elena

Posts: 6
Registered: Dec 5, 2002
From: CA
Re: DSG Rating Suggestion
Posted: Feb 4, 2004, 9:00 PM

If you want to see how ratings would be calculated using Glicko rating system, you can play with it on http://www.users.on.net/bjcox/downloads/glicko.xls

This is for Glicko-1, not for improved Glicko-2, but still gives some idea.

I hope it helps a bit

cicerolove

Posts: 46
Registered: Feb 1, 2002
From: Little Elm
Age: 32
Home page
Re: DSG Rating Suggestion
Posted: Feb 4, 2004, 9:08 PM

Okay, I went to glicko.com and I read through the paper he wrote concering the deficencies in the Elo system and hwo his system addresses it. His basic premise is that an historical element should be included in the ratings system to give a better "trustworthiness" to players' ratings. I have ot say that I find this all very interesting but completely unnecessary. Let me to the hounds...

First of all, a rating system will never be more than a best guess of someone's ability within the narrow universe of the participants in that rating system. That is to say, that by and large any rating system will do so long as it applies the same rules to all participants in that ratign system. Having said that, any rating system is accurate so long as it is equitable because rating systems are there merely to give a rough guide of a person's ability in relation to his opponents and his opponents relations to other opponents etc. I understand that Mr. Glick is trying to basically eliminate latency in moving ratings to be better reflective of a player's skill.

My suggestion for the site is to not touch the ratings system as we have it. The only thing we should do is better report the exceptions and allow people to decide the ability fo their opponent. So, in my opinion, we should have a compound rating which is the player's performance in both seats and then a white rating and a black rating which is merely the performance of a player in the respective seats. If a player has a high white ratign and a very low black rating then you knwo that they either a) don't play black very well at all (not bloody likely) or b) they play primarily when they know they have the advantage. As Dmitri suggested to me in IM, another statistic woudl eb the avg rating of your opponent from each seat. This woudl tell you whether they play hard opponents form black on a regular basis which may accoutn for their low black rating. either way, trying to come up with a single number, no matter what system you try, will always result in exceptions and problems. The best thing is to give enough data for players to form their own opinions about the strength of a player.

As to whether a player should lose points when he wins or gain points when he loses, I think we can separate out two issues here. I think your ratign should go down if you beat someone who is 400 rating points difference form you. That is a ridiculous game to count and if you persist in betaing up on weaker players, you should be punished. I find it distasteful that anyone should defend a player gaining points from playing weaker players regularly. However, conversely, one should never gain points for losing even against substantially better players. At worst, there should be no change for the losing player. Now you amy say this will stifle play betwen vastly different skill sets. I beg to differ. There's always unrated It's the responsibility and duty for upper players to make a concerted effort to play lower players. but those games don't have to be rated and probably shoulnd't.

And finally, we have to keep in midn that we are talking abotu a very short time period for most players at the beginning of their Pente play. I would hate to think we should change an entire and complete system for the sake of 20 games per player. Kludge or not, everyone's rating seeks its own level mysteriously. I mean you can look at the ratings now of established players and I woudl tell you that every player within 25-50 ratings points are equally matched for the most part. Why mess with that?

dweebo

Posts: 1,032
Registered: Dec 16, 2001
From: Powell, OH
Age: 37
Home page
Re: DSG Rating Suggestion
Posted: Feb 4, 2004, 10:10 PM

Ok, I've kept pretty silent about these ratings discussions until now. I'll just say a few things that have popped into my head while skimming these posts.

1. Some of you have too much free time
2. I think these discussions are great!
3. I am not against changes in the DSG ratings calculations. I am not in love with the current system, it was something I found online at the beginning of the site, almost all other code has changed since then.
4. I agree that P1 has an extreme advantage and "should" always win. This is probably up2ng's main argument for set-based ratings and it is a good argument.
5. Practically I think set-based is kind of weird. Problems probably already brought up that would need to be resolved:
a. Player's who ditch after losing game 1. Could keep track of how many times players have done this I guess as a warning to other players.
b. Coding wise it's probably a bit of work, but could be done, except I don't have much free time!
c. Player confusion. If set-based is optional that would help.
6. Different people view the ratings in different ways.
a. Some view ratings as rankings. I tend to be one of these people, I want to have a high rating, but really I want to be one of the top 5 pente players.
b. Some view ratings as a way to find other players to play with similar abilities.
c. Some hate ratings and just want to play.
7. For people like me in the 6a camp, maybe a ladder is really what we want. I've always wanted a ladder at DSG.
8. I'm not necessarily against complicated formula's for ratings. I think "most" people won't care how it's done as long as it is perceived as fair.
9. We could experiment and have 2 or 3 ratings for each player, each rating using a different formula's. Or implement set-based as an optional feature.
10. I am not really looking into the math here, but perhaps another option would be to tweak the "weight" of the rating changes depending on the level of the two players ratings. This follows from the argument that p1 should always win. Of course usually the people who believe that have higher ratings. So if 2 players are playing with high rankings, say over 1800 for an example, then for them, if white wins, the ratings shouldn't change much, since that is what is expected. If black wins then the ratings should change ALOT because it is an upset. However, for players with lower ratings, there probably isn't much of an advantage for white, so there wouldn't be much difference between the "weights" for a black or white win. Obviously this idea isn't easily codified into a formula and other issues remain, such as what about when a high ranking player plays a lower ranking player. It's just an idea I came up with as I'm writing this. Maybe one of your math major's can think about its implementation.

Pente Rocks!
mightybyte

Posts: 10
Registered: Dec 16, 2001
Re: DSG Rating Suggestion
Posted: Feb 4, 2004, 11:51 PM

First let me say, that I think the current rating system is pretty good. I used to have zero confidence in the ratings, but that was back when Dweebo's was much younger. Now I think the ratings are much better descriptions of the player's strength.

Regarding the issue of differences for player 1 and player 2, I really like cicerolove's idea of having a separate rating for player 1 games and player 2 games. If you try to build some kind of player 1 bias into the rating system, there will always be a question as to whether it is correct. If you have separate ratings, then players can interpret it however they want. I also like the idea of having a composite rating.

Like cicero said, we're trying to paint a picture of the person's skill whether it be relative or absolute. Multiple numbers communicate more information (that might not be useful) and could do a better job of communicating a player's skill if done right. I like the idea of the RD from the glicko rating system because it communicates another piece of information that is more specific than just the number of games a player has played.

I think a rating system with the following attributes would be nice:

Player 1 rating
Player 2 rating
Overall rating
Overall RD

It can be argued that this is too complicated and will confuse people. In that case, the Overall rating is what those types of people are looking for.

I also think it would be good to have two rating categories. One for fast games and one for slow games. That would eliminate the situation with some people having separate speed names and some people using only one name.

Of course, I don't expect that this will be done. I am just putting it as an idea for a future improvement.

dmitriking

Posts: 375
Registered: Dec 16, 2001
Age: 40
Re: DSG Rating Suggestion
Posted: Feb 5, 2004, 12:11 AM

mightybyte said:

"I think a rating system with the following attributes would be nice:

Player 1 rating
Player 2 rating
Overall rating
Overall RD

It can be argued that this is too complicated and will confuse people. In that case, the Overall rating is what those types of people are looking for.

I also think it would be good to have two rating categories. One for fast games and one for slow games. That would eliminate the situation with some people having separate speed names and some people using only one name."

i thinbk these are all good ideas. Greg and i discussed having different ratings for player 1, player 2, and overall.

making speed games separate is good also -- too many people use their "speed" name for normal, thus falsely affecting their and other people's ratings.

mightybyte, to address your comment: "If you don't agree that it is wrong for a player's rating to go down when they win and up when they lose, that's fine. I'm not going to present a mathematical argument because it is as waste of my time to prove something that is common sense."

you're right, i don't use "common" sense, that is for common thinkers, i like to think i use extraordinary and insightful sense, which can cause confusion among those who are not equipped to handle this level of thought. Now, you do seem very capable, so that means i might have done a less than perfect job of relaying my thoughts, so I'll cut and paste greg's thoughts here as a retort to you:

"As to whether a player should lose points when he wins or gain points when he loses, I think we can separate out two issues here. I think your ratign should go down if you beat someone who is 400 rating points difference form you. That is a ridiculous game to count and if you persist in betaing up on weaker players, you should be punished. I find it distasteful that anyone should defend a player gaining points from playing weaker players regularly. "

now, can you seriously srgue with that? if so please defend your position, don't just say "it's wrong." Greg and i have given several compelling reasons for our position.

If I do not accept a game invite right away, it means I will once I have fewer games in progress.
up2ng

Posts: 542
Registered: May 9, 2002
From: Northeast USA
Re: DSG Rating Suggestion
Posted: Feb 5, 2004, 1:12 AM

Well, I told myself that my last post was my last post on this topic, but after reading all the subsequent responses, I just can't help myself.

Let me summarize here in case you don't want to read further:

1. KISS
"Keep It Simple Stupid": In general, if you start presenting vastly complicated solutions to a simple problem you are probably taking the wrong approach.

2. Pente should be played in sets. Period.
Basically every single "problem" that everyone has raised over the last several threads having to do with ratings can be 100% solved by playing pente in sets. Open your mind and take a few minutes to really think about this.

Specifics:
To mightybyte:
I do agree with the notion that it seems against common sense to lose when you win & win when you lose. However, the solution of simply eliminating this behavior as a special case of the current provisional rating formula is a bad idea. For brevity, I will not explain further. Instead, since dweebo is actually open to possible formula changes (which I'm frankly surprised about), let me open a whole new can of worms and propose (or rather brainstorm) that the provisional rating formula be replaced by one that takes on the same characteristics as the established formula. Example -- same formula, different parameters:

R1new = R1 + K * (w - (1 / (1 + (10^((R2 - R1)/400)))))
where K = 320 / ((n / 2) + 1)
n/2 may or may not be integer division, n is number of games previously played.
In this case, players start out with a maximum gain/loss ability of 320 points on their first game, and this decreases gradually over 20 games until K = 32 when the player becomes established.

Another small point -- In light of many suspected instances of ratings fraud behaviors, I might consider placing a provisional CAP at 2000, or 1900 or whatever is decided upon. A player should not be able to claim the #1 spot on the board after completing exactly 20 games, and only ever play 600 rated players from then on in the future to remain at #1. A "provisional CAP" would at least 99% solve this issue. And hey, if the guy becomes established at 2000 and really deserves to be rated 2500, he can play a few dozen established games against top players to become #1. No big deal.

Now I'm really digressing out of order here but this leads into dweebo's thoughts about implementing a ladder, which I like pretty much. For example, once a player cracks the top 10, it might be interesting if he had to play other top 10 (or 20) players to improve his RANKING (perhaps ratings and rankings would not necessarily correspond at the top of the ladder). At the same time, higher ranked top players, such as #1, should be encouraged to play top players regularly or risk slipping in rank slightly over time. This is similar to boxing or tennis. I think this is much lower priority than improving the rating system however.

Next to mightybyte:
You mentioned that a chess site picks sides for you and switches off for you. Well, a set-based DSG game room would enforce switching of or do so automatically. Picking the side for you is unimportant. If you were suggesting that this is done for "one game only" then that's ludicrous. Pente should be played in sets. Comparing the player 1 advantage to the one experienced in chess is apples and volkswagons. Enforce set play, this problem goes away.

Finally, I was originally more polite about the RD suggestion, but I want to state here that I really do not like this idea. Remember, KISS. Having more and more data and parameters makes the whole system less and less meaningful -- people want to see a number, not a spreadsheet. Plus, the RD system is vastly more complex to calculate adjustments. Finally, if you try to make it "appear" simple by not publishing other players' RD values, then ratings adjustments after games will appear random and unfair. Let's not go down that road. Nor do I like the idea of a separate rating for white vs black. KISS. Play pente in sets, this problem goes away.

To Cic:
I agree with a great many of your comments. But, here is an excerpt from your post where I disagree with almost word:

"So, in my opinion, we should have a compound rating which is the player's performance in both seats and then a white rating and a black rating which is merely the performance of a player in the respective seats. If a player has a high white ratign and a very low black rating then you knwo that they either a) don't play black very well at all (not bloody likely) or b) they play primarily when they know they have the advantage. As Dmitri suggested to me in IM, another statistic woudl eb the avg rating of your opponent from each seat. This woudl tell you whether they play hard opponents form black on a regular basis which may accoutn for their low black rating. either way, trying to come up with a single number, no matter what system you try, will always result in exceptions and problems. The best thing is to give enough data for players to form their own opinions about the strength of a player."

Two ratings = bad. Less meaningful, more complex, just bad. Adding more and more information such as opponent's average rated opponent -- ugh, I'm getting a headache. I wanna see my opponent's rating, and I want that number to mean something. Keep it simple. Play pente in sets, this problem goes away.

Furthermore, the statement that a single rating cannot be achieved without exceptions and problems -- well, play pente in sets -- you will see that this solves many problems. Let me repeat again, since this fact seems to be bouncing off a lot of skulls, what the real underlying problem is that is causing so much frustration and ends up chasing the best players out of the game:

PLAYER 1 HAS A SIGNIFICANT ADVANTAGE IN PENTE

The reason why this causes so many problems in terms of ratings is kind of subtle I guess -- you really have to think about it for a few minutes. Let me spare you. If you have an even playing field environment, you can use the results of the competition to directly generate ratings (and rankings). Let's say you are a world class track and field runner and specialize in the 100 meter dash. You line up 8 competitors next to each other and they race to the finish line. The first to cross is ranked #1, the second to cross is ranked #2, etc. Now suppose you take those same 8 competitors for another 100 meter race. But this time you let the first 4 competitors run along a 10 degree downslope. The other 4 competitors are forced to run along a 10 degree upslope. How do you accurately rank the 2nd to cross the downslope against the 4th to cross the upslope??? I challenge you to tell me which runner would be faster in a fair race. Using a single formula is this case is just hogwash. Now matter what you say or how you tell me that it all evens out, it doesn't. It becomes worse and worse and worse and worse and worse and worse and worse -- until the top players get frustrated and quit the game.

Now imagine the nirvana that is the even playing field. Imagine "fair" elections and and unbiased courtrooms. Imagine *gasp* a pente match that is completely, totally and indisputably FAIR. What? Omigosh! What about the player 1 advantage? What? *whisper whisper grumble whisper* You mean ... there is such a thing as a FAIR pente match? There really is?? Really??? Really?????

Yes. Keep it simple. Play pente in sets. Fix ratings. Bring back the top players.

This is so critical that if I knew the email address for the CEO of Winning Moves I would email him tonight and demand that playing pente in sets be added to the official rulebook of the new games to be distributed this year. Convincing him to do so should not be hard when he realizes that game sales will increase by thousands of dollars by doing so. It is that important.

To Dweebo:
Thankfully, thus far it appears that your thoughts are right in line with mine on most topics. To respond:
1. Agree
2. Agree
3. Agree
4. Strenuously Agree
5. Agree that there would be coding challenges. Thus, even if you decided to make the change to set-based matches, I honestly do not expect to see this change actually happen. Certainly not soon anyway.
a. This can be solved.
b. agree
c. Disagree. I think there will be significantly less confusion among new players than you'd think -- especially when compared to many of the highly complex alternate solutions proposed above by others. There will also be significantly new life breathed into the game amongst the veteran expert players who understand that the 2000 - 2100 barrier is propogated by the player 1 advantage and the inability of the rating system to deal with it. I strenuously object to making set-based play "optional". I would rather see ratings and rankings completely abolished than to see ratings determined inconsistantly. The next step would be to implement a ratings system that says, 1. Flip a coin 2. If heads, use the current formula, if tails, add 1000 to your score. Meaningless would be a gross understatement.
6. Agree
7. Agree -- a well thought out ladder system would be interesting. Perhaps independent of ratings (such as how it was done for the penteaddiction league at case's ladder).
8. Disagree although fairly indifferent. I would care about how ratings are determined, but I'm not sure why and perhaps most people wouldn't -- although it is one of the most frequently asked questions I am asked by newbies.
And the formula and system of implementation should be simple. KISS. Play pente in sets, this problem goes away.
9. Generally I don't like these ideas, but not the end of the world to experiment.
10. "Tweaking" ratings adjustments based on white or black is a bad idea. Any time you are setting arbitrary parameters trying to find the "right balance" and designing a whole system of computation around it, you are probably taking the wrong approach. KISS. Play pente in sets, this problem goes away.

TO SUMMARIZE:
Keep it simple.
Play pente in sets, change ratings based on sets.

This is my last post on the subject I promise

Always,
up2ng

mightybyte

Posts: 10
Registered: Dec 16, 2001
Re: DSG Rating Suggestion
Posted: Feb 5, 2004, 3:45 AM

I completely understand why you think it is good for a player's rating to go down when they win a game. You are trying to prevent people from abusing the rating system. However, I just don't agree and here's why.

First of all, if you make a player's rating go down when they win a game, they have no incentive to play that game in the first place. They would have been better off to not play the game in the first place. This type of a system stifles the overall desire to play pente games. Is this something we want to do? No. THAT is common sense. And it will not be eclipsed by any kind of so-called "extraordinary sense".

Now, I also understand that this counterintuitive behavior only happens when a player has a provisional rating. The purpose here is to encourage players to play similarly rated people during the time their rating is provisional. However, during that time, they probably have no idea how the rating system works and will be shocked to see that their rating behaves differently than it intuitively should. Hmmm, this just serves to frustrate new people. That's not something we want to do either.

There is another problem with your reasoning. You are assuming that by playing someone much lower than you, you are beating up on them. That is not always the case. It could be that the better player is teaching the weaker player during the game. In this case, it is a very good game. However, the game does not have any meaning for the players' ratings. Therefore, the strong player's rating should not go up and the weak player's rating should not go down. They should stay the same. There is nothing wrong with this behavior.

You can argue that those two players should play the game unrated in that case. That is a fine option, but it won't always happen. It makes sense for the rating system to recognize a meaningless game--meaningless in that it doesn't generate any new rating information unless there is a huge upset--and make sure that ratings are not modified because of it.

Both of these problems can be fixed by making the player's rating go up 0 points when they win a game against a much lower rated player. Now, there is no benefit to players who abuse the system by playing people much lower than them. The only thing in it for them is the risk of losing a bunch of rating points if they lose the game. This is enough risk to serve as a deterrent to beating up on weak players.

Your argument is that a "sometimes wrong" rating system prevents people from beating up on weak players. In the process, I have demonstrated that this behavior has overall negative effects on the pente community.

My argument is that by taking out the backwards rating behavior and making the ratings not change, you still can remove the benefit of beating up on weaker players, but retain the risk that this activity will significantly reduce their rating. It accomplishes the same goals without the downsides.

However, I essentially agree with your goals of creating a rating system that is not easily exploitable and provides a good estimation of the player's ability. To be honest, I believe that DSG has a good system right now (besides the occassional counterintuitive behavior). I suggested the glicko system because it provides a generalized solution to the problem without requiring the separate cases of provisional and established ratings.

dmitriking

Posts: 375
Registered: Dec 16, 2001
Age: 40
Re: DSG Rating Suggestion
Posted: Feb 5, 2004, 4:12 AM

i think we're going around in circles. keep in mind, the rating drop after a win happens ONLY during the first 20 gamse and ONLY when playing a player whose rating is more than 400 points below the other player. you said this stifles play, i say that's silly, no one should have trouble finding opponents within 400 opints of him!

you say maybe the "good" player is teaching the "bad player a lesson? that's really reaching for lame ways of supporting a weak argument. he can teach the player A) with rating turned off and B) after he is established. you are using an unlikely scenario as supporting evidence.

now you are saying (and not for the first time) that the rating sjhould simply not change. i have addressed this many times so I'll use caps this time:

BEATING A 1600-RATED-PLAYER WHO HAS NO GAMES PLAYED, AND THEN WINNING 19 AGAINST A 600-RATED-PLAYER SHOULD NOT CREATE AN ESTABLISHED RATING OF 1800!

YET, THAT IS EXACTLY WHAT YOUR PROPOSED SYSTEM DOES.

to use your words, THAT DOESN'T FIT WITH COMMON SENSE!

not everything has to makes sense on first inspection, as long as it makes sense in practice.

If I do not accept a game invite right away, it means I will once I have fewer games in progress.
mightybyte

Posts: 10
Registered: Dec 16, 2001
Re: DSG Rating Suggestion
Posted: Feb 5, 2004, 5:00 AM

I have never proposed a system where the scenario you propose will happen. If a new player beats a 1600 rated player and then wins 19 games against a 600 rated player, then his rating should be slightly above 1600. If the current system makes that player 1800 then it is not accurate.

Also, those 19 games that he played against the 600 player should not count toward the 20 games to become established. However even if our current system gave the player the correct rating of just over 1600, it would still have made the mistake of making the player established.

My whole argument is that as currently implemented, the provisional system is very inflexible and incapable of properly dealing some situations.

The glicko system is much better equipped to intelligently handle a wide variety of situations. If the player plays a bunch of games against someone much lower than them, then it will not reduce their RD. Essentially, it would still be saying that the player was provisional.

That's why it's a good system. It doesn't stick a fixed meaning to the idea of provisional and established ratings. It uses a number that can accurately represent a range of provisionalness or establishedness.

The math is more complicated, but that's to be expected since it is a more robust system that can effectively deal with more varied situations.

thad

Posts: 54
Registered: Feb 21, 2003
From: Hawaii
Home page
Re: DSG Rating Suggestion
Posted: Feb 5, 2004, 7:22 AM

A ranking/rating is supposed to reflect a players skill level. When a player wins against another player that he is ranked much lower than, no new information about that player's skill level has been gained, so his rating should not be adjusted AND IT CERTAINLY SHOULD NOT BE LOWERED!!

Here it is:
Walt and Larry play.
Walt beats Larry.
Walt is the winner.
Larry is the loser.

If Walt's rating is already much greater than Larry's, then their ratings already reflect the fact that Walt will/should/could beat Larry and their ratings should not be modified.

If their ratings are close, then Walt's should be increased and Larry's should be lowered.

There is NO information that indicates that Walt's rating should be lowered. There is NO information that indicates that Larry's rating should be raised.

It doesn't matter how many other games either player has played. There is no justification for lowering Walt's rating and no justification for raising Larry's. Period.

Thad

thad

Posts: 54
Registered: Feb 21, 2003
From: Hawaii
Home page
Re: DSG Rating Suggestion
Posted: Feb 5, 2004, 7:39 AM

up2ng,

While your suggestion to always play Pente in sets solves many rating/rankings problems, it introduces more problems than it solves.

Playing sets only is no good when:

A group of players at a table are playing 'I've got winner'. As it is now, whoever wins plays whoever's next. Pretty simple, pretty fun, etc. It won't work with sets because there'd be too many ties and those waiting would get sick of waiting.

When I want to work on a new line & try it on some players, sometimes I only want to be Player 2 (or sometimes I only want to be Player 1).

Quite often, when I come to DSG, there are only one or two other players at the site. If I'm ranked a good bit above (or below) them, we can make things more even by my only playing as Player 2 (or only as Player 1 if I'm ranked way below).

If I only have time for one more game.

If I happen to have played an odd number of games and need to leave the site to take care of something else going on at home/work.

I'm sure you get the idea here. The point is, sets solve the ratings/rankings problem by forcing everyone to play an even number of games as P1 & P2, but in the big picture, forcing everyone to play an even number of games as P1 & P2 is a bad thing.

Thad

PS: I know you said your most recent post was your last on the subject, but I don't mind one more post (or a private email) if you care to comment further.

Replies: 26   Views: 98,414   Pages: 2   [ 1 2 | Next ]
Back to Topic List
Topics: [ Previous | Next ]


Powered by Jive Software