Home » Forum Home » General

Topic: DSG Ratings -- Fact and Fiction
Replies: 40   Views: 151,868   Pages: 3   Last Post: Jun 19, 2008, 5:17 AM by: zoeyk

Search Forum

Back to Topic List
Replies: 40   Views: 151,868   Pages: 3   [ Previous | 1 2 3 ]
vitals

Posts: 39
Registered: Dec 16, 2001
From: ~THE BACK WOODS~
Re: DSG Ratings -- Fact and Fiction
Posted: Feb 4, 2004, 4:03 AM

Sounds WONDERFUL NG, I am all for it! Bring it one! Let it happen!
*~*~*~*~*~*~WOOOOOOOOHOOOOOOOO~*~*~*~*~*~*

Perhaps some type of poll could be added to the home page here at DSG? That would at least present this idea to the players that do not check the forums or receive the posts via email.

Peace Out
-V

vitals

Posts: 39
Registered: Dec 16, 2001
From: ~THE BACK WOODS~
Re: DSG Ratings -- Fact and Fiction
Posted: Feb 4, 2004, 4:17 AM

******Bring it ON! - even*******

emerald

Posts: 58
Registered: Jan 19, 2002
From: Peoria Arizona
Age: 43
Home page
Re: DSG Ratings -- Fact and Fiction
Posted: Feb 8, 2004, 6:34 AM

His rating probably went down cause he was still provisional. That is possable and I have seen it a few times.

jasonholl

Posts: 1
Registered: Feb 11, 2003
Age: 34
Re: no no no no no no
Posted: Mar 6, 2004, 5:22 PM

i too have played here and won a game and more score got lower ..since then it has put me off playing here too much ... once i have played 25 games of the same type would my score still get lower if i play someone with a lower rating than me (even if i win ?) i have been on here a few times and turned down games because the opponent had a lower rating.

dmitriking

Posts: 375
Registered: Dec 16, 2001
Age: 40
Re: no no no no no no
Posted: Mar 8, 2004, 3:34 AM

Jasonholl, after completing 20 games, your rating would never go down after a win.

But, after your first win, your rating did go down, from 1600 to 1510. This is odd, because your opponent was rated 1020. I was under the impression that only an 800 point difference could cause a ratings drop after a win. We should see what dweebo has to say about it. But, for you to say it has put you off from playing here too much, don;t you think you;re overreacting a bit? Most opponents will be rated such that you would not lose points for a win.

If I do not accept a game invite right away, it means I will once I have fewer games in progress.
dweebo

Posts: 1,032
Registered: Dec 16, 2001
From: Powell, OH
Age: 37
Home page
Re: no no no no no no
Posted: Mar 8, 2004, 5:52 PM

Hmm, yes this appears to be another interesting "issue/bug/feature" of the ratings formula that only applies in the FIRST game played.

If you follow the formula as coded (earlier in this post) it does work out. The problem seems to be the beginning ratings of 1600 isn't factored into the new rating. This is because of the line in the formula

rating = (rating * getTotalGames() + gameValue) / (getTotalGames() + 1);

In the first game, getTotalGames() returns 0. Maybe it should return 1? In other words it should include the current game in the total game count. The formula is unclear about that but it does seem like what it is doing now is wrong.

The other consequence of this is if you are established and play a provisional player in his/her first game, your rating will not change AT ALL whether you win or lose which is not big deal but is also wrong.

Thanks for bringing this additional issue to my attention.

Pente Rocks!
dmitriking

Posts: 375
Registered: Dec 16, 2001
Age: 40
Re: no no no no no no
Posted: Mar 8, 2004, 7:18 PM

I would not even be sure I would call this a mistake in the program--in the first game, If the player's rating "drops" from 1600 to 1510, as Jasonholl's did, is it REALLY dropping? Not really, because the starting point of 1600 is arbitrary anyway. If each person;s rating was listed as ---- initially instead of 1600, then it would not be a problem at all. Internally, the figure 1600 could still be used, but externally it could show "Jasonholl's rating has gone from ---- to 1510," thereby making it apear as if his initial rating is 1510 instead of 1600 and then 1510.

As for an established person's rating NOT changing at all when playing someone with zero games, I see no problem with that--because again, the 1600 starting point is arbitrary, and really, the player's rating is infinite or undefined at that point. He may well be a 2100 player for all we know, so adjusting his opponent's rating based on the arbitrary 1600 starting point of the new player isn't really a good idea (even though it would only be counted by a factor of 1/20).

There is one possible way to do this to please almost everyone: Scrap the provisional system ONLY for the provisional player. In other words, start a new player at 1600, and from there his rating goes up and down the normal 32 or some fraction thereof. This way they have to earn their 2000 ratings, unlike certain phony second IDs (bugnone, catgirl004) who did not and who then adversely affect the entire ratings system. Established players have to claw their way to the top and so should provisional players.

On the flip side of this coin, the provisional system SHOULD remain intact for the established player, by that I mean the established player's rating should only be affetced by a factor of n/20 with n being the number of games played by the new player.

I honestly cannot see any possible objection to this system, except from those who have chosen to abuse the ratings system for one resaon or another.

If I do not accept a game invite right away, it means I will once I have fewer games in progress.
up2ng

Posts: 542
Registered: May 9, 2002
From: Northeast USA
The Ratings Solution
Posted: Mar 9, 2004, 2:34 PM

Ok, it's high time for me to comment on this issue once again. Here I will explain, very briefly, some of the drawbacks of the current ratings system (some already mentioned, others not yet mentioned), followed by my full solution to completely fixing the system.

-------------------------------

But first of all, I'd like to address some of the recent posts directly. First, the reason that jasonholl dropped from 1600 to 1510 after playing someone rated 1020 comes directly from the formula -- not because of any bug and not because 1600 is not being included in the calculation. What everyone has been missing is that the "magic number" is NOT 800 when playing another provisional player -- it's only 400! In this case, the calculation is: Average the opponent's ratings -- (1600 + 1020) / 2 = 1310. Now, after a victory, since his opponent was also provisional, you ONLY ADD 200! This yields 1510, a discouraging, and perhaps devastating result for a first time player trying out the site. Next, I encourage Dweebo to check and see at what point he increments the counter on getTotalGames() for the player and the opponent. After looking through everything, I think you are right, and it is indeed 0, but I also do not think this is an error. The point of the provisional system is that you are averaging 20 different results for gameValue and that average gives you an established rating. On your first game there is no previous gameValue calculated, so you are basically just pushing your first one directly onto the queue, without doing any averaging or modifications of any kind. However, the initial value of 1600 IS INDEED factored into the new rating and is actually very important. It does not show up as rating, but it does get factored into the calculation of the gameValue -- just like the example above with jasonholl. The part that bothers me is that for the established player, K = 0 when playing a NEW player. I believe this fact is what has spurred many players to "help out" the new players by purposely losing their first game to the new player. This is a horrible practice and I wish there were some consequence for it, but again I do not think that this behavior is an error. The reason is that the established player's rating is adjusted based on his opponent's current rating -- and when playing a new player this current rating of 1600 is completely and totally meaningless with regards to affecting that established player's rating. It would be grossly unfair to everyone if that 1600 was somehow factored into that established player's rating adjustment. I can't think of anything mathematical that would solve that problem -- but at least that's a fairly minor problem. Lastly, I can understand the argument of scrapping the provisional system. It seems to cause nothing but problems. But I believe that the purpose and intent of the provisional system is sound and if a good provisional system were in place it would have its advantages. Mostly, it is meant to put players much closer to their true rating much earlier on so that the effects on other players of grossly inaccurate ratings can be minimized. With that said, on to my post...

-------------------------------

By now it should be clear that we have a fairly sophisticated formula that is used to adjust ratings for established players and a positively atrocious formula used to adjust ratings for provisional players. So, most of my comments will be about the provisional system.

As already debated throughout this post, the major downfall of the current provisional system is that it is possible to gain points when you lose and lose points when you win. Despite some very passionate remarks stating that this behavior is not only normal but is a necessary part of the system -- in fact, this behavior doesn't make any sense at all. Not only that, but it has actually discouraged some new players from playing at this site which is a cardinal sin!

Now, it has been mentioned previously that in order for this to happen, the difference between the two players must exceed the "magic number" of 800. This is only partially correct! As stated previously, gameValue is ONLY increased or decreased by 400 when playing against an established player! When playing against another provisional player, gameValue is increased or decreased by just 200 points. Thus, the "magic number" when two provisional players are playing against each other is only 400!!! So, if you are provisional, you can LOSE points if you play another provisional player rated more than 400 points below you. This is a catastrophe. The reason why this is so bad is that, quite frequently, three or four or five brand new players will enter the site at the same time. These new players may (and often do) gravitate towards the same table and end up playing each other. Now, if four new players (starting out at provisional 1600) start playing a few games and rotating who plays who and continue playing for any significant period of time at all, it is HIGHLY LIKELY that this terrible behavior will occur and be witnessed by SEVERAL NEW players! This may cause ALL of these new players to quit the site completely! I would argue that this situation is an emergency and must be dealt with ASAP. NOTE: many of the examples in the previous posts are INCORRECT that involve two hypothetical provisional players since the wrong calculation was used.

Also, about calculations in previous posts, many other calculations were also INCORRECT because the task of averaging gameValues was ignored. One particular example was the provisional 2000 against the provisional 1000. First, let's change the provisional 2000 to an established 2000, otherwise the calculations are wrong as per my last note above. Next, this is NOT either player's first game played (since on your first game played your current rating is 1600). So, when the 1000 loses, he will NOT achieve an 1100. Let's say it was only his second game played. Well then gameValue = 1100 and the average of all previous gameValues = 1000 (ie, rating = 1000, gamesPlayed = 1). The 1100 must be averaged in against all previous gameValues, whos average over 1 games = 1000, thus his new rating would be 1050, NOT 1100.

With that said, consider this DISASTROUS, albeit unlikely scenario: New player enters the site, finds another provisional player rated at 2300 after only 2 games (this is possible! Do the math yourself and you will see what is required to make this possible) (Hint, see what happens when a new player wins a set against an established 1920 player). The new player loses a game against this 2300 player -- he becomes rated 1750!!! Now he loses to another 2300 player. He moves up again to 1787.50! Now he loses again to another 2300 player. He moves up again to 1806.25! Lose again. 1818! Lose again. 1826! In fact, this player could establish himself at a record of 0 - 20 and improve his rating every game until he is established at nearly 1900!!! What if instead of playing 2300 provisionals every time he managed to find 2400 provisionals every time! (a new player pulls off two sets, 4 games, against that 1920 established player). Well, in that case, a player can go 0-20 and be established at nearly 2000!!!!!!!

Lastly, let me point out one of the major flaws in the provisional formula that is actually solved in the established formula! The provisional formula states that gameValue is adjusted by 200 points if you are playing a provisional player and by 400 points if you are playing an established player (since established ratings are more reliable and accurate). But, if you are playing a NEW player with 0 games played OR you are playing a provisional player with 19 games played, your gameValue is STILL adjusted by exactly 200 points in both cases! Even though the player with 19 games played has a rating which is MUCH more accurate and reliable than a new player! So, just "how provisional" is that player anyway? Well, according to the provisional formula, we don't know and we don't care! Wrong!!! Just take a look at the established formula now, which gets it right. If an established player plays an opponent who is provisional, his K value is scaled by n/20, where n is the number of games that the provisional opponent has played. The "more provisional the opponent is" the more this behavior protects the established player -- as it should since the increase in the level of protection is proportional to the level of inaccuracy that the provisional rating is showing (the less games played, the MORE inaccurate the rating). So the question arises, should this protection be given to established players and not to provisional players? Of course not! This is an important point and I'll be coming back to this when I present the solution.

Consider the form of the two equations. What behaviors were achieved by choosing each of these forms? First, the established formula has a complex exponential term. The purpose of this portion of the equation is specifically meant to solve and ensure the behavior that, as an established player, you will ALWAYS gain points when you win and you will ALWAYS lose points when you lose. Next, notice that this exponential term always ends up yielding a fraction -- and the current rating is always directly adjusted by that fraction of the maximum possible adjustment. This computation is based on finding the "difference in skill levels of the two opponents". The "actual overall skill level" of the game is not considered since the adjustment to the rating is a relative adjustment -- the new rating is computed relative to the old rating. In other words, if a 1200 player beats a 1000 player, his adjustment is the same as if an 1800 player beats a 1600 player. The relative difference in skill level is the same, and the rating system is set up such that, in theory, a 1200 player should beat a 1000 player the same percentage of the time as an 1800 player will beat a 1600 player. Finally, notice that the maximum possible adjustment (K) is scaled down based on "how provisional" the opponent is. If the opponent is NEW or very provisional, the adjustment to the established player's rating is rightly minimized since his opponent's rating is so unreliable (more chance of being highly inaccurate).

How does this differ from the provisional formula? First, notice that computing the gameValue is based primarily on considering the "actual overall skill level" of the game instead of the "difference in skill levels of the two opponents". For example, it does not matter at all whether a 1500 player plays a 1500 player or an 1800 player plays a 1200 player -- the basis for the gameValue is the average between the two -- or the overall skill level of the game being played -- which is 1500 in both cases. This means that the approach here is to determine an absolute number for the provisional player's rating based on his performance in a certain skill level match. Specifically, the idea is that there is NO starting rating to be able to compute relative computations from, so we instead find many separate absolute rating approximations (20 in all) and average them together to form an established rating. (In practice, these absolute rating snapshots are averaged into the player's rating as he goes along, but this yields the same result.) So, what is actually happening here is like throwing 20 separate darts at a dart board, which give you data all over the map -- but if you take the avgX, avgY of all these points, you can find the center of mass and claim to have found the exact spot which you were aiming for. But what appears to happen in practice is more of a feedback system, like trying to find the moon in a powerful telescope with coarse adjustment knobs. As you hunt around you see the moon zip through your field of view and out the other side. So you stop, adjust, and head back the other way and, zip, out it goes the other way. So you continue making smaller and smaller adjustments in a wave-like pattern until you have finally "zeroed-in" on the target. So, as you become "less and less provisional", the adjustments to your rating become less and less wild, reflecting the fact that you are becoming "less and less provisional".

So it appears that these are two vastly different types of formulas meant to solve vastly different behaviors. But are the behaviors really so different??? Why can't I "zero-in" on a target using "relative" adjustments instead of averaging absolute calculations? Let's say I'm searching for the moon. I have to start somewhere, right? So what if the starting point is arbitrary! Just start making adjustments! If you make enough approximate adjustments, it doesn't matter where you start, you will eventually end up where you intend to end up. Just look at golf! You gotta end up putting a little ball into a tiny cup! First you make VERY LARGE adjustments, hitting the ball 300 yards. Eventually you make small putts until you reach your target. The point is, the same desired effect can be achieved by making relative adjustments from an arbitrary starting point. Some of you probably already know where I'm going with this so I will delay no more.


RATING SYSTEM SOLUTION:


1) Change the system so that ALL rated games are played in SETS. If a set results in a tie, neither player's rating should change. THIS IS SO IMPORTANT THAT IF THIS IS NOT DONE IT WILL DESTROY COMPETITIVE PENTE FOREVER -- END OF STORY. But this point has already been beaten to death previously...

2) Set a CAP on the maximum rating allowed as a provisional player. I suggest that this CAP be NO HIGHER than 1850, and it's possible that it should be even lower. This is to prevent ratings fraud that has become all too prevalent. THIS MIGHT BE THE MOST IMPORTANT POINT OF THIS POST, SO DON'T MISS IT!

3) Change the provisional formula to the following:

newRating = oldRating + p/20 * (11/(0.5 * (gamesPlayed + 2))) * 32 * (w - (1 / (1 + (10^((opponentOldRating - oldRating)/400)))))

This is the same as the established formula where p = the number of provisional games your opponent has played (so if he is an established player, p = 20) -- this adjustment factors in "how provisional your opponent is", just like in the established formula. In addition, K, set to 32, is also modified by "how provisional YOU are". This gives the same behavior as in the current provisional system where the rating adjusts wildly in the first couple of games and adjusts progressively less as the rating "zeroes-in" on the true rating. In the first game, K = 32 * 11 = 352 (slightly less than the 400 used currently), and by the last provisional game, K = 32 * 11 / 10.5 = 33.52. Keep in mind, K (adjusted) represents the MAX adjustment in a given game. Just like in the established formula, the rating will be adjusted by only half of K if both players have the same rating, and substantially less if you are the favorite. As far as the CAP goes, the actual rating should be kept track of behind the scenes instead of simply truncated -- such that if the player loses a game, but only loses a few points such that his rating would still be above the cap, then he will not lose points below the cap at this stage. However, once the player becomes established, if he has a rating at that point which is still above the CAP, this rating gets truncated down to the CAP, and established rating adjustments will begin, starting with that CAP value as the rating.

Now, let's see how the behavior of this system stacks up against the current provisional system with an example. Suppose a new player joins DSG and plays 5 games. They are all sitting there at the same table at the same time and our new player gets his pick of opponents each time. Let's look at two possibilities for the ORDER which he plays his games:

Order A:
--------
Defeats Established 1450
Loses to Provisional (3 games played) 1770
Defeats Provisional (15 games played) 1680
Loses to Established 1510
Loses to Provisional (10 games played) 1650

Order B:
--------
Loses to Established 1510
Loses to Provisional (10 games played) 1650
Defeats Established 1450
Loses to Provisional (3 games played) 1770
Defeats Provisional (15 games played) 1680


Now I will show the results of playing these same opponents in order A and in order B under the current provisional system. Next I will show the results of order A and order B under the new system, but simply setting P = 10 for all provisional opponents (this should simulate the current system's simple behavior of adjusting results against established opponents by double the amount of ANY provisional opponent). Finally, I will show the results of order A and order B under my new proposed system in full.


Current A:
----------
1600
1925
1786.25
1835.21
1694.56
1650.10


Current B:
----------
1600
1155
1178.75 (*** NOTE: Rating increased after a loss! ***)
1357.29
1358.88 (*** NOTE: Rating increased after a loss! ***)
1430.99


New (P = 10) A:
---------------
1600
1704.41
1656.69
1703.64
1597.62
1572.68


New (P = 10) B:
---------------
1600
1379.40
1358.99
1469.53
1458.93
1504.76


New A:
------
1600
1704.41
1690.09
1754.17
1641.10
1612.52


New B:
------
1600
1379.40
1358.99
1469.53
1466.35
1534.44


Hopefully these results show that this new system provides the same basic behavior of "zeroing-in" on a true rating based on the results of provisional games played, just as the old system was meant to do. In addition, it's accuracy is improved. Under the new system, it is impossible to lose points when you win or gain points when you lose. The new system reduces the importance of the first game played and reduces the wild swings generated from this and other provisional results while still being flexible enough to allow for greater adjustments than an established player gets. The new system makes distinctions and "protections" based on "how provisional the opponent is", as well as reducing the amount of fluctuations based on "how provisional YOU are". From a behavior standpoint, I can see no flaws in this new system, but I see many many benefits.

I hope that all three of these items are strongly considered at some point in the future. Feel free to offer feedback.

Always,
up2ng

watsu

Posts: 1,487
Registered: Dec 16, 2001
Home page
Re: DSG Ratings -- Fact and Fiction
Posted: May 26, 2004, 1:17 PM

About whether or not there is a limit to how high a ranking one can get here- it has been a long time since I studied infinite series and I no longer have my textbook around so I'm not going to attempt to determine whether the series is divergent or not but... if someone were to get a 3200 spread between themselves and their opponent they would need to win approximately 3,000,000 games to gain a point. If they had a spread of 6400 it would take around 300,000,000,000,000 wins to gain a point and with a spread of 12800 it would take around 3x10^30 wins to gain a point. Apologies if my numbers are off by a factor of ten or so. I think speaking practically about physical capabilities we can say that it is highly unlikely that any mortal human will ever have an earned rating of 5,000 (or even 3,500 for that matter)or higher. In fact, I suspect that there is a much lower ceiling for pente as it is currently rated (lower even than that of chess because ratings are not calculated by sets) Gary Kasparov is currently rated at 2817 FIDE which I think would be about 2917 or so USCF (I'm not sure because the comparison I saw was for different figures than the current numbers).


Message was edited by: watsu at May 26, 2004 7:27 AM


Message was edited by: watsu at May 26, 2004 7:44 AM


Message was edited by: watsu at May 26, 2004 7:46 AM


Message was edited by: watsu at May 26, 2004 7:49 AM


Message was edited by: watsu at May 26, 2004 7:58 AM


Message was edited by: watsu at May 26, 2004 8:42 AM


Retired from TB Pente, but still playing live games & exploring variants like D, poof and boat
up2ng

Posts: 542
Registered: May 9, 2002
From: Northeast USA
Re: DSG Ratings -- Fact and Fiction
Posted: May 26, 2004, 5:49 PM

Yes, this is what I was getting at with my previous posts. Based strictly on the formula, I came to the conclusion that there is no mathematical limit on ratings currently. However, as you said, there are some additional constraints. First, you can eventually get to the point where the fraction of a point gained is so small that the computer rounds this number to 0 and you really do get no gain from your win. Second, a human can only physically play so many games in a lifetime. Third, no one in the real world is superhuman enough to win EVERY game, and when you lose you take a HUGE hit at the top of the spectrum. This is especially the case if you play games as Player 2, in which case whether you win or lose is out of your control when the game first begins.

Let me repeat a few figures from my previous post. Rspread is taken to mean the difference between your rating and your opponent's rating (assume your rating is higher for this discussion). DeltaR is the amount you will gain when you win.

Rspread DeltaR
500 1.70
1000 0.10
1500 0.0057
2000 0.00032

Now, consider god decides to play pente from an early age, as often as practical, always plays as Player 1, and never loses. He always plays the highest caliber opponents, assume opponents are rated 2000. He is able to play about 100000 games during his lifetime.

He will very quickly attain a rating above 2000. Relatively quickly, he will reach a rating of 2500. From here, he will earn about 1.7 points per victory or less. Over the course of another 500 games or so he can get up into the high 2000s approaching 3000. Upon reaching 3000, he will only be earning 0.10 points per victory. Another 1000 victories in a row will yield well under 100 more points. I believe, within his lifetime it is possible that he could reach 3500, but by this time he is earning a miniscule 0.0057 points per victory. This means that it takes 200 victories to earn a single point -- thousands of victories to earn 10 points, and before long, his lifetime quota of 100000 games played has expired. The limit, with these constraints seems to be between 3500 and 4000 points.

Remember that as soon as you consider a player always playing sets, under the CURRENT rating system, the limit is substantially less. It is probably impossible, for all practical purposes, for an established player to maintain a rating above 2500, since our ratings calculations here at DSG DO NOT ACCOUNT FOR PLAYING PENTE IN SETS. Once DSG is changed so that ratings are adjusted based on the result of a set, you will see players' ratings spread out considerably -- and more importantly, top players will start playing rated games again.

zoeyk

Posts: 2,241
Registered: Mar 4, 2007
From: San Francisco
Age: 45
Home page
Re: DSG Ratings -- Fact and Fiction
Posted: Jun 19, 2008, 5:17 AM

ahhh yes here it is......the ratings thread

Scire hostis animum - Intelligere ludum - Nosce te ipsum - Prima moventur conciliat - Nolite errare
Replies: 40   Views: 151,868   Pages: 3   [ Previous | 1 2 3 ]
Back to Topic List


Powered by Jive Software