Posts:
542
Registered:
May 9, 2002
From:
Northeast USA
DSG Ratings -- Fact and Fiction
Posted:
Jan 31, 2004, 10:13 AM
Greetings Pente Players!
Today I was in the mood to do a little math, don't ask me why. I was also in the mood to discuss, once again, pente ratings in general and the rating system used here at DSG. So, hopefully, this post will combine these whims into a useful discussion which will point out some facts about DSG ratings and dispel some myths as well.
First, some basics. Ratings are used at DSG to track the relative skill levels of pente players. This is useful and important for many reasons. I encourage everyone to search for and read my previous post titled 'Should Pente "sets" be the norm?', which discusses exactly what ratings are, what the advantages of having a rating system are, why they are important and what they are used for. In addition to the ways I listed that ratings are used in that post, it was also pointed out that they are often important in determining seedings for big tournaments. You should also know that ratings calculations are adjusted after every "rated" game you play, and are adjusted the same way whether you played the game as white or as black. You can also choose to play "unrated" games at any time if you choose, and such games do not affect your rating. See the aforementioned post for a very detailed explanation of why ratings should be adjusted based on the results of playing "sets of games" -- ie, playing the same opponent one game as white and another game as black.
Now, when new or even not so new players begin having a discussion about ratings, probably the most often asked question is: Well, how are the ratings calculated anyway? Where can I find the formula? The formula is indeed posted, but it is not necessarily easy to find unless you search around for it a little bit. Here is the location:
From the main page --> (Documentation) FAQs --> General FAQ --> How are the Ratings Calculated?
There are in fact two separate formulas, and I'll list them both here. First, understand that there is a difference between "provisional" vs. "established" players for the purposes of adjusting your ratings. Provisional players are those players who have not yet played at least 20 rated games at DSG. This is a period of "zeroing in" on your "true rating". It is recommended, but not required, that you play rated games against multiple opponents and multiple skill levels during this provisional period and that you alternate (as always) between playing as player 1 and player 2. During this provisional period, your rating is adjusted using a different formula than that for established players, one that allows for larger variations to more quickly help focus in on your true rating. When an established player plays against a provisional player, his own rating adjustment formula gets modified to allow for less variation to compensate for the fact that the provisional player's rating is most likely inaccurate.
Ratings formula for provisional players:
In the FAQ page for DSG it only mentions that every game the provisional player plays has a value:
value = (R1 + R2) / 2 + w * 200 + e * 200
Unfortunately it is never explained what happens to this "value" and it does not define e. R1 is your (provisional player's) rating, R2 is your opponent's rating, and w = 0 for a loss and w = 1 for a win. I will take a guess here -- WARNING, these are assumptions and may not be correct: I will assume that e = 0 for a win and e = 1 for a loss. The resulting value is "weighted" or averaged in with all previous such "values". Thus, if you are playing your 10th game, your previous 9 values are stored someplace, and when your 10th game is finished, the "value" is computed for your game 10, then it is averaged against the other 9 values and your new rating is calculated and displayed. Note that your value attained for a given game is simply the average of the two players' ratings + 200 if you win or is the average of the two players' ratings - 200 if you lose. However, this value makes your rating fluctuate the most after playing your first game, since this value IS your rating (averaged against no other values), and every game played thereafter makes your rating fluctuate less wildly since the value counts as only a percentage of your new rating, depending on how many games you have played. Before playing any games, your rating is 1600. (Interestingly, if you play your first game against an opponent rated over 2000, your rating will go UP even if you lose, and if you play your first game against an opponent rated under 1200, your rating will go DOWN even if you win -- thus, as will be discussed later, even a provisional player should look to play games against opponents who are at roughly their same rating.)
Ratings formula for established players:
R1new = R1 + K * (w - (1 / (1 + 10^((R2 - R1)/400))))
This looks kind of nasty written this way, but it really isn't that bad and will be used throughout the rest of this discussion in presenting some ratings facts and dispelling some ratings myths. R1new is your new rating after playing your rated game. R1 is your rating at the start of the game, R2 is your opponent's rating at the start of the game, w = 1 if you win, w = 0 if you lose, and K is a constant, chosen by Dweebo to be 32. When playing against a provisional player, K is multiplied by n/20, where n is an integer between 0 and 20, equal to the number of games the provisional player has played thus far.
RATINGS FACTS:
Ok, let's jump into the math. Notice that a LARGE factor in how your rating changes is the exponent, (R2 - R1)/400. This means that your rating changes depending on how much difference there is between your rating and your opponent's rating, and whether your rating is higher or your opponent's is higher. I will often refer to R1new - R1 as deltaR (change in R), and refer to R2 - R1 as Rspread (the spread, or difference between the two opponents' ratings). Notice that the large term that gets multiplied by K is at least -1 and at most 1. Working backwards, this is because the term that is subtracted from w is between 0 and 1. The denominator is between 1 and infinity -- the exponential part is between 0 and infinity.
MYTH BUSTER #1:
Amazingly, there is a widespread rumor that if you are rated very much higher than your opponent, that even if you win your rating might go down! This is simply UNTRUE AND IMPOSSIBLE! Let's just look at the formula and follow the logic just presented above. As Rspread approaches negative infinity (you are WAY higher rated than your opponent), the exponent also approaches negative infinity and the exponential term approaches 0 -- thus the denominator is so close to 1 that the fraction is still less than, but approaches 1 -- but it NEVER becomes more than 1! So, when you win, there will always be some positive change to your rating.
MORE FACTS:
If you look at it the other way, you are rated WAY lower than your opponent, then Rspread approaches infinity, the exponential term approaches infinity and the fraction is still greater than, but approaches 0. So, when you win, you will NEVER win MORE than 32 points if you are an established player. Likewise, if you lose, you will always lose points, but NEVER more than 32 points.
MYTH BUSTER #2:
Another popular contention is that playing against provisional players will hurt your rating more than just playing established players. THIS IS FALSE! In fact, notice that if you are playing against someone who's playing their first game, YOUR RATING WILL NOT CHANGE AT ALL! That's right, if the player hasn't played a match yet, then n = 0, and so K is multiplied by 0, which equals 0 -- and if the K term is 0, then the whole expression is 0 since K is multiplied by the rest of the expression, so your deltaR (change in rating) is 0! Now, we understand that the provisional player may not have an accurate rating -- it's possible that after only a couple of games, a very strong (1800+ caliber) player might have a provisional rating of 1000. This is what folks refer to by saying that it will be more harmful to play provisional players. But take heart! Although a 2000 player losing to a 1000 player normally loses very close to the maximum 32 points, you will lose FAR less by playing a similarly rated provisional player! Suppose this strong player somehow managed to lose his first game against a player rated only 800 and thus was bumped right down to 1000. Well, if you play this player and lose because he was strong after all (ie, the provisional rating was grossly inaccurate), then you (the 2000 player) WILL NOT lose nearly 32 points! Not even close! Since K is "scaled" down by n/20, and n was only 1, the MOST you will lose is 32/20 = 1.6 points! Surely this is not a great risk as advertised. Furthermore, as K is scaled down "less", the provisional rating MORE THAN compensates itself by becoming more accurate faster than the "protection" granted to the established player is lessened. This is the whole reason for such protection, and it's more than adequate enough to make this myth just that, a myth.
FACT:
IT IS MORE ADVANTAGEOUS TO PLAY ONE GAME AS PLAYER 2 FOLLOWED IMMEDIATELY BY PLAYING THE SAME OPPONENT AS PLAYER 1 THAN PLAYING AS PLAYER 1 FIRST FOLLOWED BY PLAYER 2.
This is an example of a massive flaw in the current ratings system that is indisputable fact and can be proven mathematically. This is discussed at length in my other post, 'Should pente "sets" be the norm?' along with a mathematical proof which I will not repeat here -- I refer you to read the other post. The basic idea though is that Player 1 has an inherent advantage in pente. Regardless of your rating or your opponent's rating, you have more of a chance of beating that opponent as player 1 than you do as player 2. Hence, the term "Player 1 advantage". Well, when you combine this fact with the fact that ratings are adjusted after every game instead of after every set, you end up with the following situation. You should reasonably expect that due to the player one advantage, you will win your game playing as player 1 and lose your game playing as player 2. In fact, it can be argued that this expectation is WEIGHTED MORE HEAVILY than the players' difference in skill as depicted by the ratings, and thus the points to be gained/lost ratio justifies. Purely mathematically speaking, the whole concept of ratings SHOULD exactly even out these forces if everyone always played in sets. But, ratings are adjusted based on games, not sets. This leads us to the question, if I can reasonably expect to win my game as player 1 and lose my game as player 2, does it make any difference if I play as player 1 first or as player 2 first?
The surprising answer: ABSOLUTELY IT DOES! As a very simple example, suppose two established players are rated at exactly 1600. If you win your first game, you are adjusted to 1616 and your opponent is adjusted to 1584 (Notice, here, we considered previously what happens to the formula as Rspread approaches negative infinity and positive infinity, in this case Rspread is exactly 0 -- The exponent is 0 and the exponential term becomes 1 -- thus the fraction is 1/2, and either 1/2 or -1/2 is multiplied by 32 depending on whether you win or lose, thus your rating will change by 16, exactly half of the maximum. Furthermore, an additional ratings fact follows that the amount that one player gains added to the amount that the other player loses ALWAYS adds up to 32). Now you play again and you lose. However, your ratings this time are NO LONGER EQUAL, THEY HAVE ALREADY BEEN ADJUSTED! So, in this case you do not lose 16 points, you lose 17.47 points! You have tied the match against a player rated EXACTLY the same as you, yet your rating has gone DOWN to 1598.53 and your opponent's rating has gone UP to 1601.47. Conversely, if you simply sat in the player 2 seat first, YOUR rating would now be higher and your opponent's rating would be lower.
THIS IS FACT AND IT IS A CATASTROPHE. To fix this problem, I have already proposed that ratings should be adjusted only when one player wins a SET. If the players split the set, they should TIE, and the ratings be left alone -- like it never even happened. Match? What match? I don't remember any match! Now, there will be naysayers that claim that if a lower rated player "manages" to TIE the higher rated player, he should be rewarded. Well, if the game were completely neutral, this might make some sense. But pente has a player 1 advantage, there is no getting around it. So, ties should be expected, even when the relative skill levels are not exactly equal. If you don't agree with this, fine, just always remember to give me the 2 seat first.
MYTH BUSTER #3:
Well, I've saved the toughest for last. There is a VERY pervasive rumor that there is some hard and fast LIMIT of what is the maximum rating that can possibly be achieved at DSG. Of course, no one can point to exactly what this limit actually is. That is because this is FALSE. There is no maximum limit on ratings at DSG.
To attempt to prove this fact, I will have to use a bit of "fuzzy math". First, let's make a simplifying assumption. Suppose that there were only 2 people in the world that have ever played at DSG. There was no provisional status and they both started rated 1600 and only ever played each other. Also, they were so amazingly fast that they could play millions of games. Now suppose that one player was an amazing pente GOD, and the other player was an atrocious braindead monkey. Needless to say, the god wins every match for eternity and the monkey keeps coming back for more punishment. Hey, there has recently been a player at DSG to win nearly 60 games in a row -- what's to say the pente god couldn't have a streak of 100, or 1000, or 1,000,000? Good, I'm glad you concur.
Now, for the fuzzy math: In math, there's such a thing called a series, where you add the terms of an infinite sequence of terms based on some function. For example, the Geometric series says that the sum of terms r^n as n increases from 0 to infinity can "sometimes" be determined. When it can be determined, it "converges" to some value, always approaching but never quite reaching that value. In that case, it converges only when the absolute value of r < 1. In such cases, the sum of the infinite number of terms is expressed as 1/(1-r). However, in all other cases, this sum "diverges", meaning that the value cannot be determined since as you keep adding terms, the sum keeps getting closer and closer to infinity.
Now consider the following -- the sum of all terms 1/n as n increases from 1 to infinity. In otherwords: 1 + 1/2 + 1/3 + 1/4 + ...
Now, this is important: You can sit here and add these terms for quite a while and not reach a very large number. But, it is mathematically proven to DIVERGE. Meaning, if you add ENOUGH terms, this sum will approach infinity.
Back to pente and the 1600 god versus the 1600 monkey. We can simplify (and label) our above formula to say:
Ok, let's analyze. Examine the deltaR column. This is how deltaR changes as you become a larger and larger "favorite" to win according to the ratings of you and your opponent. Now, in the case of god vs the monkey, we can further see that the way you calculate pente god's rating after n games is by adding the n deltaR values that were generated in each of the n games, and add that sum to the original rating of 1600. So, we can hypothesize here that if the sum of all deltaR values generated, as the number of games played approaches infinity, has a finite value or approaches but never exceeds some specific finite value, then there is indeed a mathematical limit as to how high your DSG rating can climb. In this case, the sum in question would be said to "converge" upon a limit. If instead, it can be shown that the sum of all deltaR values generated, as the number of games played approaches infinity, has a value approaching infinity, then this sum "diverges" and there would be no limit to how high your DSG rating can climb.
It is my fuzzy conclusion here that in the top sum, the terms are decreasing at a SLOWER RATE than the terms in the bottom sum. And if you are "converging slower" than a DIVERGING sum, then you may conclude that your sum diverges. Thus, there is no limit to the rating which can be attained at DSG.
Now, I'll admit here that I don't know these math concepts as well as is possible so there is probably a better way to prove or disprove this theory. For example, the way that our bottom series is known to be diverging is that it is compared to the integral from 1 to infinity of 1/x dx = ln(x) from 1 to infinity = infinity. Now, using our formula (1) above, we could attempt something like: integral from 1 to infinity of 32 * (1 - (1 / (1 + 10^(Rspread/400)))) dRspread However, remember that both deltaR and Rspread include R1 as part of their makeup: deltaR = R1new - R1; Rspread = R2 - R1 I'm not sure if this matters or not, and even if it didn't I'm not sure how to solve the above integral, so until then, this conclusion of mine will have to remain "fuzzy". Additionally, one could write a computer program to crunch through a couple hundred iterations to find more deltaR terms and compare them against the corresponding known diverging series' terms and see if they start "converging faster than" the diverging series, in which case i suppose that the conclusion would change from "diverging" to "inconclusive".
A final point on this. While there may be no mathematical limit on how high your rating can climb, there may be some artificial limit that gets imposed when the increases gained by winning become so small that they get rounded by dweebo or by the computer to 0.
FINAL FACT:
Let's end this book with some useful facts. Below is a chart of various Rspread values along with the corresponding deltaR values that result from winning the game. To obtain deltaR values for positive Rspread values (if you are the underdog), simply compute (32-deltaR).
Re: DSG Ratings -- Fact and Fiction
Posted:
Jan 31, 2004, 5:59 PM
please note-- with one hand using caps is too inconvenient, hence, no caps in this post.
dean, your math and figures all seem accurate to me, and I agree wholeheartedly with everything you said.
I want to add a few final thoughts on ratings.
1. the ratings matter. they are the best tool for evaluating players in a variety of ways and for various purposes.
2. The only reason to have a separate speed name from a regular name is to separate the rankings for each specific type of game. but i have noticed that most players don't bother using their speed name only for speed and vice versa. this bugs me, although i fully expect to be in the vast minority on that one.
3. multiple IDs playing rated games: some people seem to have no problem with this, but i just see no reason for it. at best, you might confuse, mislead, or upset people, and at worst, you might appear to be involved in deception, cheating, ratings fraud, or some other undesirable act.
4. because of my seemingly strict stance on ratings, some have suggested i am obsessed with my rating and that i am inordinately worried about it, that it is "just a game" and "it's only a rating," or some other similarly hackneyed statement. To that i wonder why i see so many people, including those making these comments, playing UNRATED games. the only reason to play unrated is to protect the rating! conversely, i never play unrated games. now surely someone who is "obsessed" with his rating would frequently play unrated to protect against losing the full 32 opints to low ranked players.
i guess that is all i have to say about ratings. actually i have one final thought. any act that falsely affects the ratings of even one player has a trickle down effect that affects every other player in subsequent games. this is certainly something to think about.
If I do not accept a game invite right away, it means I will once I have fewer games in progress.
Posts:
54
Registered:
Feb 21, 2003
From:
Hawaii Home page
Re: DSG Ratings -- Fact and Fiction
Posted:
Feb 1, 2004, 12:06 PM
Aloha,
I played several games on or around 12/30/03. I won two against an opponent (sorry, I don't recall who) who had a much lower rating than I did and MY SCORE DID GO DOWN. It went down one point in each of the two games I won against that player!!
Posts:
54
Registered:
Feb 21, 2003
From:
Hawaii Home page
Re: DSG Ratings -- Fact and Fiction
Posted:
Feb 1, 2004, 12:11 PM
up2ng,
I think the easiest and best way to slove your issues with rankings would be to have two scores, one as P1 and the other as P2. Or you could have some kind of (weighted) average of those two numbers. That would be completely fair as I see it and it would resolve all of the issues you have presented. Correct?
Posts:
542
Registered:
May 9, 2002
From:
Northeast USA
Re: DSG Ratings -- Fact and Fiction
Posted:
Feb 1, 2004, 7:03 PM
Thanks for all input so far, and I hope to see even more responses on this issue. I agree with the additional points about ratings made by Dmitri, and here I'd like to answer Thad's posts.
First, if you follow the above math and logic, you will see that it SHOULD be impossible for an established player to EVER lose points when you win. If any established player notices this happening to them, then this is a programming bug, and should be brought to Dweebo's attention -- send him the exact game in question, along with the exact date and time of occurance.
IF there is a bug here, I will take a shot in the dark at how this might be happening. Notice in my original post that there IS a chance that a provisional player may gain points when losing a game and lose points when winning a game. Now, this is important: IF THIS OCCURS, IT SHOULD NOT AFFECT WHETHER OR NOT THE ESTABLISHED OPPONENT GAINS OR LOSES POINTS. My suspicion here is that the ratings adjustment portion of the program may be calculating the adjustment of one player's rating, then simply subtracts from 32 (or calculates an absolute value then scales by n/20 if appropriate) AND determines the SIGN of the second player's adjustment based on the result of the first player's calculation. IF this is the case, then THIS WOULD BE AN ERROR IN THE PROGRAM. The reason is that if the provisional player loses a game and STILL gains points according to the provisional rating formula, the established player who defeated him should NOT lose points!
However, if the calculations are determined completely independently, assuming there are no other bugs, it is IMPOSSIBLE for an established player to lose points when he wins -- be absolutely sure that this has occured, then send these occurances to Dweebo ASAP.
Finally, your alternate solution of having 2 separate ratings -- one as player 1 and one as player 2 -- is certainly interesting. I'm not sure how I feel about it at this time, it will require more thought. I hope that many other people will chime in on this point and express what they think about this solution. However, I continue to maintain that the BEST solution is to determine and adjust ratings based on the result of a SET of games, and a tie should not affect the ratings at all. I understand that this would require a significant reprogramming of the game room, the ratings adjustment code and perhaps many other aspects of the software so I do not expect this change to occur in the near future at DSG, but any future rating systems developed, such as the tournament system being developed at playpente.com or perhaps some day if live tournament play allows players to have some sort of live play rating, it is important to recognize that competitions should be based on SETS instead of games.
Re: DSG Ratings -- Fact and Fiction
Posted:
Feb 1, 2004, 10:18 PM
Thad-- is should be impossible for your rating to drop. I have played 1200 games and never had my rating drop after a win. If you did, you should email dweebo. He can check your ratings after each match and verify the accuracy of your claim.
If I do not accept a game invite right away, it means I will once I have fewer games in progress.
Re: DSG Ratings -- Fact and Fiction
Posted:
Feb 1, 2004, 10:34 PM
Thad - I believe you are mistaken. I checked the game logs. No such occurrence can be found. At no point after you became established did you win a game and lose rating points.
If I do not accept a game invite right away, it means I will once I have fewer games in progress.
Posts:
54
Registered:
Feb 21, 2003
From:
Hawaii Home page
Re: DSG Ratings -- Fact and Fiction
Posted:
Feb 1, 2004, 11:11 PM
It might not have been before I was established.
I went and looked at the games history. I played several games right at the end of the year because Dweebo told me I needed to have 20 games in by the end of the year in order to play in Tournament 5. Maybe it was that my ratings went UP following a loss. I can't remember the specifics, but something WEIRD did happen with the ratings. I remember both my opponent & I commenting on it. Is there a way to look back at the comments made during a game? I assume those are not stored.
Posts:
1,032
Registered:
Dec 16, 2001
From:
Powell, OH
Age:
37 Home page
Re: DSG Ratings -- Fact and Fiction
Posted:
Feb 1, 2004, 11:50 PM
I have heard reports from several people about this bug, I will check the code to make sure it is working correctly.
I seemed to remember that it "could" happen even with those formulas, but I could be wrong. I took the formulas from a chess website back when I started the site and haven't really messed with it much since then, so it could be buggy.
The only changes I've made were from Gary Barnes suggestion to store the ratings as floating point instead of integers, the ratings as a whole were slowly decreasing due to rounding off after each game.
Posts:
1,032
Registered:
Dec 16, 2001
From:
Powell, OH
Age:
37 Home page
Re: DSG Ratings -- Fact and Fiction
Posted:
Feb 2, 2004, 12:20 AM
Quoting up2ng here: In the FAQ page for DSG it only mentions that every game the provisional player plays has a value:
value = (R1 + R2) / 2 + w * 200 + e * 200
Unfortunately it is never explained what happens to this "value" and it does not define e. R1 is your (provisional player's) rating, R2 is your opponent's rating, and w = 0 for a loss and w = 1 for a win. I will take a guess here -- WARNING, these are assumptions and may not be correct: I will assume that e = 0 for a win and e = 1 for a loss.
I guess I need to update the FAQ a little. value is just the players final rating after the game. w does stand for a win or loss but = 1 for a win and -1 for a loss.
e really is about whether your opponent is established or not. if your opponent is provisional e=0, if established e=w. e's job is basically to double the rate of increase or decrease of your rating after playing an established player.
As an example lets look at the case of you and your opponent having a rating difference greater than 800 (the magic number of the formula), lets say 1000 and 2000 and you're both provisional. Then the average of your two ratings is (1000 + 2000) / 2 = 1500. The rating for the player who won is 1500 + (w=1*200) + (e=1*200)= 1900! So their rating went from 2000 to 1900 even with a win. Conversly the loser's rating is now 1500 + (w=-1*200) + (e=-1*200) = 1100.
So that solves that "myth" up2ng is correct for games where established players are playing, the ratings will never "reverse" in that case.
The 2000 rated player's rating is now 1900... He lost points while winning?
The key would seem to be to have (R1 + R2)/2 have a drop of more than 400 (w*200 + e*200). Then it cannot be made up for by the win. The average rating pulls you down too much. This seems like a bad idea. If I missed something, let me know.
Re: DSG Ratings -- Fact and Fiction
Posted:
Feb 2, 2004, 2:52 AM
Well, since I myself used the wrong formula in the previous post, I have to ask the question: When an established player plays an unestablished player, does it use the correct formulas for both?
So the winner, if it is the established player, would use the established formula and the loser would use the unestablished formula?
Just a possibility. Otherwise with the established way of computing ratings there should be no way to drop on a win. Sorry about the confusion.
Posts:
542
Registered:
May 9, 2002
From:
Northeast USA
Re: DSG Ratings -- Fact and Fiction
Posted:
Feb 2, 2004, 9:48 AM
This is certainly interesting to know, although it makes less sense to me than when I thought I had gotten it right
So, first of all, the w in the provisional formula has different values (1 or -1) than in the established formula (1 or 0). And it seems that I also had the idea for e wrong, this value is used to double the gain or loss if the provisional player plays an established player.
Ok, what I really don't get then is that value is calculated fresh after each provisional game and the provisional player's rating is completely independent of previous games played? I'd like you to check that again dweebo because that seems different than what happens in observation. It appears that the purpose (and observed effect) of provisional status is to help "zero in on" a reasonably accurate rating, varying widely at first, then varying less and less with each game until by game 20 it is reasonably close to a true rating. The way you just described makes it seem like your first 19 games are meaningless, and if on your 20th game if you are rated 1000 (lets say after losing 19 straight games against fairly highly rated players) and just beat a 2000 player on your 20th game, your rating would jump from 1000 to 1900 and you'd have a record of 1 - 19. I just don't remember seeing this happen in practice so I assumed that these "values" were averaged into other such "values", thus helping to lessen the amount of wild fluctuation each game until you were zeroed in on your true rating after game #20. Please clarify, thanks.
Quick note: about the possible bug -- just be sure that if an established player plays a provisional player, that the SIGN is correct for the established player when the provisional player gains points after losing. I have personally never witnessed this phenomenon but if you have already gotten many such reports in the past maybe there is something to it.
Posts:
1,032
Registered:
Dec 16, 2001
From:
Powell, OH
Age:
37 Home page
Re: DSG Ratings -- Fact and Fiction
Posted:
Feb 2, 2004, 3:33 PM
Oh yeah, you're right up2ng, forgot to add the part about averaging in past games. My earlier statement about "value" was incorrect.
The last part of the formula is this: rating = (rating * total games + value) / (total games + 1);
As for questions about when one player is established and the other provisional. The provisional formula will be used for the provisional player and the established formula for the established player.
And here is all the java code so anyone can verify it again.
The method updateRating() is called for each player after the game finishes. Also note that a copy of player 1's data is made before it's rating is updated and used to calculate player 2's rating change, this is so both players ratings are updated "at the same time". Here is that code, calls to gameOver() then call updateRating().