What is FargoRate?
Unlike high jumpers, who have height, swimmers, who have time, and javelin throwers, who have distance, pool players –pocket billiard players—have no absolute measure of performance. Skill at pool, like skill at chess, must be based on relative performance—upon who beats whom.
FargoRate rates pool players worldwide on the same scale based on games won and lost against opponents of known rating. We compute the optimum set of ratings—also known as maximum likelihood ratings—as those that best predict the outcome of all of the games amongst all of the players.
Professional players generally have ratings between 700 and 800. A random company holiday party might have many players rated between 50 and 200. Most people who play pool in leagues and tournaments are between these ranges, i.e., between 200 and 700. There is no top and no bottom to the scale.
The rating difference between two players determines the chance each will win a game.
Two players with the same rating, i.e., a 300 and another 300, or a 600 and another 600, have equal chances of winning a game between them. If the two players play multiple games, they will tend to win them in a ratio of 1:1 (one to one).
When two players are 100 points apart, say a 300 versus a 400, the ratio of game wins will be near 1:2, as in 5 games to 10 games, or 50 games to 100 games.
A 200-point gap leads to a game win ratio of 1:4
A 300-point gap leads to a game win ratio of 1:8
A 400-point gap leads to a game win ratio of 1:16
Two players with a 34-point gap, like a 530 and a 564, will win games in a 4:5 ratio. A 50-point gap predicts a 5:7 win ratio.
A new player can establish a rating by performance against an opponent of any rating. For instance, a new player who consistently wins 2 out of 3 games against a 350 is performing like a 450. That is, the two win games in a 2:1 ratio and thus are separated by about 100 points. A group of players who are well coupled to one another, like in a local league, can become coupled to the rest of the world by a few players or even a single player playing outside the group.
Games are added to our dataset every day. And a new rating optimization, coupling everybody together around the globe, is performed every day.
The result is a system that is as useful for rating two-dozen players in a small-town league as it is for rating players in a regional tournament tour as it is for rating world-class completion. And a byproduct is each of these groups knows exactly where it stands relative to the others.
While Fargo ratings are defined by relative performance, the ratings very much take on an absolute meaning. Once players in a region become accustomed to Fargo Ratings, the ratings become at the center of discussions of player ability.
There is no top or bottom to the scale
800 | A top world-class player. Fewer than 20 players worldwide have ratings that exceed 800. |
700 | A top regional player in the US – There are about 300 players at this level in the United States. – a threat to run six in a row if the break is working. World-Class female player. |
600 | Has run three-in-a-row multiple times and maybe four-in-a row a time or two. High Run in 14.1 of 50-60. There are generally around 30 players at this level per million population |
500 | A good local league player. Runs out first time at the table in about 5% of the games. Close to the median of players in the FargoRate system |
400 | Runs out first time at the table in about 1% of the games—once or twice a league season. |
300 | A common level of play for league player. Maybe has run a table, and maybe not. |
200 | beginner level – modestly coordinated --Most likely has never run an 8-ball table |
100 | beginner level – somewhat uncoordinated |
Fargo ratings are on a logarithmic scale like the Richter scale for earthquakes. What that means is that for each gap of 100 points, the higher rated player is twice as good as the lower rated player in the sense that a fair match between them would be 8-4 or 10-5, i.e., the higher rated player wins twice as many games as the lower rated player.
This would also be true for any other 100-point gap, such as from 550 to 650. Tables are easily constructed that show fair matches for any rating difference. For instance, when the stronger player goes to 9 games, a fair match is one for which the weaker player goes to the following number of games:
Rating difference | Weaker player goes to |
17 | 8 |
36 | 7 |
58 | 6 |
85 | 5 |
117 | 4 |
158 | 3 |
217 | 2 |
317 | 1 |
Robustness is a measure of the reliability of a player’s Fargo Rating. For now, it is simply the number of games a player has played that contribute to his or her rating. A robustness of 200 is a minimum standard for us to consider a rating “established.” In general, a rating is more reliable not only by being based on more games but also by more of those games being recent and by more of those games being against opponents with established ratings. Robustness will likely incorporate these latter two factors in the future, and that is why we don’t simply call it number of games. Players with a robustness under 200, i.e., those with an unestablished rating, have an official rating that may be influenced by a starter rating. [see What is a starter rating?]
Many. The basic relation between rating differences and win probability is characteristic of ELO schemes. Arpad Elo was a Hungarian-born American physicist who first applied these ideas to rate chess players several decades ago. These ideas are still core to chess ratings and also form the basis for world ratings in football (soccer), NFL football, baseball, a variety of competitive video games, the game Go, and many others. These equations even made an appearance in the movie Social Network (the facebook movie) as part of facemash, an elo-based scheme to rate the attractiveness of female university students at Harvard.
There were two implementations of ELO-type schemes in the 1990s. One was by Ron Shepard, a scientist at Argonne National Laboratory outside of Chicago, who implemented the scheme for 8-ball players in the Argonne Pool League. The other was by Bob Jewett, who used an ELO-type scheme as the basis of the NPL (National Pool League) rankings for 9-ball players largely in the San Francisco area. More recently the idea of an Ab Initio Global Optimization of ELO-type ratings was described by Michael Page in a 2002 Billiards Digest article, Sizing up with the Pros. Fargo Ratings were later implemented without the global optimization at Fargo Billiards in Fargo ND.
Most handicapping systems or rating systems are easily manipulated. And such manipulation is a serious problem. A small number of unscrupulous players begin trying to cheat any new system. Then when other players hear about this they feel they need to join in or be played the fool. Soon it becomes an industry. Many of those who make it to national events are those most adept at working the system. While tens of thousands of people play 8-ball every week, there is another group for which 8-ball is only one of the games they’re playing. The secret rating algorithm depends prominently on inning counts. And any player capable of running out against a weaker opponent is also capable bunting balls around for a couple innings like a cat plays with a mouse, padding the inning count while still winning the game. Making the secret, proprietary formula more complex to try to stem this problem is tempting, but it just fuels the game and the true gamers.
The best way to deal with this problem is to devise a system that is open, transparent, and naturally resistant to manipulation. While the possibility of manipulating the system can never completely be eliminated, the fact that every single game against every opponent contributes to a player’s rating makes that manipulation much more difficult. Also, there are a number of features of the system described here that mitigate the problem. A player generally cannot get intentional losses in the system without paying for them. For example, to enter a double-elimination tournament with the intent of losing two matches comes at the expense of the tournament entry fee. Furthermore, if a player does this three tournaments in a row, he or she has squandered three tournament entry fees only to find all that nefarious effort thwarted by a single good tournament, where the player plays six or eight matches rather than the two in the losing tournaments.
The biggest deterrent, though, is a consequence of the players believing in the system, which they do. It is clear after using this system for over five years that the vast majority of players strive for a higher rating. Nearly every 485 wants to be a 500. Nearly every 685 wants to be a 700.
In the USA Pool League (USAPL) 8-ball match format, players are awarded one point for each ball pocketed and 7 points for the 8-ball. So the player who wins a game always gets 14 points, and the player who loses a game gets between 0 and 7 points. Fargo Ratings predict the likelihood of each player winning a game, so it is straightforward to use Fargo Ratings to predict which player gets 14 points, but Fargo Ratings is silent on the number of points the losing player is expected to get for a loss.
Fortunately, this can be determined from statistics on tens of thousands of 8-ball games played by players of different skill. The average score for a losing player is a little more than 4 points. But by drilling down a little more it is determined that a player earns more points when losing a game against a weak opponent than by losing a game against a strong opponent. A strong opponent is more likely to run out leaving many of your balls on the table. A weak opponent is more likely to win the game when you have tried and failed to get out and consequently have few balls on the table. This dependence is accounted for in the USAPL matchups.
Fair USAPL matchups are determined by the rate each of the players is expected to earn points, so the only thing needed to determine the actual matchup is the length of the match, and that is determined by the desired number of games, or the desired length in time of the match.￼￼
If p is the probability a player wins a game and q=(1-p) is the probability the same player loses a game, the expected points earned in a game is p*14 + q*(points per loss), and points-per-loss depends on the opponent’s rating.
Rating players and handicapping matches are two separate things. Many people are interested only in rating players to compare performance. But ratings may also be used to generate fair matchups between players of different skill. The easiest way to do this is to have matches where the two players must win a different number of games to win the match. So in a “9-7” match, the higher-rated player must win 9 games before the lower-rated player wins 7 games to win the match.
It is tempting simply to drop off old games in favor of new games for players that play regularly such that the rating is always reflecting a recent skill level. But that strategy is a mixed bag. The reason is that each game impacts the rating of both players. And while a particular game may seem old or unimportant for one player, that same game might be a key ingredient of the rating of the opponent, who maybe hasn’t played many games.
To address this issue, the weight of a match towards a players rating diminishes over time. The end result is that more recent matches have a greater influence on a player's rating than do older matches.
A starter rating—aka a starter guess—is part of an optional approach to incorporate local knowledge/prior knowledge in assigning a useful preliminary rating for players who don’t yet have a Fargo Rating. It is not part of the FargoRate system.
The FargoRate system computes a performance rating based on data. When that performance rating is based on 200 or more games, it is called a Fargo Rating. Because a performance rating based on only a few games is unreliable as a measure of skill, it can be supplemented with prior knowledge to generate a sensible guess of skill for players without a Fargo Rating.
The preliminary rating the player sees is a weighted blend of the performance rating (with influence determined by the number of games it is based upon) and the starter rating (with influence based on the remaining games to 200.)
For instance, a player with performance rating of 580 based upon 50 games and a starter rating of 540 will see a preliminary rating of 550.
Once a player has 200 games the starter rating is ignored.
After finishing a post-doctoral research fellowship with the US National Research Council, Michael spent eight years as scientist in the Laboratory for Computational Physics at the US Naval Research Laboratory. He followed this by eighteen years as Associate Professor of Chemistry at North Dakota State University, where he taught graduate and undergraduate courses in physical chemistry and ran an active research program that produced several PhD scientists. He is the author of over 40 published articles and book chapters in the areas of quantum chemistry and computational molecular physics. He quit his academic career in 2008 to become the proprietor of Fargo Billiards & Gastropub in Fargo, ND.
Recent Events