Introducing the Inaugural KYH Elo Ratings System

By Lambert Chu


Here’s an age-old question in ZED: how does my horse really stack up to the rest of the competition?

ZEDs built-in class points system does a decent job of approximating horse ability, but it’s far from perfect. The point changes are always the same after each race, no matter how strong the horses you competed against. On top of that, the point changes are controlled completely by ZED and have already been overhauled once.

We need a better rating system! It should be one that assigns more intelligent points changes after each race and takes into account the strength of the field.


The Classic Elo System


First, let’s get an idea of how a typical Elo system works and why it doesn’t quite work well for ZED.

Elo is used to calculate the relative skill levels of players or teams in zero-sum, two-player games. It’s famous for rating chess players, but it’s expanded to serve other purposes too - like FiveThirtyEight’s NBA team ratings.

After every game, the winning player takes points from the losing one, and the number of points is determined by the difference in the players’ ratings.

  • If the higher-rated player wins, a few points are taken from the lower-rated player.
  • If the lower-rated player wins, a lot of points are taken from the higher-rated player.
  • If it’s a draw, the lower-rated player gains a few points from the higher rated player.


But how can we calculate Elo ratings when there are more than two players like in ZED races? That gets a bit trickier. We’ll have to extend the classic Elo logic so that coming in first place is better than finishing in second place, which is better than finishing in third place, and so on. Losing (coming in last place) would not improve your rating.

In order to generalize this calculation to multiplayer races, we can treat a single, multiplayer race as a combination of all pairwise 1-on-1 matchups between every horse involved.

For example: if your horse finished in 2nd place, that means you lost to the 1st place horse, beat the 3rd place horse, beat the 4th place horse, etc.



Problems with Classic Elo in ZED


As you probably know by now, finishing last in ZED isn’t necessarily a bad thing! Having a lot of last place finishes is a strong indicator that you have a horse with high deviation, aka a U-shaped horse. This is one of the strongest archetypes in the game, since you’ll have a lot of finishes at the top of the pack (in the money), yet finishing at the back of the pack doesn’t hurt you at all compared to finishing in the middle of the pack.

In a traditional Elo system, horses that finish in the back of the pack are considered losers to all 11 other horses, including the ones in the middle of the pack. As a result, U-shaped horses are excessively punished for finishing last and are assigned Elo ratings that are too low.

Conversely, horses that consistently finish in the middle of the pack are ranked as average horses, when in reality they’re considered to be bad racers. Therefore, we have an issue where back-of-pack placements are excessively punished.

A second problem is that finishing in 1st place is disproportionately favorable in ZED. That’s because the game’s prize structure skews heavily towards the front. Here’s the breakdown of the prize pool payout as of November 2021:

1st place: 60%
2nd place: 25%
3rd place: 15%

Because of the payout structure, the 1st place horse beating the 2nd place horse is a “more significant win” than the 2nd horse place horse beating the 3rd place horse. It’s certainly a more meaningful win than the 9th place horse beating the 10th place horse. We have a problem where consecutive placements should not be weighted linearly.

To have a system that’s truly better at evaluating ZED horses, we’ll need to find a way to resolve these two issues.



Changes to the System


One way to reduce the penalty of back-of-pack placements is to tighten up the distribution of possible placements. Instead of a 12th place horse losing to 11 other horses, why don’t we lower the number of L’s that it takes?

In this system, we’ll make one major tweak to horse placements. Any horses that finish from 4th to 6th place are considered tied, and any horses that finish from 7th to 12th place are considered tied.

Horses that are in the money (1st, 2nd, 3rd) are still ranked separately. Horses in the middle of the pack (4th to 6th) are ranked in the next tier, and the remaining horses are in the bottom tier. This structure wasn’t chosen completely arbitrarily - it’s consistent with ZED’s latest class progression structure.


We revamp the final horse positions for each race. There are multiple horses “tied” for positions 4 and 5.


Now, how about the problem of consecutive placements?

We can fix this by assigning an exponential score function. This rewards finishing at the top of the pack a lot more - especially the 1st place horse.

In a standard Elo system, the scores are weighted linearly. With a higher scoring base, the top placements earn exponentially higher ratings. You can see in the graph below how the exponential functions (red and green lines) have much higher rewards for the top of the pack compared to the classic linear function (blue line).

See the Python package below for the image source.
For the model implementation, I used a modified version of this Python package. Special thanks to djcunningham0!


The Elo System in Action


How about we see the entire thing play out for a sample race?

This is a table that shows Elo ratings before and after this race from back in December 2020.
Horses that finish 1st through 3rd gain Elo points, while the rest of the field lose Elo points. The amount won or lost is dependent on their final position and the ratings of horses they beat or lost to.



In this specific race, notice how Hozier gets hit particularly hard for finishing 10th. It went into the race with an 1192 rating (the highest of the field), so it lost points than any other horse in this race.

That's the beauty of taking strength of field into account!



How do the Final Ratings Look?


The final ratings are now live on KYH!! Click here to see the full list. Keep in mind these are KYH's very own, in-house ratings and are not an official ZED stat.

In this first version, we are using all-time race history, which means we consider all races across prize pools and distances.

As an overall summary, here are the average Elo ratings and ZED ratings per class. It makes plenty of sense that the higher class horses have higher Elo and ZED ratings - but notice how Class 1 horses have very noticeably higher ZED ratings.


This comes from the elite horses in the game really skewing the average higher, since they continuously race, make dough, and creep up in overall class points. On the other hand, our system does a better job of calculating rating changes when these top horses compete against each other in Class 1.

Let’s also plot ZED’s ratings against our own Elo ratings:

In ZED’s rating system (x-axis), almost all of the horses are clustered at the low end of the scale between a tight range of numbers: 0 and 100. Our Elo system (y-axis) does a better job of dispersing these low-end horses across a wider range of numbers - which means it should be easier to tell whether one horse is better than another.

With an Elo system, you can have a better idea of how good horses are that are relatively new. If a new horse races several times against monsters and beats them, it'll gain points faster - as it should!



Conclusion


Now that Elo ratings are live, it’ll open the door for even more opportunities. We can display ratings through the KYH Chrome extension, so you can gauge your competition even more effectively before entering a race; we can flag races that have a ton of monsters (high average rating) or have a light field (low average rating). We can also quickly spot races that have some crazy upsets when a low-rated horse beats a high-rated horse.

On top of that, we’ll use these ratings as targets for models that predict racing ability for lightly-raced horses. We should even be able to use them to predict breeding ability, one of the hottest questions in the game. And of course, it’ll be a lot of fun to watch ratings climb for your horses for bragging rights.

One next step we'll definitely take is creating separate ratings lists per distance funnel. Notable distance specialists - such as Princess of Power - are hurt in ratings when they purposely lose races to downclass. Her losses in sprints should not discredit her from being one of the best marathon runners in the game.

There is still some subjectivity in the methodology, and slightly adjusting some of the knobs can yield a different set of rankings. As a result, take everything with a grain of salt. However, we stand by this system as an excellent gauge of true horse performance that’s free from personal biases.

Stay tuned for more adjustments going forward. If you feel like you see glaring errors, please feel free to reach out… but if you ask us to bump up the rating of your favorite horse, you’re getting a lump of coal for Christmas!


Back