The Beverz Rating; Dodgeball's first ever rating system
Have you ever wondered what dodgeball and chess have in common? Well, not a lot actually... until now!
Alex Bembridge from Team GB & Manchester Killer Bees, isn't just a dodgeball player, he's a self-confessed nerd too. Since the first lockdown, he's been crunching numbers to create dodgeball's very first ELO based rating system, aptly named the Beverz Rating System. In case you didn't know, an ELO rating system is used in chess!
For those who don't know what an ELO rating system is, simply put, it is a system that allocates a rating to a player or team based on their performance. Over the course of the season, a team's rating can increase or decrease depending on their results. Therefore, you can predict the probability of a team winning and even the score of the match.
So, we've teamed up with Alex to predict match results and scores throughout the league season, and we'll be analysing the most anticipated line-ups. Stay tuned for the first predictions of the season!
Read on if you dare... this is where things start to get deeper. In the rest of the blog, Alex will explain how the system works, and fundamentally how he will predict the results.
The Beverz Rating
What is ELO?
ELO is a rating system that originated from chess, named after its creator Arpad Elo. Yes, that is where I got the idea - I told you I was nerdy. However, numerous other sports use an ELO based rating system. FIFA use it for the International Team rankings and a variation is used in Esports like League of Legends and Rocket League.
What is the Beverz Rating?
It is an ELO system for all of the dodgeball teams in the UK including men’s, women’s, mixed and university. I will now break it down how I've built the system, and how I will use it going forward.
Step 1. Starting Point
I have used all the scores from the 2019-2020 "half season", all 2021 results (that I have access to) and educated estimations to generate a rating for each team. This gives me a basic rating. For example, super league teams would be given a higher score because they're the highest level.
Step 2. The Math
Oh, don’t worry, this article doesn’t go into detail of the maths except for this graph:
This graph shows how the difference between the two teams’ ratings and gives a win probability. For example, if your rating is 400 points higher, there is roughly a 0.9 or 90% chance that you will win.
Based on your rating paired against another team's, I can statistically predict roughly the probability you will win a set!
But it doesn't stop there. Based on the probability of a team winning, paired with duration of match, I can also predict the score.
Step 3 - New Rating
I then compare that predicted result to the actual result, if your team does better than expected, your rating goes up, but if you do worse than expected, the rating goes down.
Next, I then calculate the percentage of sets you should win and compare it to the win probability. Based on your rating paired against another team's, statistically I can predict roughly what the score will be! So, if your team does better than expected, your rating goes up - but if you do worse than expected, the rating goes down.
The amount it goes up/down by depends on the difference between actual and predicted scores, the type of tournament/fixture, and the accuracy of the rating. The accuracy of the ratings is less reliable for new teams than established teams. Because of the lack of data, it makes their initial ranking changes more volatile. For example, Leamington Spartans have played in numerous league events and opens, so they're accuracy would be higher. Compared to a team like Snoop Dodge, who have only ever played one tournament.
Step 4. Let's apply this to a real example
Let's take Meteors vs Bath Bombers at the start of the season, whereby Meteors won 10-8. The table below first shows the details of the game as it was at the time.
It was Baths first result therefore they start on 1000 and Meteors were at a lofty 1562 after an undefeated 2020 season. Bath performed better than expected, Meteors worse, so their ratings increased and decreased respectively.
|Meteors rating||Bath Rating||Meteors Chance of Winning||Bath Chance of Winning||Predicted Score||Actual
|Bath change of rating||Meteors Change of rating|
By the end of the season, match predictions will be more accurate, because more information has been gathered. Particularly for Bath who were a new team at the start of the season. We can therefore adjust their rating closer to what their "true" rating is based on the new data we obtain from them playing.
Step 5 - Rinse and Repeat
As every teams’ ratings will be fluctuating after each event, I will start getting more accurate ratings. This means I will be able to observe patterns emerging and I will be able to better predict results. For example, now that Bath have entered 5 events exceeding most expectations there rating is now at a modest 1276. If they were to come up against Meteors again, Meteors are still a favourite but is Baths rating still too low or even Meteors too high?
|Meteors rating||Bath Rating||Meteors Chance of Winning||Bath Chance of Winning||Predicted Score||Actual; Score||Bath change of rating||Meteors Change of rating|
There are other systems, why this one?
The reason why I have chosen this method is because you can calculate a more accurate win percentage. Instead of a total game "win" (100%), "draw" (50%) and "loss" (0%), which is how chess results work. The above system uses the probability of winning each frame rather than the whole game. This allows for more accurate predictions, as there are more percentage options, making it more likely the real result falls closer to the predicted result. This makes for a more accurate and stable ranking system overall.
Does the rating reset at any point?
It does not. This system is to produce accurate ratings and the best way to create this is to input more data. This means that you are comparing the current team to the previous team. And that’s what this rating system is about, watching the trends in a team's performance. For example, this would be highly applicable to a university team that may lose players at the end of the season. Their subsequent new team could be better or worse, and this will be reflected in their trend over time.
We enter mixed ability teams rather than full strength teams. Will that have an effect our rating?
Yes, this is the biggest problem with this system as it currently stands. Several teams have had some big rating hits, and this is because the team over/under performs vs what is expected. If I am informed of this, I can introduce a modifier so that this bias affects the rating less.
The other major problem is collaboration teams. I have started with a rating that is closer to the real value by taking an average rating of the parent clubs. For example, Dripping in Finesse, a team that included Amie, Elisa, Amanda, Immy (ofc), Devah and Evie, I used the rankings of Spartans, Queen Bees, Beagles & Stafford, respectively, i.e. their most recent club with a current rank. The predicted rating was 1209 rather than 1000 going into the North West Open. This meant the predicted results would be closer to the actuals but it is still not an ideal solution to the problem.
Finally, can I ask what is my team's rating?
I know it's a burning question for you all, but you'll have to be patient and wait for the first league event for the grand reveal. But here is a taster for now!
As we expected, Bath Bombers have had the biggest increase in rating out of everyone, due to a strong inaugural season. Another good indication that this system is working is that Rhondda and Norwich have both entered seven tournaments/ league rounds and their ratings have not fluctuated that much. This means their rating reflects their current ability.
So, what next?
This is a bit of fun and has not been created for teams to live and die by their rating. So please just play your normal game and let the ratings do their thing.
Over the course of the season I will release predictions and let you know about any movers and shakers. And of course, it would be great to hear your comments and feedback!