…cause often wild discussions. There are lots of chess engine rating lists, which somewhat differ. The problem is that there is a big statistical influence on the results.
To cope with that problem, Prof. Arpad Elo invented a rating system for chess tournaments in 1960. The Elo system makes it possible to tell the reliability of a tournament result.
For example, when comparing two approximately equal engines, a result of 7-3 does not necessarily mean that the first chess engine is stronger. On the contrary, a tournament of just 10 games is not even an indication but purely random. To get meaningful results, hundreds of games must be played.
These are rating lists which base on big amounts of games and which also follow certain side conditions to make the results even more reliable: