AYSO Soccer's Mistake with Team Balancing:
Why the Fireballs' 36 Points Doesn't Equal the Dragons' 36 Points
or
Why the Dragons beat the Fireballs 9-0
by Steve Hampton
Davis, CA
November, 2003 (revised October, 2009)
One the main tenets of AYSO soccer is "balanced teams". Despite this,
Davis AYSO has had problems achieving this goal. In the past (pre-2003), the
problem was so chronic that the league was notorious as being one of the
most un-balanced leagues in town. Most families I know experienced
dismal or dominating seasons, with few competitive games. In contrast,
the local baseball little league, where players try out and coaches draft
players, seems to be much more successful. After 2003, AYSO adopted the
"sprinkling" approach promoted here, which improved things considerably.
However, many still don't understand it, and the annual comparison of teams' "average" rankings
suggest Davis may slide back into problems.
|

It's another slow day for the Dragon keepers.
Will he get to touch the ball this quarter?
|
Two Main Methodologies for Team Balancing
The current team balancing method is for coaches to rate players 1.0
through 5.0, at 0.5 intervals, with 5.0 being the very best players.
Ignoring the obvious problem of variability in coach ratings, let's
assume the coaches generally get it right and understand exactly who
is a "3", who is a "4", and so on. The real problem comes after the
ratings, when the teams are formed. There are two general approaches
to using these ratings in team formation:
· the aggregate sum approach
· the sprinkling approach
In the aggregate sum approach, players are assigned to teams (typically
by a computer) such that each team ends up with the same total number of
points. For example, the Dragons may have players rated 5-5-5-4-3-3-3-2-2-2-1-1 =
36 points, while the Fireballs may end up with players rated
4-4-4-3-3-3-3-3-3-2-2-2 = 36 points. These would be considered balanced.
In the sprinkling approach, players within each rating are sprinkled across
teams (often manually) in a snake-like fashion. First the 5's are passed
out, then the 4's, and so on. Unrated players are also spread evenly across
teams. The net result is similar to the aggregate sum approach, in that all
teams end up with similar aggregate point values. The difference here is
that the team compositions are also similar.
Two Problems with Adding Up Ratings
For several reasons, the sprinkling approach is vastly superior to the
aggregate sum approach, which is entirely deceptive for the following reasons.
1. The Treatment of Ordinal Rankings
In the most general sense, the player ratings are ordinal rankings.
That is, the ratings simply put the players into ordered categories
that happen to have numbers as labels. They do not necessarily imply
that the numbers have meaning as values and are related to each other
in a linear fashion. All we know is that the 5's are more valuable
than the 4's, the 4's are more valuable than the 3's, and so on.
However, we do not necessarily know how much more valuable each rating
is than the one below it. For example, it may not be true that a 5 is
20% better than a 4, a 4 is 20% better than a 3, and so on. One could
posit that a 5 is a dominant player who can virtually always beat a 4 to
get off a shot, or stop a 4 on the defensive end, and is thus more like
a 7 or an 8 relative to a 4, in terms of actual value to the team.
Figure 1 shows some possible relationships between player rating categories
and actual value to a team.
The aggregate sum approach, by treating the ratings as numbers,
inherently prescribes a linear relationship to them, thus assuming
path A. In fact, we don't know the actual relationship between the
player ratings and their actual value to a team. Under the aggregate
sum approach, if path B or C is more accurate, serious team imbalances
may occur.
As an example, let's return to the two teams described earlier.
If path A is true, than the ratings reflect precisely their numerical
value to the team (e.g., a 5 contributes 20% more than a 4, etc.; or,
put another way, trading a 5 and a 2 for a 4 and 3 is an equal trade).
However, let's also evaluate the situation in which path B is a more
accurate representation of reality.
DRAGONS | FIREBALLS |
Player Rating | Actual Value | Player Rating | Actual Value |
Path A | Path B | Path A | Path B |
5 | 5 | 8 | 4 | 4 | 4 |
5 | 5 | 8 | 4 | 4 | 4 |
5 | 5 | 8 | 4 | 4 | 4 |
4 | 4 | 4 | 3 | 3 | 2 |
3 | 3 | 2 | 3 | 3 | 2 |
3 | 3 | 2 | 3 | 3 | 2 |
3 | 3 | 2 | 3 | 3 | 2 |
2 | 2 | 1 | 3 | 3 | 2 |
2 | 2 | 1 | 3 | 3 | 2 |
2 | 2 | 1 | 2 | 2 | 1 |
1 | 1 | 0.2 | 2 | 2 | 1 |
1 | 1 | 0.2 | 2 | 2 | 1 |
Total: | 36 | 37.4 | Total: | 36 | 27 |
It's not difficult to see the outcome if indeed path B is more realistic.
In this case, the three 5's on the Dragons dominate the game and give
their team a lopsided advantage over the Fireballs. The Fireballs,
lacking any 5's, may have match-up problems as well and be all the
more hard-pressed to defeat teams with 5's.
The sprinkling approach inherently avoids this problem of not knowing
the relationship between rating and value. With the sprinkling approach,
paths A, B, or C may represent reality and the teams will still be balanced.
The key advantage here is that we don't have to know which path is correct:
the method works for any functional form (any potential path).
2. The Treatment of Unrated Players
Whenever the focus is on the aggregate sum of the ratings, there is a
question regarding how to treat players that have no prior rating.
Players with no rating are typically assigned a specific point value
(both 3 and 0 have been used by Davis AYSO).
There are two significant problems with this: 1) we have no idea what
their actual rating should be (my own experience suggests that using 3
is too high and 0 is obviously too low); 2) there is a much larger
variance (i.e. degree of uncertainty) associated with this category
than with any other category (e.g., I've seen unrated players turn out
to be everything from 1 to 5). With the aggregate sum approach, the
method is blind to the number of unrated players assigned to any one
team (I was once given 6 unrated players on a team of 12). Obviously,
putting a disproportionate number of these players on any one team
increases the uncertainty surrounding that team. The goal here is to
spread the uncertainty as thinly as possible. Under the sprinkling
approach, unrated players are spread evenly across all teams.
Conclusions and Recommendations
Since 2004, Davis and most other AYSO leagues around the nation have used the sprinkling method.
However, most of these leagues still mention re-balancing
measures they may take (for example, when players drop or join) and
describe the aggregate sum of the player ratings as their primary criteria
for deciding whether or not a team is balanced. In using this criteria,
they fall into the trap of assuming path A is the correct one.
I offer the following recommendations:
1. Strictly employ the sprinkling approach, taking care that each
team has, to the extent possible, equal numbers of 5's, 4's, 3's, 2's, 1's,
and unrated players.
2. Do not fall into the trap of summing or averaging the player ratings. To avoid
this mistake, rate the players A thru E rather than 1 thru 5. With the
sprinkling approach, simply spread each category across the teams.
3. Treat unrated players as unrated players. Do not assign a point
value or rating of any kind to them. Perhaps call them U's. Spread
them evenly across teams.
4. Because there are usually two age groups in each division,
consider creating more rating categories by dividing the A thru E's
into A-E "first year in division" (where their rating is emanating from
play in a younger division) and A-E "second year in division". (This is the current practice in Davis.)
As long as
there are at least as many teams as there are players in each rating
category, more categories can be created.
Send correspondence to Steve Hampton at hamptons_at_sbcglobal.net
|
About the Author:
Steve Hampton grew up playing recreational soccer and has been coaching AYSO soccer
since 1997 (U6 through U19). When not thinking about team defensive
strategies, he ponders how math affects the lives of children.
|