In the soccer Twitterverse and Blogosphere the San Jose Earthquakes come in for a lot of stick as a dirty team. A Google search of the following four words "San Jose Earthquakes dirty" turned up the following:
"San Jose play dirty, though, so I don't look for much to change."
"The San Jose Earthquakes have acquired the Supporters Shield playing cheap and dirty."
"San Jose plays somewhere between excessively aggressive and dirty. Their elbows are usually up, they grab and push, and they have no qualms about running through opponents."
So much so that in 2012 Frank Yallop felt the need to defend his squad in the press:
"Getting across people and being strong, the last time I looked - and I played the game for a number of years - you're actually allowed to do that."
A few of us science and engineering (some might say nerdy) types were sitting on the bleachers of Buck Shaw after our ritual pre-game tour of Curry Up Now and Speedy Panini food trucks. After we settled on whose turn it was to buy the beer, the postprandial discussion turned to the San Jose Earthquakes reputation around the league as a dirty team. After a few chugs, we began to wonder how one might set about proving this reputation to be true - or not.
We came up with a strategy to determine what we named The Dirty IndexTM, a hitherto uninvented (we believe) calculation that would incorporate player statistics over the season. By the time we drained our glasses of ale, we had come up with a formula based on minutes played, fouls committed, yellow cards and red cards, using the following assumptions to calculate our Dirty IndexTM:
- A player gets yellow carded for about every three fouls (our consensus estimate; n = 8 beer-buzzed fans), so we decided to multiply the number of yellow cards by a factor of three.
- Naturally, since red cards are awarded for a second yellow, we would multiply the number of red cards by a factor of six. These multiplication factors would be our method to account for the intensity of the fouls committed.
- We wanted to prorate the dirty activity of the teams on a per game basis (90 min) and factored in the minutes played by each of the eligible players in the 2013 MLS season.
- It was assumed that any referee selection bias, home and away effects, climate and altitude effects would average out over the course of a season.
- No account could be taken for distinguishing between a red card for a leg-breaking tackle or a weak second yellow. We also did not account for yellow cards for taking off jerseys. Our reasoning was that there are so few red cards and jersey removing yellows relative to fouls and yellows that they are not statistically significant over the course of the whole season.
Our data set was pulled from the season stats for each player on every team for the 2013 season, available from the MLS web site http://www.mlssoccer.com/stats/season. Slim wrote a script (the computer geek kind) that searches the source to extract the data from each page that Sue and Nerdy then entered into Microsoft Excel.
Players who had played more than 180 minutes were considered eligible for inclusion - equivalent to 2 out of the 34 games on the 2013 MLS schedule. The team stats for minutes played (M), fouls committed (FC), yellow cards (YC) and red cards (RC) were collated and the Dirty IndexTM (DI) was calculated on a per game basis (on the back of a greasy napkin) as follows:
DI = (FC + (YC x 3) + (RC x 6)) x (90 / M)
All it took was a few cups of coffee, a dataset of 462 players and a modicum of understanding of basic statistics to turn up some interesting and unexpected observations.
The DI values for each team ranged from 1.327 for the ‘cleanest' team, the LA Galaxy (sorry Quakes fans it's true - and we did double-check the math), to 1.908 for the ‘dirtiest' team, the 2013 MLS Cup finalists Sporting Kansas City. The median DI value of 1.627 was best represented by FC Dallas. The results for all of the infractions by each team are presented in Figure 1.
Figure 1: Dirty IndexTM by MLS Team for 2013 - click on the image to enlarge it.
Here are some of our favorite highlights from the 2013 season:
- Top of the league in red cards was the Seattle Sounders (10). DC United was the only team that did not acquire any red cards though the entire 2013 season.
- Top of the league in yellow cards was MLS Cup finalists Sporting Kansas City (66; Oriol Rosell & Aurelien Collin combined for 120 fouls [of 504] and 23 Yellows [of 198) and least was LA Galaxy (35; hard to fathom for a team fielding Marcelo Sarvas with a league leading 80 fouls).
- There's no apparent correlation between success of a team and their Dirty IndexTM- Sporting KC and Chivas USA had very similar Dirty Indices.
After our first steps onto the field, our next question was: Is there was any difference in the DI of the various positions played? A quick data sort organized the data by defender, midfielder, forward and goalkeeper (if a player was listed at two positions, the first position listed was used). The results for each position are presented in Figure 2.
Figure 2: Dirty IndexTM by field position in the 2013 MLS season.
Not surprisingly, goalies were the cleanest players, with only 4 red cards among the 34 players throughout the season - their DI was a pristine 0.167. Midfielders came top with a DI of 1.96, followed closely by the forwards with a DI of 1.864. As a corollary, we noticed that the ratios of minutes played indicate that the overall MLS formation for 2013 is roughly 4-4-2.
What about the original question of the Earthquakes fans: just how dirty is our team? Let's break down their ranking in the calculations that we've just presented. Ranking the dirtiest team as #1 of the 19 teams, here's where the San Jose Earthquakes place:
With the 6th position on the Dirty IndexTM it appears that the Quakes are a little dirtier than average, but they do lie within the middle third of the range of DI stats (Range: 1.714-1.521). The Quakes DI also lies within one standard deviation of the median (Range: 1.465-1.791) - a range that excludes the three cleanest (LA, HOU, VAN) and three dirtiest teams (KC, CHV, TOR) as outliers. Thus, the Earthquakes are more average than outlier - as defined by being in the middle third or within one standard deviation.
Undeniably, the number of Earthquakes red cards is high and ranked fourth for the 2013 season. Of course, a sending off is a much more noticeable event during a game - usually it's a rare event. Overall though, the high number of red cards earned by the Earthquakes is balanced out by the lower than average number of fouls (15th of 19), and a roughly average number of yellow cards (8th of 19).
So, though we cannot argue against their higher than average Dirty IndexTM it's not significantly different that the majority of MLS teams. While the reputation for dirty play of the San Jose Earthquakes is not fully deserved (we feel), the high visibility of their red card-worthy infractions likely contributes to their reputation among opposing fans.
So, what is the reputation of the five dirtier teams: New England Revolution, Columbus Crew, Toronto FC Chivas USA and Sporting KC? Repeating the Google search of each team name with the word ‘dirty', just as we had done for the Earthquakes turned up no similar comments on the cleanliness of each team - by all means let us know if we've missed any. Sporting KC have already shown us that they might repeat as dirtiest MLS team of 2014, after three yellow cards and 25 fouls in the season opener in Seattle - the highest number of fouls that the Seattle Sounders have ever suffered in a single game. Prorated for 34 games, Sporting KC would commit 850 fouls for the 2014 season, easily surpassing the 504 total for 2013.
In summary - there was a great deal of banter and fun arguing statistically significant and insignificant statistics. We invite MLS fans to enjoy their own debate of how these variables should be managed, and preferably over a glass of their preferred adult beverage.
As far as these three fans of San Jose Earthquakes are concerned, we believe we now have an inkling that the red cards earned by the Earthquakes might explain the reputation of our team, which has been singled out around the rest of MLS as a dirty team. Stay tuned to Quake Rattle and Goal for The Dirty Index TM Part 2: the Bash Brothers, in which we analyze the individual statistics for some of your heroes and nemeses. And yes, it is statistically certain that Steven Lenhart's name will come up.
Authors: Nerdy Gales, Sue Lull-Berms and Slim Rivet-Gore