Dave Laidig is back with the second part of his series looking at the use of statistics and numbers in soccer. For Part I, click here.

In part 1, we reviewed Opta’s Castrol Index ratings of MLS players and

an adjusted index minimizing statistical anomalies. In this part, we

use the objective performance data – i.e., the Castrol Index and the

adjusted ratings – to analyze the relative contribution of the

different positions, and to serve as the basis of determining the

value of each position.

The beginning of Part 2 relies heavily on the work of Benjamin

Leinwand and Chris Anderson, available at

http://www.soccerbythenumbers.com. In short, I attempted to replicate

their work, and then extend the new areas. I am indebted to their

efforts, and appreciate their willingness to share their results with

the public – and to start a conversation that consumes large chunks of

my free time.

Using both the Castrol Index and my adjusted scores, I set about

replicating the Leinwand- Anderson analysis. Their analysis created a

regression equation using a team’s average Castrol Index rating for

each position and the team’s league points (i.e., a mathematical model

roughly summarized as: a Constant + Fwd Rating + Mid Rating + Def

Rating + GK Rating = Expected League Points).

The Leinwand- Anderson analysis reported Defenders have higher average

Castrol Index scores. And the Defender position was the only one

significantly related to league points in a multiple regression

equation; an equation including a team’s average Forward, Midfielder,

Defender, and GK ratings (which are 7.07, 7.09, 7.59, and 6.53

respectively). Considering forwards are paid more than Defenders,

this suggests that investing in defenders may be a more productive use

of resources. Using publically available data, I was able to recreate

their result, although some of my interim calculations were a touch

different. Replicating their process as best I could, I obtained an

R-squared of .62; meaning this model roughly explains about 62% of the

variation in a team’s league points. Perhaps more importantly, and as

originally reported, only one position group resulted in a significant

relationship with the league table.

As stated in Part 1, I believe the Castrol Index can be improved,

especially among players with lesser playing time. Accordingly, I ran

my adjusted index scores through the same process as the Leinwand-

Anderson analysis. Notably, defenders still retained a higher average

rating over midfielders and forwards (7.89 versus 7.58 and 7.67

respectively), but the position averages became closer with using the

adjusted index. Using the adjusted ratings, the R-squared value was

.72; which adds 10% over the model using the original Castrol Index

ratings. Consequently, the adjusted ratings were a better predictor

of team performance.

Next, I attempted to address a unique soccer feature of the regression

equation. The Leinwand- Anderson equation treats Forwards,

Midfielders and Defenders as equal units. But we know, depending on

the formation and game situation, there are uneven numbers at each

position. Thus, I modified the equation to account for the relative

time contribution of each position. To do this, I used average

position score from the first analysis, and weighted it by that

position’s contribution to the overall team minutes. For a math

illustration, the LA Galaxy forwards may have an adjusted index

average of 7.17, and contribute 20% of the team’s minutes. I would

then report .2 * 7.17 or 1.43 as a value of that position’s point

contribution to the team.

By considering playing time with each position, the multiple

regression equation for the position’s adjusted point contribution was

significant for Forwards, Midfielders, and Defenders, with a

non-significant p-value of .14 for Goalkeepers. Thus, this model

permits analysis of all field positions, not just defenders. In this

model, defenders have a larger regression coefficient than

midfielders, which is larger than forwards. This would support the

notion that defenders contribute more to wins than other positions.

Further, the R-squared value was .78; which, by explaining 78% of the

variance in league minutes, represents the model with the greatest

explanatory power (compared to .62 for the Leinwand- Anderson model

and .72 for the Leinwand- Anderson using adjusted index ratings).

Although this mathematical model shows each position’s contribution to

league points, the model is clumsy for team use. Contracts are

determined player by player, and not position group by position group.

Thus, in order to make the equation useful, the regression equation

coefficients need to reflect the value of one player. Consequently, I

considered league data and determined forwards accounted for 19.0% of

all league minutes, midfielders 39.5%, defenders, 32.2% and

goalkeepers 9.1%. With eleven players, I calculated how many

“players” were assigned to a particular position group. Using

forwards for example, this position had 2.1 players’ worth of league

minutes applied to this position’s regression coefficient.

As a result, the estimated impact of inserting a field player with a

higher adjusted index score on the team’s league points can be

calculated. Recalling our model, the defender position group had the

highest contribution to league points. But when comparing a single

player to a single player, the value of a forward was greater than

that of a defender. This result is due in large part to the defender

group consisting of more players than forwards, thus the defender’s

contribution is diluted by the extra players incorporated into its

effect.

To illustrate, I went back to my adjusted index scores to provide some

examples for each position.

Upgrading from a median forward to an 80th percentile forward

(represented by Zusi) would expect to yield an additional 7.98 league

points. Upgrading from a median defender to an 80th percentile

defender (AJ DeLaGarza) would expect to yield an additional 4.68

points. And upgrading from a median midfielder to an 80th percentile

midfielder (Zach Lloyd) would expect to yield an additional 3.03

points.

And this model can be adapted to a team’s current system of player

valuation (instead of using the adjusted index or Castrol Index) by

using the assumption that player distributions are similar. And while

the player rating distributions may not exactly replicate the adjusted

index, this method provides structure for roster management decisions.

Upon review, one is justified in saying defenders contribute more to

wins. With a closer inspection, we see less variability among

defenders compared to other positions. And on a player by player

basis one can justify paying more for a forward upgrade because of

greater expected results. In sum, forwards may be more valuable for

their combination of rating and scarcity.

However, this analysis alone does not justify current expenditures.

In Part 3, we will consider how salary affects team success and

whether salary is associated with player performance.

Filed under: Uncategorized |

potter1959, on April 10, 2012 at 8:41 am said:Good post,Dave.

I came to the same conclusion here

http://thepowerofgoals.blogspot.co.uk/2012/02/is-mls-spending-its-money-correctly.html

MLS defenders as a unit probably do contribute more to team success than strikers,but strikers are significantly under represented in terms of on field numbers.The same probably applies to most other leagues.

Also agree that scarcity of striking talent is a factor.

Looking forward to part 3.

Mark

Dave L, on April 10, 2012 at 4:28 pm said:Mark, thanks for the link. I look forward to checking it out.

And position descriptions are only part of the analysis. One needs to use the adjusted index values (which applies to about 75% of the players), and then compare options among actual players. Because money does not track with performance, great value is possible without resorting to generic position labels. But speaking in generalities, there is a good chance that two Defender upgrades (median to 80+ percentile) would be cheaper than one forward upgrade (median to 80+ percentile). The 2 Defenders would expect to yield an extra 9 points, while the one forward would yield about 8 points.

I enjoy playing with the hypothetical moves, but I think the important point is to structure player management decisions (using whatever evaluation criteria a team will accept) in a way that supports the desired outcome. And if a desired player’s cost is greater than the benefit on wins – he better be able to sell jerseys and put bottoms in seats in order to justify the expenditure.

Jersey1, on April 10, 2012 at 6:42 pm said:This makes sense in a soccer sense, but fans don’t buy Sean Franklin jerseys, they buy Donovan jerseys. These clubs are a business after-all, so they have to make some of these moves with an eye towards the bottom line, especially since seat sales in MLS don’t always track success on the field

PART 3 Money and Performance in MLS « Footiebusiness, on April 25, 2012 at 12:43 am said:[…] at the use of statistics and numbers in soccer. For Part I, click here. For Part II, click here. So far, this series has focused on analyzing objective measures of performance. In Part 1, we […]

Laidig Speaks: Adjusted Castrol Index and Creating a Predictive Framework « Footiebusiness, on February 19, 2013 at 12:50 am said:[…] have shown that Castrol Index scores, positions, and playing time reflect league results (See Footiebusiness and A Beautiful Numbers Game). And while the scores appear to match up with the league results […]

From the Vault: Money & Performance in MLS | Footiebusiness, on February 22, 2013 at 2:33 am said:[…] at the use of statistics and numbers in soccer. For Part I, click here. For Part II, click here. So far, this series has focused on analyzing objective measures of performance. In Part 1, we […]

Soccer Potpourri | OverLapping RunOverLapping Run, on February 22, 2013 at 11:20 pm said:[…] Laidig: Money & Performance in MLS – Part 1 – Part 2 – Part […]

A New Look at Player Value | Footiebusiness, on February 28, 2013 at 2:23 am said:[…] wish to cover material again, especially since much of the data has been previously posted on Footiebusiness. (See a recap in the background on the PAR). However, with metrics that correlate to winning, we […]

Predicting EPL Performance | Points Above Replacement, on June 25, 2013 at 12:31 am said:[…] Index scores, when also considering position labels and playing time, reflect league results (See Footiebusiness and A Beautiful Numbers Game). For the most recent EPL season, this model R2 was 0.934. And for […]

dress-up onlinegames, on July 13, 2013 at 9:13 pm said:There’s definately a great deal

to learn about this topic.

I like all the points you have made.

impotencies impotency impotent impotently impotents impound impounded impounding impoundment impoundments impounds impoverish impoverished impoverishes impoverishing impoverishment impoverishments impower impowered impowering impowers impracticable imprac, on July 14, 2013 at 5:38 pm said:Excellent notable analytical vision with regard to details and

may anticipate complications prior to they will happen.

Laidig Speaks: From the Vault | Footiebusiness, on October 1, 2013 at 1:44 am said:[…] have shown that Castrol Index scores, positions, and playing time reflect league results (See Footiebusiness and A Beautiful Numbers Game). And while the scores appear to match up with the league results […]

Posts of the Year: Laidig Speaks | Footiebusiness, on December 27, 2013 at 1:19 am said:[…] have shown that Castrol Index scores, positions, and playing time reflect league results (See Footiebusiness and A Beautiful Numbers Game). And while the scores appear to match up with the league results […]