Superb video here.
Archive for January, 2013
January camp is slated to likely begin sometime next week with an expectation of a full roster out perhaps Monday by USSF.
Here’s a few bets on the roster: (A * denotes the invitation/attendance has already been confirmed.)
G: Nick Rimando, Sean Johnson, Bill Hamid
DEF: Matt Besler, Omar Gonzalez, George John, Steven Beitashour, Sheanon Williams, Heath Pearce, Amobi Okugo, Chance Myers
MID: Alfredo Morales*, Kyle Beckerman, Alejandro Bedoya, Graham Zusi, Dax McCarty, Nick DeLeon
FW: Chris Pontius, Freddy Adu, Josh Gatt, Benny Feilhaber, Mix Diskerud, Brek Shea
STR: Eddie Johnson*, CJ Sapong, Alan Gordon, Juan Agudelo
* Wildcard: Maurice Edu. Edu has played 11 minutes all season from Stoke City. Might he get released to Klinsmann’s mothership for January. Probably not, but…
This piece originally ran in August of 2012
What happens when TSG reaches out to a bunch of nerds in high places who like numbers, triangle passes, Barcelona analogy misappropriations, Dax McCarty data, Roger Espinosa heat maps and deal in things called “machine learning” & “regression analyses?”
A massively data-erotic soccer column on game and player analysis that lies somewhere just left of Shelter Island and A Beautiful Mind, but well beyond Good Will Hunting.
Now our actors around the roundtable. NASA “Curiosity” seekers don’t have squat on these guys. They are:
» Steve Fenn: Steve is a mathematics lunatic (complimentary) and tweets at the poignant handle Optahunt. Steve also now writes for TSG & Big D Soccer … when Opta Chalkboards are not online. Welcome Steve, the John Nash of these proceedings. (Follow on twitter)
» Rui Xu: Rui is currently in his second season as the Performance Analyst with Sporting Kansas City. He graduated from the University of Southern California with a degree in Economics. Rui is like the Stu Ungar of this roundtable, pre-extracurriculars of course. (Follow on twitter)
» Alex Oshansky: Alex runs numbers for an investment fund in LA by day and makes soccer spreadsheets at night. This makes him the
Bernie Madoff Blue Horseshoe (about to become the) Warren Buffet of the soccer world. He already is here in this column. Alex also writes for TSG and also on his own blog, Tempo Free Soccer. (Follow on twitter)
» Devin Pleuler: Devin is a computer science graduate from Wentworth Institute of Technology in Boston, where he played on the men’s varsity team as a goalkeeper. He’s writes the Central Winger analytics column for MLSsoccer.com. He’s a TSG alum and now a member of the MIT men’s soccer coaching staff. So he’s obviously Will Hunting here. (Follow on twitter)
TSG: Okay, what is the biggest misconception in what appears to be the growing market of data processing and statistical evaluation in soccer?
Xu: I think the biggest one is that I don’t think we’re going to get the granularity that we get in baseball, and I don’t think it will surpass scouting in terms of importance when it comes to opponent scouting.
The BEST case scenario for soccer statistics is getting to where defensive statistics are in baseball., with Ultimate Zone Rating (UZR) and Defensive Runs Saved (DRS), which are known to take a long time to stabilize, and are very iffy to use in small sample sizes.
Olshansky: I don’t know if it it is a misconception exactly, but it seems like the analytical community is only just scratching the surface of what we can learn about soccer. The field is in a nascent stage, and I think some perspective regarding just how far there is to go is needed. Graham MacAree of SB Nation had a great treatise on the state of the field and what still must be accomplished.
For example, only recently have some analysts started to realize just how few crosses are actually completed (for some teams only 10%). What does this mean exactly? Is crossing now a bad strategy? It’s just one example of how little we really understand.
Fenn: Underestimating the value of context. Biggest culprit within that issue is pass completion without adjusting for pass difficulty or value. Devin’s recent Central Winger columns have been nice steps toward better understanding.
TSG: What is the one stat–in your expert opinion or through empirical review–that is the most mis-used stat in soccer?
Olshansky: I’m not the first one to say it, nor will I be the last, but possession % is still used far too often to denote who is outplaying whom. Devin has written about how it is often more a defensive statistic than an offensive one and is highly dependent on game situation.
Fenn: For me, the most misused is also the most used. Whether he actually said it or not, I agree with the Jonathan Wilson quote, “Goals are overrated.” Again, context is key, and Sam Green’s work on shot quality is a great way to gain better insight into scoring and goalkeeping.
Pleuler: Possession is undoubtably the most mis-used stat in soccer. Arguably, this is because it is significantly more nuanced and complicated than it is often made out to be. Since the number of possessions during a game is finite, it is in the best interest of the more efficient team to increase the rate of possessions. This is simply because there is less variation the larger the sample size. Conversely, after a team has scored, it’s in their best interest to decrease the number of remaining possessions in the game. The less possessions, the less chances there will be for your opponent to score.
Barcelona does this by gaining and holding possession for long periods of time. Stoke does this by sitting deep in their own zone and becoming comfortable with their opponents possessing the ball for long periods of time. Barcelona and Stoke are attempting to do accomplish exactly the same thing and yet we view one style as artistic and the other as anti-soccer. One of these styles artificially inflates possession percentages, and one artificially deflates them. To talk about possession statistics without acknowledging this kind of context is really really bad.
Xu: By a GIGANTIC margin, possession percentage. First of all, it doesn’t even describe what it claims to; it actually has nothing to do with ‘time;’ it’s calculated using team passes divided by total passes. The data providers are using pass volume as a proxy for time in possession because it’s easier to calculate. Graham Macaree has a great article on that here.
Secondly, nobody ever provides any context on possession percentage. Statements like “When the score is tied, the average team’s possession percentage is 50%. When they’re up by one, the average team’s possession is 40%. When they’re down by one, the average team’s possession is 60%” are actually pretty easy to find, and they tell WAY more of the story than just possession percentage by itself.
Finally, it’s unclear a) whether or not possession percentage is reflective of the scoreline at all, and b) whether possession percentage is reflective of the scoreline, or the scoreline is reflective of possession percentage. Did the team lose 1-0 because they had a lower possession percentage, or did they have a lower possession percentage because they lost? There are just too many unknowns and ambiguities for it to be as ubiquitous as it is.
Xu: Well, goal differential, but I’m guessing that’s not what you’re asking.