Friday, January 14, 2011

Friday thoughts

Not really sure where I'm going with today's post, but just sitting here riffing on some stats.

One of the great conceits of being a sports fan is that we all think we know more than ______, and that if the coaching staff only listened to us, our team would be that much better.  Some of us even start blogs to make sure we can get the word out.

But, after a few years of blogging, it starts to become clear that maybe we don't know as much as we'd like to think.

Here's an example:  I can make a simple model using the four factors to estimate Georgetown's offensive efficiency for each game played so far this year, and from this I can give an estimate of how important each factor is to Georgetown's offense.
Factor      Weight
eFG%         0.64
OReb%        0.18
TO%          0.13
FTM/FGA      0.01
Total        0.96
That's to say, about 64% of the variability game to game in Georgetown's offensive efficiency can be explained by how well they shoot (eFG%). Once we account for that, about 18% of the remaining variability is due to how well they offensive rebound, etc.

I can even make a nifty graph to show how well the model works:

 I can also repeat the exercise for the defense (no nifty graph this time), for a slightly surprising result [this table has been revised]:
Factor      Weight
eFG%         0.36
TO%          0.46
DReb%        0.11
FTA/FGA      0.02
Total        0.95

But here's the thing - what can we do with this information?

I can demonstrate that Georgetown's offense is most dependent upon how well the team shoots from the floor, but I can't tell you how the Hoyas can improve their shooting accuracy from this exercise.

More importantly, just because their defense this year is most dependent upon how well they turn over the opponent doesn't mean the team should be running a full-court press all the time.  The law of unintended consequences reminds us that the result could just as easily wind up that teams would end up shooting more layups (and therefore improve their shooting accuracy) as a result, and Georgetown's defense wouldn't improve at all.

But it sure seems tempting.


A switch seems to have been thrown once conference play began, and the once great Georgetown Hoyas are now mostly mediocre.  One idea that I thought would be interesting was to compare how different lineups have played in the OOC and conference portions of the schedule.

The problem with this is that there just haven't been a lot of conference games played yet, so once you get past the top 2 or 3 most popular lineups the sample size is getting small enough that the stats are mostly meaningless.

But that's never stopped me before, so let's take a look.

.                                                Conference games only         Non-conference games only
.                                              Offense    Defense    Net       Offense    Defense    Net
Lineup                                        Poss Rate  Poss Rate  Rate      Poss Rate  Poss Rate  Rate
Clark--Freeman--Thompson--Vaughn--Wright        79  106    76   96    10       174  135   173   94    41
Clark--Freeman--Lubick--Sims--Wright            20   85    25  140   -55        68  131    68   75    56
Clark--Freeman--Lubick--Vaughn--Wright          23   96    21  110   -14        30   90    28   89     1
Benimon--Clark--Freeman--Vaughn--Wright         20  130    18  144   -14        50  120    49  114     6
Clark--Freeman--Sims--Thompson--Wright          17   94    17   82    12        51  133    48   88    46
Clark--Lubick--Thompson--Vaughn--Wright         13  123    12   83    40         2  150     2  200   -50
Freeman--Lubick--Thompson--Vaughn--Wright       11  127     8   50    77        10  150    12  108    42
Clark--Freeman--Lubick--Starks--Vaughn           7   29     9  144  -116         2  100     3  133   -33
Benimon--Clark--Freeman--Sims--Wright            7  143     7   57    86        37   97    41   88    10
Clark--Freeman--Lubick--Sims--Starks             7  129     7  100    29         6  100     7   57    43
Freeman--Lubick--Sims--Thompson--Wright          7   86     7   71    14        10   60    11   55     6
Benimon--Clark--Freeman--Thompson--Wright        7  157     7  171   -14         5  140     6   67    73
Clark--Freeman--Sanford--Sims--Wright            6  117     7   57    60         3  100     4  125   -25
Clark--Lubick--Sims--Thompson--Wright            6  150     5  120    30         8  100    12   67    33
Freeman--Lubick--Sanford--Sims--Wright           5   40     5  120   -80        27  122    25  136   -14

The first five stats columns are how well the lineup has played in Big East play, the next five columns are how well the lineup played before conference play.

I've highlighted in green the four lineups most improved in conf. play, and in pink for the four lineups that have declined the most in conference.

Nothing special jumps out at me, other than noting that the first substitution of the game (Lubick and Sims in for Vaughn and Thompson) was a great idea before Big East play, but not so much now.


  1. There is a correlation between the O efg and the D to and efg. Turnovers and d rebounds can lead to easier transition baskets, and defense can press or trap and put more organized pressure on the ball after made baskets. What we have missed are a couple of nice runs each game where we get a flurry of stops and buckets.

    From the lineups, it is interesting that Lubick in for Thompson has been so bad in BE play, but Lubick in for Clark is has been effective all season. Also, Sims and Thompson have not been on the floor together very often, but the numbers are very good when they have been.

  2. I should clarify that the defensive numbers have been very good with Thompson and Sims both in the game. The offense has been average.

  3. Hey, thanks for reading.

    I did look at off. eFG vs. def. TOs, and its a correlation essentially driven by two points - the Pitt and St. John's games.

    We have a saying at my day job: if you can put your thumb on a scatter plot and the story changes completely, you're not describing a relationship. I think that's what's going on here.

    Moreover, if I look at either OeFG or OEff vs. steals rate, there's really no correlation. It's these live-ball turnovers that lead to fast-break layups, so they're what you'd want to use. And this result actually surprised me.

    I hadn't looked at OeFG vs DeFG, but now that I have it mostly looks like a single point driving it - the Missouri game. So the correlation is actually positive, in that higher DeFG means higher OeFG. But I don't think the relationship is meaningful.

    It's funny, because the OeFG vs. DTO looked important to me too, until I actually went through the exercises described above.