Hoya Prospectus: offensive efficiency

Showing posts with label offensive efficiency. Show all posts

Wednesday, February 1, 2012

Is Georgetown overrated?

Yes.

The Hoyas came into this week with their AP ranking dropping from 9th to 14th (or 18th on KenPom, if you prefer a rational metric), and fans across the Hoya-nation spectrum are fretting that their plans of a Final Four trip might not happen.

I think it might be time to lower your expectations a bit.

As our regular reader knows, we don't update this blog nearly as often as we used to - hell, it took me two weeks to get all the stats pages updated. But there is one set of stats that we do track religiously around here, and that's the "Big East Snapshot" page. Not so much because it's a better set of stats than the others, but because I actually bothered to make those stats more automated than the rest.

You probably don't check that page often, if at all, but you should. It provides you with two main pieces of information: a set of summary tables (more on those in a minute) and a handy chart for each Big East team. Here's the Georgetown chart (click to enlarge):

There's a previous post where I explain how to read these charts, but here's a brief summary:

The top of the chart shows how well the Hoyas played in any game, accounting for the quality of the opponent and the venue. A black dot means a win, a grey dot means a loss. We rate the team's performance by the final score, and with some tricky math figure what rank you'd give the team based only on that single game.
The middle and bottom of the chart are how well the team played on offense and defense, respectively, again accounting for opponent and venue. These are the offensive and defensive efficiencies (points per 100 possessions) that Ken Pomeroy made famous.
Home games are in all caps; since Chaminade is not a Division-I team, that game doesn't get rated.

Now if you put your thumb over that dot that represents the win at St. John's, you should notice a disturbing trend - since the second win against Memphis, the Hoyas have begun an inexorable slide throughout January. Note: the chart is dynamic and will update as the season progresses, so the above discussion may or may not be valid in a few weeks.

Why?

Well, if I knew exactly what was wrong I probably wouldn't be posting from my mother's basement, but I think the chart shows two clear issues:

The defense stopped playing lights-out after the Memphis game (excluding the win vs. Providence), although there seems to be a trend towards getting back to the early season prowess.
The offense is going into the tank (and you probably didn't need a chart to know that).

Putting this in some context after the jump

It's way too early to start looking at stats

http://www.savagechickens.com/2009/03/love-stats.html

But I've started to wake up the stats pages anyway (if you're new here, the stats pages are linked at the tool bar at the top of the page).

Right now, I've got the team and player advanced stats, along with player plus/minus and shot selection up and running. Let's wait until next weekend (after the 'Bama game) to wake up the plots for net points and the performance charts - I do have the table on the net points page live, though.

The stats pages will get updated weekly, depending upon what else is going on with my schedule. Now that Ken Pomeroy has moved most of his site behind a pay wall, I suspect that the pages here will be in bit higher demand. Feel free to prod me (e-mail is at the upper-right) if you just can't wait for something. Also, do let us know if you see a mistake, don't understand what a stat means, or have just thought of the greatest new basketball statistic of all time.

A few thoughts from what is now available:

The biggest surprise for me so far is the low turnover rate Georgetown has yielded on offense [TO rate = 15.6%, rank = 13/345 versus Div-I], including against their first two quality opponents [13.9%]. Now the Tigers haven't been forcing turnovers much this year [17.4%, 308/345], but the Jayhawks have [23.5%, 107/345]. The Hoyas have managed to keep their turnover rate below 20% for an entire season only once [2005-6: 18.8%, 35/334] since Coach Thompson arrived with his pass-heavy offense. I have no expectation of this keeping up, but it sure would be nice if it did.

Henry Sims and Jason Clark are using a huge number of possessions so far [34% and 27%, respectively vs. Kansas and Memphis], and their efficiency on offense is suffering a bit because of it [ORat = 104 and 107, respectively]. Hollis Thompson and Markel Starks managed much better offensive ratings by being more selective [125/16% and 141/14%].

Otto Porter is playing really well, whether against all teams or just the top-100 (I don't think you needed the stats to know that). He's been the best defender on the team while using his possessions efficiently [130/18%].

The reason some people hate plus/minus is that the stat can be very misleading, even over the course of an entire season. There's a good example of that currently on the Hoyas: plus/minus hates Hollis right now [-11 net / 40 minutes]. Net points says he's been playing fine [+11 net / 40 minutes]. Call me biased - since I'm about the only one who posts Dean Oliver's net points - but I think the stat-head version is a bit more accurate.

Wednesday, February 16, 2011

Introducing a new stats page

I've been both sick and out in the field for the past week, so things have been more quiet than normal around here. Most stats pages are updated - the rest will have to wait until the weekend.

Meanwhile, I've been sitting on a new stats page that has been gestating for the past couple of weeks - it's certainly not in a finished state (I've only been able to incorporate a couple of suggestions), but I figured I'd just get it out there with minimal explanation, and come back and clean it up when I get the chance.

The new page is currently called "Big East Snapshot" and can be found on the tabs at the top of the page. It is a re-packaging of Ken Pomeroy's adjusted offensive and defensive efficiency statistics, so if margin-of-victory statistics are not of much interest to you, you should probably stop reading right here.

The purpose of the page is two-fold:

Give my reader an idea of how well each Big East team has played within conference so far this season, both over the entire conference season and over the last five games (the "snapshot").
Provide something of a new-and-improved version of the performance charts that I generate for Georgetown after each game, but now for all Big East teams.

This is a follow up on a previous post here, where I criticized John Gasaway for using unadjusted efficiency stats for his Tuesday Truths columns. It occurred to me that it wouldn't be hard to generate the adjusted stats myself, and from that flowed the conference snapshot.

The idea is simply to correct each game's margin-of-victory statistics for both the quality of the opponent and the game venue. For instance, in Georgetown's last game, the Hoyas beat Marquette 69-60 in a 68 possession game. This translates into offensive and defensive efficiencies of 102 and 89, respectively, for the Hoyas (or 89 and 102 for Marquette).

But if I account for who the Hoyas were playing and that it was a home game, we get adjusted efficiencies of 111 and 86 for Georgetown and 109 and 90 for Marquette.

That is to say, if Georgetown would have played equally as well, but on a neutral court and against the average Div-I opponent (right now St. Peter's), we'd expect the Hoyas to end with efficiencies of 111 and 86 for the game; if it were a 68-possession game, that'd be a final score of 75-58. For Marquette, if they played equally as well as last Sunday, they would also beat St. Peter's, but with a final score of 74-61 in a 68-possession game.

The biggest difference between this analysis and the "Performance" stats is that here, I'm now giving equal credit to both teams for each game's result. For instance, in the shellacking that Seton Hall put on Syracuse, I'm giving equal credit to the Pirates for playing out of their minds and the Orange for mailing in the game.

That's it in a nutshell, and I'm working this through for all games played by all Big East teams. The main caveat here is that Ken also employs a weighting factor for more recent games, which I'm currently not doing. I think this is a very small effect (well under 1%), so the stats I'll be posting are very close to what Ken would have.

There are also couple of technical reasons why I'm unveiling this page, which you won't likely be much interesting in. Suffice it to say that the underlying statistics for the new page are generated in a more automated fashion, so it takes only a few keystrokes to update the tables, and a few more to update the charts shown after the tables.

That's about all the time I've got right now, so I'll just go ahead and hit the publish button for now. I'll try to come back and add another post discussing why I think the new page is kind of interesting.

Friday, January 14, 2011

Friday thoughts

Not really sure where I'm going with today's post, but just sitting here riffing on some stats.
--------------------------------------------------------------

One of the great conceits of being a sports fan is that we all think we know more than ______, and that if the coaching staff only listened to us, our team would be that much better. Some of us even start blogs to make sure we can get the word out.

But, after a few years of blogging, it starts to become clear that maybe we don't know as much as we'd like to think.

Here's an example: I can make a simple model using the four factors to estimate Georgetown's offensive efficiency for each game played so far this year, and from this I can give an estimate of how important each factor is to Georgetown's offense.

Factor      Weight
eFG%         0.64
OReb%        0.18
TO%          0.13
FTM/FGA      0.01
Total        0.96

That's to say, about 64% of the variability game to game in Georgetown's offensive efficiency can be explained by how well they shoot (eFG%). Once we account for that, about 18% of the remaining variability is due to how well they offensive rebound, etc.

I can even make a nifty graph to show how well the model works:

I can also repeat the exercise for the defense (no nifty graph this time), for a slightly surprising result [this table has been revised]:

Factor      Weight
eFG%         0.36
TO%          0.46
DReb%        0.11
FTA/FGA      0.02
Total        0.95

But here's the thing - what can we do with this information?

I can demonstrate that Georgetown's offense is most dependent upon how well the team shoots from the floor, but I can't tell you how the Hoyas can improve their shooting accuracy from this exercise.

More importantly, just because their defense this year is most dependent upon how well they turn over the opponent doesn't mean the team should be running a full-court press all the time. The law of unintended consequences reminds us that the result could just as easily wind up that teams would end up shooting more layups (and therefore improve their shooting accuracy) as a result, and Georgetown's defense wouldn't improve at all.

But it sure seems tempting.

Run Hoyas, Run? Part 2

Last time, I discussed the distribution of Georgetown's possessions last season as a function of possession length.* Today, I'm going to look as the efficiency of Georgetown's offense (and defense) as a function of possession length. Then, I'll combine the two to find if the Hoyas were leaving points on the table.
*here, possession length = time until first action; see previous article for details.

A few years ago, Ken Pomeroy posted a plot of possession efficiency for each second of possession length, derived from five years of play-by-play stats for Division I college basketball. I re-posted that figure last time, so here I'll just show my re-plot of his data, along with possession efficiencies for three-second aggregate bins (0-3s, 3-6s, etc.) as I did last time for possession length:

For now, just focus on the solid gray line, which represents Ken's original data - I'll come back to the bins further down the page.

As Ken discussed in his original article, there are three areas of interest on the figure:

For possession lengths less than 12 seconds, there is a large increase in scoring efficiency compared to possessions that last longer. There is a sharp peak at 3-4 seconds (typically fast breaks after steals) where the average D-I team is scoring better than 1.2 points per possession (ppp) - and remember that a middle-of-the-road team will likely average right at 1.0 ppp overall. However, the improved efficiency drops slowly from the peak, and finally reaches that 1.0 ppp baseline only at 12 seconds into the possession. Teams benefit greatly from scoring off the break, but continue to benefit well into the possession time as the defense scrambles to get set.

For possession lengths of 12-30 seconds, there is very little variability in efficiency, as teams average 1.008 ± 0.010 ppp (yeah, I'm actually reporting a standard deviation here - get over it). There is a slight reward for scoring earlier in the possession: 12-24 seconds into the possession, teams average 1.013 ± 0.005 ppp, while 24-30 seconds into the possession the average drops to 0.996 ± 0.007 ppp. It's a subtle and not statistically significant difference. Here, we've effectively reached an even match between the offense and defense.

For possession lengths greater than 30 seconds, efficiency decreases quickly with added time. As the last few seconds wind off the shot clock, scoring efficiency approaches 0.8 ppp, which is a very poor number. By now, the defense holds the advantage, as the offense loses its selectivity in an effort to get any sort of shot up at the basket.

So how did the Hoyas and their opponents fare compared to Ken's aggregate?

(more after the jump)

Examining the slump -- offense or defense?

One of the first things I do when dissecting why a team is winning or losing is simply to check whether the offense or defense is more to blame. People inherently focus on scoring -- there will be a game where the team loses 97-95 and their fans will scream about a turnover or missed shot, all the while ignoring the sieve that was the defense.

So what about the Hoyas? The team is slumping badly with its most recent loss to a good WVU squad on the road (which, as Brian points out, was mostly caused by offensive turnovers and a distinct lack of defensive turnovers).

One simple way to look at a team is to look at offensive and defensive efficiencies. As of the WVU game, Georgetown has an adjusted offensive efficiency of 114.5, good for 20th in the country.

What does this mean? That Georgetown could be expected to score 114.5 points in 100 possessions against an average D-I team. Georgetown's defensive efficiency is 92.4 (43rd), which means it could expect to give up 92.4 points versus that same average team.

In short, the offense has been better than the defense right now.

But that hasn't always been the case.

I went game by game and took Georgetown's offensive and defensive efficiency in each game, adjusted for home/away games and the opponent's current efficiency for the season, to see how well Georgetown's offense and defense performed over the course of the season.

To put this in the context of a typical Georgetown game, I converted those efficiency numbers into points, based on a 67-possession game.

In short, the numbers below are the number of points per game that Georgetown played above or below an average NCAA in each game. The average NCAA team is someone like Loyola Marymount, Lipscomb or Indiana this year; so being average here is not a good thing, and decidedly worse that being an average team in the Big East.

How is this different than what Brian did here?

Well, Brian was looking at how Georgetown performed relative to its own expectation;. I'm trying to evaluate Georgetown's performance against a standard baseline (i.e. the average NCAA team) to see, overall, how the offense and defense have performed, given the competition. It's really just an offset between the two analyses.

.                 (Above Avg. Team)  (Below Avg Team)
Games               Points Scored     Points Allowed
First 7 Games             7.8               9.2
Second 7 Games            6.3               8.2
Third 7 Games            12.1               4.8
Fourth 7 Games            8.7               4.2

In the first seven games, the Hoyas had some creampuffs, the Temple offensive debacle and a rather balanced performance versus Butler.

Starting with the Butler game and running through much of the second 7 games (including Washington), the defense carried Georgetown for the most part. In the Old Dominion game, the offense was actually below NCAA-average.

Starting with Seton Hall, the team started to get hot on offense, but also a bit schizophrenic. In four of the seven games in this quartile, the Hoyas posted offenses 18+ points above NCAA Average in a 67-possession game. Yes, NCAA-average isn't very good, but 18 points differential due to the offense in a regular Georgetown game is pretty darn good.

The defense started to decline, though, and in the South Florida game, neither the offense nor defense was any good.

By the last set of games, the offense had come back to earth but the defense hadn't revived.

Looking at the losses:

Game              Offense        Defense
ODU                 -1              5
South Florida       -1             -4
at Rutgers           8             -5
Notre Dame           3            -11

at Marquette         9              7
at Villanova         5             11
at Syracuse          0              7
Syracuse             6              6
West Virginia        6              5

I divided them up into perceived "bad" losses and "acceptable" losses. It's worth noting that versus Marquette and Nova, Georgetown played better than in some of their wins. The Syracuse games were pretty bad; at home the Hoyas played better than versus Temple but that's about it. Overall, the defense plays as well or better than the offense in losses against marquee opponents.

In the three "bad" in-conference losses, the defense has been more to blame. But aside from the Rutgers game, the offense wasn't exactly pulling its weight, either. Still, 11 points worse than an average D-I (read: Indiana) defense versus Notre Dame? Ughh. Overall, the defense plays terribly in losses against "easier" opponents.

Here's the whole season, along with the season to date running average after the seventh game -- you can see how the defense has slowly declined:

Game                   Offense  Defense  Total  O Avg  D Avg
Tulane                   12         5      17   
Temple                  - 9        19      10   
Savannah St.              6        10      16   
Lafayette                23         1      24   
Mount St. Mary's         10         6      16   
American                  2        13      16     7.5    8.9 
Butler                    9        11      20     7.8    9.2 
Washington                1        19      20     6.9   10.4 
Old Dominion            - 1         5       4     6.1    9.7 
Harvard                   6         9      15     6.1    9.7 
St. John's                9         3      12     6.3    9.1 
DePaul                   15        10      25     7.0    9.1 
Marquette                 9         7      15     7.2    8.9 
Connecticut               6         6      11     7.1    8.7 
Seton Hall               18         0      18     7.8    8.1 
Villanova                 5        11      17     7.6    8.3 
Pittsburgh               21         5      27     8.4    8.2 
Rutgers                  20         3      23     9.0    7.9 
Syracuse                - 0         7       7     8.5    7.9 
Duke                     22        10      32     9.2    8.0 
South Florida           - 1       - 4     - 4     8.7    7.4 
Villanova                16         7      23     9.1    7.4 
Providence                7        13      20     9.0    7.6 
Rutgers                   8       - 5       3     9.0    7.1 
Syracuse                  6         6      12     8.8    7.0 
Louisville               15        14      29     9.1    7.3 
Notre Dame                3       -11     - 8     8.8    6.6 
West Virginia             6         5      11     8.7    6.6

Notice that the offense got better as the season has progressed before stabilizing in February, while the defense has been steadily trending downward since the Washington game.

Thursday, February 18, 2010

Chris Wright's Scoring and Georgetown's Win/Loss Record

As Brian and others have duly noted, it’s impossible at this point to discuss Georgetown basketball without hearing the platitude “as goes Chris Wright, so go the Hoyas.” The talking heads can’t get over the fact that when Chris Wright scores more than 10 points the Hoyas are 16-0 and when he scores less than 10 they are a lackluster 2-6.

But how much truth is there to this statistic? Is it coincidence? Is it only an easy-to-digest, TV-friendly fallacy? Could it really be that simple?

Full disclosure: I came into this analysis as a healthy skeptic. I couldn’t possibly believe that the key to Georgetown winning was Chris scoring in double digits. What I looked for, though, was something larger. What does Wright’s scoring mean in terms of the offense as a whole? If he’s not scoring, what is he doing (or not doing) instead?

I went to the fantastic Individual Win/Loss splits here to find out, and the results were more than a little surprising.

The most obvious win/loss split:

Wright shoots 27/60 (45%) from 3FG in wins
Wright shoots 1/23 (4%) from 3FG in losses.

Ouch. That’s certainly not good, but it can’t be the whole story, can it?

I started by looking at shooting metrics among the starting five to see if there were any other similar trends (click any figure to enlarge):

Nothing too shocking or revealing, outside of Chris. But what’s more startling isn’t the shooting percentage; it’s the shot selection. Here’s a graph of 3pt FG attempts as a percentage of total FGs attempted:

First of all, I had no idea just how much of a sniper Jason Clark has turned into. 60% of all of his shots are 3s. But when you’re making them at 45% clip, go right ahead!

It also follows that if you’re trying to get back into a game, you’re going to pop a few more from behind the line (see: Austin Freeman and Jason), but look at that 23% jump from Chris. In losses, he's taking more than half of his shots from behind the arc, and we already covered how many of those he’s making.

To me, this appears to be a fundamental and philosophical change in his game. He’s not just taking a few more 3s to try to shoot the team back into a game, Wright averages 12.9 2FGA / 100 possessions played in wins, but only 7.2 2FGA / 100 possessions in losses. If he’s also less inclined to try to score off of the dribble, it would be expected that he’s not getting into the lane and to the FT line as much either:

That’s not a precipitous drop, but it’s a drop, especially compared with the posts (well done, Mr. Monroe) and Jason on the wing.

So if he’s not scoring in the lane and he’s not drilling three-balls, is he deferring to his teammates more?

WOW.

He’s passing a lot more - that is a staggering jump. Chris’s overall ARate this year is 22.1 or 363rd best in the country. The 35.6 overall would be good for 23rd(!) overall in D-1.

The only problem? It’s not helping win ball games.

At this point, I’m convinced. Chris Wright’s play is critical to the success of the team. However, the low scoring in losses is a symptom, not the problem.

The problem is that, in games Georgetown has lost, he’s changing his game to be more deferential and it’s impacting the rest of the offense (not just his own), and the offense can’t function properly if Chris isn’t a threat in the lane.

Zones can spread out on the shooters and teams in man can double Greg more easily. Wright needs to drive and shoot, especially against zones, to pull defenders in and give open looks to Austin and Jason, either as a direct assist or after ball movement. It also frees up Greg from facilitating so much at the high post, keeping him down low for better looks and chances at offensive rebounds (Monroe's OR% is 5 points higher in wins).

Chris is the team's lead threat in terms of slashing to the hoop. Unlike the “more than 10 point” bromide, his role isn't to be the leading scorer every night, but when he looks to score more off the dribble it opens up other options. It makes him a multidimensional threat. If being more aggressive driving means he scores over 10 a night that's great, but he doesn’t need to carry the load himself.

It's also interesting because it goes against the early-season conventional wisdom that Chris needed to become a better pass-first PG to lead this team. It actually looks the opposite - when he takes that dimension of his game away it neuters his effectiveness and stagnates the offense.

So drive, Chris, drive, and as the inimitable Bill Raftery would put it, let’s hope to see a little more lingerie on the deck in the future.

Thursday, December 31, 2009

Looking back in anger, individually

Brian's been looking at last year's meltdown and how likely it is to repeat. I thought I might see something at the player level that could shed some illumination.

To be honest, I didn't expect much. The sample I had time to gather is incredibly small, and team stats are more or less aggregations of individual stats, so how much could I learn?

Well, at first blush, maybe the great collapse of 2008-09 could have been predicted.

Let's look at the year before and at % change of Offensive Rating from non-conference to conference play. In addition to a positive % change, a small negative change would actually imply relative improvement, as unlike the team stats, these aren't adjusted for competition.

Here's the year before the meltdown:

2007-08
Player             % Change   Class/Left?
Crawford, Tyler       16%     Sr/Graduated
Sapp, Jessie          -1%     Jr
Hibbert, Roy          -2%     Sr/Graduated
Wallace, Jonathan     -6%     Sr/Graduated
Ewing, Patrick        -7%     Sr/Graduated
Freeman, Austin      -16%     Fr
Summers, DaJuan      -21%     So
Macklin, Vernon      -23%     So/Transferred
Rivers, Jeremiah     -46%     So/Transferred

See any incredibly obvious trends? There's a small sample here so I'm loathe to commit to anything until I get a bigger sample, but man, it seems to me that you could take this a few directions:

Do upperclassmen retain their offensive value better? Should this have been a warning sign?
Are there style issues at hand? The "halfcourt" players seem to maintain more of their offensive value.
Lastly, we could have seen it coming in that of the four players who played often and retained most of their effectiveness into conference play, three were graduating.

And now last year:

2008-09
Player                  % Change   Class/Left?
Mescheriakov, Nikita       89%     Fr/Transferred
Sims, Henry                30%     Fr
Freeman, Austin            -5%     So
Clark, Jason               -9%     Fr
Wattad, Omar              -11%     So/Transferred
Wright, Chris             -13%     So
Sapp, Jessie              -14%     Sr/Graduated
Monroe, Greg              -14%     Fr
Summers, DaJuan           -18%     Jr/NBA
Vaughn, Julian            -22%     So

I'm not sure this helps my theories much, except maybe style of play. There certainly needs to be a bit more work done here, but there's little doubt that when only one starter -- Freeman -- maintains their level into conference play -- you have an issue.

Over the next few days, I'll try to look at a bigger sample as well as what elements of these players' play created a drop.

Just off of this, I'd be a bit worried about Vaughn (can he bully the BE?) and Monroe (lack of go to low post moves) more than anyone. Given how mediocre the offense has been so far this year, we need to see improvement in conference play, not regression.

Wednesday, December 30, 2009

Looking back in anger, part 2

Last night I posted some tables showing the adjusted offensive and defensive efficiencies for the Hoyas over the past six seasons, broken into two groups: early season (Nov. - Dec.) and late season (Jan. - Apr.).

Tonight I'll go ahead and run through the four factors tables that underlie each of the efficiencies we discussed.

Again here, I've adjusted each of the stats to account for the level of competition, but this is a much less certain trick then when I adjust efficiencies. Without getting too technical, I have KenPom's adjusted efficiencies available for all of Georgetown's opponents over the past six seasons, but I have no adjusted stats for the four factors. Think of the RPI, which uses record (25%), opponents' record (50%) and opponents' opponents' record (25%) - since I can't account for opponents' opponents for these stats, they are missing that component equivalent to about 25% with the RPI.

Also, I suspect that this will quickly turn into a stats dump, as it's getting late tonight. I'll try to come back and flush this out a bit more tomorrow. At inspection before posting, this article looks like a mess, but I need to get to bed!

To start, let's take a look at the offense:

Offense - Early Season
Season    O. Eff.     eFG %     TO Rate    OReb %     FT Rate
2003-04    103.9      50.4       19.6       39.1  *    42.4
2004-05    106.9      53.8  *    20.8       36.7       32.4
2005-06    109.9      54.8  *    18.0       32.6       32.9
2006-07    113.8 *    55.6  *    21.7       39.8  *    34.5
2007-08    119.1 *    59.2  !    18.8       36.0       35.4
2008-09    117.4 *    57.2  *    20.1       36.8       54.7  !

What I've done is tag each column if the team was performing significantly better or worse than an average team. Roughly put:

! = Top 10    * = Top 25    x = Bottom 50    X = Bottom 25

Before I delve into this table, I'll go ahead and post the late season stats for offense:

Offense - Late Season
Season    O. Eff.     eFG %     TO Rate    OReb %     FT Rate
2003-04     96.2      44.7  x    21.0       30.2  x    32.8
2004-05    113.2 *    55.0  *    22.2       33.9       32.8
2005-06    117.7 *    54.4  *    18.9       36.4       35.9
2006-07    125.9 !    58.2  !    20.6       42.1  !    38.6
2007-08    114.3 *    55.6  *    22.3  x    33.7       35.8
2008-09    108.3      52.5       23.4  x    34.6       38.2

At this point, I think there are at few truisms that become apparent:

For a team to operate at a very good (Top 25) or elite (Top 10) efficiency on offense, it needs to shoot extremely well. Doing another thing very well is useful, but not necessary.
Being very good or elite at one of these skills in the early season is likely to translate to conference play, but not a guarantee.
Generally, expect performance to decline a bit in conference play. This seems most likely with turnovers.
Getting offensive rebounds is nice, but not committing turnovers is nicer.

Now let's run the defense:

Defense - Early Season
Season    D. Eff.     eFG %     TO Rate    OReb %     FT Rate
2003-04    91.6       49.2       28.1  !    39.6  X    28.4
2004-05    97.3       47.1       23.0       37.2  X    39.5  x
2005-06    92.9       47.1       20.7       28.3  *    29.9
2006-07    89.1  *    44.6  *    21.1       29.1       36.3
2007-08    87.2  *    42.1  *    18.4       30.8       25.1  *
2008-09    82.6  *    39.8  !    25.1  *    36.7  x    27.5

Defense - Late Season
Season    D. Eff.     eFG %     TO Rate    OReb %     FT Rate
2003-04    92.8       50.5       24.8  *    34.4       43.1  x
2004-05    94.4       47.6       21.1       33.2       39.6  x
2005-06    92.6       47.7       20.0       31.1       27.5
2006-07    88.7  *    43.2  *    19.5       34.6       28.5
2007-08    85.3  *    41.8  *    20.7       31.1       41.4  x
2008-09    95.0       48.9       21.5       34.4       36.9

Here, the conclusions are similar:

Preventing teams from making shots is the best way to run an efficient defense.
Usually, early season field goal defense will translate well into conference play.
Bad rebounding early season can be corrected by conference play.
You can still have a great defense despite giving up fouls

So what happened last season? A few things.

The team shot less accurately from the field in conference than early season. This wasn't dramatic, and I'd guess tied to the 3FG shooting (~~I'm too lazy to look right now~~). I couldn't let that stand, so I've appended another table at the end with shooting percentages for all seasons. Turns out it was as much 2FG shooting as 3FG shooting. Shows what I know.
Offenisve turnovers went up quite a bit
Items #1 and #2 also happened in 2007-8, but were a bit more severe last year.
An elite ability to get to the FT line evaporated in conference play. This may have been a bit of crutch propping up the offense early on.
The defense was defending field goals at <40% eFG early season, but this fell to Esherick-level in conference. This was the single biggest change in the factors for offense or defense last year, and was fundamental to the collapse in conference.
Better rebounding couldn't make up for the drop in turnovers generated by the defense.

Finally, how does the team's early season stats look, heading into the game vs. St. John's tomorrow night?

Season    O. Eff.     eFG %     TO Rate    OReb %     FT Rate
2009-10    114.4 *    57.7  !    23.2  x    39.3  *    37.7

.         D. Eff.     eFG %     TO Rate    OReb %     FT Rate
           81.9  *    42.5  *    20.9       27.8  *    25.7  *

The offense is currently rated as very good, but I expect that to drop down a bit in conference. As likely as not, the shooting accuracy will drop down a bit against taller Big East clubs, and that lousy turnover rate is most likely going to increase even more.

Rebounding on both ends is very good so far, and while a small decline would be expected in each case, I don't think either will become the worry that we saw last year.

The overall adjusted def. efficiency is very good so far this year, and there isn't a single underlying stat that raises a red flag of unsustainability. I suspect that this season, much like 2007-8, the Hoyas will go as far as their defense will take them.

Edited to add these table:

Offense

.           Early Season                Late Season
Season    2FG%   3FG%   FT %          2FG%   3FG%   FT %
2003-04   49.0   40.3   70.2          40.7   31.9   72.3
2004-05   51.6   37.9   69.7          51.8   35.8   70.9
2005-06   56.9   37.5   69.3          52.1   34.6   71.3
2006-07   59.7   36.9   70.8          56.8   37.1   71.2
2007-08   60.1   41.0   60.2          54.2   37.1   68.1
2008-09   58.1   34.6   75.3          53.1   32.4   68.2

2009-10   54.0   36.1   71.1

Defense

.           Early Season                Late Season
Season    2FG%   3FG%   FT %          2FG%   3FG%   FT %
2003-04   43.9   32.9   62.6          50.9   31.9   69.3
2004-05   43.2   33.9   66.7          45.8   33.2   73.8
2005-06   41.4   37.7   64.1          48.0   33.8   70.8
2006-07   42.8   30.6   71.0          43.4   30.2   71.0
2007-08   38.3   29.9   65.8          41.6   29.4   68.4
2008-09   36.8   28.8   71.6          49.5   34.0   70.5

2009-10   41.3   30.3   72.5

Tuesday, December 29, 2009

Looking back in anger

As the Big East regular season gets under way, it comes time to wonder what we've learned about the Hoyas so far this year.

About this time last season I also wondered what November and December had demonstrated - heading into a home game against Pittsburgh, Georgetown was ranked #1 overall by Ken Pomeroy, with a net adjusted efficiency (adj. off. eff. - adj. def. eff) of + 42. That is, we expected that the Hoyas would outscore the median Div-I team (e.g. Holy Cross or Hofstra in 2009) by 42 points in 100 possessions.

I was so excited that I made one of those fancy-pants graphs to demonstrate visually just how good Georgetown was (click any figure to enlarge):

2008-09 Big East Aerial (31-Dec-08)

You'll need to read this post to understand everything in this figure, but simply: upper-right = good; lower-left = bad.

The Hoyas went 6-14 the rest of the way. Obviously, that figure - and the Hoyas early season performance - didn't tell us what was to come, only what had happened so far.

Now I could simply knock out this year's version and let it rest, or I could stare at a bunch of numbers on a spreadsheet and try to understand why the early season performance by the Hoyas was such a poor predictor of the rest of the season.

Let's do both.

First, here's this year's aerial, through games played Monday:

2009-10 Big East Aerial (28-Dec-09)

As conference play gets underway, Syracuse and West Virginia appear to be the class of the Big East, both with fairly balanced teams (offense vs. defense efficiencies).

Georgetown, surprisingly, is a solid third by KenPom's ratings, although the team is highly dependent upon its defense (#1 in the conference) as the offense is merely average.

Following is an enormous cluster of teams, headed by Marquette and Villanova and followed by St. John's, Pitt, UConn, Louisville, Seton Hall, Cincinnati and South Florida. That's nine teams all grouped together.

Notre Dame - which almost fell off the chart with its extreme of the league's best offense and worst defense - and Providence are slightly behind the peloton, and Rutgers and DePaul hold the final two spots.

A few points to make:

Note that the scaling on this year's aerial is slightly different than last year's. The offensive scale doesn't go quite so high and the defensive scale doesn't go quite so low. For instance, Notre Dame had better adj. off. and def. eff. stats last year, while this plot may lead you to think that their offense is actually better this year.
The elite teams this year aren't quite as elite as those from last season headed into January. Since four Big East teams made it to the Elite 8 last year, a drop-off this time isn't unexpected.
No team last season made it to the NCAA tournament from behind the isopleth that crosses the solid diagonal at OE=107, DE=93. This year, S. Florida sits just behind that line. Three teams that were ahead of that line last year (Georgetown, Notre Dame and Cinci) failed to make the tourney, although they were bubblicious until the end. If that's any sort of bellwether, it would mean 12 Big East teams are capable of making the NCAAs this year (although obviously some will not).
A few teams may look out of place:

Georgetown's pedestrian offense may seem unusual, but did you know that the Hoyas have finished higher than 8th in off. efficiency in the Big East only two times in JTIII's first five seasons (2006 & 2007)?
Louisville's offense is better than their defense? The Cardinals have finished 4th, 2nd, 1st and 1st in def. efficiency in conference their first four seasons. If the offense can tread water, expect Louisville to improve as its defense does.
Pitt was one of the most lopsided teams last year - they had a great offense and mediocre defense. It's the opposite so far this year.
To my eye, the most dramatic year-over-year improvement is S. Florida, but I'm not sure how they'll do without Gus Gilchrist for the foreseeable future. St. John's is also far ahead of last year.

----------------------------------------------------------------------------------

Okay, that was fun, but what about the real issue - how well can we predict the rest of the season from how well a team plays in November and December?

Here's Georgetown's won-loss record for the past 6 seasons in the early part (all games in Nov. and Dec.) and late part (Jan. - Apr. games) of the schedule:

.                Early                  Late
Season       W   L     Net         W   L     Net 
2003-04      9   0   + 28.8        4  15   - 11.8
2004-05      8   3   + 13.7       11  10   +  1.1
2005-06      8   2   + 21.2       15   8   +  7.7
2006-07     10   3   + 23.1       20   4   + 15.7
2007-08     10   1   + 31.9       18   5   + 12.1
2008-09     10   1   + 29.6        6  14   -  3.6

That third column for each segment ("Net") refers to the difference between off. and def. efficiency during those games. It turns out that the best two W/L records, and the 2nd and 3rd best net efficiencies from early season preceded two epic collapses - yes, this dataset goes back to the final Esherick season, although this is just a coincidence since this is as far back as KenPom's database goes back in time.

Now hopefully, if you're my regular reader or you've bothered to make it this far down the article, you'll want to dig a bit deeper into those numbers above. For instance, what was the quality of competition like during each early season? If we adjust the underlying off. and def. efficiencies for that competition, what would the net eff. numbers look like?

Let me tackle those one at a time. First, I'll re-post those net. eff. numbers with the opponents' average KenPom rating:

.             Early                    Late
Season      Net   Rating             Net  Rating
2003-04   + 28.8   258             - 11.8   63
2004-05   + 13.7   175             +  1.1   55
2005-06   + 21.2   190             +  7.7   49
2006-07   + 23.1   150             + 15.7   48
2007-08   + 31.9   160             + 12.1   59
2008-09   + 29.6   119             -  3.6   61

Clearly, JTIII does not subscribe to Esherick's (and by proxy, his father's) scheduling philosophy. The 03-04 team was simply getting fat on a bunch of low-major teams early season, and never had its mettle tested until conference play got rolling. The 08-09 bunch, however, played the toughest early-season schedule so far - remember that the UConn game last year was played before New Year's, so it is included. The huge drop in net efficiency last year came after the team looked to be able to handle just about anyone early on.

Let's now take a look at each of the individual efficiencies, now adjusted for competition during the segment. This is analogous to looking at KenPom's Adj. Efficiencies, rather than raw efficiencies. As I mentioned at the top, when we discuss adj. efficiencies, what we mean is how would we expect the team to perform against an average Div-I team.

Here we go, looking at offense first:

.                Off. Eff.
Season     Early  Late   Diff.
2003-04    103.9   96.2  -7.7
2004-05    106.9  113.2   6.3
2005-06    109.9  117.7   7.7
2006-07    113.8  125.9  12.1
2007-08    119.1  114.3  -4.8
2008-09    117.4  108.3  -9.1

The last Esherick team was the worst offensive team in the group, both early and late season, once you adjust the stats for the level of competition.

One of the great mythical concepts of the Georgetown/Princeton offense is that it is massively complex and takes considerable practice time and game experience to master. The first three seasons under JTIII certainly give evidence to that, as the team improved from early season to late season. This wasn't the case in 07-08, where the team was stocked with upperclassmen and operated at a high level even with the step back in conference. But last year's team showed the biggest overall drop intra-season, and the roster had only 1 junior and 1 senior - surely an improvement should have come.

It's also worth noting that, in spite of the collapse last year, the team was still scoring more than 12 points per 100 possessions more than the 03-04 club.

.                Def. Eff.
Season     Early  Late   Diff.
2003-04     91.6   92.8   -1.2
2004-05     97.3   94.4    2.9
2005-06     92.9   92.6    0.3
2006-07     89.1   88.7    0.3
2007-08     87.2   85.3    1.9
2008-09     82.6   95.0  -12.4

Excluding last season, this table just makes a lot more sense to me. I can understand offenses going to slumps, for example due to a prolonged stretch of poor outside shooting, but I'd expect that a team's defensive ability will be fairly well established in the first two months of the season, once the level of competition is accounted for.

An important trend, and one that I don't think most fans or analysts are picking up on, is that each Hoya team in the JTIII era has been better defensively than the previous year's team, and had improved from early-season to late-season. Heading into Jan. 2009, Coach Thompson was beginning to look like some sort of defensive genius.

Then the wheels came off.

If we use the Esherick team as an archetype for one that had lost its confidence, or perhaps better a coach that had lost his team, we still see only a small drop in the quality of defensive play. Last season, there were no significant injuries yet the defense on the court in conference play not only didn't resemble the early season team, it broke with a general trend over a five-year period.

.                Net Eff.
Season     Early  Late   Diff.
2003-04     12.3    3.4   -8.9
2004-05      9.6   18.8    9.2
2005-06     17.1   25.1    8.0
2006-07     24.8   37.2   12.4
2007-08     31.9   29.0   -2.8
2008-09     34.8   13.3  -21.4

Finally (for tonight), I'll summarize the previous tables with the net adj. efficiencies for the past six seasons. I think its worth noting that, as much as we like to decry the late season swoon last year, it's worth recognizing that the early season performance was about as good as the Hoyas have played since JTIII arrived, just a touch below the 06-07 team during its run to the Final Four. The shock wasn't so much how poorly they were playing by the end of the season, but rather how far they had fallen in such a short period of time.

While the Esherick team's collapse wasn't nearly as large, that team simply wasn't good enough to start with to be able to collapse very far.

Tomorrow, I'll dig a bit deeper into the underlying stats, and take a look at what we've learned so far this year.

Sunday, October 25, 2009

Season Preview: Austin Freeman

Edited: [10-26, 10pm] Crap. Well, apparently it's preseason for the bloggers, too.

I had a typo in one of my spreadsheet formulas, which was screwing up the possession usage data in the Georgetown players' skill curves. I've corrected the figures and accompanying text - the story has changed a bit now, especially as it relates to Austin Freeman and the other returning players, so if you've already read this article, you might want to re-read the last section.

-------------------------------------------------------------------------------------------------

For better or for worse, the media, the fans and even the Georgetown Athletic Department have embraced the notion that this season's Hoya team will be led by it's three McDonald's All-Americans: Greg Monroe (soph), Chris Wright (junior) and Austin Freeman (junior).

These are the only three returning players that we credit with positive net points (created more points than they allowed) from last season, so it seems natural that they would become the core of the team.

A trio of players leading the team is not new. During the JTIII era, there have been typically three players who use ≥22% of available possessions each season. If all players shared the ball equally, we'd expect a possession usage of 20%, so in effect the offense is usually dominated by three guys.

.       Player         %Poss    %Min    ORat
2004-5  Bowman          24.2    82.7    112.4
.       Green           23.8    84.0    111.5
.       Hibbert         25.3    39.3     89.2
.       Cook            20.3    80.1    102.3

2005-6  Green           25.4    80.7    102.7
.       Hibbert         25.6    59.6    120.9
.       Bowman          24.6    70.7    101.0
.       Cook            18.1    76.8    113.0

2006-7  Green           24.9    83.0    114.4
.       Hibbert         22.8    65.7    130.8
.       Summers         22.0    65.7    101.8
.       Wallace         18.9    80.2    119.7

2007-8  Hibbert         25.9    66.0    120.5
.       Summers         23.8    67.5    104.0
.       Sapp            22.7    66.4    105.5
.       Wright          21.9    42.4     97.7*
.       Freeman         18.1    63.9    115.9

2008-9  Summers         24.4    72.0    104.0
.       Monroe          22.9    76.0    110.9
.       Wright          21.3    81.5    107.2
.       Freeman         19.4    74.3    115.6

*Wright missed 18 games his freshman year, so his usage stats aren't easily compared to his teammates.

Usually, the next man in line for possessions is much more efficient offensively than at least one of his more aggressive teammates. Last season the "next man in line" was more efficient than all three players who were ahead of him.

That man was Austin Freeman. It's also worth noting that he was able to keep a high offensive rating despite having his 3FG shooting accuracy drop from 40% to 31% from his freshman to sophomore season. One could reasonably hope that he will be even more proficient this year.

There are two fundamental hurdles that he - and any player looking to step into a bigger role - must overcome. We'll call them inertia and marginalism. Each of these concepts is fundamental to a pair of questions we'll ask about Austin Freeman coming into this season:

Can Austin Freeman increase the rate at which he uses possessions, to become a go-to offensive player rather than just a complementary one?
Will there be a cost in his offensive efficiency if he does use more possessions?

Inertia

A couple of years ago, Ken Pomeroy posted an article on Basketball Prospectus noting that

[o]nce a player demonstrates himself to be a role player, it's unlikely he'll ever be a go-to guy and, therefore, a superstar. It's not quite a law in college basketball, but players who are not very involved in the offense tend to stay that way. Any major changes in a player's usage are usually the result of filling the hole left by a departing possession eater.

I found this point compelling, so much so that I wrote about this each of the past two pre-seasons, and here I am doing it again.

As an aside, an important point to keep in mind during this discussion is that we are discussing usage rate (a percentage), not possessions used (a counting stat). As players receive more minutes of playing time, their counting stats will naturally increase. But here we are concerned with how their rate statistics change, which should better indicate a change in behavior or ability.

Greg Monroe and Chris Wright appear naturally predisposed toward using possessions - Wright has used ~22% of available possessions each of the first two seasons, and Monroe was using more than 23% last year. This was a good thing last season, as both were more efficient than the team overall, especially when looking at performance versus Top 100 opponents. In fact, they were the second and third best option on offense in those games. The most efficient offensive player, whether you look at vs. Top 100 teams, conference games or even all games, was Austin Freeman.

Can we expect that Freeman will use significantly more possessions this year? First, let's see if we learned anything from his freshman to sophomore growth.

From the table above, we see that the Hoyas went into last year with only one possession-eater lost (Hibbert) and three returning (Summers, Sapp and Wright). So possessions were available, but there wasn't a wholesale change at the top.

To understand the year-to-year change in possession usage a typical Big East player experiences, we can take a look at all Big East players from 2005-2008 and fit a line through their possession usage rates from one year to the next. I've attached a figure from last year's article - you'll need to go back and read that post to understand all of what's going on in it, but for now all we care about is the solid black line that is fitted to the circles (click on the figure to enlarge).

The typical Big East player will increase his usage from one year to the next, so long as he used less than 22% of possessions in the previous year. Players who used more than 22% of possessions the previous season tend to use less. Moreover, we can use that fitted black line to actually estimate how many more possessions a player would be expected to use the next season.

Austin Freeman went into last season having used 18.1% of possessions as a freshman. Based on historical Big East growth rates, we expected him - on average - to use 18.9% of possessions as a sophomore. He actually exceeded that by a bit (19.4%). So it looks like Freeman is fairly well-described by our little model, or perhaps we're being a bit conservative.

This season, the Hoyas again have lost one possession eater (Summers) and return two (Wright and Monroe), so we'd expect about the same change or increase in usage from the returning players.

If we apply the model towards next season, we'd only expect Freeman to use 20.0% of available possessions, which would frankly be a bit disappointing in light of his offensive ability. Let's take this a bit further. Because we are über-geeks here, we can actually predict what his usage rates would be under favorable (75th percentile) and extraordinary (95th percentile) conditions, just as Pomeroy did.

.                       Year 2
Year 1: 19.5%         Expectation
.  Average               20.0
75th percentile          22.0
95th percentile          25.5

Assuming the model is good for Freeman, an increased usage to the magical 22% rate - both the seeming natural usage rate for players and the top tier for players in the Georgetown offense - has about a 1 in 4 chance of happening this season. It's tempting to say that he'll likely take more than 20% of possessions, since he used more than predicted last year (or to say that there is better than a 1 in 4 chance he'll get to 22%), but I'm a bit hesitant to draw this conclusion from one data point (his change from freshman to sophomore year).

Marginalism

Throughout the above discussion, we were only concerned with the percentage of possessions Austin Freeman might use this year, with the hope that he might increase his usage rate more than expected. The assumption is that a sharp increase in possession usage by Freeman would help the team because he is the team's most efficient scorer. Taking some of Summers' and Sapp's possessions and scoring on them at Freeman's rate will help the offense.

But if Freeman takes more possessions and shots, would he remain as efficient a scorer? As a player takes more and more possessions from his teammates, does his efficiency decrease, and by how much?

The law of marginal utility (i.e. "diminishing returns") should be familiar to anyone who's had to suffer through an economics class. Simply, as a resource is increasingly available or used, the utility of each quanta of the resource decreases. In plainer English, the more abundant an item, the less its value. Think crop prices, or water rates.

To my best knowledge, this idea of marginal return was first applied to basketball by Dean Oliver, who wondered if players were more offensively efficient when they used fewer possessions. He discusses this in his book Basketball on Paper, and, to this end, he looked at three NBA players: Jerry Stackhouse, Michael Jordan and Georgetown's very own Allen Iverson via what he calls "skill curves" (I've reproduced his plot here):

To my way of thinking, he's got the axes backwards (usage rate is the independent variable and therefore should be on the X-axis) but the conclusions from the data are still clear. I'll flip the axes to make my point, though (and ignore that red line for a moment):

As players increase their usage - the percentage of possessions they use - they become less efficient.

However, it's not a smooth curve, but rather a sigmoidal fit (an S-curve), so that there is a big jump between efficient usage and inefficient usage. That notch varies from player to player, and Jordan's greatness shows up by where his notch is: he can produce a 120 offensive rating (1.2 pts. per poss. used) even while using more than 30% of available possessions.

There is a common criticism of Oliver's work, summarized recently by Kevin Pelton over at Basketball Prospectus:

Most past efforts [to understand efficiency vs. usage in the NBA] were tripped up by the problem of looking at usage on a game-by-game basis. Naturally, players will use more possessions on nights where they have a more favorable matchup, so it is not surprising that these studies actually found that players' efficiency rose as their usage increased.

More recently, Eli Witus expanded greatly upon this pioneering work by comparing high-usage and low-usage lineups for the 2007-8 NBA season, to find a relationship between player usage rates and efficiency without the confounding effect described by Pelton. I won't go into much detail here - the article may be a bit advanced for non-geeks - but the upshot was that he found that, if a player increases his usage rate by 1%, his efficiency will decrease by 1.25 points. This result is that red line added to the graph above. While it doesn't apply to Jordan, this new analysis actually shows good agreement with Oliver's work with "normal" NBA superstars.

This is all well and good, but is this information applicable to Austin Freeman, or the Hoyas more generally?

To find out, I compiled efficiency vs. usage stats for the past three seasons for Georgetown, much like Oliver did. I don't have the energy, and probably not the skills either, to redo Witus' work. Here, I simply compiled offensive rating vs. poss. usage rate for each player in each game, using my HD Box Score program, which should be more accurate than using the traditional box score calculations.

The data tends to be quite a bit more noisy than Oliver's plots, mainly because there aren't nearly as many games to sort through. Oliver looked at 2 NBA seasons (164 games), while I have data for 88 Georgetown games over the last three years. I've also used relatively narrow "bins" or ranges of possession usage to average - I'm using increments of 2.5% (e.g. averaging games with 15% - 17.5% poss. used). I've done this so each player's skill curve will have at least 8 points. I've included standard deviations for each bin to help indicate that noise - a point with no error bars is from a single game.

We'll start with Roy Hibbert and Jon Wallace, combining their junior and senior seasons. Here, Witus' expected decline rate is now indicated by the dashed gray line.

We don't see the notch - the big and sudden drop in efficiency at high usage rates - but there also aren't the extremely high usage rates that the NBA stars can reach. What we do see is that the decline looks very different for the two players.

Hibbert - a high usage player - was incredibly efficient at virtually every usage rate (and I have no idea why he has that drop when he used less than 10%), good for about a 130 Off. rating when using between 12% and 33% of possessions. His efficiency finally starts to drop at extremely high usage rates (>35%), but even this part of the curve is being drive by a single game (against Michigan, Nov. 2007).

Wallace - a low usage player - has a very different skill curve. There's a lot more noise in his data, which I believe is attributable to his high dependence on 3pt shooting. He also suffered from a much steeper drop in efficiency as he used a higher percentage of possessions. If we fit a line to his curve (not shown) we'd see that his expected offensive rating drops below 100 around 25% of possessions used. And since he was surrounded by other skilled offensive players, it makes intuitive sense that we'd not want him to use much more that 20% of possessions, which was his natural behavior.

Next up are Sapp and Summers, for whom I have the last three seasons. I've left the Witus line at the same location as for the Hibbert/Wallace plot, to allow for easy comparison.

While Summers was a forward and Sapp a guard, the slopes of their efficiency curves are quite similar. They both shot about half of their shots from outside (Sapp: 428/811 3FGA/FGA = 52.8%, Summers: 411/838 = 49%) at about equal proficiency (Sapp: 34.5% 3FG, Summers: 35.1%) over their careers, so this may not be entirely surprising. Once again, we see no notch in their curves, but a decline in efficiency at increasing usage not as steep as for Wallace. Sapp, especially, showed a steady drop paralleling the Witus line, although he seems to have an upward notch at the 25% usage rate. I wonder if this is the effect Pelton discussed; Sapp - who I think was always under-appreciated for his basketball sense - may have been more adept at recognizing and exploiting a favorable matchup.

At even moderate usage (>15%), neither player showed an area of high efficiency (>120 off. rating), but Summers did post some very high off. ratings at the lowest usage bins (although those were highly variable). This is not to say that these were poor offensive players - a 120 off. rating is very good - but neither looked to be a consistently great offensive player, even when not required to carry the load.

Now that we've got some context, let's take a look at how Austin Freeman has performed over the last two seasons.

Freeman's curve is a bit harder to make sense of, as he's got that big drop in efficiency when using 17.5-20% of possessions. In a bit of a statistical fluke, most (7 of 9) of the games that make up this bin are from his freshman year, and that dip seems to be due to his freshman games (his two sophomore games in the bin are amongst the three best of the bin). More on year-to-year improvement below.

Ignoring that dip, we see that Freeman can be an elite offensive player when he's using less than ~22% of possessions, operating at the level of Hibbert and Wallace rather than Summers and Sapp. Also, it's apparent that Freeman does not do well when he takes on a higher load - above 22% of possessions used his off. rating drops below 100, i.e. to a mediocre level.

So here we are faced with a conundrum - Freeman has been anointed to be one of the big 3 players for the Hoyas this season, but his offensive game suffers greatly when he steps into the high usage (>22%) role.

I'll now add Wright and Monroe to Freeman's graph:

As you can see, Monroe also has the drop in his skill curve, although his looks to drop below a 100 off. rating somewhere around 27% of possessions used.

Chris Wright's curve is a complete mess. That huge drop at low usage rate is the average of two games against Pitt, including the 2008 BET when he put up 0 points created in 30 possessions played. But even ignoring that point, his skill curve just doesn't seem to obey the rules of efficiency vs. usage. I don't know if this is a result of the 18 games he missed during his freshman year or his inconsistent outside shooting, but I'll refrain from further comment until we get another season to add to the database.

Am I underselling Freeman's potential for this year?

There is one critical point that I've been ignoring here: year-to-year improvement. Unlike Oliver and Witus, we aren't discussing mature NBA players, but college kids who are still developing their skill sets and learning a complicated offensive scheme.

To address this, I've come up with a simplistic plot. I've taken all Big East players for the 2005-2008 seasons who played at least 10% of available minutes, and found the difference between their current and previous year's poss. usage and off. rating. For example, looking at Austin Freeman:

Season   Poss %   ORat
2007-8    18.1    115.9
2008-9    19.4    115.6
Diff.     +1.3     -0.3

I've compiled all available player-seasons (n=274) in this graph:

The markers are color-coded by Year-2 offensive rating and sized by Year-2 percent minutes played. The fitted line (with the fit weighted by % min) is the black line, with the 75% and 95% prediction bands in blue and gray, respectively.

The evidence is not promising. That line has a negative slope, just as Witus saw for NBA players. Ours has a gentler slope, but still shows that a 1 percent increase in possession usage from one season to the next will cost an average Big East player about 0.78 points in off. rating.

All I can offer is that the correlation is extremely weak: the 1σ uncertainty of that slope is 0.73, which is to say that it is just barely significant. To put it another way, of the 274 player-seasons we're looking at here, 80 showed an improvement in offensive rating while increasing possession usage. Or take a look at Chris Wright, who improved his offensive rating 9.5 points (97.7 to 107.2) with a drop of only 0.6 points in usage (21.9 to 21.3).

Could inherent talent (using, e.g. RSCI ranking as a metric) help some players to improve offensively in spite of increased usage? That study will have to wait for another day.

Sunday, November 16, 2008

Another stats gimmick, and J'ville preview

Excuse this interruption of SFHoya99's season preview, but I thought I'd chime back in to introduce another stats feature that I've been working on behind the scenes.

If you're looking for the Jacksonville preview, you'll need to scroll down quite a bit.

My regular reader may have noticed by now that I've been loathe to assign credit or blame on specific players during a single game, but rather tend to present team stats. I do this in part because I think that it is difficult to evaluate individual play (especially defense) with a simple basketball box score.

There are tools available to glean some additional information when you look at a single game, notably the individual net score box that Dean Oliver describes in Basketball on Paper. Henry Sugar over at Cracked Sidewalks is a particular proponent of this, and has been providing Marquette fans with his version (which he calls "Individual Player Ratings") for most of last season. Here's an example from last year's game between MU and Villanova (hope he doesn't mind me linking):

Note that I've previously discussed this game when I introduced my version of the HD Box Score.

I won't explain Mr. Sugar's work here, but I will point to an excellent post he wrote last season covering the basics of each stat column listed. The bottom line for most fans is in columns 5 and 7 - points produced and net points added. This gives us an idea, based on tempo-free stats, of just how many points each player contributed towards the game result (in this case, a 10 point win for Marquette).

There are some limitations to this work.

Without going into too much detail here, I can assure you that the defensive rating assigned to each player for this game is just loosely tied to reality. Defensive stats are not available for most basketball games (NBA too) at the detail-level needed, so it is somewhere between difficult and impossible to assign blame for each player's defensive effort.

But more generally, the calculations used for the stats in the table above are underpinned by a large number of estimates, which should improve as we aggregate data over the course of a season, but which can be quite a bit off during an individual game. Here are just some examples of missing information needed to make the calculations for the stats above:

How many possessions did a player have on offense? Defense?
How many offensive/defensive possessions ended in a score?
What percentage of field goals made by a player were off of an assist?
How often are a player's missed shots rebounded by a teammate?
How well did the team rebound while the player was on the court?
How often did a player end a possession by making at least 1 free throw?
How often does a player give a foul, and the opponent miss at least 1 free throw (e.g. Hack-a-Shaq)?

None of these questions - and others I haven't posed - can be answered by looking at the game box score. So the only recourse is to make estimates, based on a series of formulas introduced by Dean Oliver (and presumably used by Henry Sugar).

However, all of the questions asked above can be answered by parsing the available play-by-play from the game. And that is what I propose to do.

A few points to consider:

While I can improve the accuracy of the final stats by replacing estimates with actual tallies of various components of the calculations, I'm not modifying the philosphy (or math) of the final stats. That is, if you don't think individual player Offensive Rating is a good measure of how a player contributes on offense, there is little here to convince you otherwise. Of course, if your main quibble is with D. Oliver's many underlying estimates, keep reading.

As I've said before, the drawback of using play-by-play data is that there are inevitably errors in the transcript, which can lead to uncertainty in assigning credit or blame. However, I am not convinced that these same errors aren't also in the official box score, but are just hidden from view. Just for Georgetown, I know of at least one instance where Ken Pomeroy found an error in the play-by-play that propagated to the box score.

I am not exploiting the play-by-play fully yet, because if takes a lot of work. I've written over 5000 lines of code so far (yes, that was a brag) and my wife keeps mentioning how much time I spend working on the program, and something about a divorce (at least I think that's what she said, I wasn't really paying attention). For instance, I could record the shooting percentage of each player making an assisted basket, but I don't yet. I could distinguish between assisted dunks, layups and jumpers, but I don't yet.

A bigger point, and it goes back to an early post, is that I don't really believe in D. Oliver's defensive stats, and frankly I don't think he does either. They are merely an estimate, using an exceeding limited toolbox. Here's what I wrote there to briefly explain his Defensive Rating stat:

Defensive rating is an attempt to estimate the contribution of each player to the team's defensive efficiency. It is calculated as team defensive efficiency, plus one-fifth of the difference between team defensive efficiency and individual player stops per 100 possessions played. Player individual stops are estimated from the number of blocks, steals and defensive rebounds each player has, plus some team stats. Since it is not a simple ratio, it is more like being graded on a curve, such as that it is limited to the range of 80% - 120% of team defensive efficiency. So, a player who literally refused to play defense (e.g. Donte Greene) could score no worse than 80% of his team's efficiency. I would describe this stat as a very rough estimate of actual defensive worth . . .

Later in that same post, I discussed an alternative method, which was simply to use available plus/minus stats to calculate the team's defensive efficiency while the player was on the court, and use that (less the team's defensive efficiency while the player was off the court) to rate that player's defensive ability.

The drawback to this method, pointed out on this thread on Hoyatalk, is that it the quality of one's teammates can have a big effect.

So here, I'm proposing a new method: I am using Dean Oliver's basic statistics for player offensive and defensive rating, but the data I am feeding into the underlying equations are only those generated by his team while the player was on the court. This should especially help with defensive stats, in that the base team defensive efficiency used is now the def. efficiency while the player was on the court (i.e. the player receives no credit or penalty for great or lousy defense played by his teammates while he sat on the bench). The remainder of Dean Oliver's def. rating calc. (stops, stop %, scoring poss., etc.) is used as originally described. Additionally, as stated earlier I am removing as many of the estimates used by Oliver as I can, when I have time. The seven listed above are all incorporated, along with a few others (e.g. is a blocked shot recovered by the shooter's team?). I'll try to write up a FAQ covering all of the gory details at some point this season - likely when my wife is out of town.

As a test case, I've run the Marq/Nova game mentioned at the top of this post. Here's what I get:

INDIVIDUAL NET POINTS STATS

Marquette             Off    Poss           Individ     Def             Individ                             
Player                Poss   Used    ORtg   Pts Prod    Poss    DRtg   Pts Allow   Net Pts
HAYWARD, Lazar         59    12.5   111.2    13.9        59    100.9     11.9       +2.0                  
BARRO, Ousmane         51     3.5   149.3     5.2        51     95.4      9.7       -4.6                  
JAMES, Dominic         69    18.0   140.5    25.3        70     97.3     13.6      +11.7                  
MCNEAL, Jerel          66    18.6    79.0    14.7        67     96.4     12.9       +1.8                 
MATTHEWS, Wesley       42    11.7    92.8    10.8        42     86.0      7.2       +3.6                    
ACKER, Maurice         23     4.7   181.0     8.4        23     81.8      3.8       +4.7                  
FITZGERALD, Dan        16     0.3   280.0     0.9        17    104.5      3.6       -2.7                   
CUBILLAN, David        31     3.1    74.8     2.3        32    134.1      8.6       -6.3                   
BURKE, Dwight           6     0.0     -       0.0         7     62.9      0.9       -0.9             
MBAKWE, Trevor         12     2.0   100.0     2.0        12    124.4      3.0       -1.0                  
TOTALS                 75    74.3   112.4    83.5        76     98.7     74.6       +8.9          

Villanova             Off    Poss           Individ     Def             Individ                         
Player                Poss   Used    ORtg   Pts Prod    Poss    DRtg   Pts Allow   Net Pts
Pena, Antonio          62    12.9    75.6     9.8        60    123.0     14.8       -5.0                 
Cunningham, Dante      61    12.0    93.3    11.2        62    112.0     13.9       -2.7                     
Reynolds, Scottie      60    16.1    85.4    13.7        57    123.9     14.1       -0.4                     
Fisher, Corey          62    17.4    76.4    13.3        59    112.0     13.2       +0.1                 
Anderson, Dwayne       54     6.6   154.1    10.1        53    125.3     13.3       -3.1                    
Redding, Reggie        25     2.0   223.2     4.4        28     80.9      4.5       -0.1                   
Clark, Shane            8     0.8   333.3     2.5         9     70.2      1.3       +1.2                
Stokes, Corey          48     7.6   121.7     9.3        47    106.7     10.0       -0.7                 
TOTALS                 76    75.3    98.7    74.3        75    113.3     85.1      -10.8

The actual score of the game was MU 85, VU 75.

Several of the columns here are the same as Henry Sugar's above, but there are a few new ones as well. Briefly

Off/Def Poss - the number of offensive or defensive possessions that a player was on the court; I think this is more useful than minutes played.

Poss Used - the number of offensive possessions used by a player (partial credit due to assists and offensive rebounds).

Off. Rating - the number of individual points produced, divided by the number of offensive possessions used, multiplied by 100. This is an estimate of the number of points a player would produce (not simply score) in 100 possessions.

Points Produced - similar to possessions used, it is an estimate of the team points scored that can be credited to an individual player; again, partial credit due to assists and offensive rebounds.

Def. Rating - An estimate of the number of points a player would allow in 100 possessions. See the discussion above the table for the details.

Points Allowed - The actual number of points allowed by the player - again an estimate.

Net Points - The difference between points produced and points allowed.

I've also included a totals line for all stats, so you can actually check my work.

The total Off Poss & Def Poss are the actual number of possessions in the game.

The total number of possessions used by each team agree very well with the reality - for my data parser, total possessions used are typically within 5% of actual possessions played, but this game worked exceptionally well.

Total points produced for each team are also very close to actual points scored. These should be with 10%, and often with 5%.

The summed points produced divided by total possessions used gives an estimate of team off. efficiency. This is the value listed as the total of ORtg. The estimated team offensive efficiencies (112.4 & 98.7) agree extremely well with actual off. efficiencies for each team (113.3 & 98.7).

At least for this game, it appears that my method is giving a quite satisfactory measure of what happened on offense. It won't always be so accurate, but this is why I want to give these totals - it will allow my reader to decide for himself (do any women read this blog?) how well the stats analysis is working.

Defensive stats are more tightly coupled to team, rather than individual, data so the totals here aren't quite so useful. The DRtg totals are simply team defensive efficiencies, calculated as team points allowed divided by defensive possessions.

Here, the summed individual points allowed for each team agree within 1 point of the actual score, another excellent result - I find typically they will agree within 5 points.

Finally, the net points totals give two estimates of the margin of victory (or loss). The average of the two [(8.9 + 10.7)/2] = 9.9 is almost exactly the true margin. It usually doesn't work quite this well!

I think this method compares favorably to the "classic" method proposed by Dean Oliver. I will keep working at it to remove additional estimated values and fix any bugs (e.g. I wasn't counting missed dunks until last week), but I think the basic framework is now in place. Any feedback would be appreciated.

Edited to add: A year later, and I did incorporate some feedback into net points. See here for the gory details.

-----------------------------------------------------------------------

Jacksonville

Finally tonight, I thought I'd take a look at last year's game vs. Jacksonville, which the Hoyas won 87-55. That link will take you to my post-game post from last season, which includes the tempo-free and HD box scores (both will be part of each post-game analysis this season, when available). Here, I'll post the net points stats from last year's game - I've bolded and italicized any player who should play tomorrow.

INDIVIDUAL NET POINTS STATS
 
Georgetown            Off    Poss           Individ     Def             Individ                        
Player                Poss   Used    ORtg   Pts Prod    Poss    DRtg   Pts Allow   Net Pts
Wallace, Jonathan      26     9.3    71.7     6.6        25     91.3      4.6       +2.1                         
Summers, DaJuan        39     8.8   125.0    11.0        36    101.8      7.3       +3.7                           
Sapp, Jessie           36     9.2    61.8     5.7        35     75.0      5.2       +0.4                    
Ewing, Patrick         26     2.0   101.6     2.0        26     55.5      2.9       -0.9                      
Hibbert, Roy           24     8.7   115.6    10.0        24     84.4      4.1       +6.0                    
Macklin, Vernon        46     4.4   143.3     6.4        45     97.8      8.8       -2.4                       
Wright, Chris          40    10.0   137.2    13.7        40     74.5      6.0       +7.7                     
Rivers, Jeremiah       28     4.7   152.7     7.1        28     83.6      4.7       +2.4                        
Jansen, Bryon           4     0.0     -       0.0         4     80.0      0.6       -0.6                     
Freeman, Austin        42     4.3   255.1    10.8        43     76.6      6.6       +4.3                       
Crawford, Tyler        29     4.5   123.0     5.5        29     95.5      5.5       +0.0                       
Wattad, Omar           10     2.1   129.3     2.7        10     90.9      1.8       +0.9                    
TOTALS                 70    67.9   120.3    81.6        69     79.7     58.1      +23.5                 

Jacksonville          Off    Poss           Individ     Def             Individ                          
Player                Poss   Used    ORtg   Pts Prod    Poss    DRtg   Pts Allow   Net Pts
SMITH, Ben             54    16.8    64.3    10.8        55    115.1     12.7       -1.9                      
HARDY, Ayron           34     5.4    73.4     4.0        37    125.0      9.3       -5.3                    
MCMILLAN, Andre        37     5.6   143.8     8.0        37    119.7      8.9       -0.8                       
COLBERT, Lehmon        40     9.0    87.8     7.9        40    105.8      8.5       -0.6                           
ALLEN, Marcus          30     3.8    95.6     3.6        30    126.3      7.6       -4.0                         
COHN, Travis           16     3.4    62.0     2.1        16    135.0      4.3       -2.2                        
GILBERT, Brian         30     3.1    97.2     3.0        30    143.6      8.6       -5.6                          
KOHIHEIM, Paul         26     3.8    20.9     0.8        25    120.5      6.0       -5.2                      
BROOKS, Aric           19     5.9    80.9     4.8        19    116.6      4.4       +0.4                        
LUKASIAK, Szymon       33     5.0    79.1     3.9        35    138.1      9.7       -5.7                            
JEFFERSON, Evan        26     5.1    59.6     3.1        26    139.5      7.3       -4.2                       
TOTALS                 69    66.9    77.8    52.0        70    124.3     86.9      -34.9

DaJuan Summers had a great offensive game, but a lousy defensive game against the Dolphins, while Jessie Sapp was just the reverse (bad O, great D). Austin Freeman was his typical efficient self on offense but didn't use up a lot of possessions (~10%), while Chris Wright was player of the game on both ends of the court. Even Omar Wattad did his thing on the offensive end (1-1 2FG, 1-2 3FG).

I won't go into the Jacksonville players (you can see how they played last year).

The Dolphins lost to Florida State on Saturday, 59-57. J'ville was trailing 57-40 with 3:30 left and proceeded to go on a 15-1 run to bring the score to 58-55 with :20 left in the game, thanks in part to 2-8 FT shooting by FSU.

Hoya Prospectus