Sunday, March 13, 2011

Thinking about NCAA Seeding

The NCAA tournament bracket was announced tonight, and Georgetown received a 6 seed in the Southwest Region, playing the winner of the Southern California v. Virginia Commonwealth play-in game.

While there is always the normal sturm und drang about included and excluded teams after the brackets are announced, I was curious how the actual seeding played out versus Ken Pomeroy's rating system.

Now there's nothing magical about Ken's system (although John Gasaway would have you believe otherwise), but it provides a nice objective metric against which to judge the seeding.  In particular, Ken's system will struggle with teams that were missing an important player for a significant chunk of time, but will have that player back in time for the tournament ("I wonder who he means").

One thing to keep in mind is that this isn't the same sort of analysis as the log5 odds calculations that Tom provides here, and Ken has just posted for the entire NCAA field over at Basketball Prospectus.  Rather, here we're just concerned with how hard each sub-region is, relative to the others.

Here's a breakdown of the Southwest region:
Seed   Team                  Mis-seed
  1    Kansas                    0
 16    Boston U                  0
  9    Illinois                 -4
  8    Nevada Las Vegas         -2
  5    Vanderbilt                3
 12    Richmond                 -1
 13    Morehead St               1
  4    Louisville               -1
  3    Purdue                   -1
 14    St. Peter's               0
 11    Southern California       0
       Virginia Commonwealth     2
  6    Georgetown                2
  7    Texas A&M                 4
 10    Florida St                0
 15    Akron                     0
  2    Notre Dame                1
The first two columns should be self-explanatory; I've color-coded matchups to make the table a bit easier to read, with the play-in game using it's own color.

The third column, labeled "mis-seed" is simply a comparison of the actual seed assigned by the tournament committee with the seed found by ranking the 68 tournament teams via KenPom's ratings.  A positive number means a team is over-seeded (the team received a higher seed from the tournament committee than Ken's system would have assigned) and a negative number is an under-seed.

For example, the 8/9 game in the Southwest pits Illinois and UNLV.  Through Sunday's games, Ken ranks Illinois and UNLV as the 20th and 22nd best teams in the country.  Since every team ahead of them also made the tournament, their expected seeds from KenPom is simple enough to calculate [ = KP rating / 4, rounded up].  So Ken would have given Illinois a 5 seed, and UNLV a 6 seed.  The mis-seed for Illinois is KP seed - NCAA seed [ = 5 - 9], which gives a mis-seed value of -4; UNLV earns a -2 this way [= 6 - 8].

The upshot of all this is that, regardless of who comes out of the 8/9 game, Kansas (assuming the win over BU) will be playing a team that Ken Pomeroy would tell you is at least two seed lines better than a typical 8/9 opponent.

At the other extreme, 10-seed Florida State gets a break in playing 7-seed Texas A&M that Ken would predict is actually the worse team of the pair, by a hair

If you take this further and break the bracket down into upper and lower sub-regions, it is quickly apparent that Kansas was done no favors by the committee, while Notre Dame has a far easier draw.

Here's a summary of the difficulty for each of the 1 and 2 seeds, found by averaging all of the other team's mis-seed values in the upper and lower halves of each region.  I've included each team's own mis-seed stat, as well:
.                           avg. opp.                         avg. opp.
Region       1-seed         mis-seed       2-seed             mis-seed
East         Ohio St. [0]     -1.1         North Carolina [2]   -0.1
Southeast    Pittsburgh [1]   -1.1         Florida [3]           0.4
Southwest    Kansas [0]       -0.6         Notre Dame [1]        0.9
West         Duke [0]          0.6         San Diego St. [0]     0.3

By this metric, the East region is the hardest in the tournament as both OSU and UNC have a sub-region which is under-seeded, although the Buckeyes have a lot more to complain about.  Meanwhile, the West with Duke and San Diego St. is the easiest region, with both teams' opponents over-seeded on average.

Pitt is done no favors in the Southeast, while Florida - who KenPom would have pegged a 5-seed - plays in a slightly soft sub-region.  And Notre Dame's luck continues with the most over-seeded sub-region of all the top seeds.

And finally, for the sake of completeness, here are the mis-seed stats, averaged as a function of seed-line:
Seed    Mis-seed       Seed    Mis-seed
  1       0.3           16       0.0
  2       1.5           15       0.0
  3       0.5           14      -0.3
  4      -1.8           13      -2.0
  5       2.3           12      -2.8
  6       2.0           11      -1.4
  7       1.8           10       0.5
  8       0.5            9       0.3
The stats from this table - if they are representative of most of the NCAA tourneys since the expansion to 64 teams - give a fairly good explanation why those 5/12 upsets are so common:  the 5-seed is the most over-seeded in the bracket, and the 12-seed is the most under-seeded.  Plan accordingly.

After the jump, the rest of the region mis-seed tables.

East region:
Seed   Team                  Mis-seed
  1    Ohio St                   0
 16    Texas San Antonio         0
       Alabama St                0
  9    Villanova                -2
  8    George Mason             -1
  5    West Virginia             1
 12    UAB                       0
       Clemson                  -6
 13    Princeton                 0
  4    Kentucky                 -2
  3    Syracuse                  0
 14    Indiana St                0
 11    Marquette                -3
  6    Xavier                    3
  7    Washington               -3
 10    Georgia                   2
 15    Long Island               0
  2    North Carolina            2

Southeast region:
Seed   Team                  Mis-seed
  1    Pittsburgh                1
 16    NC Asheville              0
       Arkansas Little Rock      0
  9    Old Dominion              2
  8    Butler                    3
  5    Kansas St                 3
 12    Utah St                  -8
 13    Belmont                  -8
  4    Wisconsin                -1
  3    Brigham Young             1
 14    Wofford                  -1
 11    Gonzaga                  -4
  6    St. John's                3
  7    UCLA                      4
 10    Michigan St               0
 15    UC Santa Barbara          0
  2    Florida                   3

West region:
Seed   Team                  Mis-seed
  1    Duke                      0
 16    Hampton                   0
  9    Tennessee                 3
  8    Michigan                  2
  5    Arizona                   2
 12    Memphis                   1
 13    Oakland                  -1
  4    Texas                    -3
  3    Connecticut               2
 14    Bucknell                  0
 11    Missouri                 -2
  6    Cincinnati                0
  7    Temple                    2
 10    Penn St                   0
 15    Northern Colorado         0
  2    San Diego St              0


  1. Illinois has 13 losses. They do not deserve a better seed.

    Pomeroy's ratings do not translate into seedings if seedings are rewards for winning games. Pomeroy compares teams based on their scoring margins for a predictive model but a team that wins one game by 15 and loses another by 5 is not as successful as a team that plays the same games and wins both by 5. The 2-0 team may not necessarily be better but it earned a higher seed than the 1-1 team.

  2. Um, okay. I actually have no idea how the committee determines seeding.

    I'm not saying that Illinois should have received a higher seed. All I am saying is that MOV stats indicate that Illinois is better than you'd expect for a 9-seed and, by extension, could make Kansas' 2nd round game a lot closer than you might expect.

  3. Okay. I agree with that.

    It is not that Illinois was mis-seeded. Pomeroy shows that Illinois is a better team than its won-loss record.

    Now, Utah State and Belmont were mis-seeded.

  4. and Florida.

    Seems like it's a semantic argument, and I'll be the first to admit "mis-seed" is probably not the right term, but I couldn't come up with anything clever at 11pm.