lang="en-US">

Movies | IMDb Top 250 Movies List Analysis, 4th Edition
Site icon Overthinking It

IMDb Top 250 Movies List Analysis, 4th Edition

The distribution graph for the IMDb Top 250 Movies list as of 10/3/2011

It’s October, which means it’s time to subject the IMDB Top 250 Movies list to a level of quantitative scrutiny it probably doesn’t deserve. For those of you who are new to this series, here’s a quick recap: I’m four years into an effort to analyze changing movie tastes through lists of top movies.

In 2008, I compared the AFI Top 100 movies lists of 1998 and 2007 to the IMDb Top 250 movies list as of September 2008 and found the industry (AFI) list skewed towards older movies compared to the fan list (IMDb) and that the contents of the AFI list had only advanced by 5 years in the 9 years between the two editions.

In 2009 and 2010 I returned to the IMDb lists by taking snapshots of the list at around the same time of year as the initial analysis (late September/early October) and attempted to extrapolate some meaning from the changes in the lists’ composition over the years.

And now, in 2011, I’m adding a fourth year of IMDb data to the analysis! Will any trends emerge? Can we predict the future of the IMDb Top 250 list? Read on to find out.

Before we begin, I should state the obvious limitations to using the IMDb list as a tool for analyzing changes in movie tastes over time. The IMDb list is far from authoritative, and the potentially skewed demographics of its voters casts even more doubt on the validity of the list. Furthermore, the very exercise of assigning a single numerical rating to a movie is more than reductive; it’s borderline absurd. But put all of that aside for a moment. The IMDb list, for all its flaws, is well-known, and its ratings are accepted as being decent rough indicators of movie quality.

With that being said, let’s pick up where we left off last year, when I asserted that the IMDb list’s perceived bias towards newer movies was real, and getting worse over time. From last year’s article:

When I fired up Excel to do this analysis, I was pretty sure that I would find that the overall shift of the dataset in terms of median year would be greater than the concurrent shift in time. And I was right:

  • Median Year of IDMb Top 250 List as of 9/30/2008: 1975
  • Median Year of IMDb Top 250 List as of 10/18/2009: 1977 (jumps 2 years after 1 year)
  • Median Year of IMDb Top 250 List as of 9/26/2010: 1981.5 (jumps 4.5 years after 1 year)

Yup, you read that right. Over the course of the last year, the median year of the IMDb Top 250 movies list increased by 4.5 years, which suggests that not only is the lists’ bias towards newer movies still present, it’s intensifying over time.

So how does the most recent sampling fit in with this trend? The list as of 10/3/2011 had a median year of 1983.5, or a jump of 2 years after 1 year. So the tilt towards newer movies wasn’t as severe as it was from 2009-2010, but it was more than what you’d expect if the movies were evenly distributed from the minimum to the maximum for each year.

Let’s see how that plays out in chart and graph form:

Year Min  Max Actual Median Theoretical Median
2008 1920 2008 1975 1964
2009 1921 2009 1977 1965
2010 1921 2010 1981.5 1965.5
2011 1921 2011 1983.5 1966

(Note: the “theoretical median” = what the median year of the list would be if the movies were evenly distributed between the oldest movie on the list and the newest.)

I thought it would be fun to make a rudimentary projection as to what would happen to this list based on the trend of the last 4 years. So I looked at the changes in differences between median year and list year: from 2008-2009, the difference reduced from 33 to 32; from 2009 to 2010, the difference reduced from 32 to 28.5; and from 2010 to 2011, the difference reduced from 28.5 to 27.5. Each change can also be stated as a reduction factor (e.g., 32/33 = .9697). Averaging these three factors produces a single average reduction factor of .9417, which, if you apply to future years, we can use to come up with the shocking prediction that the list’s median year will essentially equal the year of the list in about 70 years:

OK, calm down folks. You don’t have to be a statistician to see the problems with this approach. First, intuitively, it makes no sense. Can you imagine a top movies list in the year 2070 that has as many movies on it from all years prior to 2069 as it does for the years 2069 and 2070? Second, and more importantly, a sample size of 4 is way too small to make this sort of projection. (Also, my little trick of averaging reduction factors probably isn’t sound math, but it worked in that it produced the nice, albeit erroneous, graph you see above.)

About that sample size. Unfortunately, I’ve only been doing this for four years. Now, if only there were some way to go back in time and capture the status of the list from previous years. If only…

But wait! Such a thing exists. Some fortuitous Google searches led me to the “IMDB Top 250 History” website, which has snapshots of lists going all the way back April 1996.

Jackpot! I took additional snapshots from the same time frame, added them to the analysis, and…

…was disappointed to find that the trend totally disappeared with the expanded dataset:

Year Min Max Actual Median Theoretical Median
1996 1925 1996 1986 1960.5
1997 1927 1997 1986.5 1962
1998 1925 1998 1983 1961.5
1999 1922 1999 1975.5 1960.5
2000 1922 2000 1975.5 1961
2001 1922 2001 1978.5 1961.5
2002 1922 2002 1976 1962
2003 1922 2003 1975 1962.5
2004 1922 2004 1976 1963
2005 1920 2005 1978.5 1962.5
2006 1920 2006 1974.5 1963
2007 1920 2007 1976.5 1963.5
2008 1920 2008 1975 1964
2009 1921 2009 1977 1965
2010 1921 2010 1981.5 1965.5
2011 1921 2011 1983.5 1966

Turns out that in its earlier days, the IMDb Top 250 list was even more skewed towards newer movies than it is today, both in relative and absolute terms. And even with the larger sample size, the swings in median year make any sort of projection unfeasible, whether it’s with my fuzzy math method or a more formal linear regression analysis. And we haven’t even factored in IMDb’s changes and tweaks to its ranking algorithm over the years.

Sorry, folks, we can’t predict the future of the IMDb Top 250 Movies List through statistics. Or even make an educated guess. But we can still have a lot of fun analyzing the changes that have occurred to the list. Read on for more:

Here’s the master distribution graph for the 2011 list, and the ones for the previous four years for comparison. The solid lines in the graphs are all 4th order polynomial trend lines.

2011

The distribution graph for the IMDb Top 250 Movies list as of 10/3/2011

2010

2009

2008

Now, for specific changes between the 2010 and 2011 lists:

New movies on the 2011 list

Title Year Rank Rating
Drive 2011 114 8.2
Harry Potter and the Deathly Hallows: Part 2 2011 133 8.2
A Separation 2011 207 8
Black Swan 2010 113 8.2
The King’s Speech 2010 125 8.2
The Social Network 2010 220 8
Shutter Island 2010 238 8
Elite Squad: The Enemy Within 2010 246 8
Mary and Max 2009 195 8
Ip Man 2008 241 8
Spring, Summer, Fall, Winter… and Spring 2003 247 8
A Beautiful Mind 2001 242 8
The Celebration 1998 233 8
Beauty and the Beast 1991 249 8
Fanny and Alexander 1982 210 8
Stalker 1979 243 8
Persona 1966 200 8
The Man Who Shot Liberty Valance 1962 237 8
Tokyo Story 1953 248 8
His Girl Friday 1940 250 8
The Passion of Joan of Arc 1928 212 8
Sherlock Jr. 1924 221 8

Where are the recent summer movies? Many of the big winners of the 2010 Academy Awards have made their way onto the 2011 list, but none of the 2010 summer fan favorites, and only one from the 2011 summer season. Compare that to the 2010 summer hits that made it onto the 2010 list: Inception, Toy Story 3, How to Train Your Dragon, and Kick-Ass. Conclusion: the 2011 summer movie season was weak. But you didn’t need statistics or the IMDb list to tell you that.

Movies on the 2010 list that dropped off the 2011 list

Title Year Rank Rating
Kick-Ass 2010 195 8
Letters from Iwo Jima 2006 214 8
Little Miss Sunshine 2006 244 7.9
Crash 2004 225 8
Mulholland Dr. 2001 247 7.9
Wo hu cang long 2000 240 7.9
Toy Story 2 1999 229 8
The Nightmare Before Christmas 1993 241 7.9
Edward Scissorhands 1990 250 7.9
The Conversation 1974 233 8
Planet of the Apes 1968 230 8
Bonnie and Clyde 1967 223 8
Spartacus 1960 242 7.9
Anatomy of a Murder 1959 236 8
The African Queen 1951 221 8
Harvey 1950 202 8
Brief Encounter 1945 217 8
Arsenic and Old Lace 1944 249 7.9
Shadow of a Doubt 1943 215 8
The Adventures of Robin Hood 1938 246 7.9
King Kong 1933 208 8
Duck Soup 1933 224 8

So much for Kick-Ass. Great movie, but…Top 250 great? (Duke it out in the comments.)

Not much else to say here, except to repeat the annual ritual of lamenting the cruelty of the IMDb algorithm that pushes out classic movies to make room for new “classics.” Notably, the original Planet of the Apes has left the list in the same year that a new Planet of the Apes movie received high critical and audience praise. Fellow gorilla-themed classic King Kong is also gone, meaning that these Damn Dirty Apes have gotten their paws off of the IMDB Top 250 List…for now.

CGI? We don't need no stinkin' CGI.

Conclusion

Looking back at the comments on these articles over the past few years, it’s clear that these lists and their content get people really worked up, in spite of all of the caveats downplaying any authority that the IMDb list may carry and the authority that any ranking list done on subjective grounds may carry.

So rather than try to tell people not to read into this too much, I’m going to give the opposite advice for those wishing to comment on this article: read into it all you want. If you think the data says something about the changing tastes of our times, the intensifying fanboy effect on IMDb rankings, the downfall of the art of film, or the downfall of Western Civilization, let it be known. If you think this is a pointless exercise and that I should be using Excel for more interesting pop culture analysis, let it be known.

Oh, and speaking of Excel, if you want to do your own analysis, you can either download my datasets for 1996-2011 or grab some for yourself at the IMDB Top 250 History site.

Exit mobile version