Episode 557: Pizzeria Quattro

On the Overthinking It Podcast, we tackle algorithms, automated recommendation, Bayesian statistics, and whether to pass up a delicious cinnamon roll.

by Matthew Wrather
March 4th, 2019

Support Overthinking It by becoming a member for $5/month!

Peter Fenzel and Matthew Wrather start with an inappropriate recommendation from Netflix and head down the rabbit hole of algorithmic recommendations, machine learning models, behavior tracking, agency, and volition. Remember: Nobody is going to take care of your brain but you.

Download (MP3)

Subscribe: iTunes Other Apps

joeposner Member Mar 5th 2019 2:11 am #

As a former Netflix employee, I’m going to “well actually” your comments about their recommender and ML algorithms: “well, actually, you were right on all counts.” I need to hand-wave a bit since, while my knowledge of their systems is 10 years out of date, I can’t go into all the details. But I can discuss things that are publicly disclosed or trivially obvious.
Their DVD recommendations took a few different things into account. The number of stars you were expected to give was one factor. How fast Netflix could ship you that DVD if you put it in your #1 slot was another. And what it cost Netflix in licencing fees to ship that movie was another factor. The best recommendation for a user was one that the user was expected to give 5 stars to, that could ship overnight, and that had zero licensing fees. Since nothing was perfect, the “level of recommendation” for a given movie was a mathematical combination of those (and other) attspects.
This leads to another point that you touched on. The definition of “good” in any Machine Learning system is a mathematical representation of how important some human values are, decided on by the business. Whether that’s converting future profit to net present value or deciding how people compare “on time airplane arrival” with “low turbulence” and “lost luggage”, a Machine Learning optimization requires a human definition of “best” to compare its calculations against. The input values are also, explicitly and implicitly, set by human choices. We don’t throw “number of baseball games played today” into ML algorithms to optimize mortgage default rates because doing so (a) dramatically increases computation time and (b) produces false positives (enough silly attributes in the mix and random chance a few will show correlation to “best”). So humans set the input, which means humans make decisions like “we will use the calender definition of winter months rather than average temperature for that zip code when deciding seasonality”

Reply
Jens Mar 5th 2019 11:03 am #

Thank you for this episode. As a psychologist working in research, the topic of inferential statistics, and specifically behavior prediction, is close to my interests.
Now, I’m fully aware that you’ll never read this comment on the podcast because Captain Marvel is coming up and everyone (including myself) would rather listen to you talking about that one as opposed to about some rando’s musings about statistics, but I’d still like to share some thoughts that I had when listening to the episode:
– A personal anecdote: it’s hard to overstate how much of a phenomenon (and how sudden) Harry Potter was back in the early 2000’s (especially after the release of the first movie). Amazon was already a thing (almost exclusively for books, though) and I’ve noticed at the time that no matter what item I looked at, its algorithm recommended one of the available Harry Potter books. In that case, the algorithm was clearly overwhelmed by the increased base probability of someone buying HP books, which overrode the information they’ve had on file from my prior consumer history.
The same thing must certainly occur in the context of Netflix as well: my main reason for watching Marie Kondo was that everyone else was watching it. Recommending the show to me at a certain time, regardless of my actual interests, does not seem unreasonable. So in a way [drink], the algorithm needs to have a loophole built in that will lead it to ignore its own mechanism if something becomes popular enough to override it.
– Similarly, it reminds me of a weird story out of Germany: like in many large cities, you can buy merchandise that says “Hamburg” in the eponymous city. At one point, a producer of tourist beanies erroneously printed “HAMBRUG” on them (‘u’ and ‘r’ switched). Unexpectedly, people loved this (presumably as a sarcastic comment on producer/tourism culture) and bought them up, so that soon enough they had to produce deliberately misspelled HAMBRUG merchandise.
My point is: I don’t believe there is anything that any behavioral prediction algorithm could have done to see this coming. A mechanism that expected people to act reasonably in any sort of way would have missed this trend, and subsequently would have left money on the table.
Thank you again, and looking forward to the next episode!

Reply
John C Mar 10th 2019 10:15 am #

Apology: Since the episode had a lot of important points to it, this comment is going to be long-winded and only-hopefully-not-meandering. We’ll pretend that’s not my normal state.

That said, it’s not so much that algorithms or machine learning systems aren’t biased, since…see the assorted systems that think black people are invisible or not human, image insist Asian people have their eyes closed, parrot right-wing nonsense, or sentence minorities more harshly for convictions. They are, as the good book (OK, A good book) says, “though it has many omissions and contains much that is apocryphal, or at least wildly inaccurate, it scores over the older, more pedestrian work in two important respects. First, it is slightly cheaper…”

The key to remember, I think, is that it’s cheaper because of the troubling mass surveillance aspect. It’s harder to build a one-user, local recommendation system, because it’s much more difficult to guess what you want and haven’t seen, if you don’t have information on other things that people more generally have seen and rated, without knowing a lot about your tastes and how to break down the candidates.

And, of course, both systems fail miserably when there’s something genuinely new, because there’s no way to judge whether anybody will like it, and so most algorithms won’t recommend it. Netflix seems to suffer from something like this, where a non-traditional show isn’t strongly recommended, but their ratings algorithm requires people to know about something, so it fails because they refuse to put it into the auto-playing “hero recommendation” slot. Or, because it doesn’t seem to take much context into account, it keeps recommending the thing you’ve just watched.

(I’m specifically thinking of the rebooted “One Day at a Time,” which they claim to want to keep making, but aren’t seeing the required numbers of views…but also aren’t prioritizing curating it over the more generic fare. The show’s a definite recommendation, though. I don’t know why “we have a show with Rita insert-excessive-gerund-for-emphasis-here Moreno, so go watch it” isn’t the first thing they tell everyone who logs in. They’re still trying to push Marie Kondo on me, despite my not watching reality TV, though…)

Anyway, advertising-wise, it’s worth pointing to Edward Bernays, who starts his book “Propaganda” (1928) suggesting that “the conscious and intelligent manipulation of the organized habits and opinions of the masses is an important element in democratic society.” That’s seriously the first sentence of the book. Emily Fogg-Meade made harsher versions of those comments in 1901, “a subtle, persistent, unavoidable presence that creeps into the reader’s inner consciousness. A mechanical association is formed and may frequently result in an involuntary purchase.” And what’s a recommendation engine but a kind of advertising that manipulates your habits directly by making your choices? It’s also maybe worth pointing out that the United States Supreme Court nearly banned advertising in 1967 based on antitrust arguments, until the FTC pulled out and decided that advertisements were (secretly?) usefully informative, with a big retreat under the Reagan administration.

Add in companies engineering their products to be addictive (and sometimes even describing their products in terms of addiction, like “binge” or “crack”), and all this combines to the big Internet companies optimizing to get any emotional reaction out of you so that you’ll be more receptive to the ads. So, suddenly, YouTube somehow can’t figure out that watching a specific review for a socially-progressive movie doesn’t necessarily mean that you want to then watch an infinite parade of man-babies shouting about how feminism is going to destroy all of civilization or maybe a “proof” that the Earth is flat. And it’s also why I’ve soured on Quora over the years, as its algorithms seem to have decided that the best way to keep me engaged is to recommend conversations to be angry at…or the assortment of “what’s the stupidest thing you’ve seen at your job” questions that I literally never read, but people I follow on Quora apparently do, so that’s more important.

That also gets at the context problem, that as far as the algorithms are concerned, I’m not allowed to be interested in certain aspects of a person or topic. The granularity is either all or nothing. Forget about if you want different content depending on your mood.

Related: One version of the story I heard about Netflix changing its rating system (from someone who provided content that proved controversial) was that the out-of-five-star system was at least partly dropped to stymie coordinated attacks on original shows. The current approach isn’t as meaningful, so it’s harder to game. Of course, it’s also less useful for exactly the same reasons.

But worse, when you start combining all of these little facets (the bias, the surveillance, the intentional manipulation “for democracy,” the ease of outside manipulation, the addiction, and the emotional provocation, not even getting to the poor vetting of advertisers, it suddenly gets into even more troubling territory during that whole “Mark Zuckerberg may or may not be planning to run for President” debate a couple of years back. And it at least makes me wonder if the recent interactions between politics and algorithmic social media have been exactly as designed, except that the wrong people pulled the trigger too early.

That goes all “trouble in River City,” though, so I’m going to Amazon Prime me some foil, just to be safe! (The last initial is P, which stands for Propaganda…and sure, pool.)

Unrelated topic: Individual member feeds sound absurdly hard, by the way, and entirely reasonable to not want “leaks” of the premium content to be completely trivial. I’ll definitely use an RSS feed when it inevitably works, but as a possible compromise, would it be easier to have a way to mark Digital Library items as listened-to and a filter to hide those links from the listing? It wouldn’t be an end-to-end solution, and I don’t know if it’d be useful to anybody but me, but it’d remove the need to remember where I left off, because I can be bad at that working backwards on an infinite scroll.

Oh, and at least for the tutorials I’ve seen (and the online class I tried to take that was too similar to the elective I took in college twenty-something years ago), more than superficial ability in math or Python seems like it’d be overkill.

Reply