12.3.14

Remaining Illegible in an Information Age

There's a longstanding joke about statistical and quantitative analysis: a man loses his keys one night, and is looking for them under a streetlight. When asked why he's looking only there and not elsewhere, he replies "because that's where the light is." So it goes: people interested in data have a vested interest in reducing the world to data and ignoring everything else. In political science, despite the increasing sophistication of statistical methods, there is very little that can be robustly modeled, and what can be modeled is usually of little value in predicting the future. The bluster is quite high, but it is rarely delivered on. 

The same seems to be true of concerns over the internet and privacy. People are all too happy to reduce individuals to their data, where data means "those things a person buys or publicly associates themselves with." There also seems to be a great deal of concern that those with the technological ability to refine the software that makes this possible are embracing this future without thinking too carefully about it. 

I think it's rather the other way around: our technological masters have convinced themselves that the small part of a person's life that is directly measurable constitutes the whole of it, and have therefore confined themselves to creating finer-grained data over an ever smaller portion of that life.

Three reasons for this:

1. The "like," which I take to be the gold standard for people voluntarily surrendering information about themselves, has only been around for a short time. There are limits to the information that can be gained from a like (anyone remember "can this pickle get more fans than Nickelback"?), and the other information sources do not inspire confidence in the quality of the algorithms used: Facebook keeps asking me when I got engaged to my wife, despite the fact that it was listed as an event on Facebook when it happened. 
And this is the quality of information that is made readily available. Most data is not in any form to be used. I was an active participant in online R.E.M. fan groups in the mid-90s, under my own name, all of which is available on the internet (as are 12 years of blog archives) to anyone so enterprising as to search. There are large stretches of my life that are pretty useless for data-mining purposes, even though the data exists. Which segues into the larger point: tastes and interests change over time, which means even if you have the data and even if it's in a form that can be used, all I have to do is not like, say, How I Met Your Mother anymore, and that information is useless.

2. The recommendations data can generate are limited, and bad more frequently than they should be. Amazon has nearly 15 years of my purchase history, and it rarely recommends something I'm interested in. The relevant information is not available to them: I buy a Javier Cercas novel because Roberto Bolaño mentioned him in an essay and made the novel sound interesting; I read Roberto Bolaño because I was bored with my then-current options and my mother thought I might like it. I buy a novel but don't like it; I buy a movie because it's cheap and worth a shot. There's no one reason I consume the way I do, and the connections between one thing and another are often illegible to data collecting schemes. If someone possessed all of my information: credit card, browser history, Amazon buying, Netflix history, then they might possibly be able to find patterns and predict. But no one entity has all that information, and there are pretty solid reasons to assume the relevant parties are unlikely to consolidate.
Also, Netflix is always going to suggest House of Cards to me, even though I've never watched it. They have to, it's in the essence of advertising one of their own shows, but it compromises the purity of the data, and thus its usefulness (see also buying your way into a google search).

3. Technology and behavior is opaque. I briefly considered pursuing "professional internet writer" as a career, but backed off it when I took a serious look at the economic prospects. Since then, I am more involved on twitter, less on blogs and Facebook. But here's the thing: even when I was more involved overall, most of my life was happening offline. Significant aspects of my life get no showing at all; the self that is exposed is intentionally chosen, and thus a persona, and thus not a complete reflection of who I am. And all this as one of the technologically linked-in: most people I know conduct even less of their life online, to say nothing of people whose jobs and lives don't permit this level of interaction: CBS is still the #1 network, even though I don't know anyone who watches it, and hit movies, music, etc, are all things far outside my interest. (To say nothing of the new generation of apps for which privacy and anonymity are the point: Snapchat et al)

So it seems appropriate to allow the Zuckerbergs and the Bezoses of the world to continue their attempts to find their keys under their very bright streetlights, and not worry very much about it.

No comments: