December 11, 2009

The Goldilocks Problem of Privacy in Public

Filed under: commons, events, musings, networks, politics, privacy — wseltzer @ 8:55 am

One of the very interesting sessions at Supernova featured a pair of speakers on aspects of privacy and publicity: danah boyd on “visibility” and Adam Greenfield on “urban objects.” Together, I found their talks making me think about the functions of privacy: how can we steer the course between too much and too little information-sharing?

danah pointed out the number of places we don’t learn enough. We “see” others on social media but fail to follow through on what we learn. She described a teen whose MySpace page chronicled abuse at her mother’s hands for months before the girl picked up a weapon. After the fact, the media jumped on “murder has a MySpace,” but before, none had used that public information to help her out of the abuse. In a less dramatic case of short-sighted vision, danah showed Twitter users responding to trending black names after the BET Awards with “what’s happening to the neighborhood?” Despite the possibilities networked media offer, we often fail to look below the surface, to learn about those around us and make connections.

Adam, showing the possibilities of networked sensors in urban environments, described a consequence of “learning too much.” Neighbors in a small apartment building had been getting along just fine until someone set up a web forum. In the half year thereafter, most of the 6 apartments turned over. People didn’t want to know so much about those with whom they shared an address. Here, we might see what Jeffrey Rosen and Lawrence Lessig have characterized as the problem of “short attention spans.” We learn too much to ignore, but not enough to put the new factoid in context. We don’t pay attention long enough to understand.

How do we get the “just right” level of visibility to and from others? and what is “just right”? danah notes that we participate in networked publics, Helen Nissenbaum talks of contexts. One challenge is tuning our message and understanding to the various publics in which we speak and listen; knowing that what we put on Facebook or MySpace may be seen by many and understood by few. Like danah, Kevin Marks points out the asymmetry of the publics to which we speak and listen.

Another challenge is to find connections among publics and build upon them to engage with those who seem different, Ethan Zuckerman’s xenophilia. The ‘Net may have grown past the stage where just Internet use could be conversation-starter enough but spaces within it take common interest and create community. Socializing in World of Warcraft or a blog’s comments section can make us more willing to hear our counterparts’ context.

Finally, our largest public, here in the United States, is our democracy. We need to live peacefully with our neighbors and reach common decisions. Where our time is too limited to bestow attention on all, do we need to deliberately look away? John Rawls, in Political Liberalism, discusses political choices supported by an “overlapping consensus” from people with differing values and comprehensive views of “the good.” I wonder whether this overlapping consensus depends on a degree of privacy and a willingness to look away from differences outside the consensus.

December 8, 2009

Personalized Search Opacity

Filed under: Internet, code, search — wseltzer @ 6:11 am

Google announced Friday that it would now be “personalizing” all searches, not just those for signed-in users. If your browser has a Google cookie, unless you’ve explicitly opted out, your search results will be customized based on search history.

Danny Sullivan, at Search Engine Land, wonders why more people aren’t paying attention:

On Friday afternoon, Google made the biggest change that has ever happened in search engines, and the world largely yawned. Maybe Google timed its announcement that it was personalizing everyone’s search results just right, so few would notice. Maybe no one really understood how significant the change was. Whatever the reason, it was a huge development and deserves much more attention than it has received so far.

I agree this is a big deal, even if it’s only the next step in a trend begun by customized search for signed-in users years ago. And except for here, I won’t even mention the P-word, “privacy.” Because on top of the implications of storing all a user’s search history, I wonder about the transparency of personalized search. How do we understand what search looks like to the world as it gets sliced up by history, location, and other inferences search providers make about their searchers?

As users, we’ve basically come to terms with the non-transparency of the search algorithms that determine which results to show and how to order them. We use the engine that mostly gets us relevant results (or perhaps, that offers shopping discounts). If we’re dissatisfied with the results Google returns, we can use Yahoo or Bing.

We also have some degree of trust that search isn’t systematically discriminating against particular pages or providers for undisclosed reasons. When Google received copyright takedown demands from the Church of Scientology years ago, prompting it to remove many links to “Operation Clambake,” Google sent the takedowns to Chilling Effects and linked them from its search pages so searchers could see why the search had apparently become more pro-Scientology in its results. More recently, the search engine has worked with the Berkman Center’s StopBadware to flag malware distribution points and let searchers know why sites have been flagged “harmful.” When a racist image appeared in searches for “Michelle Obama,” Google used an ad to explain why, but did not tweak algorithms to remove the picture.

How do we verify that this trust is warranted, that page visibility is a relative meritocracy? With open source, we could read the code or delegate that task to others. With a closed platform where we can’t do that, our next best alternative is implicit or explicit comparison of results with others. Investigative journalists might follow a tip-off that liberal media seemed to rank higher than conservative, and run some comparisons and questions to test and report back; search engine optimizers, motivated to improve their own pages’ rankings, might also alert us to biases that caused unfair demotions — we can believe we’re seeing a reasonable mix of digital camera stores because proprietors would complain if they were omitted. If something “feels wrong” to enough people, chances are it will bubble up through the crowd for verification (or debunking — see the complaints that iTunes “shuffle” feature isn’t random, by listeners who confuse randomness with a non-random even distribution). If a search engine failed to disclose payment-induced bias, the FTC might even follow with a complaint.

With personalized search, these crowd-sourced modes of verification will work less well. We won’t know if the biases we encounter in search are also seen by others, or if the store shuffles its end-caps when it sees us walk in. It would be easier for an Evil search provider to subtly tweak results to favor paying clients or ideologies, unnoticed.

Finally, I’m reminded of the “ants” in Cory Doctorow’s excellent Human Readable — an automated adaptive system so complex even its creators can’t debug it or determine its patterns. If someone is paying off the ants, society can’t trace the payments.

When I asked a version of this transparency question to the “real-time search” panel at Supernova, Barney Pell of Bing suggested that users don’t want to know how the search works, only that it gets them useful results. Part of my utility function, though, is fairness. I hope we can reconstruct that broader view in a world of ever-more-personalized search.

Powered by WordPress