« Makin' the Grade: On the D-List | Main | Non-Fiction: The 50th Law & Managing Softly »

Much Adieu: MS Search Data

Hold onto your hats, folks, cuz there's a PR/marketing storm sweeping the lands! It seems that Microsoft has decided to take the "bold" step of removing the IP address associated with a search query starting now at the 6-month mark. Ooooo how exciting. (that was cynicism) Actually, I could really care less. Well, ok, I think this is a good thing, but let's be honest here, it's so minor and trivial that it is just the thing for PR/marketing, not really the thing for actual meaningful privacy improvements.

So, what all my cynicism on the topic? Allow me to draw your attention to the AOL search data leak of August 2006. For a quick background on that story, check out these links (don't worry, I'll wait):
* Wikipedia
* TechCrunch
* EFF

Done reading? Great! So, let's look at the problem...

A unique tracking ID
First and foremost, despite assertions about removing the IP and "de-identification," Microsoft (and presumably every other search engine in the world) is still retaining indexed data. The index value appears to be a hash or otherwise unique number. No matter how you do it, if all queries from the same person have the same hash tag on it, then viola! you can correlate search data. From the MS description, it appears that they're referencing your searches by the hash tag in your cookie.

So, the problem here is tracking queries to a common index because it leads directly to correlation. Why is that important? Well, because the content itself is far more interesting than the allegedly PII associated with the query.

The search content itself
One major lesson we learned from the AOL data leak incident was that people tend to search on things we in the infosec community would not necessarily expect. Things like their own name, their own credit card number, their own address, and lots of other personally identifying information.

So, combine a unique tracking ID that tracks all of a single person's searches with the search data itself. Viola! you can find all sorts of interesting and scary details about a given person.

Conclusion
Is it nice that Microsoft is removing IPs now at 6 months? Sure. But why do they need that information at all? As long as they're tracking queries to a unique ID and then retaining that data, our privacy is still undermined. Nevermind that they still have a 6-month window to work with. I'm rather unclear on why this is being hailed as a major breakthrough. It all seems like much adieu about nothing.

TrackBack

TrackBack URL for this entry:
http://www.secureconsulting.net/MT/mt-tb.cgi/985

Post a comment

About

This page contains a single entry from the blog posted on January 19, 2010 10:54 AM.

The previous post in this blog was Makin' the Grade: On the D-List.

The next post in this blog is Non-Fiction: The 50th Law & Managing Softly.

Many more can be found on the main index page or by looking through the archives.

Creative Commons License
This weblog is licensed under a Creative Commons License.