Kamis, 20 September 2012

Wikipedia to Start Releasing Full but Anonymous Search Data


Wikipedia has always been about sharing, but rather than sharing the knowledge of others with the world, it is now sharing the searches of others with the world. Well, it's sharing everyone's searches with the world.

The site has started releasing complete search data daily (i.e. every query of the previous day) to, on the one hand, help improve its own search efforts and, on the other, maybe help researchers that may be interested in a large data dump like this.

Wikipedia does its best to anonymize the data, there are no IPs, no user names, nothing of that sort. Wikipedia also filters out things like credit card numbers, email addresses or social security numbers.

Still, efforts like this always walk a fine line since data is rarely as anonymous as it may seem. In fact, Wikipedia was forced to temporarily pull down the data to improve its anonymization methods.

Via: Wikipedia to Start Releasing Full but Anonymous Search Data

Tidak ada komentar:

Posting Komentar