The internet has made possible what power structures have tried and, usually, failed to do throughout the history: knowing everything there is to know about their citizens. Now there is no need for anyone to ask us for information – we gladly give it away and we are doing this in larger and larger numbers.
Think about the first time you’ve connected to the internet: the sheer amount of information available was dazzling. You could travel the world in a few seconds without leaving your home. Everything other human being seemed closer than ever and things looked pretty safe as you were just one of the other millions of people connected to the internet. Anonymity instilled a sense of freedom that was not possible in the real life.
Years have passed, the internet improved, the number of internet connected users increased exponentially and some companies started wondering – who are this users and how can we find out more about them? Apparently – pretty much.
We had a look at how companies use large data to improve their marketing efforts. Facebook and Google are some of the biggest data-handlers in the world and by offering free web services such as social networking or search these companies gather hundred of millions of users on a daily basis. These users have certain interests, profiles, friend connections and are willing to give away all this information without much thought.
If, say, a Coca-Cola representative would approach us on the street and started asking us questions regarding personal data, interests in different areas,information about our friends we would be rather skeptical, wouldn’t we?
Actually that’s what basically happens every time we search something on Google, update our Facebook profile, read an article on the web or simply send an email. Even though some information is anonymized, even though there are laws that may interfere with privacy breaches, the truth is there are some companies that hold real time information on large masses, information that can be used either at a macro or micro scale. From individuals to countries such companies know a lot of things and became increasingly good at harnessing the power that lies in these bits of data.
Micro and macro implications
On a micro level individuals basically offer some of their most intimate information to companies that use it for marketing purposes. There is a saying stating that “if you are not sold anything, you are the product being sold”. That holds true to both companies mentioned above. There is no secret Facebook and Google make most of their revenues through advertising. To help increase advertising efficiency both companies need to know as many things as possible about the person viewing the ad.
Both companies thrive on information users offer, knowingly or not. Whether is the page you are viewing, information on your Facebook profile – you tell advertisers how to better sell their products.
Another implication of sharing so much data is that you become predictable. Even though we look at ourselves as unique, special individuals, the fact is we are not. We are creatures of habit and habits turn into patterns. When some important events in our lives happen our behavior is even more likely to become predictable. Target used customer data to find out when their buyers start dealing with pregnancy. Based on a series of products future mothers are more likely to purchase they managed to target those exact customers, sometimes even before their friends or family found out.
You might think that companies and other organizations can track you only if you choose to use your real identity. Actually no. There are several techniques developed to help deanonymize internet users. One of these techniques is based on stylometry, the analysis of writing style. Although information on this subject dates back a few centuries, the internet made possible the analysis of large chunks of data.
Be it your blog, your Facebook profile or movie reviews you posted online the fact is you leave traces on the internet through your writing style. Even if you publish a text anonymously and make sure you are not traceable by classic means, stylometry can point towards you. Arvind Narayan, a computer scientist focused on “breaking data anonymization, and more broadly […] digital privacy, law and policy” explains here how this can happen, what are the necessary steps, what are technological requirements etc.
Although micro implications are interesting, they are just the tip of the iceberg. With enough data anything is possible. And I mean everything. Think stock markets, military, health and epidemic research, economy, global intelligence. Basically all the power structures our civilization depends upon can find large data extremely interesting.
In 2010 China routed traffic intended to some very discrete US organizations through its servers for roughly 18 minutes. “Not too long”, one might think, but enough to cause a 300 page report for the US congress. If 18 minutes worth of internet traffic routing caused such a stir, imagine how much of an impact information passed through Facebook and Google have on the global political scene.
With enough data stock market crushes and bubbles could be predicted, social movements could be news before they even happen, just like military strikes or economic crises. One thing is for sure: there is great power in the data provided by users online.