Author Archives: Dawud Gordon

Verizon Smart Rewards not so Smart

A while back Verizon introduced their Smart Rewards program which offers bonuses to to users. All of the marketing for this program makes it seem like you just sign up and get stuff, that’s it. But it’s not till you get down to the very fine print that you see you must enroll in “Verizon Selects” to participate, and you are paying for those bonus points (of questionable value) with your data:

“Participation in Smart Rewards may require enrollment in Verizon Selects, which personalizes marketing customers may receive from Verizon and other companies by using information about customers’ use of Verizon products and services including location, web browsing and app usage data.”

On the Verizon Selects website they say a little about what data they use:

“Simply put, Verizon Selects will use location, web browsing and mobile application usage data, as well as other information including customer demographic and interest data”

So whether or not you agree with the transparency, it is still an interesting concept to reward interested individuals for their data instead of just taking it. It appeared to be a step in the right direction from our point of view. That was until it came out that the Verizon Selects opt-out didn’t actually opt you out of it, and even those who never opted-in were still surrendering their data without rewards. Jacob Hoffman-Andrews blew the lid off of this doing a little snooping in the information that Verizon mobile browsers were putting out there.

Earn “points” for surrendering your personal data, and by circumventing any privacy you had on your mobile device.

To make matters worse, the really bad part was not that Verizon was jacking your data (they were), but they were circumventing all of your privacy protections by making you completely trackable to every website you visited. More or less, every time your phone talks to the outside world, they insert a marker into that conversation (at the cell tower, mind you) which tells the other party who you are. It’s like Verizon was trying to shoot itself in the foot from a customer trust standpoint, but had tiny, child-like feet and a bow-and-arrow so they had to work really, really hard before they managed it. The New York Times also reported that AT&T has/had a similar program in the works. I bet the conversation in upper management there went from “who’s fault is it that we didn’t do this first?” to “thank Zeus we didn’t do that” in a single heartbeat.

91% of Americans feel they are not in control of their info, and think it’s their own fault.

From the BBC Article on the Pew Research Privacy Report

The Pew Research Center released a study on how Americans feel about their privacy in a post-Snowden era. Not surprisingly, the results show that 91% of Americans feel they don’t have control over how they data is used, and for good reason. The information they are worried most about is consistent with other studies, where data which can be used to defraud or impersonate them have highest priority, followed by behavior information, with social and demographics at the lowest priority.

However what is interesting is the extremely high importance of health care information that the new study revealed, which as far as I can tell is not dangerous in terms of fraud or impersonation, but just … well …. private. Similarly, inter-personal communications such as emails, calls and text are also considered highly private. The increasing importance of the privacy of communication could very well be attributed to the events surrounding Snowden and the NSA.

How Individuals Prioritize Who They Feel is Responsible for Safeguarding their Privacy Across Different Countries and Continents. 1 means “Most Responsible” – Taken from a Lightspeed GMI Presentation at MRMW’14

But this only becomes really interesting when looking at the the Pew survey in context. A recent study from Lightspeed GMI surveyed where individuals lay responsibility for preserving privacy: “who’s job is it to keep this stuff private?” Germany, Mexico and India all put the mandate on the mobile providers and then government to regulate the use of private data and guard individual privacy. In the US however, people believe it is their own job (followed by providers, marketing companies, and only then government) to keep their information private. However, most people do not have the data they feel they are responsible for protecting, nor do they understand how analytics can get this information from seemingly unrelated data (some health information can be decoded from accelerometer data for example).

Beyond Quantified Self: Quantified Community

The next big buzzword in Quantified Self is out, and it didn’t come from Quantified Selfers. “Quantified Community” is the term real estate moguls are using when they refer to the newest high-end real estate projects in New York City. Real estate companies would like to quantify everything they can about the individuals within a space, as well as environmental parameters of that space. The goal is to present investors and renters/buyers with a complete documentation of the space and its inhabitants to help them reach a decision.

Hudson Yards Developer’s Concept

The Hudson Yards real estate project on Manhattan’s West Wide is the proving grounds for this new concept. They want to measure everything from air quality sensors in the environment, to individual step counters and quantifiers on the individual’s smartphones, all opt-out of course. Even quantifying individuals who are not participating seems to be within the scope. A suggestion to use Google Glass has been made, which will surely prompt an outrage, perhaps “Community of Glass-holes” will be trending soon.

The resulting data will be used not only for marketing purposes, but also to improve city planning in the immediate future. Even reducing the energy footprint of these communities appears possible with accurate usage data. Users are expected to want to contribute their data since their participation will have a positive impact on their daily environment. As always, success of the project will live and die with the willingness of the user to share. And that depends on how well they understand data analytics, how much they they trust the entity collecting data, and how well they understand what it will, and will not, be used for.

Sandy Pentland on the Open Platform for Personal Data with Privacy

This summer MIT’s Media Lab introduced the OpenPDS platform to with the goal of putting using in control of their data. This is coming out of Sandy Pentland‘s Human Dynamics Lab and he is involved in the project, although his role in any business plans is still unclear. We have spoken before about him, he is a great proponent of personal data empowerment and a powerful voice with the ear of scientists, business and politicians in equal measure. The concept was introduced in a paper that details an architecture which is meant to allow users to benefit from their own data without sacrificing their privacy.

http://openpds.media.mit.edu/images/overview.png

OpenPDS Architecture *Taken from the MIT Media Lab

“Only answers, no raw data”

How does it work? The main idea is that users can allow 3rd party applications to ask “questions” of their personal data store, without giving those applications access to the data itself. This works along the lines of how an app can ask Android or iOS what the user is doing (walking, running, bicycle, transport, etc.) without accessing location or motion sensor data directly. OpenPDS allows the user to control the computation of the answer (they call it “SafeAnswer”) on the personal data store, and the content which is delivered is only what is required for the correct 3rd party app behavior.

“A New Deal on data is needed”

From a business standpoint, the concept is very much aligned with that of Personal which we have discussed here before. They also create a secure personal data safe, and then give the user tools to use that data to personalize their internet environment. While Personal is running a business since 2009, OpenPDS appears to be launching the first real competition now. The question that remains to be seen is, will a group of motivated personal data empowered individuals be enough to revolutionize the personal data ecosystem? Or is a collaborative platform needed where the power of all of their personal data is combined in order to effect real change?

Cutting Edge Research Shows the Future of Wearables

14th International Symposium on Wearable Computers (ISWC) 2014 from Sept. 13th to 18th in Seattle, WA

The International Symposium on Wearable Computing (ISWC) is the leading academic conference for top-tier research in Wearable Computing. The 14th annual event is taking place in two weeks in Seattle, sponsored my Microsoft Research, Google, Intel and Yahoo! to name a few.

Topics range from applications for the space program (Amy Ross, leader of NASA’s space suit program is giving a keynote!), bleeding edge artificial intelligence, even tongue tracking!

Check out the program and stay up to date on the latest breakthroughs that you will see coming out on wearable devices in the next few years here:

www.iswc.net/iswc14/

Also don’t miss two talks from twosen.se on how to infer group behavior from smart phone/watch sensors on Monday and Tuesday!

Your Data is Worth $100 a Month

Luth Research is offering users $100 a month to completely invade your privacy. For that price they put software on your phone that tracks your location, your conversations, web browsing behavior, basically everything you do with your phone. Luth then examines how users interact with their device in order to gauge purchasing behavior. For example they found out that people who visit a car dealership are already prepared to buy, which I guess is a big deal. I guess in marketing some of the obvious assumptions have never been empirically proven before, and data science is slowly confirming them.

Purportedly, 20,000 browser users have agreed to this deal, with 6,000 mobile users enrolled as well. Thats a lot! One caveat seems to be that the $100 value is an “up to” value which depends on how many surveys you fill out. I would guess that a good deal of active participation is required along side the passive trackers to get your full paycheck. Recently DataCoup, a New York startup, started offering users $8 a moth to acquire their data.

If Luth is paying $100 and DataCoup is paying $8, is Luth overpaying or is Datacoup not paying you what your data is worth? Or perhaps Datacoup is not quite as invasive as Luth’s solution. In the end, the users will decide where the tradeoff between currency and privacy lies.

What Wearables Mean for Privacy

Currently the world is experiencing a surge in wearable technology. Every week a new press release hits the front pages about some major company bringing out their new line of sleek wearables that automatically interface with everything else you own. At the same time app ecosystems are springing up with everyone desperate to make the killer application to add value to our lives. Those of us who have been researching wearable devices long before they became hype know that the really interesting aspect of wearables is not only what they can do for us, but how well situated they are to help our technological environments better understand us. However the current app ecosystem for mobile devices dictates that “he who builds the software, owns the data.” We can either decide to use the software and accept the fine print, or not use it at all. The crux: the people who make our technology now have the ability to understand us better than we understand ourselves in a fashion which goes beyond our own powers of self reflection. And the consumer has no way of knowing how much information they are really giving away, and currently no way of controlling it.

Wearable devices are perfectly positioned to observe our behavior, actions and interactions.

Wearable devices are perfectly positioned to observe our behavior, actions, interactions, and even character and emotional state.

Alex (Sandy) Pentland is a researcher at MIT who is one of the most highly cited computer science researchers of all time. I’ve cited him myself, often. His work is mostly focussed on using technology, especially wearables, to understand the complex facets of human social interaction. A recent article from The Verge names him as the “Godfather of Wearables,” a title which he certainly deserves. The field is a hybrid of psychology, sociology, mathematics, physics and computer science, which researchers have now begun to refer to as “Computational Social Sciences.”

“Algorithms, computers and sensors can give us insight into aspects of social interaction, and into ourselves, that are beyond the abilities of our own perception”

Sandy Pentland’s research is so interesting, because for the first time, he demonstrated that algorithms, computers and sensors can give us insight into aspects of social interaction, and into ourselves, that are beyond the abilities of our own perception. In essence, these devices can understand us better than we understand ourselves! In The Verge article they mention his “Sociometer” which extracts cues from human speech patterns and behavior to discover the outcome of the social interaction before the individuals involved know it themselves, for example if a salary negotiation will be successful or not. I remember a great Keynote talk of his where he demonstrated the power of these analytics to evaluate the effectiveness of a company based only on inter-department person-to-person interaction. Even more interesting, the work resulted in Sandy’s book called “Honest Signals,” where the insight gained from machines about humans is then given back to the humans. And, even more spectacular, some of it is even actionable! Reading that book helped me improve my presentation style by understanding how I can appear excited about the content and sure of myself at the same time, for example.

But Pentland is not the only one working in this direction. Scientists all over the world are working to make sense of the tremendous amount of data generated by mobile and wearable devices. The results are astonishing. For example by monitoring interaction with your phone and app usage, can allow your personalty type to be inferred. A wearable smartwatch on your wrist with a sensor on the back can recognize your stress level with at any given time. Tracking eye movements can also tell us not only what your are looking at, but what kind of information you are consuming and how you are doing it. An app on your phone, watch, hearing aide or wearable device can tell someone if you would be a good future employee based only on your physical, non-verbal behavior. Wearable devices can also be used to detect depression, which can be a very useful pedagogical tool. If you would like to stay abreast of what is happening at this level then keep an eye on the annual International Symposium on Wearable Computers (ISWC), coming up this year in Seattle in September.

“it is almost impossible to know what an app knows about you based on the permissions it requests”

So what does this all mean? The first thing to note is that any device you wear or carry with you can tell someone else amazing and incredibly insightful things about you. The second, is that as a lay person, it is near impossible for you to tell what can be gleaned about your psyche from any basic type of raw data, e.g. motion sensing, skin resistance, location, app usage, etc.: for you it is almost impossible to know what an app knows about you based on the permissions it requests. For all scientists working on this technology, it is incredibly exciting and unbelievably scary. We create the technology that makes all of this possible, but are not the ones who decide how it will be used. If used correctly, devices and systems which we use can have the power to make our lives more enjoyable and improve our experience of the world. If used incorrectly, they make us completely naked in the eyes of an observer who we may not be aware of, know personally, or trust in any fashion.

“Having our internal psychological workings exposed, means we can be manipulated by everybody”

And it is not that we are physically naked. However embarrassing or uncomfortable that might be for some people, this type of naked would be far worse; our psyche would be naked. As demonstrated in the research, we don’t need to get at your brain waves to figure out what you are thinking, because your body displays this information in a humanly imperceivable, but technically measurable way, like a lie detector for “honest signals.” As children, we learn to pick up on these signals and understand what is going on in the minds of the people around us. We also learn to use that knowledge to manipulate people (every child knows to put your parents in a good mood before asking for something big or risky). But having our internal psychological workings exposed, means we can be manipulated by everybody. In a previous article I talked about how personal information is acquired and resold by data brokers on the personal information market. Imagine if telemarketers could pay to be alerted when you are in a minor depression by your wearable devices and then use that to their advantage? Do we really want that kind of information on the market?

From a privacy perspective that sounds very alarmist and menacing. But the technology itself is not neither intrinsically good or bad, it only becomes bad when it is not combined with a conscious decision making process. I am personally thrilled that wearables are finally finding the traction they deserve, and use them myself. But I think it should be clear to the consumer that a danger is there, and that he or she is probably not capable of recognizing it. The potential is very real to create a world that very few would want to live in, and for those in the know it is also very possible with the technology we have now. The solution lies in control over the flow of data and where the power of decision making lies. Pentland himself advocates this, where users can decide what data goes where, instead of the “agree vs. decline”, “all or nothing” approach that we currently have in the mobile app ecosystem.

But it is not that simple, since the person who is making the call about who can obtain what data would not be aware of the latent meaning within that data. It is necessary to empower the user with their own data, and also with the tools and understanding they need to weigh the value of giving someone their data for the services and software they receive in return. The difficulty lies in making individuals aware of a danger they can’t perceive, and convincing them that this is something that they need to pay attention to. The problem is of a social, economic and political nature, and therefore does not have a technological solution. Rather a grass-roots movement of the consumers, driven by an awakening must be brought about, to enact the economic and political change necessary in order to realize true personal data empowerment.

Selling your data means sacrificing your privacy. Is it worth it?

We as internet users can empower ourselves with respect to our private data by becoming the source for that data and selling it ourselves. Several startups are testing software to help us, offering different degrees of anonymization. However any data scientist will tell you that this anonymity can be broken with a couple minutes of analysis. So now the question is, do users understand this, and more importantly, do they care?

Personal data generates trillions of dollars per year in revenue, but the individuals who generate that data are only rewarded with the smallest trinkets. Their data is collected by 3rd parties, and is stored, sold and used outside of their control. So now the question is, how can we as users empower ourselves to take back control of our data, and with it our portion of the revenue generated by it?

The first step to empowerment is to have access to your own data.

As is almost always the case with novel ideas and technological frontiers, startups are leading the way. The first step in empowering ourselves is to be able to access our own data. Reclaiming our data from all of the websites and retailers who have collected it is definitely impractical, probably impossible, and maybe even illegal. But collecting it ourselves at least allows us to use it for our own purposes. For example, if I have a list of all products I bought in the last week, it doesn’t take that information away from Amazon, but it does let me give or sell it to someone else if I want to. The grass-roots “quantified self” movement has provided us with apps such as Moves, Sleep As Android, App Usage Tracker, etc., which record where you went, what you did there, how often you work out, how much and how well you sleep, and a whole host of other aspects that you might or might not want to be aware of. While this is a step in the right direction, one of the problems is that each app only targets one or a few aspects of you, and doesn’t necessarily grant you full freedom outside the app for using or accessing that data.

One startup called Personal is helping users aggregate all this information in a so-called “personal data vault.” Their concept is to empower you to provide your data to third parties of your choosing, in order to make your internet experience a more personalized one. For example, if I allow a website to access the information that I pay for a gym membership (but don’t ever go), I might be able to get rid of ads which are trying to convince me to sign up. Another aspect they can help you with is automatically filling in forms for you through a payed add-on app. However, the company is not interested in helping users sell their data, as they see this as being counterproductive since that inevitably will mess with the privacy of users.

Recently a whole flood of startups aimed at marketing personal data have made it through a founding round or two, opened their doors and announced betas. These are the likes of DataCoup, Handshake, Data Fairplay, all of which have a similar concept, which is to allow you sell your own data through them to whoever out there is interested. All of these startups also offer different levels of anonymity, where your name, email address, phone number, etc. can be removed before sale. As we mentioned previously, the more anonymous your data is, the less it is worth. But anonymising protects you, as often the information you are selling may reveal far more about you than you are aware of.

Here’s the problem. As any data scientist will tell you, data of this kind can’t be anonymized. In the same way that your web browser can be uniquely identified by how you configured it, your personal information identifies you even without your name on it, just like your fingerprint does. For example, 87% of individuals in the US can be identified by their birthdate, sex and ZIP code. Furthermore, anyone who has access to other sources of information about you, often even public records, can easily find out who you are based on similarities between the public data sets and your anonymously purchased data.

When it comes to personal data, does privacy trump empowerment or are the monetary returns worth the sacrifice?

Selling your data, anonymized or not, is selling someone information about yourself, and not just information about some anonymous individual. In other words, when it comes to the personal data economy, empowerment currently comes at the cost of privacy. Since all of the enablers here are startups in a beta phase, it is still unclear if this is a trade people want to make given the risks. The questions however, are now more focused on whether users understand the risks involved in selling their own “anonymized” data, and if they care.

Let us know what you think in the comments! Will you be selling your data?

What is your digital value? Part 2: what is your data worth to you?

The value we put on our personal is not so much based on what we think its monetary value is. We assign value far more based on how exposed we are by the data, who is buying it, and what they will do with it. In essence, the most important factor which decides the subjective value of our data is trust! Here is why.

In our first post we established that personal data has an enormous value in the internet and marketing ecosystems, around $70 to $100 per internet user per year on average. Despite it’s intrinsic value, we still give it away in return for trinkets such as reward points or miles, or for services like Facebook, Foursquare or Google. The organizations on the receiving end of that deal then turn a profit by monetizing that data in some way. In general, most people would be interested in exchanging their data for cash. However that is conditioned on the transparency as to what is done with the data, and by whom. So now we ask the question, what is the monetary value which people attach to their own personal data?

There is no established market for individuals to actually sell their own data. Most of the information on this topic which is readily available comes from exploratory research and not from business experience. One of the methods used to establish a subjective estimation of value for personal data in research uses the “willingness-to-protect” estimation method. Subjects are asked to imagine a specific product such as a new smartphone. They are told there are two versions, one which is free but collects and “leaks” their personal data, and one which is a paid product but does not collect any data. The users are then informed of the price of the paid version, and asked to select either the data-collecting free version or the payed variant. By varying the hypothetical cost of the payed version, and the type of data collected, it is possible to generate statistics for the subjective value of different types of data.

Using the willingness-to-protect technique shows that we put our data into 3 different value categories. Data which can be used to steal your identity or impersonate you is valuable, at around $150 – $250 a month. Data which documents your behavior in some way, such as your location, your browser history or your medical records is in the medium value category, worth $40 – $60 a month to protect. Information which is already available on the internet, either because we put it there or because we assume others already have collected it (profile information, demographics, purchase history, etc.) are only worth $3 – $6 a month to protect. This seems clear: our estimation of the value of our data is hinged on how much damage we think it can do to us.

How much people would pay to protect different types of personal data. Image from http://designmind.frogdesign.com/, click for source article.

But this is not the whole picture. Our valuation of our own data also depends heavily on how well we understand the uses for that data. For example, people put a low value on their electricity bill and energy consumption information. This is because they do not see it as a threat to their security, or understand the amount of insight it provides into their behavior. In reality, fine-grained power consumption information can be used to observe behavior of people in their own homes with astonishing accuracy. So, our valuation of our own data is also based on our understanding of the informational insight it represents. However, we seem to have a poor understanding of how much information is actually in there!

The willingness-to-protect approach to evaluating subjective value assigned to personal information does not show the whole picture. The question still remains, if we would pay $X to protect our data, does that mean we would also sell it for $X? It turns out that depending on the type of personal data in question, the amount for which people would sell their data is 4 – 5 times what they would pay to protect it.

One way to find out what price individuals really put on their personal data is to trick them into thinking they are actually selling it. Individuals put prices of around 10 GBP ($15) a month for their location. Location information was in the middle pricing segment of behavioral information, so $40 – $60 to protect, and a single digit multiplier factor for selling this information as opposed to protecting it. So why is this value so far below what we would expect the price to be based on the other studies?

User Values for their Location Data for Academic Purposes (Left) and Commercial Interest (Right) in GBP (Times 1.65 in USD) *(How Much is Location Privacy Worth?, Danezis et al.)

One reason is that in that case the users trusted the people they where giving their data to: the scientists running the study. Under the same circumstances, when people are told that there is commercial interest in the data they increase the price that they want for it. When people believe that data will be given to unknown entities or used for unknown purposes, their price increases. This increase about a factor of two, putting that value then square in the middle of the “protect” bracket for that category of information but still far under what would seem to be the expected selling point (factor of 4 – 5). Why is that? Who knows…but these principles ( that value scales with understanding, concern and transparency of privacy) hold true across different countries and cultures, but the values (the $ values assigned to data) varies widely, even within Europe. Perhaps the varying locations for the different studies affect the scale of what is of which value to whom.

Here is the wrap-up.

How we value our data depends on how we understand its informational content, and how damaging it can be to us. Information which can be used to steal our identity is expensive, information which documents our behavior is in the middle, and demographics and profile information already available on the net is fairly cheap. However, if we can’t see all of the players in the personal data game, don’t understand their motives, or don’t agree with their goals, that price skyrockets. What is the solution?

If you want to access our data, you need to win our trust through transparency!

What is your digital value?

PART 1: HOW MUCH IS YOUR DATA WORTH TO OTHERS?

The recent NSA spying sh*tstorm has demonstrated that our digital data matters. It is not just data, it is a digital representation of our physical selves. Our identity is both physical and now digital as well. Platforms such as Second Life offered us the possibility to spend time on the virtual side of this identity, or even create a new identity which we preferred over reality, but this concept never really caught on. Maybe it came too early, or people just didn’t want it. But Facebook, blogs, personal websites and the whole mess of social networking platforms show clearly that part of our identity is increasingly in a digital form.

What is your private data?

This identity consists of a whole soup of different types of information flotsam and jetsam which is your private data. Your private data (a.k.a. “personal data”) is any piece of information which documents any aspect of yourself. This could be bits of information which document your physical bag of meat, blood and bones: height, hair color, IQ, etc.. It may also contain information about your behavior, such as your workout routine, where you are, and so on. This behavior can also be online activity or consumer behavior, such as where you shop, what you buy, or which websites you routinely visit. Other types of information could be about your history, such as your education, life events, or moving house. You get the point: everything-and-anything.

From a more practical standpoint, NIST defines what they call “personally identifiable information” (PII). This is any information which can be used to identify you, either alone or in combination with other information. That is extremely broad. Even anonymous and seemingly innocuous pieces of information can be used to identify you when brought together. This happens when the combination of the anonymous information leads to a unique configuration, even if each piece is not unique. For example, the way you configure your browser is often unique enough to identify you, like a fingerprint. Don’t believe me? See for yourself! Furthermore, even a few simple hints can narrow you down fast. 87% of people in the US can be uniquely identified by gender, zip code and data of birth alone. So everything is private data, but in the U.S. there is almost no legislation with respect to what can and can’t be done with that data in the private sector. Parts of Europe have far more restrictive privacy laws, with the wealthier northern Europe leading the way.

How is your data created?

Any time you fill out an online form, open a webpage, sign up for something, or even just buy a product with a credit or debit card, you are creating your own digital footprint, or private data. Sometimes it’s for your own use, such as when you enter your height into the scale in your bathroom so it can calculate your BMI. Other times you entrust it to an online retailer when you create an account with them. In many instances, the fine print allows them to use it for marketing purposes, which may even include resale.

: https://panopticlick.eff.org/ shows you how unique your digital fingerprint is.

Loyalty programs are a great example of this, where for a small incentive, you provide the offering company with all the information they need to connect all your purchases to your name or ID. In some cases and countries, membership is not even needed and retailers can track you and create a profile using the card you used for payment. This allows them to improve their service to you, but also to find out all kinds of stuff about you as well. Sometimes it gets super creepy. For example the giant retailer Target figured out that a teenage girl was pregnant based on changes in her purchasing behavior and sent her some ads for baby paraphernalia. The urban legend goes that Target knew before she did which makes an awesome story, but it looks like in reality they figured out after she did but before daddy was clued in. He blew a gasket, thinking the retailer was encouraging rather curious behavior in his daughter. Severe awkwardness ensued.

Many online companies track your browsing behavior using so-called “beacons“. These are pieces of code within webpages that alert a central location that you have visited this page. And they are everywhere (if you don’t like it check out Ghostery). This is all legal. There are laws protecting what can and cannot be done with this data, although they are pretty weak in the States, especially when compared with Germany for example.

However, often times your personal data is collected or shared illegally. For instance, institutions which you have entrusted with your personal data can be hacked. Google the words “personal data leak hacked” and you’ll see how many examples of this there are. Hackers can then resell that data for a profit. Sometimes, large companies “inadvertently” collect data when they shouldn’t be, such as an LG TV which just happened to send home highly valuable viewer information even if you explicitly opted out. That’s right, they integrated the opt-out option and then ignored what you entered. Google Street View drove around collecting data from unencrypted WiFi networks all over the world “by accident” which got them in some serious hot water.

How much are people paying for and profiting from this data?

Data from a single individual is sold on the market for anywhere from less than a single cent up to around $5, depending on who you ask and the content of your data. The data is sold in batches containing data from multiple user, and although the price for a batch can be in the thousands of dollars, this boils down to very little per individual.

Interestingly enough, although the data is bought for peanuts, it is estimated that the value gained from this data is around $1,200 per individual, about 240,000 times more. This is a simple estimation based on the total revenue generated by internet advertising per year, divided by the number of internet users. I’m sure in reality it is not that simple and there are other factors at play here as well. Another take on the value of private data puts the total at $430bn (315bn EUR) for 2011 and projects an increase to $1.4tn (1tn EUR) by 2020 in Europe alone! Assuming 500m internet users in Europe in 2011, thats around $860 (630 EUR) per person. This begs the question, why the discrepancy between what it is worth and it’s price?

One reason is that your data is sold many many times, and not just once. I couldn’t find any data on how often one individual’s data is bought and sold a year, but it is probably a large number. The total amount of money payed for the data of a single person may therefore be a lot closer to $1,200 than suggested. The data is also gathered in many different seedy ways, and is therefore often incorrect, making it less valuable.

What decides the value of your data?

Your data also becomes more valuable if you have certain conditions, such as diabetes or ADHD. Also recent life events, or events in your immediate future, such as having a baby or getting divorced, can also increase how much your data is worth. Obviously, the more information about you which is available the more the set of all of your data is worth.

Aside from collection, sale and resale, there are also other ways to turn a profit using your data. Companies like Facebook, Google, Foursquare, use your data to target you with adds which are better suited to your needs and preferences. While it might sound creepy, from your point of view it just means that you like more of what you see when using their web services. Not so bad, is it? In essence, we are paying for the services provided by these companies with our private data.

How much of their profit is generated using your data is unclear. Facebook and Google make $5 and $20 per year per user respectively, some would argue that most or all of this is generated through user data. On the flip side, their wide reach and massive number of active users would not be worthless if they were not collecting user data.

The important thing to note is that while they make unfathomable amounts of money every year using personal data, they don’t make this money by selling it. Just the opposite, they go through great lengths to protect it. And they have good reason to. Their business is generated by the fact that users surrender their data in order to use the services provided by these companies. And the premise is that users will do this if the service is useful and if they trust the company offering it. These days most people know these companies are amassing a wealth of their personal data, and seem to be generally ok with it. Assumedly because they trust the web-giant powers that be not to betray them.

What does it all mean?

Your data is big bucks. But not for you. Your private data is acquired:

in return for services rendered like social networking, email, cloud storage (remember the old saying “if you’re not paying for it, you’re the product”)
in exchange for small rewards such as loyalty programs and cashback programs,
through indirect monitoring via web beacons single-sign-ons,
or illegal activities such as viruses and hacking of otherwise trustworthy sources.

This data was worth 315bn EUR revenue in 2011 in Europe alone, but almost none of that landed in the pockets of the people who owned that data originally. There are laws which protect private data with northern Europe leading the way. However, as long as people continue to give this data away willingly for almost nothing, it seems difficult to foresee any change in the near future.

And now in conclusion, I will make an outrageous statement based on loosely coupled facts taken completely out of context. If you are an active Facebook (or any other company with the same business plan) user, you can assume that they have amassed almost all of your data, either through you giving it to them directly, or their ability to infer it based on what you did give them. If your data is worth $860 to $1,200 a year, then that means your Facebook bill is $70 – $100 a month!!!

Does that seem worth it to you? Tune in to part two of this series where I will look at what your data is worth to you.

One final note. If these numbers are even approximately correct, then Facebook and Google are very inefficient at turning that value into profit ($5 and $20 per user per year respectively). Now…is that a good or a bad thing?

TwoSense Blog

Behavioral biometrics, mobile data, privacy in the digital age, and more.