Accessing Your Personal Data. The Extensive and Often Surprising Data… | by Jeff Braun

eshrag

138 21 دقائق

[ad_1]

The Extensive and Often Surprising Data that Companies Have about You, Ready and Waiting for You to Analyze

Image created with the assistance of DALL-E 2

Data privacy laws are appearing in countries all over the world and are creating a unique opportunity for you to learn how others view you while also gaining insights into yourself. Most laws are similar to the European Union’s General Data Protection Regulation, commonly know as “GDPR”. It includes provisions requiring organizations to tell you the type of personal data they store about you, why they are storing it, how they are using it, and the length of time they store it.

But the laws also include an often overlooked requirement commonly known as data portability. Data portability requires organizations to give you a machine-readable copy of the data they are currently storing about you upon request. In the GDPR, this right is defined in Article 15, “Right of access by the data subject”. The data that organizations have often includes a rich and varied set of features and is clean, making it ripe for several data analysis, modelling, and visualization tasks.

In this article, I share my journey of requesting my data from a few of the companies with whom I routinely interact. I include tips for requesting your data as well as ideas for using your data in data science and for personal insights.

Think you have a solid grasp on your taste in music? I thought I had broad and varied musical tastes. According to Apple, though, I am more of a die-hard rocker.

Want to refine your geographic data mapping skills? These data sources provide a spectacular amount of geocoded data to work with.

Plot of a walk through Universal Studios — Image by Author

Care to try your time series modelling skills? Multiple data sets come with fine-grained time series observations.

Forecast of exercise time using Apple health data — Plot by author

The best news of all? This is your data. No license or permissions needed.

Fasten your seat belt — the variety of data you will receive is broad. The types of analyses and modelling you can do are non-trivial. And the insights you gain about yourself and how others view you are intriguing.

To keep the focus on insights from the data and in the interest of brevity, I do not include code in this article. Everybody like code, though, so here is a link to a repo with several of the notebooks I used to analyze my data.

Getting the Data

If you make a list of organizations that have data about you, you will quickly realize the list is large. Social media companies, online retailers, cellular phone carriers, internet service providers, home automation and security services, and streaming entertainment providers are just a few categories of organizations storing data about you. Requesting your data from all of these groups can be quite time-consuming.

To make my analysis manageable, I limited my data requests to Facebook, Google, Microsoft, Apple, Amazon and my cellular carrier, Verizon. Here is a table summarizing my experience with the data request and response process: