We at Arrka are often asked “what is the big deal about identifying Personal Information in my organization? It is obvious and clear!”

Let us understand what exactly is Personal Information (PI).

While the precise definition of Personal Information (also referred to as ‘Personal Data’(PD), Personally Identifiable Information (PII), etc) varies across geographies and laws, the general understanding of what comprises personal information is largely the similar. It is basically any information that can be used to identify an individual – directly or indirectly.

What is generally assumed to comprise PI:

Till recently, when one said Personal Information, it was understood to cover the following key categories:

  • Demographic & Identity Information: Like your name, address, contact details, family details, educational background, memberships & affiliations, etc.
  • Financial Information: Your Bank details, financial transactions history, investment details, etc
  • Health & Biometric Information: Your medical records, your biometrics, etc

PI now comprises the following too:

As the world has become increasingly digital with Social Media, Mobile, Analytics & Cloud (SMAC) pervading most of our daily activities, the data points or identifiers that can be used to identify a person have evolved far beyond the categories above to include the following:

  • Online Identifiers – eg: IP Addresses, Location data, cookie data, data derived from trackers placed on individuals, etc
  • Device Identifiers – eg: Mobile Device IDs, device configurations, browser configurations, apps downloaded on the device, etc
  • Social Media Markers – posts on social media, ‘likes’ posted on third party sites, logins across devices, etc
  • Metadata generated from various sources – eg: metadata contained in documents, texts, calls, messages, images, videos, etc
  • Data generated as a result of analytics – combining data sets that, by themselves, may not identify an individual

The list above is just indicative, not exhaustive. This is expected to go on evolving.

PI collection: Direct vs Indirect

Another factor that is often overlooked is that an organization collects PI in two ways:

  • Direct: What an individual may provide knowingly to the organization. For eg, while filling out a form.
  • Indirect: What an individual may not know is being collected by the organization. For eg:
    • Observed Data – Data that gets recorded automatically. Eg: By online cookies or sensors or CCTV linked to facial recognition.
    • Derived Data – Produced from other data in a relatively simple and straightforward fashion. Eg: Profiling a customer from the number of visits to a store and items bought
    • Inferred Data – Produced by using a more complex method of analytics to find correlations between datasets and using these to categorise or profile people. Eg: Calculating credit scores or predicting future health outcomes

Understanding Organizational PI:

Given the reality above, an organization needs to spend some time making sure all possible bases are covered while it identifies all the PI it deals with – not just its demographic/ financial/ health related data.

What role does Arrka play here:

Arrka helps organizations do a deep dive and identify all possible PI it may be dealing with. The Arrka team partners the organizational privacy team in doing the identification, the teams behind that data and mapping this to the laws & regulations that the organization needs to comply with. This gives the organization a clear picture of what lies ahead.

For further details, contact privacy@arrka.com