These terms sound all similar and create confusion. Let me start by saying that depending on their context, they all refer to the same problem: “identifying a person based on the information” which is required for privacy. For some people, “privacy vs. confidentiality” may also be confusing, but it’s not our topic today.
Let’s start by defining terms:
PII: Stands for “Personally Identifiable Information”. According to NIST 800–122, PII is any information about an individual maintained by an agency, including:
- Any information that can be used to distinguish or trace an individual‘s identity, such as name, social security number, date and place of birth, mother‘s maiden name, or biometric records;
2. Any other information that is linked or linkable to an individual, such as medical, educational, financial, and employment information.
PHI: Stands for “Protected Health Information”. HIPAA Privacy Rule defines it as: “Individually identifiable health information, held or maintained by a covered entity or its business associates acting for the covered entity, that is transmitted or maintained in any form or medium (including the individually identifiable health information of non-U.S. citizens)”.
HIPAA Privacy Rule also stresses the genetic information as health information.
ePHI: Stands for “electronic Protected Health Information”. As you can guess, if any PHI is digitized (i.e. created, stored, transmitted, or received electronically), then it’s ePHI.
As a result, we can consider ePHI as a subset of PHI.
That leaves ePHI out of discussion. We can concentrate now on PII vs. PHI.
Let’s enumerate what type of information considered to be PII , by summarizing the list from NIST document:
- Name
- Personal identification number
- Address
- Unique asset information (IP, MAC, etc.)
- Phone numbers
- Personal characteristics like photographs, x-rays, biometric data, etc. (this item covers a lot and open to interpretation)
- Owned property (e.g. vehicle registration number)
- Information about an individual that is linked or linkable to one of the above.
The last item sounds vague but it’s necessary. It extends the PII limits very much and enforces data owner to reconsider what information they have and how they can use in order to guarantee privacy. We don’t know what type of information types we will keep recording in the next few years and by stating such a statement, the definition covers future types, and of course, not documented types.
Now let’s see, what types of information PHI covers, summarized from HIPAA Regulation:
- Names
- All geographic subdivisions smaller than a State (there are some exception for Zip code, please refer to HIPAA Regulation for more information)
- All elements of dates (except year)
- Telephone & facsimile numbers
- Electronic mail addresses.
- Personal identification number (i.e. Social security numbers, medical record numbers, health plan beneficiary numbers, account numbers)
- Other asses identification numbers (i.e. Certificate/license numbers, vehicle identifiers and serial numbers, device identifiers and serial numbers)
- URLs
- Internet protocol (IP) address numbers
- Personal characteristics(Biometric identifiers, full-face photographic images and any comparable images)
- Any other unique identifying number, characteristic, or code, unless otherwise permitted by the Privacy Rule for re-identification
The last item again open to interpretation and is a safety mechanism for future types which we cannot foresee and/or not covered explicitly in the list.
When we compare the definitions and lists, we see that they are almost same. If you look again the definition of PII again, you will see that #2 includes “medical” (marked as bold).
The only distinction to be aware is that PHI (thus ePHI) being limited to health information context. All are defined to protect and enforce our privacy.