Under the HIPAA Privacy Rule, de-identification of protected health information (PHI) is the removal of specific information about a patient that can be used alone or in combination with other information to identify that patient. Covered entities often wish to use de-identified protected health information to conduct research and perform comparative studies. Once PHI has been properly deidentified, its use is permitted without patient authorization. A recent study published in the Journal of the American Medical Association, however, concluded that deidentification may not remain permanent. The study concluded that de-identified PHI can be re-identified. PHI reidentification puts patient privacy at risk.
How is PHI Reidentification Possible?
The HIPAA Privacy Rule authorizes two methods for deidentification of PHI: the Safe Harbor Method, and the Expert Determination Method.
Increases in publicly available online personal data, and advances in data linking and data analytics, make it possible for PHI reidentification. Deidentified data, along with other information, such as demographic information (i.e., date of birth, gender, and zip code) that is readily available online, can be combined to reveal the identity of an individual.
Are you adequately protecting patient data? Find out now with our HIPAA compliance checklist.
The Journal of the American Medical Association study demonstrated that an artificial intelligence algorithm could re-identify data that had been de-identified; that is, stripped of identifiable demographic and health information.
An artificial intelligence algorithm (or any computer algorithm, for that matter) is a set of unambiguous instructions that a computer can execute. Algorithms range from simple (i.e., an algorithm dictating the computer’s strategy in a tic-tac-toe game against a human) to complex ones. Complex algorithms can be built on top of other, simpler, algorithms.
AI algorithms are sophisticated complex algorithms that are capable of “learning” from data. These algorithms have the ability to “learn” new strategies. Such strategies may include how to detect patterns in data. Through this detection, an algorithm may be able to effect PHI reidentification.
In the JAMA research study, an algorithm was utilized to identify individuals by pairing two sets of data: physical mobility data and corresponding demographic data (date of birth, gender, and zip code). Essentially, the algorithm “paired” the two sets of data; the study authors concluded that the pairing allowed for PHI reidentification.
In light of the conclusion of this study, covered entities should continue to apply effective “good data governance” principles – principles for the overall management of the availability, usability, security, and integrity of data.