Data Broker Program
The Office of HIPAA Privacy and Security, in collaboration with the UHealth IT, IRB and CTSI, has established a “data broker” program to ensure compliance with the HIPAA Privacy and Security rules, and requirements pertaining to the use and disclosure of protected health information (PHI). The Data Broker acts an independent intermediary between the Clinical Enterprise and other components of UHealth, including Research and Business Operations
The Data Broker program will be implementing a standard clinical data request process for sharing clinical data for research and healthcare operations in compliance with HIPAA and IRB requirements. This request process will utilize the Service Now platform, resulting in documentation of requests from origination to fulfillment. The Data Broker function includes promulgating safe data handling practices and providing data de-identification services. This will help in enhancing our overall data security as the growth in use of electronic medical records, electronic insurance claims processing and other healthcare information systems has led to massive increases in the collection and storage of PHI.
Health information remains one of the most sensitive types of personally identifiable information (PII). Individuals want to be assured that their healthcare information, including their diagnoses, lab results and medications, remain only accessible to those with an absolute need to know.
The process of de-identification, by which identifying information is removed from a data set so that data cannot be linked to a specific person, mitigates privacy risks to individuals. Using de-identified data, whenever possible, reduces risk to the organization by decreasing the dissemination of regulated data such as PHI, thus effectively minimizing the potential for data breaches. The HIPAA Privacy Rule provides two methods for de-identification of health information, expert determination and safe harbor. De-identification leads to information loss, which may limit the usefulness of the resulting health information in certain circumstances. This is particularly true for the safe harbor method, which utilizes a strict, inflexible but tried and tested approach. The expert method takes a risk-based approach that applies current standards and best practices from de-identification research.
The data broker, in conjunction with UHealth IT, provides services that can de-identify clinical data sets utilizing software solutions from Privacy Analytics, a market leader in the expert de-identification method. As with the safe harbor method, all direct identifiers such as name, social security number, medical record number and email address must still be removed or masked. However, depending on the data sets, the expert method may allow greater flexibility in the handling of indirect identifiers such as dates of event. This results in potentially richer data sets that better preserves the analytical qualities of the data while simultaneously seeking an appropriately low probability of re-identification.
Privacy Analytics provides one tool for use in de-identification of structured data. Please note that the software requires data sets with unique record identifiers hence the data extraction needs to be structured for this purpose. UHealth IT will perform the initial extract of the data and can provide guidance on what data fields are available, criteria for data selection etc. This structured data set will then be passed to the data broker for de-identification.
Uniquely, Privacy Analytics also offers another tool for de-identification of unstructured data such as progress notes. Note that the unstructured data set (one single field containing free form text, for example) must be a separate and distinct data set from any structured data. These are two distinct tools that are run separately against their respective targets, so one tool is used only for structured data and the other tool is run against only unstructured data. This tool uses natural language processing techniques to detect personally identifiable information (PII) in unstructured text such as names, locations, IDs, CPT codes, age, dates, phone numbers and email addresses. Once this information is detected, there are multiple options to handle including masking (replacing the characters with *****) or replacing (a similar value is chosen at random to realistically mask the personal information, e.g. Bob Smith is replaced with John Jefferies).