Overall Concept

Schematic of the OSSE principle.

The OSSE Registry Toolkit

The backbone of OSSE is a registry toolkit that enables scientists with a basic IT background to build a registry for a specific rare disease. A form editor allows defining forms for longitudinal and basic medical data and of the corresponding data schema. Each field (including, inter alia, data type, ranges, measurement units and value sets) has to be defined within the metadata repository first.

The Metadata Repository

The integration of a metadata repository for rare diseases in the OSSE architecture eases the process of integrating data from different OSSE registries as all data set specifications for rare diseases used in the respective registries will be available through the MDR. If necessary, further items can be added while building national or regional disease-specific registries and will also be available for all OSSE registries. The metadata repository component also allows for the retrieval of metadata items from other MDRs. Apart from self-defined value sets, the MDR also provides access to standardized codes such as ICD-10-GM or Orphacode (added later).

The OSSE Registry

The OSSE registry provides role-based access control, plausibility checks regarding the validity of entered values, data versioning and a first workflow support through different statuses and allowed status transitions per role. An interface for a configurable data import and data export is available as well. For better discoverability, each registry should register with a registry of registries (RoR).

The “Distributed Search” Principle

It is our strong belief that raw patient data should not leave the local registry even with the presence of patient consent, because there is low acceptance among patients and data owning scientists of big data collections and their controllability particularly outside of the influence of national data protection rules. Instead, we developed a concept for a distributed search which makes data from (national and regional) rare disease registries available, while respecting data ownership and privacy aspects. The search broker allows specified search queries based on the existing MDR items. An exposé describing the research question and supplying contact information of the inquiring partners completes the request. The local request interface of each OSSE registry, called the “Collaboration Client”, runs the received inquiries. If there is a non-empty result set, it is presented to the person in charge of each matching registry together with the exposé and the inquirer’s contact information. Finally, the data owner decides if and what to reply.

Integration of Registries with the OSSE Bridgehead

In order to enable the integration of registries built with a different software, a so-called “OSSE bridgehead” can be installed. The OSSE bridgehead consists of the OSSE core components including the query interface to the Collaboration Client. Data has to be entered from the registry by a periodically running ETL process that can optionally access the local ID management, if the creation or modification of patient pseudonyms is necessary. During the transformation step, each attribute has to be mapped to a data element defined in the MDR.

Schematic of diefferent scenarios.

Pseudonymization

Data protection regulations in Germany (and also in Europe) require pseudonymization as a basic prerequisite for the registration and storage of patient data in research networks. The part of the data that allows for the identification of patients (e. g. family name, first name, date of birth, name at birth, etc.) has to be replaced by a pseudonym and stored together with the pseudonym in a separate patient list deployed on a different server which is controlled by a trusted third party. Every time data for a patient is entered, the identifying data has to be sent to the identification management where special record linkage algorithms verify the existence of an identical or very similar data set. The corresponding ID/pseudonym will be returned to the requesting system. Research networks with different data stores, e. g. a registry, a research database and a biomaterial database, have to assign different pseudonyms for each system (according to German data protection laws). In Germany, TMF e.V., a technology and methods platform and umbrella organization for networked medical research, developed a comprehensive data protection handbook for medical research networks, which describes, among other things, requirements for the ID management and pseudonymization in medical research networks. In compliance with these requirements, the OSSE architecture includes a local ID management/pseudonymization service for each national or regional OSSE registry. The ID management software that we use in OSSE, called “Mainzelliste”, has been released as open source by the Medical Center of Johannes Gutenberg University Mainz (Division of Medical Informatics) and is already used in various other projects.