We present a new software tool called
CDN (Collaborative Data Network) for sharing and querying of clinical documents modeled using HL7 v3 standard (e.g., Clinical Document Architecture (CDA), Continuity of Care Document (CCD)). Similar to the caBIG initiative,
CDN aims to foster innovations in cancer treatment and diagnosis through large-scale, sharing of clinical data. We focus on cancer because it is the second leading cause of deaths in the US.
CDN is based on the synergistic combination of peer-to-peer technology and the extensible markup language XML and XQuery. Using
CDN, a user can pose both structured queries and keyword queries on the HL7 v3 documents hosted by data providers.
CDN is unique in its design – it supports
location oblivious queries in a large-scale, network wherein a user does not explicitly provide the location of the data for a query. A location service in
CDN discovers data of interest in the network at query time.
CDN uses standard cryptographic techniques to provide security to data providers and protect the privacy of patients. Using
CDN, a user can pose clinical queries pertaining to cancer containing aggregations and joins across data hosted by multiple data providers.
CDN is implemented with open-source software for web application development and XML query processing. We ran
CDN in a distributed environment using Amazon EC2 as a testbed. We report its performance on real and synthetic datasets of discharge summaries. We show that
CDN can achieve good performance in a setup with large number of data providers and documents.
相似文献