Canada's fragmented healthcare infrastructure leaves patients without access to their own medical histories across providers. EHR vendors operate in siloed ecosystems with little incentive for interoperability, making cross-institutional care slow, expensive, and incomplete. Meanwhile, rare disease clinical trials struggle to recruit eligible participants, stalling treatments that could save lives.

Challenge:

Solution!

Building on prior publications in clinical NLP, we designed and prototyped a researching and evaluating pipeline architectures for cross-vendor EHR integration. We investigated how NLP + NER can extract and standardize clinical terminology, then experimented with LLM + RAG approaches to perform semantic inference across inconsistently formatted records; mapping variant expressions like "SOB," "interstitial fibrosis," and "lung scarring" to unified clinical concepts.

We further explored federated node architectures as a mechanism for satisfying multi-jurisdictional privacy regulations (PIPEDA, GDPR) without centralizing PHI data. Our research facilitated a full system design with an opt-in, consent-driven trial matching engine, demonstrating how anonymized semantic match scores can connect patients to global clinical trials while preserving data sovereignty.

Deliverables & Results

  • Full system design submission for a cross-vendor EHR integration platform with NLP + NER + RAG pipeline architecture for clinical terminology standardization

  • Privacy-compliant federated data node model supporting multi-jurisdictional deployment

  • Stakeholder analysis covering hospitals, patients, researchers, and legislative considerations.

NLP · NER · RAG · LLMs · FHIR · Federated Architecture · Semantic Search