top of page
Neon Lights

Research & Project

Data-Driven Decisions: Smarter Financial Planning via Data Visualization and AI-Driven Insights

This project equips DMRF with a robust data architecture to enhance transparency and strategic planning. By analyzing, visualizing, and forecasting financial data (2018–2024), we deliver actionable insights through advanced analytics and predictive modeling.

Screenshot 2025-01-24 at 8.00.29 PM.png
Integrating MedCLIP and Cross-Modal Fusion for Automatic Radiology Report Generation

We propose a novel cross-modal framework that uses MedCLIP as both a vision extractor and a retrieval mechanism to improve the process of medical report generation. By extracting retrieved report features and image features through an attention-based extract module, and integrating them with a fusion module, our method improves the coherence and clinical relevance of generated reports.

Screenshot 2025-01-24 at 8.07.42 PM.png

Confidence Bounded Replica Currency Estimation

Screenshot 2025-01-24 at 8.12.20 PM.png

Replicas of the same data often show varying consistency levels during read and write operations due to network and system limitations. Estimating the currency (staleness) of data from responding replicas without accessing others is crucial for applications needing timely updates. Depending on the confidence in this estimation, queries can decide to use the retrieved replicas or wait for additional responses. Our approach provides theoretical bounds on the confidence of such estimations, ensuring accuracy with minimal overhead. We implement a confidence-bounded replica currency estimation system in Cassandra, introducing a novel DYNAMIC read consistency level.

Contextual Data Cleaning and Ontological Dependencies

Functional Dependencies (FDs) rely on syntactic equality and often mislabel semantically equivalent values as errors in data cleaning. To address this, we introduce Ontology Functional Dependencies (OFDs), which capture semantic relationships, like synonyms, using ontologies. We establish OFD foundations, including axioms, a linear-time inference procedure, and an algorithm for discovering OFDs, including those with exceptions. We develope FastOFD, a contextual data cleaning framework for addressing minimal repairs for data and ontologies under OFDs.

Screenshot 2025-01-24 at 8.52.02 PM.png

CurrentClean: Spatio-Temporal Cleaning of Stale Data

Data currency is imperative towards achieving up-to-date and accurate data analysis. Identifying and repairing stale data goes beyond simply having timestamps. Individual entities each have their own update patterns in both space and time. We develop CurrentClean, a probabilistic system for identifying and cleaning stale values. We propose a spatio-temporal

probabilistic model with inference rules to capture database update patterns and identify stale values, recommending repairs based on past update trends.

Screenshot 2025-01-24 at 9.01.22 PM.png

Collaborators

Data-Science-Lab-Logo-Col-Lockup_edited.jpg
Screenshot 2025-01-24 at 9.15.57 PM.png
universityofwaterloo_logo_horiz_rgb_1_ed
OntarioTech_primary_2019_edited.png
Tsinghua_University_Logo.svg.png
DMRF_FRMDC_-_no_white_background_4.png
Screenshot 2025-01-24 at 9.28.34 PM.png
Screenshot 2025-01-24 at 9.31.02 PM.png

Northeastern University—Toronto Campus

First Canadian Place

100 King St. West, Suite 4620

Toronto, ON, M5X 1E2

www.northeastern.edu/toronto​

bottom of page