Skip to main content

HPC and AI Power Synthetic Data to Advance Cancer Imaging

SECTOR: Medicine
TECHNOLOGY USED: HPC, LLM, GenAI, Diffusion Models
COUNTRY: Estonia

Organisations involved

Main Participant: Better Medicine is an Estonian SME developing AI tools for cancer diagnostics and imaging.

Domain Expert: Pärnu Hospital contributed clinical expertise, data and feedback from daily radiology practice to the project.

Technology Expert: The University of Tartu provided research support in AI and high-performance computing.

Together, the partners combined medical, academic and technical knowledge to create synthetic CT data that improves cancer detection and supports safer, more accurate AI-driven clinical insights.

 

The challenge

Medical imaging is central to modern healthcare, yet creating accurate AI models for CT scans remains difficult. These models depend on large, varied and well-annotated datasets, but most hospitals work with limited data, inconsistent labelling and uneven representation of patient groups. Differences in scanners, imaging protocols and disease patterns further reduce the ability of AI systems to perform reliably across sites and diverse medical applications.

Better Medicine faced this issue when developing a model to detect kidney tumours. Although public datasets such as Kidney Tumor Segmentation - KiTS offered a useful starting point, models trained only on these data performed poorly when tested on scans from an Estonian partner hospital. The local dataset contained very few annotated tumour cases, creating a clear “domain gap” that led to low accuracy and frequent false alerts.

The team aimed to close this gap by using generative diffusion models to produce realistic synthetic CT scans that could enrich the training data. However, creating and testing these models requires large-scale computation. Training 2D and 2.5D diffusion models that generate thousands of synthetic scans and evaluating adaptation techniques all demand far more processing power than the company’s own systems could provide.

Support from FFplus and access to the EuroHPC JU High-Performance Computer were therefore essential. This allowed Better Medicine to run large experiments, validate the use of synthetic data for clinical AI and reduce development time from months to weeks – all while adopting scalable workflows needed for future medical AI solutions.

 

The Solution

Better Medicine built a new 2.5D generative diffusion model that creates realistic CT scans using information from neighbouring slices to produce smooth, coherent images. The model was trained on healthy scans from public and internal sources. Synthetic tumours were then added to these scans to increase the number and variety of positive cases.

This hybrid method addressed the lack of annotated data and produced more consistent synthetic volumes than simple 2D approaches. The synthetic and real scans were subsequently combined to retrain the tumour-segmentation model. When tested on data from the target hospital, accuracy improved significantly, raising the Dice Similarity Coefficient (DSC) in reliability from 0.43 to 0.63 and reducing false detections by a factor of three. The experiment confirmed that synthetic data can directly improve clinical AI performance.

 

Impact 

Development through the FFplus Innovation Study  has enabled Better Medicine to move from early research to a scalable commercial solution. By utilising HPC, the company can now train and validate complex generative models far more efficiently. This has reduced reliance on manual processes, lowered development costs and cut project timelines by up to 40%.

Clinically, the project has increased confidence in AI-assisted imaging by showing clear, measurable improvements on existing hospital imaging data. Hospitals with limited datasets can now benefit from higher-quality models tailored to their patient demographics. This supports more equitable diagnostics, especially for smaller hospitals and under-represented patient groups.

Commercially, the ability to generate GDPR-compliant synthetic datasets opens a new business line for imaging providers. It also reduces the need for cross-border data sharing, enabling faster. More secure and more ethical AI innovation. The foundation laid in this project strengthens Better Medicine’s position in the fast-growing market for medical AI solutions.

 

Benefits

  • 20%+ improvement in tumour-segmentation accuracy on hospital data.
  • Threefold reduction in false tumour detections.
  • Up to 40% shorter development cycles through reduced annotation needs.
  • Scalable HPC-ready workflows for faster experimentation and model updates.
  • New revenue opportunities through synthetic-data services.
  • Stronger data privacy and support for smaller hospitals and rare-disease research.