Generatect: text-conditional generation of 3d chest ct volumes

dc.contributor.authorHamamcı, İbrahim Ethem
dc.contributor.authorEr, Sezgin
dc.contributor.authorTezcan, Alperen
dc.contributor.authorŞimşek, Ayşe Gülnihan
dc.contributor.authorEsirgün, Şevval Nil
dc.contributor.authorAlmas, Furkan
dc.contributor.authorDoğan, İrem
dc.contributor.authorDaşdelen, Muhammed Furkan
dc.contributor.authorMenze, Bjoern
dc.date.accessioned2025-11-11T08:32:11Z
dc.date.available2025-11-11T08:32:11Z
dc.date.issued2025
dc.departmentİstanbul Medipol Üniversitesi, Rektörlük, Sağlık Bilim ve Teknolojileri Araştırma Enstitüsü
dc.description.abstractText-conditional medical image generation is vital for radiology, augmenting small datasets, preserving data privacy, and enabling patient-specific modeling. However, its applications in 3D medical imaging, such as CT and MRI, which are crucial for critical care, remain unexplored. In this paper, we introduce GenerateCT, the first approach to generating 3D medical imaging conditioned on free-form medical text prompts. GenerateCT incorporates a text encoder and three key components: a novel causal vision transformer for encoding 3D CT volumes, a text-image transformer for aligning CT and text tokens, and a text-conditional super-resolution diffusion model. Without directly comparable methods in 3D medical imaging, we benchmarked GenerateCT against cutting-edge methods, demonstrating its superiority across all key metrics. Importantly, we explored GenerateCT’s clinical applications by evaluating its utility in a multi-abnormality classification task. First, we established a baseline by training a multi-abnormality classifier on our real dataset. To further assess the model’s generalization to external datasets and its performance with unseen prompts in a zero-shot scenario, we employed an external dataset to train the classifier, setting an additional benchmark. We conducted two experiments in which we doubled the training datasets by synthesizing an equal number of volumes for each set using GenerateCT. The first experiment demonstrated an 11% improvement in the AP score when training the classifier jointly on real and generated volumes. The second experiment showed a 7% improvement when training on both real and generated volumes based on unseen prompts. Moreover, GenerateCT enables the scaling of synthetic training datasets to arbitrary sizes. As an example, we generated 100,000 3D CT volumes, fivefold the number in our real dataset, and trained the classifier exclusively on these synthetic volumes. Impressively, this classifier surpassed the performance of the one trained on all available real data by a margin of 8%. Lastly, domain experts evaluated the generated volumes, confirming a high degree of alignment with the text prompts. Access our code, model weights, training data, and generated data at https://github.com/ibrahimethemhamamci/GenerateCT.
dc.description.sponsorshipHelmut Horten Foundation
dc.identifier.citationHamamcı, İ. E., Er, S., Tezcan, A., Şimşek, A. G., Esirgün, Ş. N., Almas, F. ... Menze, B. (2025). Generatect: text-conditional generation of 3d chest ct volumes. 18th European Conference on Computer Vision (ECCV), 15137, 126-143. Milan, Italy, September 29 - October 04, 2024. http://dx.doi.org/10.1007/978-3-031-72986-7_8
dc.identifier.doi10.1007/978-3-031-72986-7_8
dc.identifier.endpage143
dc.identifier.isbn9783031729850
dc.identifier.isbn9783031729867
dc.identifier.issn0302-9743
dc.identifier.issn1611-3349
dc.identifier.scopus2-s2.0-85208591777
dc.identifier.scopusqualityQ2
dc.identifier.startpage126
dc.identifier.urihttp://dx.doi.org/10.1007/978-3-031-72986-7_8
dc.identifier.urihttps://hdl.handle.net/20.500.12511/13187
dc.identifier.volume15137
dc.identifier.wosWOS:001353712400008
dc.identifier.wosqualityQ4
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.institutionauthorEr, Sezgin
dc.institutionauthorTezcan, Alperen
dc.institutionauthorŞimşek, Ayşe Gülnihan
dc.institutionauthorEsirgün, Şevval Nil
dc.institutionauthorAlmas, Furkan
dc.institutionauthorDoğan, İrem
dc.institutionauthorDaşdelen, Muhammed Furkan
dc.institutionauthorid0000-0001-7266-9844
dc.institutionauthorid0009-0003-5360-5505
dc.institutionauthorid0000-0003-2251-2093
dc.language.isoen
dc.relation.ispartof18th European Conference on Computer Vision (ECCV)
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subject3D Medical Imaging
dc.subjectText-Conditional Generation
dc.titleGeneratect: text-conditional generation of 3d chest ct volumes
dc.typeConference Object

Dosyalar

Lisans paketi
Listeleniyor 1 - 1 / 1
Küçük Resim Yok
İsim:
license.txt
Boyut:
1.17 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: