Generatect: text-conditional generation of 3d chest ct volumes

Hamamcı, İbrahim Ethem; Er, Sezgin; Tezcan, Alperen; Şimşek, Ayşe Gülnihan; Esirgün, Şevval Nil; Almas, Furkan; Doğan, İrem; Daşdelen, Muhammed Furkan; Menze, Bjoern

Generatect: text-conditional generation of 3d chest ct volumes

dc.contributor.author	Hamamcı, İbrahim Ethem
dc.contributor.author	Er, Sezgin
dc.contributor.author	Tezcan, Alperen
dc.contributor.author	Şimşek, Ayşe Gülnihan
dc.contributor.author	Esirgün, Şevval Nil
dc.contributor.author	Almas, Furkan
dc.contributor.author	Doğan, İrem
dc.contributor.author	Daşdelen, Muhammed Furkan
dc.contributor.author	Menze, Bjoern
dc.date.accessioned	2025-11-11T08:32:11Z
dc.date.available	2025-11-11T08:32:11Z
dc.date.issued	2025
dc.department	İstanbul Medipol Üniversitesi, Rektörlük, Sağlık Bilim ve Teknolojileri Araştırma Enstitüsü
dc.description.abstract	Text-conditional medical image generation is vital for radiology, augmenting small datasets, preserving data privacy, and enabling patient-specific modeling. However, its applications in 3D medical imaging, such as CT and MRI, which are crucial for critical care, remain unexplored. In this paper, we introduce GenerateCT, the first approach to generating 3D medical imaging conditioned on free-form medical text prompts. GenerateCT incorporates a text encoder and three key components: a novel causal vision transformer for encoding 3D CT volumes, a text-image transformer for aligning CT and text tokens, and a text-conditional super-resolution diffusion model. Without directly comparable methods in 3D medical imaging, we benchmarked GenerateCT against cutting-edge methods, demonstrating its superiority across all key metrics. Importantly, we explored GenerateCT’s clinical applications by evaluating its utility in a multi-abnormality classification task. First, we established a baseline by training a multi-abnormality classifier on our real dataset. To further assess the model’s generalization to external datasets and its performance with unseen prompts in a zero-shot scenario, we employed an external dataset to train the classifier, setting an additional benchmark. We conducted two experiments in which we doubled the training datasets by synthesizing an equal number of volumes for each set using GenerateCT. The first experiment demonstrated an 11% improvement in the AP score when training the classifier jointly on real and generated volumes. The second experiment showed a 7% improvement when training on both real and generated volumes based on unseen prompts. Moreover, GenerateCT enables the scaling of synthetic training datasets to arbitrary sizes. As an example, we generated 100,000 3D CT volumes, fivefold the number in our real dataset, and trained the classifier exclusively on these synthetic volumes. Impressively, this classifier surpassed the performance of the one trained on all available real data by a margin of 8%. Lastly, domain experts evaluated the generated volumes, confirming a high degree of alignment with the text prompts. Access our code, model weights, training data, and generated data at https://github.com/ibrahimethemhamamci/GenerateCT.
dc.description.sponsorship	Helmut Horten Foundation
dc.identifier.citation	Hamamcı, İ. E., Er, S., Tezcan, A., Şimşek, A. G., Esirgün, Ş. N., Almas, F. ... Menze, B. (2025). Generatect: text-conditional generation of 3d chest ct volumes. 18th European Conference on Computer Vision (ECCV), 15137, 126-143. Milan, Italy, September 29 - October 04, 2024. http://dx.doi.org/10.1007/978-3-031-72986-7_8
dc.identifier.doi	10.1007/978-3-031-72986-7_8
dc.identifier.endpage	143
dc.identifier.isbn	9783031729850
dc.identifier.isbn	9783031729867
dc.identifier.issn	0302-9743
dc.identifier.issn	1611-3349
dc.identifier.scopus	2-s2.0-85208591777
dc.identifier.scopusquality	Q2
dc.identifier.startpage	126
dc.identifier.uri	http://dx.doi.org/10.1007/978-3-031-72986-7_8
dc.identifier.uri	https://hdl.handle.net/20.500.12511/13187
dc.identifier.volume	15137
dc.identifier.wos	WOS:001353712400008
dc.identifier.wosquality	Q4
dc.indekslendigikaynak	Web of Science
dc.indekslendigikaynak	Scopus
dc.institutionauthor	Er, Sezgin
dc.institutionauthor	Tezcan, Alperen
dc.institutionauthor	Şimşek, Ayşe Gülnihan
dc.institutionauthor	Esirgün, Şevval Nil
dc.institutionauthor	Almas, Furkan
dc.institutionauthor	Doğan, İrem
dc.institutionauthor	Daşdelen, Muhammed Furkan
dc.institutionauthorid	0000-0001-7266-9844
dc.institutionauthorid	0009-0003-5360-5505
dc.institutionauthorid	0000-0003-2251-2093
dc.language.iso	en
dc.relation.ispartof	18th European Conference on Computer Vision (ECCV)
dc.relation.publicationcategory	Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	3D Medical Imaging
dc.subject	Text-Conditional Generation
dc.title	Generatect: text-conditional generation of 3d chest ct volumes
dc.type	Conference Object

Dosyalar

Lisans paketi

Listeleniyor 1 - 1 / 1

İsim:: license.txt
Boyut:: 1.17 KB
Biçim:: Item-specific license agreed upon to submission
Açıklama:

İndir

Koleksiyon

Bildiri Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu
WoS İndeksli Yayınlar Koleksiyonu