Skip to main content
Top
Published in:

14-07-2024 | Lymphoma | Original Article

A position-enhanced sequential feature encoding model for lung infections and lymphoma classification on CT images

Authors: Rui Zhao, Wenhao Li, Xilai Chen, Yuchong Li, Baochun He, Yucong Zhang, Yu Deng, Chunyan Wang, Fucang Jia

Published in: International Journal of Computer Assisted Radiology and Surgery | Issue 10/2024

Login to get access

Abstract

Purpose

Differentiating pulmonary lymphoma from lung infections using CT images is challenging. Existing deep neural network-based lung CT classification models rely on 2D slices, lacking comprehensive information and requiring manual selection. 3D models that involve chunking compromise image information and struggle with parameter reduction, limiting performance. These limitations must be addressed to improve accuracy and practicality.

Methods

We propose a transformer sequential feature encoding structure to integrate multi-level information from complete CT images, inspired by the clinical practice of using a sequence of cross-sectional slices for diagnosis. We incorporate position encoding and cross-level long-range information fusion modules into the feature extraction CNN network for cross-sectional slices, ensuring high-precision feature extraction.

Results

We conducted comprehensive experiments on a dataset of 124 patients, with respective sizes of 64, 20 and 40 for training, validation and testing. The results of ablation experiments and comparative experiments demonstrated the effectiveness of our approach. Our method outperforms existing state-of-the-art methods in the 3D CT image classification problem of distinguishing between lung infections and pulmonary lymphoma, achieving an accuracy of 0.875, AUC of 0.953 and F1 score of 0.889.

Conclusion

The experiments verified that our proposed position-enhanced transformer-based sequential feature encoding model is capable of effectively performing high-precision feature extraction and contextual feature fusion in the lungs. It enhances the ability of a standalone CNN network or transformer to extract features, thereby improving the classification performance. The source code is accessible at https://​github.​com/​imchuyu/​PTSFE.
Literature
2.
go back to reference Yao D, Zhang L, Wu PL, Gu XL, Chen YF, Wang LX, Huang XY (2018) Clinical and misdiagnosed analysis of primary pulmonary lymphoma: a retrospective study. Bmc Cancer 18(1):281CrossRefPubMedPubMedCentral Yao D, Zhang L, Wu PL, Gu XL, Chen YF, Wang LX, Huang XY (2018) Clinical and misdiagnosed analysis of primary pulmonary lymphoma: a retrospective study. Bmc Cancer 18(1):281CrossRefPubMedPubMedCentral
3.
go back to reference Sayed AN, Himeur Y, Bensaali F (2023) From time-series to 2D images for building occupancy prediction using deep transfer learning. Eng Appl Artif Intell 119:105786CrossRef Sayed AN, Himeur Y, Bensaali F (2023) From time-series to 2D images for building occupancy prediction using deep transfer learning. Eng Appl Artif Intell 119:105786CrossRef
4.
go back to reference Himeur Y, Elnour M, Fadli F, Meskin N, Petri I, Rezgui Y, Bensaali F, Amira A (2023) AI-big data analytics for building automation and management systems: a survey, actual challenges and future perspectives. Artif Intell Rev 56:4929–5021CrossRefPubMed Himeur Y, Elnour M, Fadli F, Meskin N, Petri I, Rezgui Y, Bensaali F, Amira A (2023) AI-big data analytics for building automation and management systems: a survey, actual challenges and future perspectives. Artif Intell Rev 56:4929–5021CrossRefPubMed
5.
go back to reference Hryniewska-Guzik W, Kędzierska M, Biecek P (2023) Multi-task learning for classification, segmentation, reconstruction, and detection on chest CT scans. Prog Polish Art Intell Res 4:251–257 Hryniewska-Guzik W, Kędzierska M, Biecek P (2023) Multi-task learning for classification, segmentation, reconstruction, and detection on chest CT scans. Prog Polish Art Intell Res 4:251–257
6.
go back to reference Yuan L, Chen Y, Wang T, Shi Y, Tay FE, Feng J, Yan S, Zi-Hang J, Francis EHT, Jiashi F, Shuicheng Y (2021) Tokens-to-Token ViT: training vision transformers from scratch on imageNet. In: ICCV. pp 558-567 Yuan L, Chen Y, Wang T, Shi Y, Tay FE, Feng J, Yan S, Zi-Hang J, Francis EHT, Jiashi F, Shuicheng Y (2021) Tokens-to-Token ViT: training vision transformers from scratch on imageNet. In: ICCV. pp 558-567
7.
go back to reference He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. In: CVPR. pp.770-778 He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. In: CVPR. pp.770-778
8.
go back to reference Huang G, Liu Z, Laurens VDM, Weinberger KQ (2017) Densely connected convolutional networks. In: CVPR. pp.4700-4708 Huang G, Liu Z, Laurens VDM, Weinberger KQ (2017) Densely connected convolutional networks. In: CVPR. pp.4700-4708
9.
go back to reference Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. PMLR. 97:6105–6114 Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. PMLR. 97:6105–6114
10.
go back to reference Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin Transformer: hierarchical vision transformer using shifted windows. In: ICCV. pp.10012-10022 Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin Transformer: hierarchical vision transformer using shifted windows. In: ICCV. pp.10012-10022
11.
go back to reference Duong LT, Le NH, Tran TB, Ngo VM, Nguyen PT (2021) Detection of tuberculosis from chest X-ray images: Boosting the performance with vision transformer and transfer learning. Expert Syst Appl 184(3):115519CrossRef Duong LT, Le NH, Tran TB, Ngo VM, Nguyen PT (2021) Detection of tuberculosis from chest X-ray images: Boosting the performance with vision transformer and transfer learning. Expert Syst Appl 184(3):115519CrossRef
12.
go back to reference Islam MN, Hasan M, Hossain MK, Alam MGR, Uddin DZ, Soylu A (2022) Vision transformer and explainable transfer learning models for auto detection of kidney cyst, stone and tumor from CT-radiography. Sci Rep 12(1):11440CrossRefPubMedPubMedCentral Islam MN, Hasan M, Hossain MK, Alam MGR, Uddin DZ, Soylu A (2022) Vision transformer and explainable transfer learning models for auto detection of kidney cyst, stone and tumor from CT-radiography. Sci Rep 12(1):11440CrossRefPubMedPubMedCentral
13.
go back to reference Liu D, Liu F, Tie Y, Qi L, Wang F (2022) Res-trans networks for lung nodule classification. Int J Comput Assist Radiol Surg 17:1059–1068CrossRefPubMed Liu D, Liu F, Tie Y, Qi L, Wang F (2022) Res-trans networks for lung nodule classification. Int J Comput Assist Radiol Surg 17:1059–1068CrossRefPubMed
14.
go back to reference Saha A, Tushar FI, Faryna K, D’Anniballe VM, Lo JY (2020) Weakly supervised 3D classification of chest CT using aggregated multi-resolution deep segmentation features. SPIE Med Imaging 11314:39–44 Saha A, Tushar FI, Faryna K, D’Anniballe VM, Lo JY (2020) Weakly supervised 3D classification of chest CT using aggregated multi-resolution deep segmentation features. SPIE Med Imaging 11314:39–44
15.
go back to reference Al-Shabi M, Shak K, Tan M (2021) 3D axial-attention for lung nodule classification. Int J Comput Assist Radiol Surg 16:1319–1324CrossRefPubMed Al-Shabi M, Shak K, Tan M (2021) 3D axial-attention for lung nodule classification. Int J Comput Assist Radiol Surg 16:1319–1324CrossRefPubMed
16.
go back to reference Ren Y, Tsai MY, Chen L, Wang J, Shen C (2020) A manifold learning regularization approach to enhance 3D CT image-based lung nodule classification. Int J Comput Assist Radiol Surg 15:287–295CrossRefPubMed Ren Y, Tsai MY, Chen L, Wang J, Shen C (2020) A manifold learning regularization approach to enhance 3D CT image-based lung nodule classification. Int J Comput Assist Radiol Surg 15:287–295CrossRefPubMed
17.
go back to reference Adiraju RV, Elias S (2021) A survey on lung CT datasets and research trends. Res Biomed Eng 37:403–418CrossRef Adiraju RV, Elias S (2021) A survey on lung CT datasets and research trends. Res Biomed Eng 37:403–418CrossRef
18.
go back to reference Díaz J, Brunet P, Navazo I, Vázquez P (2017) Downsampling methods for medical datasets. International conference on computer graphics, visualization, computer vision and image processing pp 12-20 Díaz J, Brunet P, Navazo I, Vázquez P (2017) Downsampling methods for medical datasets. International conference on computer graphics, visualization, computer vision and image processing pp 12-20
19.
go back to reference Jang J, Hwang D (2022) M3T: three-dimensional Medical image classifier using Multi-plane and Multi-slice transformer. In:CVPR. pp 20686-20697 Jang J, Hwang D (2022) M3T: three-dimensional Medical image classifier using Multi-plane and Multi-slice transformer. In:CVPR. pp 20686-20697
20.
go back to reference Gammulle H, Fernando T, Sridharan S, Denman S, Fookes C (2021) Multi-Slice Net: a novel light weight framework for COVID-19 diagnosis. In:2021 IEEE international conference on autonomous systems (ICAS) Gammulle H, Fernando T, Sridharan S, Denman S, Fookes C (2021) Multi-Slice Net: a novel light weight framework for COVID-19 diagnosis. In:2021 IEEE international conference on autonomous systems (ICAS)
22.
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In:NIPS Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In:NIPS
23.
go back to reference Devlin J, Chang M W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT. pp 4171-4186 Devlin J, Chang M W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT. pp 4171-4186
24.
go back to reference Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Houlsby N (2020) An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Houlsby N (2020) An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR
25.
go back to reference Balduzzi D, Frean M, Leary L, Lewis JP, Mcwilliams B (2017) The shattered gradients problem: if resnets are the answer, then what is the question. PMLR 70:342–350 Balduzzi D, Frean M, Leary L, Lewis JP, Mcwilliams B (2017) The shattered gradients problem: if resnets are the answer, then what is the question. PMLR 70:342–350
26.
go back to reference Islam MA, Jia S, Bruce NDB (2020) How much position information do convolutional neural networks encode? In: ICLR Islam MA, Jia S, Bruce NDB (2020) How much position information do convolutional neural networks encode? In: ICLR
27.
go back to reference Box JF (1987) Guinness, Gosset, Fisher, and Small Samples. Statist Sci 2(1):45–52 Box JF (1987) Guinness, Gosset, Fisher, and Small Samples. Statist Sci 2(1):45–52
Metadata
Title
A position-enhanced sequential feature encoding model for lung infections and lymphoma classification on CT images
Authors
Rui Zhao
Wenhao Li
Xilai Chen
Yuchong Li
Baochun He
Yucong Zhang
Yu Deng
Chunyan Wang
Fucang Jia
Publication date
14-07-2024
Publisher
Springer International Publishing
Keyword
Lymphoma
Published in
International Journal of Computer Assisted Radiology and Surgery / Issue 10/2024
Print ISSN: 1861-6410
Electronic ISSN: 1861-6429
DOI
https://doi.org/10.1007/s11548-024-03230-y

Other articles of this Issue 10/2024

International Journal of Computer Assisted Radiology and Surgery 10/2024 Go to the issue