Introduction

Gastric cancer is a common malignant tumor with poor prognosis, and ranks fifth and second in incidence and mortality all over the world, which poses a serious threat to human health all over the world [1]. More than 70% of new cases of gastric cancer occur in developing countries, and in China, gastric cancer ranks second only to lung cancer in terms of the incidence of all tumors [2]. Recently, great progress has been achieved in the diagnosis process and multidisciplinary treatment strategies of patients with resectable gastric cancer, but the survival rate of those patients is still not satisfactory because of the high recurrence rate [3, 4]. Surgical resection is considered to be the main treatment for locally advanced gastric cancer, but there are still more active treatment options, such as preoperative and postoperative chemotherapy and radiotherapy. Some studies have shown that preoperative treatment of gastric cancer (neoadjuvant chemotherapy/radiotherapy + surgery + postoperative adjuvant chemotherapy/radiotherapy) has been proved to be superior to surgery alone [5, 6]. Neoadjuvant chemotherapy also led to significant increase in tumor remission rate and surgical resection rate [7, 8]. Lymph node metastasis plays a key role in selecting suitable patients for neoadjuvant chemotherapy, but the low accuracy of preoperative diagnosis complicates the selection candidate for neoadjuvant chemotherapy [9,10,11,12].

According to the Japanese classification of gastric cancer, gastric adenocarcinoma is further classified into differentiated type (including papillary adenocarcinoma and well-differentiated and moderately differentiated tubular adenocarcinoma) and undifferentiated type (including poorly differentiated adenocarcinoma, mucinous adenocarcinoma and signet ring cell carcinoma),due to significantly different clinicopathological features and prognostic differences [13]. In general, patients with undifferentiated gastric cancer have a higher risk of lymph node metastasis and a poorer survival rate than patients with differentiated gastric cancer [14]. Some studies have also shown that the diagnostic accuracy of lymph node metastasis is related to the histological type and CT stage of gastric cancer [15]. Therefore, preoperative histological diagnosis of gastric cancer is helpful for the judgement of lymph node metastasis, which might be useful to clinical treatment plan making.

At present, the histological type of gastric cancer is mainly determined by postoperative pathology. Although gastroscopic biopsy can provide certain histological information for tumor classification, it only represents local tumor tissue and will bring more damage to patients. CT is the most commonly used imaging modality for the preoperative assessment of Lymph node status, but the reported accuracy is only about 60% [16, 17], and it is difficult to distinguish different histological types of gastric cancer on CT images. Therefore, it is difficult to obtain accurate information of histological type of gastric cancer preoperatively.

Different from traditional imaging modalities, radiomics has the ability to reveal the potential features for image analysis [18]. Radiomics can quantify medical images into high-dimensional, minable data through specific extraction procedures, and support clinical decision-making through subsequent data analysis [18, 19]. Radiomics has been widely used in tumor detection, tumor subtype classification, prognosis prediction and curative effect evaluation and other fields [18,19,20]. For gastric cancer, radiomics has been used to predict metastasis, predict early recurrence and evaluate efficacy. However, there are few studies on the combination of radiomics features and traditional clinical features to predict the histological types of gastric cancer.

The purpose of this retrospective study was to establish and verify a radiomics nomogram based on CT to predict the histological types of gastric cancer preoperatively for patients with indications for resection of gastric cancer, which might help to identify high-risk patients for active treatment.

Materials and methods

Patients

A total of 171 patients with gastric cancer treated resection were collected from two independent institutions. The inclusion criteria were as follows: (1) Patients who underwent gastrectomy and were pathologically proved to be gastric cancer. (2) Gastric lesions were found by abdominal contrast-enhanced CT before operation. (3) CT examination showed complete data in DICOM format. The exclusion criteria are as follows: (1) The image quality can not meet the research, such as motion artifacts, poor gastric filling and other factors that affect the diagnosis. (2) The result of pathological diagnosis is not clear. (3) There is a history of malignant tumors outside the stomach.

A sum of 143 patients recruited from Institution I (Sir Run Run Shaw Hospital, Zhejiang University School of Medicine from January 2019 to December 2020) were randomly divided into training (n = 99) and internal validation (n = 44) cohorts. And 28 patients from institution II (Nanxun District People’s Hospital from January 2016 and October 2019) formed the external validation cohort.

CT image acquisition

CT data acquisition and imaging parameters were listed in Table S1. All CT images were obtained with Siemens SOMATOM Definition Flash, GE Light Speed VCT and Siemens syngo CT, ranging from diaphragm to the level of iliac spine. CT enhanced scan used high pressure syringe to inject nonionic contrast medium into vein at a speed of 3–4 ml/s. Arterial phase delayed scanning was 15 s, portal venous phase delayed scanning was 70 s, and balance phase delayed scanning was 180 s.

Traditional imaging features and clinical data

Traditional imaging features collected the average CT value of the largest axial plane of the lesion in plain scan, arterial phase, portal phase and balanced phase, and the CT difference between post-enhanced scan (arterial phase, portal phase, balanced phase) and plain scan. Those traditional quantitative imaging features were measured by two radiologists (A and B) with 15 and 10 years of working experience, respectively. Radiologists only know clinical information, not pathological results. The two doctors described the location of each tumor on their respective workstations and measured the average CT value at the largest axial plane of tumor. Any differences will be settled through concensus. Clinical data including age, sex, tumor location, carbohydrate antigen 199 (CA199), carbohydrate antigen 125 (CA125), alpha-fetoprotein (AFP) and carcinoembryonic antigen (CEA) level were collected from digital medical record.

Image segmentation and feature extraction

The steps of image segmentation and feature extraction are completed in 3D-slice software (version 3.6.0). Since the enhancement of most gastric cancer lesions is obviously different from that of adjacent normal tissues in the portal vein phase, CT images in the portal vein phase were segmented manually in this study. Lesions were detected and located by thickening and enhancement of gastric wall. Manually segment the entire lesion and integrate the 3D volume of interest (VOI). During segmentation, necrotic areas, enlarged lymph nodes and perigastric adipose tissue were removed from VOI through multi-planar reformation (MPR) observation. MPR is to obtain two-dimensional images of coronal plane, sagittal plane and oblique plane from the original transverse axis images after reconstruction The reconstruction thickness used in this study is 5–7 mm. Image preprocess was completed with interpolation resampling algorithm at a voxel spacing of 1*1*1 to reduce the impact of the heterogeneity of CT scanners and protocols on radiomics analysis. Feature extraction was performed using 3D-slice software (version 3.6.0). Finally, a total of 850 features, including texture features, density features, shape features and filter features, were extracted from CT images.

Feature consistency check and data standardization

The VOI of all patients was first manually segmented by a doctor with 5 years of radiological diagnosis experience. One month later,forty patients were randomly selected from 171 patients for second segmentation by the same doctors. Intra-group correlation coefficient (ICC) was used to evaluate the stability of radiomics features extracted from the two VOIs of those 40 cases. Features with ICC > 0.85 were stable enough for further analysis.

Z-SCORE standardization was applied to eliminate the influence of each characteristic numerical dimension and order of magnitude, so that it falls into a small specific interval.

Feature selection and radiomics model building

Based on the training cohort, the least absolute shrinkage and LASSO analysis method was used to select features associated with histological types of gastric cancer. The regularization parameter λ was defined by use of tenfold cross-validation. LASSO analysis was done utilizing “glmnet” package of R software. Then, according to the selected radiomics features, a radiomics model was developed for histological prediction of gastric cancer using multivariate Logistic regression method in the training cohort and was validated in the internal and external verification cohorts.

Construction of radiomics nomogram

A combination model was established by combining radiomics features with clinical risk prediction factors by multivariate Logistic regression method, and verified in the internal and external verification cohorts. In order to improve the value of the combination model in clinical application, this study visualized the model as an radiomics nomogram in the training cohort.

Statistical analysis

In this study, IBM SPSS Statistics (version 20.0) was used to analyze clinical and traditional imaging feature data. Univariate analysis was used to evaluate the relationship between clinical, traditional imaging features and histological types, independent T-test or Mann–Whitney U test was used to evaluate continuous variables, and chi-square test was used to evaluate category variables. P < 0.05 was considered to be statistically significant.

The receiver operating characteristic (ROC) curve with the area under of curve (AUC) value were used for performance evaluation of radiomics model and radiomics nomogram. Calibration curves was used to assess the calibration of the radiomics nomogram. Decision curve analys (DCA) was used to evaluate the clinical practicability of radiomics nomograt.

Result

Clinical features

Table 1 showed the distribution of clinical and traditional imaging features of all patients in the training, the internal and external verification cohorts. In the cohort of training, there was no significant difference in age, sex, tumor location, CA199, AFP, CEA, CT value and enhancement amplitude of each phase between patients with differentiated gastric cancer and patients with undifferentiated gastric cancer. The value of CA125 for undifferentiated subtype was significantly higher than that of differentiated subtype in the training (P = 0.03), the internal (P = 0.008) and external (P = 0.026) validation cohorts.

Table 1 Clinical and traditional imaging characteristics of patients in the training, the internal and external validation cohort

Radiomics feature selection and radiomics model development

Firstly, 824 of 850 radiomics features with good stability (ICC > 0.85) were selected for further analysis. Z-SCORE standardization was applied to eliminate the influence of each characteristic numerical dimension and order of magnitude, so that it falls into a small specific interval. In this study, the best feature combination was selected by LASSO method. The selection process of LASSO method is shown in Fig. 1. Subsequently, four radiomics features were finally selected (detailed in Table 2) to construct the Radiomics Model of patients with gastric cancer. The radiomics model has a certain prediction effect on differentiated and undifferentiated types subtypes in the training (AUC: 0.755, 95% CI 0.650–0.859), the internal(AUC: 0.71, 95% CI 0.543–0.875) and external verification (AUC: 0.712, 95% CI 0.500–0.923) cohorts (Fig. 2).

Fig. 1
figure 1

The selection process of LASSO method. a The best feature combination was selected by LASSO method. Each color line represents the corresponding coefficient of each feature and LASSO adjusts the parameter (Lamdba), to achieve the purpose of dimensionality reduction. b Selection of tuning parameter (Lamdba) in the LASSO model via tenfold cross-validation based on minimum criteria. The AUC curve was plotted against log Lamdba). Dotted vertical lines were drawn at the optimal values by using the minimum criteria and the 1 standard error of the minimum criteria (the 1- standard error criteria)

Table 2 Details of the four radiomics features
Fig. 2
figure 2

ROC curves of the radiomics model in the training, the internal and external verification cohorts. ROC curves of radiomics model in the training (AUC: 0.755, 95% CI 0.650–0.859) (a), the internal (AUC: 0.71, 95% CI 0.543–0.875) (b) and external (AUC: 0.712, 95% CI 0.500–0.923) (c) validation cohorts

Predictive performance of radiomics nomogram

Multivariate logistic regression analysis was conducted to build a CT-based radiomics nomogram combined radiomics features and CA125, as shown in Fig. 3. The radiomics nomogram showed good discriminant performance in predicting the histological classification of gastric cancer in the training (AUC: 0.777; 95% CI 0.679–0.875), the internal (AUC: 0.726; 95% CI 0.5591–0.8933) and external verification cohort (AUC: 0.720; 95% CI 0.5036–0.9358), as detailed in Fig. 4. The calibration curve of the radiomics nomogram shows that there is a good consistency between the actual results and the predicted results in the training cohort, as detailed in Fig. 5a. As shown in the DCA (Fig. 5b) of the nomogram, when the threshold probability is 20%-80%, it shows a greater net benefit than the treatment of all patients or no treatment.

Fig. 3
figure 3

Two clinical examples of the application of radiomics nomogram. a, b A thickened lesion was observed in the gastric antrum. The radiomics score of the patient is 0.5, which means the points of nomogram is 1.2. The value of CA125 was 4.9, which means the points of nomogram was 0.25, Total point of nomogram (d) was 1.45, which means the probability of undifferentiated type was less than 30%. Microscopic pathological image of the surgical specimen (c) proved the differentiated type. e, f A thickened lesion was observed in the gastric fundus. The radiomics score of the patient is 2.1, which means the points of nomogram is 2.5. The value of CA125 was 9.0, which means the points of nomogram was 0.5, Total point of nomogram (d) was 3.0, which means the probability of undifferentiated type was about 90%. Microscopic pathological image of the surgical specimen (g) proved the undifferentiated type

Fig. 4
figure 4

ROC curves of the radiomics nomogram in the training, the internal and external verification cohorts. ROC curves of radiomics nomogram in the training (AUC: 0.777, 95% CI 0.679–0.875) (a), the internal (AUC: 0.726, 95% CI 0.5591–0.8933) (b) and external (AUC: 0.720; 95% CI 0.5036–0.9358) (c) validation cohorts

Fig. 5
figure 5

Calibration curves and Decision curve analysis of radiomics nomogram. a Calibration curves of the radiomics nomogram being used in the training cohort. The y-axis represented the actual undifferentiated type rate, and the x-axis represented the predicted undifferentiated type possibility. b Decision curve analysis of radiomic nomogram. The y-axis shows the net benefit, and the x-axis shows the risk threshold

Discussion

Preoperative histological subtype diagnosis of gastric cancer is instructive for clinical treatment. In this study, a novel CT-based radiomics nomogram that combined selected radiomics features and CA125 was established and tested to predict the histological types of gastric cancer before operation.

The AUC of CT-based radiology nomogram in the training, the internal and external validation cohorts are 0.77, 0.726 and 0.72 respectively, which proved its potential value in preoperative histological discrimination of gastric cancer. In addition, compared with common binary judgement of predictive model, nomogram to some extent visualized the role of each feature for model judgement and provided a quantitative probability for each patients, which might improve the credibility of the prediction performance of radiomics models in clinical practice.

A total of 850 features, including texture features, density features, shape features and filter features, were extracted from CT images. The 13 shape features describe the shape and size of the tumor.The 18 density features describe the distribution of voxel-based CT intensity in tumors. The 75 texture features describe the relative positions of various grayscale on the image. Through filtering transformation, the features re-collected from density features and texture features are filter features.

Four radiomics features are selected to construct radiomics nomogram, and three of them focus on the texture features of the image: gray-level size zone matrix (GLSZM) and gray-level dependence matrix (GLDM).

First of all, Both GLSZM and GLDM are used to evaluate the similarity of image grayscale in plane or row direction. Therefore, GLSZM and GLDM reflect the local grayscale correlation, the higher the value, the greater the correlation. In this study, the greater the value of GLSZM and GLDM, the greater the grayscale correlation of the tumor image.

Secondly, the median of the three texture features of differentiated gastric cancer was higher than that of undifferentiated gastric cancer, indicating that the grayscale correlation of CT images of patients with differentiated gastric cancer was greater.

Finally, I think this may have something to do with the gap between cancer cells. Some studies have mentioned that differentiated types include papillary and tubular adenocarcinomas. Undifferentiated types include poorly differentiated adenocarcinoma, signet ring cell carcinoma and mucinous adenocarcinoma. In poorly differentiated adenocarcinomas, solid or medullary types are characterized by tight accumulation of tumor cells [14]. Therefore, in the CT images of patients with undifferentiated gastric cancer, the close accumulation of cancer cells in different forms leads to a lower grayscale correlation, which makes the values of GLSZM and GLDM smaller.

There were indeed some limitations in this study. First of all, the sample size was relatively small. Expanded sample size of prospective and multicenter external verification was necessary to further verify the performance of the nomogram in this study. Secondly, the radiomics features used in this study are only extracted from CT images in the portal vein phase, which might excluded some potential value of radiomics features. Therefore, other stages would be further studied. Finally, this study only discussed the relationship between radiomics model and histological type. In future studies, we will increase the correlation between radiomics models and local recurrence, survival and lymph node metastasis, so as to make the study more clinical practical.

Conclusion

In this study, we accomplished CT based radiomics analysis for predicting the histological types of gastric cancer and the radiomics nomogram established in this study could roughly predict the histological type of gastric cancer and contribute to the clinical formulation of a better treatment plan.