Skip to Main Content (Press Enter)

Logo UNIMI
  • ×
  • Home
  • People
  • Projects
  • Fields
  • Units
  • Outputs
  • Third Mission

Expertise & Skills
Logo UNIMI

|

Expertise & Skills

unimi.it
  • ×
  • Home
  • People
  • Projects
  • Fields
  • Units
  • Outputs
  • Third Mission
  1. Outputs

Sample size and predictive performance of machine learning methods with survival data: A simulation study

Academic Article
Publication Date:
2023
Citation:
Sample size and predictive performance of machine learning methods with survival data: A simulation study / G. Infante, R. Miceli, F. Ambrogi. - In: STATISTICS IN MEDICINE. - ISSN 0277-6715. - 42:30(2023 Dec 30), pp. 5657-5675. [10.1002/sim.9931]
abstract:
Prediction models are increasingly developed and used in diagnostic and prognostic studies, where the use of machine learning (ML) methods is becoming more and more popular over traditional regression techniques. For survival outcomes the Cox proportional hazards model is generally used and it has been proven to achieve good prediction performances with few strong covariates. The possibility to improve the model performance by including nonlinearities, covariate interactions and time-varying effects while controlling for overfitting must be carefully considered during the model building phase. On the other hand, ML techniques are able to learn complexities from data at the cost of hyper-parameter tuning and interpretability. One aspect of special interest is the sample size needed for developing a survival prediction model. While there is guidance when using traditional statistical models, the same does not apply when using ML techniques. This work develops a time-to-event simulation framework to evaluate performances of Cox regression compared, among others, to tuned random survival forest, gradient boosting, and neural networks at varying sample sizes. Simulations were based on replications of subjects from publicly available databases, where event times were simulated according to a Cox model with nonlinearities on continuous variables and time-varying effects and on the SEER registry data.
IRIS type:
01 - Articolo su periodico
Keywords:
machine learning; prediction; sample size; simulation; time-to-event
List of contributors:
G. Infante, R. Miceli, F. Ambrogi
Authors of the University:
AMBROGI FEDERICO ( author )
Link to information sheet:
https://air.unimi.it/handle/2434/1023543
Full Text:
https://air.unimi.it/retrieve/handle/2434/1023543/2344761/Statistics%20in%20Medicine%20-%202023%20-%20Infante%20-%20Sample%20size%20and%20predictive%20performance%20of%20machine%20learning%20methods%20with%20survival.pdf
Project:
Innovative statistical methods in biomedical research on biomarkers: from their identification to their use in clinical practice
  • Research Areas

Research Areas

Concepts


Settore MED/01 - Statistica Medica
  • Guide
  • Help
  • Accessibility
  • Privacy
  • Use of cookies
  • Legal notices

Powered by VIVO | Designed by Cineca | 26.5.1.0