Objective To explore factors that potentially impact external validation performance while developing and validating a prognostic model for hospital admissions (HAs) in complex older general practice patients. Study design and setting Using individual participant data from four cluster-randomised trials conducted in the Netherlands and Germany, we used logistic regression to develop a prognostic model to predict all-cause HAs within a 6-month follow-up period. A stratified intercept was used to account for heterogeneity in baseline risk between the studies. The model was validated both internally and by using internal-external cross-validation (IECV). Results Prior HAs, physical components of the health-related quality of life comorbidity index, and medication-related variables were used in the final model. While achieving moderate discriminatory performance, internal bootstrap validation revealed a pronounced risk of overfitting. The results of the IECV, in which calibration was highly variable even after accounting for between-study heterogeneity, agreed with this finding. Heterogeneity was equally reflected in differing baseline risk, predictor effects and absolute risk predictions. Conclusions Predictor effect heterogeneity and differing baseline risk can explain the limited external performance of HA prediction models. With such drivers known, model adjustments in external validation settings (eg, intercept recalibration, complete updating) can be applied more purposefully. Trial registration number PROSPERO id: CRD42018088129.