Machine Learning Algorithms Predict Achievement of Clinically Significant Outcomes After Orthopaedic Surgery: A Systematic Review

Published:December 26, 2021DOI:


      To determine what subspecialties have applied machine learning (ML) to predict clinically significant outcomes (CSOs) within orthopaedic surgery and to determine whether the performance of these models was acceptable through assessing discrimination and other ML metrics where reported.


      The PubMed, EMBASE, and Cochrane Central Register of Controlled Trials databases were queried for articles that used ML to predict achievement of the minimal clinically important difference (MCID), patient acceptable symptomatic state (PASS), or substantial clinical benefit (SCB) after orthopaedic surgical procedures. Data pertaining to demographic characteristics, subspecialty, specific ML algorithms, and algorithm performance were analyzed.


      Eighteen articles met the inclusion criteria. Seventeen studies developed novel algorithms, whereas one study externally validated an established algorithm. All studies used ML to predict MCID achievement, whereas 3 (16.7%) predicted SCB achievement and none predicted PASS achievement. Of the studies, 7 (38.9%) concerned outcomes after spine surgery; 6 (33.3%), after sports medicine surgery; 3 (16.7%), after total joint arthroplasty (TJA); and 2 (11.1%), after shoulder arthroplasty. No studies were found regarding trauma, hand, elbow, pediatric, or foot and ankle surgery. In spine surgery, concordance statistics (C-statistics) ranged from 0.65 to 0.92; in hip arthroscopy, 0.51 to 0.94; in TJA, 0.63 to 0.89; and in shoulder arthroplasty, 0.70 to 0.95. Most studies reported C-statistics at the upper end of these ranges, although populations were heterogeneous.


      Currently available ML algorithms can discriminate the propensity to achieve CSOs using the MCID after spine, TJA, sports medicine, and shoulder surgery with a fair to good performance as evidenced by C-statistics ranging from 0.6 to 0.95 in most analyses. Less evidence is available on the ability of ML to predict achievement of SCB, and no evidence is available for achievement of the PASS. Such algorithms may augment shared decision-making practices and allow clinicians to provide more appropriate patient expectations using individualized risk assessments. However, these studies remain limited by variable reporting of performance metrics, CSO quantification methods, and adherence to predictive modeling guidelines, as well as limited external validation.

      Level of Evidence

      Level III, systematic review of Level III studies.
      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Arthroscopy
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Kunze K.N.
        • Polce E.M.
        • Chahla J.
        Response to “Regarding ‘Editorial Commentary: Artificial intelligence in sports medicine diagnosis needs to improve’”.
        Arthroscopy. 2021; 37: 1367-1368
        • Ramkumar P.N.
        • Kunze K.N.
        • Haeberle H.S.
        • et al.
        Clinical and research medical applications of artificial intelligence.
        Arthroscopy. 2021; 37: 1694-1697
        • Helm J.M.
        • Swiergosz A.M.
        • Haeberle H.S.
        • et al.
        Machine learning and artificial intelligence: Definitions, applications, and future directions.
        Curr Rev Musculoskelet Med. 2020; 13: 69-76
        • Makhni E.C.
        • Makhni S.
        • Ramkumar P.N.
        Artificial intelligence for the orthopaedic surgeon: An overview of potential benefits, limitations, and clinical applications.
        J Am Acad Orthop Surg. 2021; 29: 235-243
        • Myers T.G.
        • Ramkumar P.N.
        • Ricciardi B.F.
        • Urish K.L.
        • Kipper J.
        • Ketonis C.
        Artificial intelligence and orthopaedics: An introduction for clinicians.
        J Bone Joint Surg Am. 2020; 102: 830-840
        • Fontana M.A.
        • Lyman S.
        • Sarker G.K.
        • Padgett D.E.
        • MacLean C.H.
        Can machine learning algorithms predict which patients will achieve minimally clinically important differences from total joint arthroplasty?.
        Clin Orthop Relat Res. 2019; 477: 1267-1279
        • Kunze K.N.
        • Polce E.M.
        • Ranawat A.S.
        • et al.
        Application of machine learning algorithms to predict clinically meaningful improvement after arthroscopic anterior cruciate ligament reconstruction.
        Orthop J Sports Med. 2021; 9 (23259671211046575)
        • Kunze K.N.
        • Polce E.M.
        • Clapp I.
        • Nwachukwu B.U.
        • Chahla J.
        • Nho S.J.
        Machine learning algorithms predict functional improvement after hip arthroscopy for femoroacetabular impingement syndrome in athletes.
        J Bone Joint Surg Am. 2021; 103: 1055-1062
        • Polce E.M.
        • Kunze K.N.
        • Fu M.C.
        • et al.
        Development of supervised machine learning algorithms for prediction of satisfaction at 2 years following total shoulder arthroplasty.
        J Shoulder Elbow Surg. 2021; 30: e290-e299
        • Kunze K.N.
        • Karhade A.V.
        • Sadauskas A.J.
        • Schwab J.H.
        • Levine B.R.
        Development of machine learning algorithms to predict clinically meaningful improvement for the patient-reported health state after total hip arthroplasty.
        J Arthroplasty. 2020; 35: 2119-2123
        • Kunze K.N.
        • Polce E.M.
        • Nwachukwu B.U.
        • Chahla J.
        • Nho S.J.
        Development and internal validation of supervised machine learning algorithms for predicting clinically significant functional improvement in a mixed population of primary hip arthroscopy patients.
        Arthroscopy. 2021; 37: 1488-1497
        • Moher D.
        • Liberati A.
        • Tetzlaff J.
        • Altman D.G.
        Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement.
        PLoS Med. 2009; 6e1000097
        • Assel M.
        • Sjoberg D.D.
        • Vickers A.J.
        The Brier score does not evaluate the clinical utility of diagnostic tests or prediction models.
        Diagn Progn Res. 2017; 1: 19
        • Cook N.R.
        Use and misuse of the receiver operating characteristic curve in risk prediction.
        Circulation. 2007; 115: 928-935
        • Uddin S.
        • Khan A.
        • Hossain M.E.
        • Moni M.A.
        Comparing different supervised machine learning algorithms for disease prediction.
        BMC Med Inform Decis Mak. 2019; 19: 281
        • Steyerberg E.W.
        • Vergouwe Y.
        Towards better clinical prediction models: seven steps for development and an ABCD for validation.
        Eur Heart J. 2014; 35: 1925-1931
        • Steyerberg E.W.
        • Vickers A.J.
        • Cook N.R.
        • et al.
        Assessing the performance of prediction models: a framework for traditional and novel measures.
        Epidemiology. 2010; 21: 128-138
        • Kattan M.W.
        • Gerds T.A.
        The index of prediction accuracy: an intuitive measure useful for evaluating risk prediction models.
        Diagn Progn Res. 2018; 2: 7
        • Slim K.
        • Nini E.
        • Forestier D.
        • Kwiatkowski F.
        • Panis Y.
        • Chipponi J.
        Methodological index for non-randomized studies (MINORS): Development and validation of a new instrument.
        ANZ J Surg. 2003; 73: 712-716
        • Quddusi A.
        • Eversdijk H.A.J.
        • Klukowska A.M.
        • et al.
        External validation of a prediction model for pain and functional outcome after elective lumbar spinal fusion.
        Eur Spine J. 2020; 29: 374-383
        • Khor S.
        • Lavallee D.
        • Cizik A.M.
        • et al.
        Development and validation of a prediction model for pain and functional outcomes after lumbar spine surgery.
        JAMA Surg. 2018; 153: 634-642
        • Staartjes V.E.
        • de Wispelaere M.P.
        • Vandertop W.P.
        • Schröder M.L.
        Deep learning-based preoperative predictive analytics for patient-reported outcomes following lumbar discectomy: feasibility of center-specific modeling.
        Spine J. 2019; 19: 853-861
        • Siccoli A.
        • de Wispelaere M.P.
        • Schröder M.L.
        • Staartjes V.E.
        Machine learning-based preoperative predictive analytics for lumbar spinal stenosis.
        Neurosurg Focus. 2019; 46: E5
        • Karhade A.V.
        • Fogel H.A.
        • Cha T.D.
        • et al.
        Development of prediction models for clinically meaningful improvement in PROMIS scores after lumbar decompression.
        Spine J. 2020; 21: 397-404
        • Merali Z.G.
        • Witiw C.D.
        • Badhiwala J.H.
        • Wilson J.R.
        • Fehlings M.G.
        Using a machine learning approach to predict outcome after surgery for degenerative cervical myelopathy.
        PLoS One. 2019; 14e0215133
        • Pedersen C.F.
        • Andersen M.
        • Carreon L.Y.
        • Eiskjær S.
        Applied machine learning for spine surgeons: Predicting outcome for patients undergoing treatment for lumbar disc herniation using PRO data.
        Global Spine J. 2020; (2192568220967643)
        • Kunze K.N.
        • Polce E.M.
        • Rasio J.
        • Nho S.J.
        Machine learning algorithms predict clinically significant improvements in satisfaction after hip arthroscopy.
        Arthroscopy. 2021; 37: 1143-1151
        • Nwachukwu B.U.
        • Beck E.C.
        • Lee E.K.
        • et al.
        Application of machine learning for predicting clinically meaningful outcome after arthroscopic femoroacetabular impingement surgery.
        Am J Sports Med. 2020; 48: 415-423
        • Ramkumar P.N.
        • Karnuta J.M.
        • Haeberle H.S.
        • et al.
        Radiographic indices are not predictive of clinical outcomes among 1735 patients indicated for hip arthroscopic surgery: A machine learning analysis.
        Am J Sports Med. 2020; 48: 2910-2918
        • Ramkumar P.N.
        • Karnuta J.M.
        • Haeberle H.S.
        • et al.
        Association between preoperative mental health and clinically meaningful outcomes after osteochondral allograft for cartilage defects of the knee: A machine learning analysis.
        Am J Sports Med. 2021; 49: 948-957
        • Harris A.H.S.
        • Kuo A.C.
        • Bowe T.R.
        • Manfredi L.
        • Lalani N.F.
        • Giori N.J.
        Can machine learning methods produce accurate and easy-to-use preoperative prediction models of one-year improvements in pain and functioning after knee arthroplasty?.
        J Arthroplasty. 2021; 36: 112-117.e6
        • Kumar V.
        • Roche C.
        • Overman S.
        • et al.
        Using machine learning to predict clinical outcomes after shoulder arthroplasty with a minimal feature set.
        J Shoulder Elbow Surg. 2021; 30: e225-e236
        • Kumar V.
        • Roche C.
        • Overman S.
        • et al.
        What Is the accuracy of three different machine learning techniques to predict clinical outcomes after shoulder arthroplasty?.
        Clin Orthop Relat Res. 2020; 478: 2351-2363
        • Kuhn M.
        • Johnson K.
        Applied predictive modeling.
        Springer, New York2013
        • Kunze K.N.
        • Bart J.A.
        • Ahmad M.
        • Nho S.J.
        • Chahla J.
        Large heterogeneity among minimal clinically important differences for hip arthroscopy outcomes: A systematic review of reporting trends and quantification methods.
        Arthroscopy. 2021; 37: 1028-1037.e6
        • Leopold S.S.
        • Porcher R.
        • Gebhardt M.C.
        • et al.
        Editorial: Opposites attract at CORR-machine learning and qualitative research.
        Clin Orthop Relat Res. 2020; 478: 2193-2196