A Guide for the Application of Statistics in Biomedical Studies Concerning Machine Learning and Artificial Intelligence


      With the plethora of machine learning (ML) analyses published in the orthopaedic literature within the last 5 years, several attempts have been made to enhance our understanding of what exactly ML means and how it is used. At its most fundamental level, ML comprises a branch of artificial intelligence that uses algorithms to analyze and learn from patterns in data without explicit programming or human intervention. On the other hand, traditional statistics require a user to specifically choose variables of interest to create a model capable of predicting an outcome, the output of which (1) may be falsely influenced by the variables chosen to be included by the user and (2) does not allow for optimization of performance. Early publications have served as succinct editorials or reviews intended to ease audiences unfamiliar with ML into the complexities that accompany the subject. Most commonly, the focus of these studies concerns the terminology and concepts surrounding ML because it is important to understand the rationale behind performing such studies. Unfortunately, these publications only touch on the most basic aspects of ML and are too frequently repetitive. Indeed, the conclusion of these articles reiterate that the potential clinical utility of these algorithms remains tangential at best in their current form and caution against premature adoption without external validation. By doing so, our perspective and ability to draw our own conclusions from these studies have not advanced, and we are left concluding with each subsequent study that a new algorithm is published for an outcome of interest that cannot be used until further validation. What readers now need is to regress to embrace the principles of the scientific method that they have used to critically assess vast numbers of publications before this wave of newly applied statistical methodology—a guide to interpret results such that their own conclusions can be drawn.

      Level of Evidence

      Level V, expert opinion.
      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Arthroscopy
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Fontana M.A.
        CORR Insights(R): Can machine-learning algorithms predict early revision TKA in the Danish Knee Arthroplasty Registry?.
        Clin Orthop Relat Res. 2020; 478: 2102-2104
        • Leopold S.S.
        Editor's Spotlight/Take 5: Can machine learning algorithms predict which patients will achieve minimally clinically important differences from total joint arthroplasty?.
        Clin Orthop Relat Res. 2019; 477: 1262-1266
        • Varady N.H.
        • Feroe A.G.
        • Fontana M.A.
        • Chen A.F.
        Causal language in observational orthopaedic research.
        J Bone Joint Surg Am. 2021; 103: e76
        • Ramkumar P.N.
        • Kunze K.N.
        • Haeberle H.S.
        • et al.
        Clinical and research medical applications of artificial intelligence.
        Arthroscopy. 2021; 37: 1694-1697
        • Helm J.M.
        • Swiergosz A.M.
        • Haeberle H.S.
        • et al.
        Machine learning and artificial intelligence: Definitions, applications, and future directions.
        Curr Rev Musculoskelet Med. 2020; 13: 69-76
        • Makhni E.C.
        • Makhni S.
        • Ramkumar P.N.
        Artificial intelligence for the orthopaedic surgeon: An overview of potential benefits, limitations, and clinical applications.
        J Am Acad Orthop Surg. 2021; 29: 235-243
        • Myers T.G.
        • Ramkumar P.N.
        • Ricciardi B.F.
        • Urish K.L.
        • Kipper J.
        • Ketonis C.
        Artificial intelligence and orthopaedics: An introduction for clinicians.
        J Bone Joint Surg Am. 2020; 102: 830-840
        • Steyerberg E.W.
        • Vickers A.J.
        • Cook N.R.
        • et al.
        Assessing the performance of prediction models: A framework for traditional and novel measures.
        Epidemiology. 2010; 21: 128-138
        • Steyerberg E.W.
        • Vergouwe Y.
        Towards better clinical prediction models: Seven steps for development and an ABCD for validation.
        Eur Heart J. 2014; 35: 1925-1931
        • Karhade A.V.
        • Thio Q.C.B.S.
        • Ogink P.T.
        • et al.
        Predicting 90-day and 1-year mortality in spinal metastatic disease: Development and internal validation.
        Neurosurgery. 2019; 85: E671-E681
        • Brier G.W.
        Verification of forecasts expressed in terms of probability.
        Monthly Weather Rev. 1950; 78: 1-3
        • Vickers A.J.
        • van Calster B.
        • Steyerberg E.W.
        A simple, step-by-step guide to interpreting decision curve analysis.
        Diagn Progn Res. 2019; 3: 18
        • Ribeiro M.T.
        • Singh S.
        • Guestrin C.
        “Why should I trust you?”: Explaining the predictions of any classifier.
        Proc 22nd SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016; : 1135-1144
        • Ghassemi M.
        • Oakden-Rayner L.
        • Beam A.L.
        The false hope of current approaches to explainable artificial intelligence in health care.
        Lancet Digit Health. 2021; 3: e745-e750
        • Friedman J.H.
        Greedy function approximation: A gradient boosting machine.
        Ann Stat. 2001; : 1189-1232
      1. Greenwell BM. pdf: An R package for constructing partial dependence plots.
        R Journal. 2017; 9: 421-436
        • Chicco D.
        • Warrens M.J.
        • Jurman G.
        The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation.
        Peer J Comput Sci. 2021; 7: e623
        • Botchkarev A.
        Performance metrics (error measures) in machine learning regression, forecasting and prognostics: Properties and typology.
        ArXiv. 2018; (abs/1809.03006)
      2. Kunze KN, Krivicich LM, Clapp IM, et al. Machine learning algorithms predict achievement of clinically significant outcomes after orthopaedic surgery: A systematic review [published online December 27, 2021]. Arthroscopy.

        • Kunze K.N.
        • Orr M.
        • Krebs V.
        • Bhandari M.
        • Piuzzi N.S.
        Potential benefits, unintended consequences, and future roles of artificial intelligence in orthopaedic surgery research: A call to emphasize data quality and indications.
        Bone Jt Open. 2022; 3: 93-97
        • Polce E.M.
        • Kunze K.N.
        • Dooley M.S.
        • Piuzzi N.S.
        • Boettner F.
        • Sculco P.K.
        Efficacy and applications of artificial intelligence and machine learning analyses in total joint arthroplasty: A call for improved reporting.
        J Bone Joint Surg Am. 2022; : 10-2106
        • Kunze K.N.
        • Rossi D.M.
        • White G.M.
        • et al.
        Diagnostic performance of artificial intelligence for detection of anterior cruciate ligament and meniscus tears: A systematic review.
        Arthroscopy. 2021; 37: 771-781
        • Kunze K.N.
        • Polce E.M.
        • Nwachukwu B.U.
        • Chahla J.
        • Nho S.J.
        Development and internal validation of supervised machine learning algorithms for predicting clinically significant functional improvement in a mixed population of primary hip arthroscopy.
        Arthroscopy. 2021; 37: 1488-1497
        • Kunze K.N.
        • Polce E.M.
        • Rasio J.
        • Nho S.J.
        Machine learning algorithms predict clinically significant improvements in satisfaction after hip arthroscopy.
        Arthroscopy. 2021; 37: 1143-1151