From: Machine learning in general practice: scoping review of administrative task support and automation
Author | General practice | Administrative task | Machine learning | |||||||
---|---|---|---|---|---|---|---|---|---|---|
No | What is the problem? | What data is used? | How are GPs involved? | What is the task? | What needs improving? | How automated? | What is the problem? | What methods are used? | What evaluation measures? | |
1 | [15] Abu Lekham et al. (2021) | Appointment scheduling | Data on patient appointments from an outpatient primary care center containing 26 features collected from 2016 to 2019 | GPS not stated as involved in the research, but one author is affiliated with the healthcare center in question | Prediction of missed appointments (no-shows and early cancellations) | Patient scheduling, enhance capacity use, maximize revenues, minimize costs, and ultimately achieve financial stability | Fully | Supervised – binary, multi-class, multi-stage chain | Logistic Regression, Decision Tree, and some ensemble methods, including Random Forest, Ada Boost, Gradient Boosting, and Bagging | Precision, recall, F-value, and accuracy |
2 | [16] Ahmad et al. (2021) | Appointment scheduling | Patient-visit information (patient ID, month, day, age, gender, race, ethnicity, insurance type, visit type, and previous no-shows) from the EHR database, eClinicalWorks, between 2014 and 2016 | GPs not stated as involved in the research, but all authors have medical affiliations | Reduce the rate of clinical no-shows or missed appointments | Decreasing clinical no-show rates | Fully | Supervised regression | Probit regression | Sensitivity, specificity, ROC curve and AUC |
3 | [17] Cubillas et al. (2014) | Appointment scheduling | Historical appointment data, weather and environmental for patients requiring administrative assistance during the years 2007, 2008, 2009, 2010, and 2011 | GPs not stated as involved in the research and no authors have medical affiliations | Patient scheduling for differentiating between administrative and healthcare matters | Schedule in accordance with demand predicted for each day | Fully | Supervised regression | Generalized Linear Models and Support Vector Machines (with Linear and Gaussian kernel) | Average percentage error |
4 | [18] López Seguí et al. (2020) | Teleconsultation | Teleconsultations recorded by the teleconsulting system received between 2016 and 2018 | GPs are involved in labelling the teleconsultations and some of the authors have medical affiliations | Text classification of teleconsultation messages between GPs and patients | Teleconsultation with decision support avoiding the need for a face-to-face visit | Fully | Supervised classification | Random Forest, Gradient Boosting (lightGBM), Fasttext, Multinomial Naive Bayes, and Naive Bayes Complement | Precision, sensitivity, F-value and the ROC curve |
5 | [19] Michalowski et al. (2017) | Care management | Canadian Cardiovascular Society’s atrial fibrillation clinical practice guidelines and Cochrane Database of Systematic Reviews | GPs not stated as involved in the research and no authors have medical affiliations | Disease management | Personalized management of atrial fibrillation | Fully | Supervised classification | Preference learning | No evaluation reported |
6 | [20] Mohammadi et al. (2022) | Appointment scheduling | EHR data (including patient, visit and provider characteristics) from encounters at an urban community health clinic in 2014 with an emphasis on the “schedulers’ notes” field | GPs not stated as involved in research, but authors are affiliated with health colleges and companies | Patient-centered re-design of appointment scheduling | Appointment scheduling based on patient needs | Partially | Unsupervised clustering | Agglomerative clustering | Clustering comparison with human judgements, scheduling assessments concerning average appointment duration, average time spent in clinic, number of patients seen by clinic |
7 | [21] Mohammadi et al. (2018) | Appointment scheduling | Semi-structured EHR data representing unique patients visiting a large urban multi-site community health center from 2014 to 2016 | GPs not stated as involved in research, but authors are affiliated with health colleges and companies | Predict patients’ adherence to appointments | Appointment compliance and access to care | Fully | Supervised classification | Logistic regression, artificial neural network, and naïve Bayes classifier | AUC, sensitivity, positive (no-show) predictive value, overall accuracy |
8 | [22] Park et al. (2019) | Communication | Transcripts of audio recordings from primary care office visits at 26 ambulatory care clinics between 2007 and 2009 | GPs not stated as involved in the research, but some of the authors have medical affiliations | Patient-provider communication | Patient satisfaction, payments, and quality of care | Fully | Supervised classification | Logistic classifiers, Support vector machines, Gated recurrent units, (Conditional random fields, Hidden Markov models, and hierarchical gated recurrent units) | Classification accuracy for talk-turns, precision, recall, F1 score |
9 | [23] Peito and Han (2021) | Healthcare recommender systems | Patients’ historical health records (with ICD-9 codes) from a European private health network | GPs not stated as involved in the research and no authors have medical affiliations | Patient-doctor matchmaking | Suggestions for patients concerning the best suited doctor for their next primary care visit | Partially | Representation learning (hyperbolic embeddings), transfer learning (pretrained embeddings and domain knowledge) | Domain knowledge filtering | Hit rate and precision |
10 | [24] Schwartz et al. (2022) | Care management | Prediabetes patients with an internal medicine primary care visit within an academic center with multiple ambulatory locations in Maryland and Washington, DC | GPs not stated as involved in the research, but all authors have medical affiliations | Physician–patient communication in pre-diabetes management | Guideline-concordant care | Fully | Supervised classification | Logistic regression, Linear support vector machines, Stochastic gradient descent, Random Forest, Decision tree, Gaussian naïve Bayes, Convolutional neural networks | Accuracy, sensitivity/recall, specificity, PPV/precision, F-measures |
11 | [25] Spenceley et al. (1996) | Electronic medical record (EMR) user interaction | SOAP ((S)ubjective complaint, (o)bjective findings, diagnosis or (a)nalysis, and therapy/ treatment (p)lan) notes for patients and visits from Adelaide General Practice | GPs not stated as involved in the research, but one author has a medical affiliation | Adaptive interface for data entry in EMR | Usability and support for data entry in EMRs | Fully | Supervised classification | Probabilistic | Hit rate |
12 | [26] Williams et al. (2019) | Resource management through scheduling | Private taxi contractor records of taxi journeys from November 2016–February 2017 and February 2017–June 2017 | Interviews with primary care providers and all authors have medical backgrounds | Laboratory test scheduling | Time and cost reduction | Fully | Supervised regression | Linear regression | Time-to-result delay, cost reduction |