Skip to main content

Table 3 Summary of studies included in the present scoping review along nine research-based attributes concerning general practice, administrative tasks, and machine learning

From: Machine learning in general practice: scoping review of administrative task support and automation

 

Author

General practice

Administrative task

Machine learning

No

What is the problem?

What data is used?

How are GPs

involved?

What is the task?

What needs

improving?

How

automated?

What is the

problem?

What methods are used?

What evaluation measures?

1

[15] Abu Lekham et al. (2021)

Appointment scheduling

Data on patient appointments from an outpatient primary care center containing 26 features collected from 2016 to 2019

GPS not stated as involved in the research, but one author is affiliated with the healthcare center in question

Prediction of missed appointments (no-shows and early cancellations)

Patient scheduling, enhance capacity use, maximize revenues, minimize costs, and ultimately achieve financial stability

Fully

Supervised –

binary, multi-class, multi-stage chain

Logistic Regression, Decision Tree, and some ensemble methods, including Random Forest, Ada Boost, Gradient Boosting, and Bagging

Precision, recall, F-value, and accuracy

2

[16] Ahmad et al. (2021)

Appointment scheduling

Patient-visit information (patient ID, month, day, age, gender, race, ethnicity, insurance type, visit type, and previous no-shows) from the EHR database, eClinicalWorks, between 2014 and 2016

GPs not stated as involved in the research, but all authors have medical affiliations

Reduce the rate of clinical no-shows or missed appointments

Decreasing clinical no-show rates

Fully

Supervised regression

Probit regression

Sensitivity, specificity, ROC curve and AUC

3

[17] Cubillas et al. (2014)

Appointment scheduling

Historical appointment data, weather and environmental for patients requiring administrative assistance during the years 2007, 2008, 2009, 2010, and 2011

GPs not stated as involved in the research and no authors have medical affiliations

Patient scheduling for differentiating between administrative and healthcare matters

Schedule in accordance with demand predicted for each day

Fully

Supervised regression

Generalized Linear Models and Support Vector Machines (with Linear and Gaussian kernel)

Average percentage error

4

[18] López Seguí et al. (2020)

Teleconsultation

Teleconsultations recorded by the teleconsulting system received between 2016 and 2018

GPs are involved in labelling the teleconsultations and some of the authors have medical affiliations

Text classification of teleconsultation messages between GPs and patients

Teleconsultation with decision support avoiding the need for a face-to-face visit

Fully

Supervised classification

Random Forest, Gradient Boosting (lightGBM), Fasttext, Multinomial Naive Bayes, and Naive Bayes Complement

Precision, sensitivity, F-value and the ROC curve

5

[19] Michalowski et al. (2017)

Care management

Canadian Cardiovascular Society’s atrial fibrillation clinical practice guidelines and Cochrane Database of Systematic Reviews

GPs not stated as involved in the research and no authors have medical affiliations

Disease management

Personalized management of atrial fibrillation

Fully

Supervised classification

Preference learning

No evaluation reported

6

[20] Mohammadi et al. (2022)

Appointment scheduling

EHR data (including patient, visit and provider characteristics) from encounters at an urban community health clinic in 2014 with an emphasis on the “schedulers’ notes” field

GPs not stated as involved in research, but authors are affiliated with health colleges and companies

Patient-centered re-design of appointment scheduling

Appointment scheduling based on patient needs

Partially

Unsupervised clustering

Agglomerative clustering

Clustering comparison with human judgements, scheduling assessments concerning average appointment duration, average time spent in clinic, number of patients seen by clinic

7

[21] Mohammadi et al. (2018)

Appointment scheduling

Semi-structured EHR data representing unique patients visiting a large urban multi-site community health center from 2014 to 2016

GPs not stated as involved in research, but authors are affiliated with health colleges and companies

Predict patients’ adherence to appointments

Appointment compliance and access to care

Fully

Supervised

classification

Logistic regression, artificial neural network, and naïve Bayes classifier

AUC, sensitivity, positive (no-show) predictive value, overall accuracy

8

[22] Park et al. (2019)

Communication

Transcripts of audio recordings from primary care office visits at 26 ambulatory care clinics between 2007 and 2009

GPs not stated as involved in the research, but some of the authors have medical affiliations

Patient-provider communication

Patient satisfaction, payments, and quality of care

Fully

Supervised

classification

Logistic classifiers, Support vector machines, Gated recurrent units, (Conditional random fields, Hidden Markov models, and hierarchical gated recurrent units)

Classification accuracy for talk-turns, precision, recall, F1 score

9

[23] Peito and Han (2021)

Healthcare recommender systems

Patients’ historical health records (with ICD-9 codes) from a European private health network

GPs not stated as involved in the research and no authors have medical affiliations

Patient-doctor matchmaking

Suggestions for patients concerning the best suited doctor for their next primary care visit

Partially

Representation learning (hyperbolic embeddings), transfer learning (pretrained embeddings and domain knowledge)

Domain knowledge filtering

Hit rate and precision

10

[24] Schwartz et al. (2022)

Care management

Prediabetes patients with an internal medicine primary care visit within an academic center with multiple ambulatory locations in Maryland and Washington, DC

GPs not stated as involved in the research, but all authors have medical affiliations

Physician–patient communication in pre-diabetes management

Guideline-concordant care

Fully

Supervised

classification

Logistic regression, Linear support vector machines, Stochastic gradient descent, Random Forest, Decision tree, Gaussian naïve Bayes, Convolutional neural networks

Accuracy, sensitivity/recall, specificity, PPV/precision, F-measures

11

[25] Spenceley et al. (1996)

Electronic medical record (EMR) user interaction

SOAP ((S)ubjective complaint, (o)bjective findings, diagnosis or (a)nalysis, and therapy/ treatment (p)lan) notes for patients and visits from Adelaide General Practice

GPs not stated as involved in the research, but one author has a medical affiliation

Adaptive interface for data entry in EMR

Usability and support for data entry in EMRs

Fully

Supervised classification

Probabilistic

Hit rate

12

[26] Williams et al. (2019)

Resource management through scheduling

Private taxi contractor records of taxi journeys from November 2016–February 2017 and February 2017–June 2017

Interviews with primary care providers and all authors have medical backgrounds

Laboratory test scheduling

Time and cost reduction

Fully

Supervised regression

Linear regression

Time-to-result delay, cost reduction