Despite significant progress in reducing the burden of HIV, it remains one of the major challenges to public health worldwide. In 2023, there were 40 million adults living with HIV, and more than 1 million new HIV diagnoses were reported worldwide.1 In the United States, there were over 30,000 new HIV infections in 2022, a 12% decrease from 2018; however, more than half of the cases were reported in the southern region, primarily affecting transgender people, gay men and other men who have sex with men (MSM), individuals who inject drugs, sex workers and people in the correctional system.1,2 In the context of the Ending the HIV Epidemic (EHE) initiative, current progress falls far short of the target of reducing new HIV infections by at least 90% by 2030, highlighting a need for more efficient, effective and tailored interventions to reduce disparities in HIV prevention and treatment.3,4
Machine learning (ML) is a branch of artificial intelligence (AI) that uses computational tools, such as complex algorithms, to resolve tasks without explicit programming.5 In recent years, ML has been used to innovate and transform traditional approaches in the healthcare field. Recent applications include using AI to predict the 3D structure of proteins, a process that could speed up research and drug discovery, the potential for early detection and diagnosis of neurodegenerative diseases by using AI to analyse subtle changes in speech patterns even before symptoms appear; and enhancing the accuracy of medical imaging through AI’s ability to detect patterns and anomalies that may be missed by the human eye.6–8 Given ongoing challenges in controlling the HIV epidemic, ML approaches have demonstrated promise in areas such as risk prediction, enhancing HIV testing and pre-exposure prophylaxis (PrEP), improving retention in care, optimizing antiretroviral therapy (ART), early diagnosis of comorbidities, guiding outbreak response and allocating resources more effectively in HIV programmes locally and globally.9–54
Given the growing role of AI and ML in the field, this article aims to explore how these methods are being applied across the HIV care continuum in prevention, diagnosis, treatment and public health efforts aimed at controlling the epidemic. We also discuss important considerations around implementation and future directions to guide the responsible integration of these technologies into research and practice.
Methods
Search strategy
We developed a search strategy using a combination of terms such as ‘HIV’, ‘HIV prevention’, ‘HIV care’, ‘HIV testing’, ‘artificial intelligence’, ‘machine learning’, ‘Digital Health’, ‘Public Health’ and other related terms in Medline (PubMed) and Embase. The inclusion criteria were articles published in English within the last 5 years that discussed the use of AI models (e.g. ML, natural language processing [NLP] and deep learning [DL]) in HIV prevention, care, public and global health. We also included recent conference abstracts (e.g. Conference on Retroviruses and Opportunistic Infections, Council of State and Territorial Epidemiologists).
Study selection
After the initial search, a total of 1,100 records were identified through database searches, with an additional seven records identified through other sources such as preprints and conference abstracts.55 After removing duplicates, 950 unique records remained for title and abstract screening. Of these, 827 were excluded for not meeting the inclusion criteria. The remaining 123 full-text articles were assessed for eligibility, resulting in 78 exclusions. Ultimately, 45 studies were included in this review (Figure 1).
Figure 1: Flow diagram of study selection

AI = artificial intelligence; HIV = human immunodeficiency virus; ML = machine learning
Analysis and reporting
The review was conducted using the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) checklist items applicable to a narrative review. For each included article, we collected information on the authors, study aims, research region, population, data collection methods, type of AI/ML methodology used, outcomes and key findings. Table 1 provides a summary of the most relevant data.11–21,23,24,26–44,46–54
Table 1: Summary of most relevant findings11–21,23,24,26–44,46–54
| Primary focus | Authors | Region | Data source | AI/ML tools | Key findings |
| Prevention | Bao et al.11 | Australia | EMR | XGBoost, DL, RF and GB | ML models (AUCs=0.711–0.763) outperformed multivariable logistic regression (AUC=0.698) to predict incident HIV in MSM |
| He et al.12 | China | Public health dataset | DT, SVM and RF | ML models (AUCs=0.853–0.942) outperformed logistic regression (AUC=0.778) to predict incident HIV in MSM | |
| Marcus et al.13 | USA | EMR | LASSO | Predicted incident HIV among patients at Kaiser Permanente in Northern California with an AUC of 0.86 | |
| Krakower et al.14 | USA | EMR | LASSO | Predicted incident HIV at Health in Massachusetts with an AUC up to 0.9 | |
| Burns et al.15 | USA | EMR | XGBoost and LASSO | Predicted incident HIV in the general population, women and Black individuals with AUCs of 0.89, 0.86 and 0.89, respectively | |
| Saldana et al.16 | USA | Public health dataset | GB | Predicted incident HIV with an 80% accuracy rate in both sexes using notifiable STI data in single county | |
| Mutai et al.17 | Sub-Saharan Africa | Population-based | XGBoost | Predict HIV positive cases with an f1 score of 90% and 92% for males and females, respectively | |
| Chingombe et al.18 | Zimbabwe | Population-based | SVM, BC, GNB, GB and RNN | Predicted HIV-positive results with AUCs ranging from 0.81 to 0.94 | |
| Feller et al.19 | USA | EMR | NLP+ predictive models | NLP improved the performance of predictive models extracted data from clinical notes, from an AUC of 0.75–0.82 | |
| Morales-Sanchez et al.20 | Spain | EMR | Maximum entropy, RoBERTABio and RoBERTABioLONG | Identified suspected HIV positive cases with AUCs over 0.948 based on clinical notes | |
| Ajmal et al.21 | USA | Public health dataset | NLP | Reduced the time needed to review records by 80% and underscored key details contributing to HIV transmission | |
| Turbé et al.23 | South Africa | Population-based | CNN | Analysed and interpreted HIV tests in lower-income settings with a 98.9% accuracy rate | |
| Roche et al.24 | Kenya | Population-based | CNN | AI tool achieved a 100% sensibility to interpret HIV self-testing and outperformed user and provider interpretations | |
| Cheah et al.26 | Malaysia | Population-based | AI chatbot | Provided educational support for using and interpreting HIVST, addressed questions and concerns about HIV testing and PrEP and offered mental health support | |
| Ntinga et al.27 | South Africa | Social media | Nolwazi_bot | Supported HIVST and PrEP with 17.5% of participants guided by the chatbot tested positive | |
| Liu et al.28 | USA | Population-based | Computer vision and neural networks | The DOT diary application was shown to be feasible and acceptable, with 91% adherence to oral PrEP | |
| Buchbinder et al.29 | USA | Population-based | Computer vision and neural networks | The DOTdiary measurement of PrEP use showed concordance with blood concentrations of tenofovir diphosphate and emtricitabine triphosphate at 91.0% and 85.3%, respectively | |
| Liu et al.30 | USA | Population-based | Computer vision and neural networks | An adaptation of DOT diary for Spanish-speaking MSM and both English- and Spanish-speaking TGW was feasible, perceived as useful, and received high levels of user satisfaction | |
| Zheng et al.31 | USA | Social media | DL approach | ML model discovered 23 new HIV-related influencers on Twitter with a 90% accuracy rate | |
| Rice et al.32 | USA | Population–based | Heuristic-based algorithm | AI-selected peer leaders on youth homelessness reduced condomless anal sex by 31% | |
| HIV Care | Olatosi et al.33 | USA | EMR | Bayesian Network | Predicted HIV care status in PLWH with an AUC of 0.94 |
| Cai et al.34 | USA | SVM | Predict whether the patient would remain in care or drop out over time with AUCs over 0.920 | ||
| Mirudwe et al.35 | Uganda | BERT (NLP) | Identified PWH at the highest risk of dropping out of care with an AUC of 0.96 | ||
| Maskew et al.36 | South Africa | Logistic regression, RF and AdaBoost | Predicted whether a PWH will attend the next scheduled appointment with an AUC of 0.68 | ||
| Seboka et al.37 | Ethiopia | XGBoost and GB | Predicted virological failure and low CD4 counts in PWH with an AUC of 0.99 and 0.83, respectively | ||
| Revell et al.38 | Global information | Multi-source dataset | RF | Performed accurate predictions of virological failure in different scenarios (e.g. missing data) | |
| Goyal et al.39 | USA | Public health dataset | RF | Predicted unsuppressed viral load in PWH with an AUC of 0.822 | |
| Ma et al.40 | Canada | Population-based | MARVIN (AI chatbot) | AI chatbot was useful, reliable and easy to access for supporting self-management and HIV medication adherence | |
| Villanueva et al.41 | Canada | Hybrid model (ChatGPT + MARVIN) | AI chatbot classified messages as self-harm or insults with 95.5% accuracy rate | ||
| Rajpurkar et al.42 | South Africa | EMR | CheXaid (CNN) | Diagnosed active pulmonary tuberculosis based on clinical features and X-ray findings with an AUC of 0.83 | |
| Paul et al.43 | USA | EMR/cognitive tests | GB | Identified frailty in PLWH by combining demographic and health information with brain imaging features | |
| Lui et al.44 | Asia | EMR | CNN + SVM | Predicted coronary atherosclerosis and obstructive coronary artery disease incorporating traditional cardiovascular risk factors and retinal image analysis with AUCs of 0.987 and 0.99, respectively | |
| Public and global health | Mazrouee et al.46 | USA | Public health dataset | RF and DT | Determined whether an individual with HIV is part of a transmission cluster or a singleton with an AUC of 0.94 |
| Matta et al.47 | USA | NBR-Clus | Identified and characterized two clusters in HIV-positive individuals: African American and Hispanic people | ||
| Mutai et al.48 | Sub-Saharan Africa | Agglomerative hierarchical | Identified two clusters per sex based on similarities in sociodemographic and behavioural features across 13 countries in sub-Saharan Africa | ||
| Kupperman et al.49 | Europe | CNN | Detected two outbreaks that occurred among injecting drug users in Finland and Sweden in an early stage | ||
| France et al.50 | USA | ECNA (neural network approach) | Identified that the full clusters were 3–9 times larger than the detected clusters from HIV molecular surveillance | ||
| Sharma et al.51 | USA | Public health dataset | DRL (based on neural networks) | Simulated resource allocation based on each jurisdiction decisions within the state and reduced HIV incidence by 19% in California and 23% in Florida | |
| Onovo et al.52 | Sub-Saharan Africa | Public health dataset | XGBoost | Identified two key predictors of VLS improvement: viral load coverage and new enrolments in PrEP | |
| Endawkie et al.53 | Sub-Saharan Africa | Demography and health survey | RF | Identified adults at the highest risk of HIV infection; using spatial interpolation, georeferenced the predictions to identify hotspots | |
| Onovo et al.54 | Kenya | Demography and health survey | Tuned LASSO regression | Predicted paediatric HIV cases. Georeferencing of predictions identified HIV hotspots in 12 counties |
AUC = area under curve; BC = bagging classifier; BERT = bidirectional encoder representations from transformers; CNN = convolutional neural networks; DL = deep learning; DOT = directly observed therapy; DRL = deep reinforcement learning; DT = decision tree; ECNA = evolving contact network algorithm;EMR = electronic medical records; GB = gradient boosting; GNB = Gaussian Naïve Bayes; HIVST = human immunodeficiency virus self-testing; LASSO = least absolute shrinkage and selection operator; MSM = men who have sex with men; NLP = natural language processing; PLWH = people living with human immunodeficiency virus; PrEP = pre-exposure prophylaxis; RF = random forest; RNN = recurrent neural networks;STI = sexually transmitted infection; SVM = support vector machines;TGW = transgender women;VLS = viral load suppression; XGBoost = extreme gradient boosting.
Results
Artificial intelligence/machine learning for HIV prevention
AI and ML approaches have been applied across the HIV prevention continuum, from risk prediction and testing to adherence support and community outreach. The following section summarizes key applications, including the use of structured and unstructured data for HIV risk prediction, innovations in testing in low-resource settings, AI-powered chatbots for HIV self-testing and PrEP uptake, digital tools for adherence monitoring and AI-driven strategies for identifying peer leaders in prevention efforts.
HIV risk prediction
By analysing complex interactions within features (i.e. demographic information, sexual behaviour, clinical history and social determinants), ML algorithms have shown an advantage over conventional risk models in classifying individuals at higher risk of contracting HIV.9,10 The area under the curve (AUC) is a standard metric for prediction models, where a value of 1.0 indicates perfect class discrimination, while a value of 0.5 is equivalent to random guessing. Extracting data from electronic medical records (EMRs) and self-reported sexual history, Bao et al. evaluated the HIV incidence prediction performance of four ML models: gradient boosting (GB), DL, random forest (RF) and extreme gradient boosting (XGBoost).11 Briefly, GB is a classifier where multiple decision trees (DTs) learn from each other to improve overall performance. DL is an approach that mimics brain-like neural networks. RF is an ensemble of DTs that combine their outputs to enhance accuracy and reduce overfitting. XGBoost is an improved version of the GB approach.56 Those models outperformed a multivariable logistic regression with AUCs of up to 0.77 versus 0.70 for the regression model.11 In a similar study, He et al. explored supervised learning algorithms such as DT, support vector machines (SVMs) and RF, using data extracted from MSM sentinel surveillance. These models demonstrated AUCs ranging from 0.853 to 0.942, compared with an AUC of 0.778 for logistic regression.12
Using a large dataset from EMR, Marcus et al. and Krakower et al. evaluated multiple ML algorithms to identify potential PrEP candidates. They observed that least absolute shrinkage and selection operator (LASSO), a type of logistic regression, achieved an AUC of 0.86 for predicting incident HIV cases among patients at Kaiser Permanente in Northern California, while at Atrius Health in Massachusetts, it achieved an AUC up to 0.91 but was less accurate when tested at Fenway (AUC=0.77).13,14 However, their models underperformed among female individuals. Burns et al. compared XGBoost and LASSO for predicting HIV incidence using EMR data.15 In contrast to previous studies, both models performed well in the female cohort, with LASSO outperforming XGBoost with AUCs of 0.86 and 0.78, respectively.15 In the overall cohort, XGBoost achieved an AUC of 0.89 versus 0.84 for LASSO.15 In a parallel study, Saldana et al. trained five ML classifiers (RF, nearest neighbours, logistic regression, naive Bayes and gradient boosted trees) using a US public health dataset with reportable sexually transmitted infections (STIs).16 Gradient boosted trees were the best-performing model with an 80% accuracy rate in predicting HIV incidence for both sexes.16 In other studies, these models also accurately predicted HIV incidence among populations in developing regions with high HIV prevalence (e.g. sub-Saharan Africa), as well as in key groups, such as MSM and Black individuals.11,12,15,17,18
Natural language processing for detecting risk patterns in clinical notes
Most of the AI models have used structured data from EMR to predict HIV risk. In contrast, Feller et al. observed that extracting data from clinical notes using NLP improved the performance of a predictive model that used only structured EMR data—from an AUC of 0.75 to an AUC of 0.82.19 Similar findings were observed in a study using Spanish clinic notes by Morales-Sánchez et al. where a logistic regression (maximum entropy) and two NLPs (RoBERTABio and RoBERTABioLONG) achieved AUC values above 0.948 to identify suspected HIV positive cases.20 Additionally, in an HIV cluster detection setting, Ajmal et al. observed that using an NLP algorithm to analyse information from interviews with 86 individuals in 11 HIV clusters in Georgia, USA, not only optimized the time required to review the data by 80% but also extracted detailed information on risk behaviours and other key variables related to HIV transmission.21
Deep learning for enhancing testing in lower income settings
DL models, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), use layered computational systems to automatically learn complex patterns from raw data.22 In HIV prevention, these models have the potential to enhance the interpretation and reporting of HIV tests. In a study by Turbé et al., CNN was trained to classify images of HIV rapid diagnostic tests (RDTs) as positive or negative based on 11,374 photographs taken by over 60 fieldworkers using Samsung tablets in South Africa.23 The algorithm was integrated into a mobile application, and its results were compared with those of trained providers.23 The model achieved 98.9% accuracy in classifying RDT images, 6% higher than the visual interpretation by health workers.23 Additionally, this mHealth system offered several advantages, such as preventing data loss, connecting to laboratory information systems and enabling real-time monitoring. Similarly, Roche et al. developed an AI algorithm using multiple computer vision models, including two different CNN approaches, to interpret images of HIV self-testing (HIVST) in Kenya. The model achieved a sensitivity of 100%, outperforming both user and provider interpretations, which were 93.2% and 97.7%, respectively.24
Artificial intelligence-powered chatbots and digital outreach for increasing HIV self testing and pre-exposure prophylaxis
AI-based chatbots or conventional agents leverage DL models, such as natural language understanding and natural language generation, to conduct fluid, coherent and human-like conversations.25 Regarding HIV prevention interventions, AI chatbots provide educational support for using and interpreting HIVST, assist users in finding clinics for testing, address questions and concerns about HIV testing and PrEP and offer mental health support.26 In contrast to human counsellors, AI chatbots offer judgement-free and always-available support, which can be especially important for populations facing stigma around HIV prevention. Ntinga et al. found that 79.2% of users of an AI chatbot developed in South Africa reported a better experience with the tool than with a human provider.27 Additionally, they observed that 17.5% of participants guided by the chatbot tested positive, while 82.8% of those who tested negative expressed interest in learning more about PrEP.27
Using computer vision and neural networks, Liu et al. created a directly observed therapy (DOT) diary application for monitoring and supporting PrEP use.28 Once downloaded onto a mobile phone, the app enabled the collection of visual and audio data of pill-taking through the front-facing camera. It also sent daily dosing reminder alarms to participants and tracked and described sexual encounters. In an 8-week pilot with African American and Latinx young MSM in San Francisco and Atlanta, the app proved to be feasible and acceptable, with 91% adherence to oral PrEP.28 Similarly, a randomized controlled trial conducted in the same cities observed a concordance between DOT diary measurements of PrEP use and concentrations of tenofovir diphosphate and emtricitabine triphosphate in blood. The concordance rates were 91.0% and 85.3%, respectively.29 However, there was no significant difference in PrEP adherence using the mobile app and its usage declined over time.29 Finally, an adaptation of DOT diary for Spanish-speaking MSM and both English- and Spanish-speaking transgender women was feasible, perceived a useful, and received high levels of user satisfaction.30
Peer leader selection
Controlling the HIV epidemic also requires identifying qualified peer leaders who are willing and able to share HIV-related information. A study by Zheng et al. designed an iterative influencer detection model, a DL approach that identified the most influential peer leaders within their community based on a dataset of one million tweets from the USA. This model discovered 23 new HIV-related influencers on Twitter, including health organizations, research institutions and local HIV advocacy groups, with a 90% accuracy rate.31 In another study conducted by Rico et al. on youth homelessness, AI-selected peer leaders using a heuristic-based algorithm reduced condomless anal sex by 31% compared with peer leaders selected by traditional methods.32 Additionally, this improvement was observed early, within the first month.32
Artificial intelligence/machine learning HIV care
AI and ML are increasingly being used to enhance HIV care by predicting clinical outcomes, optimizing treatment and improving patient engagement. From identifying individuals at risk of falling out of care to supporting ART decision-making and adherence, these tools analyse diverse data sources, including medical records, public health surveillance and imaging to enable more personalized, timely and effective interventions. The following section highlights emerging applications of AI across the HIV care continuum.
Predictive modelling for identifying individuals at risk of falling out of HIV care
Olatosi et al. and Cai et al. evaluated multiple ML models using CD4 cell counts and viral load tests to analyse retention in care transitions and predict whether the patient would remain in care or drop out over time.33,34 The best-performing models were the Bayesian network (AUC=0.94) in the study by Olatosi et al. and SVM (AUC=0.920) in the study by Cai et al.33,34 Mirudwe et al. observed that bidirectional encoder representations from transformers, a DL model based on NLP, achieved an AUC of 0.96 in identifying people living with HIV (PWH) at the highest risk of dropping out of care using EMR data from Uganda.35 In South Africa, Maskew et al. found that ML algorithms such as logistic regression, RF and the AdaBoost classifier predicted whether a PWH would attend the next scheduled appointment, with an AUC of 0.68.36
Artificial intelligence optimizing antiretroviral therapy regimens
Seboka et al. trained several ML models on EMRs and found that XGBoost achieved an AUC of 0.99 for predicting virological failure, while GB was the best model for predicting low CD4 counts, with an AUC of 0.83.37 Even without important data, such as genotype, time on therapy and CD4 count, Revell et al. observed that the performance of RF models remained reasonable, with an AUC of 0.78, compared with 0.89 when all information was available.38 In that study, AI-based simulations using RF-trained models identified alternative regimens with a higher likelihood of virological response than those prescribed in the clinic.38 In contrast to previous studies using EMRs, a recent study by Goyal et al. in San Diego observed that, by analysing mandatorily reported public health information, an RF model predicted PWH with an unsuppressed viral load, with an AUC of 0.822.39
Artificial intelligence-assisted adherence monitoring
MARVIN is an AI-based chatbot developed to support self-management and HIV medication adherence.40 The chatbot addressed concerns about ART administration, recommendations for taking ART during travel, general HIV-related topics and medication reminders.40 In a 4-week study, users reported that MARVIN was useful, reliable and easy to access.40 Similarly, Villanueva et al. developed a hybrid model with ChatGPT and the MARVIN chatbot to support mental health in PWH.41 The model was trained to manage high-risk messages about stigma and mental health and achieved an accuracy of 95.5% in classifying messages as self-harm or insults. The model also generated appropriate responses tailored to each category.41
The role of artificial intelligence in imaging
AI models are being evaluated to support the diagnosis and early detection of some comorbidities in people living with HIV. Rajpurkar et al. developed CheXaid, a DL that leverages CNN, to diagnose active pulmonary tuberculosis based on clinical features and X-ray findings.42 This model achieved an AUC of 0.83, and its stand-alone performance was significantly better than that of providers assisted by the model, with an accuracy of 79% versus 65%, respectively.42 AI is also leading advances in neuroimaging to understand cognitive impairment. Paul et al. observed that a GB multivariate regression model was effective in identifying whether a PWH is frail by combining demographic and health information with brain imaging features.43 In another study, Lui et al. incorporated traditional cardiovascular risk factors and retinal image analysis into a model that combined a DL algorithm based on CNN and ML models, such as SVM. The model predicted coronary atherosclerosis and obstructive coronary artery disease among PWH with AUCs of 0.987 and 0.99, respectively.44
Artificial intelligence in public and global health
Beyond individual-level predictions, AI and ML are being applied to enhance public health efforts by improving HIV surveillance, transmission network analysis and resource allocation. These tools enable deeper insights into transmission dynamics, support real-time outbreak detection and help identify geographic areas and populations most in need of intervention. The following section explores how AI is reshaping HIV epidemiology and public health planning through novel applications in cluster analysis, geospatial modelling and strategic decision-making.
Machine learning for epidemiological link analysis in HIV transmission networks
HIV molecular epidemiology analyses HIV gene sequence data to assess genetic similarity and identify networks of people experiencing HIV transmission.45 However, in this approach, PWH with missing data are excluded from analysis, as it is based solely on genetic information. AI techniques may offer a solution by analysing relationships between different variables beyond just genetic data. Mazrouee et al. trained multiple ML models, leveraging contextual metadata from viral genetic data to learn patterns of HIV transmission.46 Then, using metadata from PWH whose genetic sequences were unknown, they found that RF and DT models could determine whether an individual was part of a transmission cluster or a singleton. The models achieved AUCs above 0.94.46
These approaches possess the potential to identify and characterize clusters. In a study by Matta et al., inference methods (i.e. CT graph) were initially applied to convert data into graphical formats and identify groups of related variables. They used a dataset from Sexual Acquisition and Transmission of HIV Cooperative Agreement Program (SATHCAP), which included surveys taken in Chicago, Los Angeles and Raleigh, to explore the relationship between drug use and the sexual transmission of HIV.47 Then, leveraging the graphical data, they applied unsupervised ML models for clustering, such as NBR-Clus, and identified two clusters. The first cluster was composed mostly of African American individuals, while the second was predominantly Hispanic. Additionally, through descriptive analysis, they found that the first cluster, compared with the second, showed higher behavioural risk (68% had used crack and 73% had been in prison or jail), but also a support network (78% had attended a self-help group).47 Similarly, Mutai et al. demonstrated that AI-based models can also identify clusters of countries, which enables policymakers to design tailored interventions. In their study, using agglomerative hierarchical clustering, an unsupervised ML approach, they identified two clusters per sex based on similarities in sociodemographic and behavioural features across 13 countries in sub-Saharan Africa.48
On the other hand, AI has the potential to support real-time outbreak detection. Kupperman et al. developed and validated a CNN model based on genomic data to detect two outbreaks that occurred among injecting drug users in Finland and Sweden.49 They progressively added European HIV-1 sequences from the outbreak period to create a dynamic model that classified the inputs as ‘active epidemic’ or ‘inactive epidemic’. The model detected the start and the end points of the Finnish outbreak in 1999.49 Additionally, the model identified a pre-outbreak phase in Sweden, 3 years before the Swedish outbreak in 2004.49 In another study, France et al. conducted AI-powered simulations using an evolving contact network algorithm, based on neural networks, to replicate HIV molecular cluster networks in the USA from 2015 to 2017.50 They observed that the full clusters were 3–9 times larger than the detected clusters by HIV molecular surveillance and showed a disproportionate number of undiagnosed infections.50
Machine-learning models for resource allocation
Sharma et al. evaluated deep reinforcement learning (DRL), a method based on neural networks in which an agent learns to make decisions through interaction with an environment. They used DRL to develop a policy-based strategy for optimizing resource allocation across prioritized jurisdictions in California and Florida, within the context of the EHE initiative.51 They used budget allocation data from the Health Resources and Services Administration (HRSA) for these two states.51 In this framework, each jurisdiction was an agent that interacted with others but made independent decisions regarding prevention and treatment strategies. The AI-driven simulations resulted in a 19% reduction in HIV incidence in California and a 23% reduction in Florida.51 Additionally, under a simulated scenario in which the budget allocation was increased tenfold, the model revealed the potential to reduce HIV incidence by up to 75% between 2019 and 2030.51 In contrast, a model where a single agent made interventions for the entire state led to a 4.4% increase in HIV incidence in California and a 0.6% increase in Florida.51
Onovo et al. trained an XGBoost model using historical data from 21 The US President’s Emergency Plan for AIDS Relief (PEPFAR)-supported countries in sub-Saharan Africa between 2017 and 2024. They also included socio-demographic, socioeconomic and health indicators from the World Bank’s World Development Indicators.52 They observed a significant increase in viral load suppression (VLS) from 82.93% in 2017 to 95.59% in 2024 across all the countries.52 By analysing the interactions between the factors associated with this improvement, the XGBoost model identified two key predictors: viral load coverage and new enrolments in PrEP.52 Consequently, the model suggests that interventions targeting these factors could help sustain the upward trend in VLS from 2025 to 2030.
Finally, predictions generated by AI tools can be enhanced with spatial analysis to identify hotspots and improve resource allocation. For example, using an RF model with demographic and health information across sub-Saharan Africa, Endawkie et al. identified adults at the highest risk of HIV infection.53 Through spatial interpolation, they observed that these high-risk individuals were concentrated in the southern and specific eastern regions of sub-Saharan Africa.53 In a parallel study, Onovo et al. trained six supervised ML algorithms to predict paediatric HIV cases using demographic and health information of children under 15 years living with HIV in Kenya.54 This included Ridge regression, tuned Ridge regression, LASSO regression, tuned LASSO regression, Elastic Net regression and tuned Elastic Net regression. Tuned LASSO regression was the best performance model, and the predictions were georeferenced, identifying HIV hotspots in 12 counties in the southwestern region of Kenya.54
Federated learning and privacy preservation
AI models are trained on datasets that may include particularly sensitive data such as HIV status or STI history. Federated learning (FL) and homomorphic encryption (HE) are approaches that help protect the privacy of data. FL allows models to learn from data without that data having to leave its primary source. Meanwhile, HE ensures that private data remains encrypted even while being used for analysis. Tang et al. demonstrated that a multi-layered perceptron neural network model incorporating FL and HE outperformed traditional approaches for predicting HIV/STIs, with an AUC of 0.94 and an accuracy of 90.7%.57 The model incorporating FL and HE was both smarter and more accurate and at the same time, it preserves the privacy of sensitive patient data, which is crucial for the ethical use of AI.
Discussion
AI is enhancing the HIV response by enabling more tailored interventions. The AI-based models predict individuals at higher risk, improve PrEP engagement and identify the most effective peer leaders for outreach. ML models provided tools for more efficient public health strategies by optimizing resource allocation to areas that need them most, detecting outbreaks earlier and expanding HIV prevention efforts in underserved communities. Table 2 provides a summary of barriers influencing the HIV epidemic contrasted with potential AI opportunities for addressing them.
Table 2: Summary of select barriers influencing the HIV epidemic contrasted with potential artificial intelligence solutions
| Challenges and barriers in the HIV epidemic | AI solutions |
| Tailor HIV testing efforts | AI chatbots are perceived as judgement-free and offer ‘always-available’ support for finding clinics or facilities for HIV testing, addressing questions and concerns about HIV testing and PrEP. They also assist with HIV medication adherence and mental health support. Additionally, AI can help identify more effective community influencers to scale HIV prevention and education outreach |
| Stigma and social isolation | |
| Mental and emotional health | |
| Difficulties with adherence to PrEP and ART | |
| Limited testing in lower-income settings | AI algorithms can interpret results of HIVST using available resources, enabling remote analysis and monitoring. They can also optimize and prioritize resource allocation based on risk prediction |
| Identification of individuals most in need of testing and HIV PrEP | Models to identify predictors of individuals at higher risk of HIV acquisition and risk patterns in clinical notes, public health datasets, registries, etc., to guide conversations and prioritize testing and PrEP delivery |
| Difficulties in retention of care for people living with HIV | AI-based predictions of dropout risk to enable timely and effective retention strategies |
| Lack of experienced HIV care providers | AI can optimize interpretation of HIV tests, support treatment prescribing and develop interventions based on risk level |
| Disparities in public health funding and spending | AI-powered simulations for allocation of resources based on epidemiological context; including geospatial analysis in AI prediction can identify HIV hotspots and underserved areas to develop tailored strategies |
| Negative impact of syndemics | AI models analyse complex, multimodal data from structured and unstructured fields to tailor responses in different settings |
AI = artificial intelligence;ART = antiretroviral therapy;HIV = human immunodeficiency virus;HIVST = human immunodeficiency virus self-testing;PrEP = pre-exposure prophylaxis.
In 2018, the US Food and Drug Administration authorized the marketing of IDx-DR, the first autonomous AI software designed to detect diabetic retinopathy.58,59 Although AI is still in the early stages of its application in healthcare settings, ML algorithms could be integrated into EMRs as automated screening tools to help providers prioritize individuals for HIV testing and identify potential PrEP candidates. Additionally, these models could identify PWH at higher risk of falling out of care, enabling timely and more effective retention strategies. These models have also demonstrated potential in reducing provider error in the interpretation of HIV tests, supporting treatment decisions, improving ART prescription and optimizing manual chart reviews. Additionally, they can highlight details in clinical notes that help assess HIV risk and identify missed dynamics in HIV transmission.
In focus group discussions, providers state that HIV risk prediction tools could be useful to engage patients in counselling about HIV prevention interventions.60 However, prediction models are imperfect, raising concerns, particularly when patients might be inappropriately categorized as high-risk.61 Receiving a ‘high-risk’ score could be ambiguous and confusing, and may also trigger fear, anxiety or mistrust.60 That is one of the reasons why AI should be viewed as a tool that supports, rather than substitutes, human expertise. Additionally, patients might react negatively if they become aware that computer models have used their medical records to generate these predictions. The success of these models depends on how they are implemented in practice, underscoring the need for pre-implementation strategies that involve perceptions of both providers and patients.62
One of the strengths of these models lies in their ability to identify patterns and address implicit bias in large datasets from various sources, such as EMRs, public health data collected by federal and state agencies, and sociodemographic and health surveys. However, concerns arise when models are trained on incomplete, biased or non-representative data, which can limit their generalizability.
Studies conducted in priority populations and undeserved areas have trained models using population-based tools or self-reported information, which may introduce recall and non-response bias. Similarly, most models are trained using EMRs, which may rely on incomplete data from populations disproportionately affected by HIV inequities, such as Black and Latino people. Barriers to accessing and maintaining health care, such as high uninsured rates, stigma and discrimination, and social determinants of health (e.g. transportation issues and demanding employment schedules) can result in insufficient representation of these populations in EMRs.63,64 This leads to less accurate predictions. For instance, Burns et al. found that HIV incidence predictions for White individuals were more accurate than those for Black populations.15 Additionally, ML models have been developed in urban settings, university hospitals or single institutions, often excluding transgender communities or including participants that were highly educated in the intervention (e.g. chatbot), which leads to biased results.26,65 In other studies, AI chatbots provided different responses for adjuvant chemotherapy regimens for endometrial cancer based on the geographic location of the patient.66 Additionally, AI tools for resume screening, writing recommendation letters or offering advice have shown biased outputs based on race, gender and people’s names.67–69 To address these issues, toolkits like AI Fairness 360 have gained importance. AI Fairness 360 helps researchers and developers detect bias in algorithms and datasets and provides education and guidance for mitigating bias before AI is used in real-world scenarios.70
One of the challenges to the acceptance and clinical application of these models is the lack of explainability. While DL models make accurate predictions regarding HIV risk, support in testing and treatment and interpret clinical images, their complex architecture with layers of interconnected nodes poses a challenge in understanding the relationship between features, the factors influencing the weighting and the fragility of their performance over time.71 Explainable AI frameworks provide insights into why a model makes specific decisions, how results are achieved and the implications of those decisions.72 Additionally, providing clinicians with a fact sheet containing relevant information such as the model’s name, intended setting, mechanism of risk, score calculation, validation process, performance metrics, intended uses and how to navigate these conversations could be an important step in presenting this information in a more understandable manner.73
Within public health frameworks, AI models can identify factors that influence HIV transmission networks. This highlights their potential to enhance strategies for detecting and responding to outbreaks, such as those in HIV molecular epidemiology. By leveraging information beyond just incidence rates and considering multiple factors such as HIV knowledge, diagnostic rates, treatment adherence, sociodemographic features and budget constraints, these models can identify disproportionately impacted areas. This is an important step in improving resource allocation and supporting tailored prevention and treatment efforts by public health officials and community organizations.
Unlike prior reviews that focus exclusively on prevention or clinical care, this study offers a comprehensive synthesis of AI and ML applications across the full HIV continuum, including prevention, care and public health strategies. Additionally, we include a practical summary table of potential AI and ML applications, which may serve as a useful reference for future studies. This study also surfaces concerns about bias in AI performance, underscoring the need to address equity and inclusivity in implementation efforts. Finally, the discussion of explainability, user perceptions and implementation challenges provides valuable insights for translating AI innovations into clinical and public health practice.
Limitations
Our study has some limitations that should be considered when interpreting the findings. First, the search was limited to two databases (PubMed and Embase) and to articles published in English within the past 5 years, which may have excluded relevant studies. Second, we did not perform a formal quality assessment of included articles, which limits the ability to interpret the evidence. Finally, the inclusion of conference abstracts and preprints, while helpful for capturing recent developments, may include findings that have not undergone peer review.
Conclusion
AI promises to improve how HIV prevention, care and public health responses are tailored and delivered, but its impact depends on effective implementation. As models become smarter, more understandable, inclusive in design and community-centred, they can help us close gaps in care, reach more individuals and bring us closer to EHE. Future efforts should focus on developing models with prospective data, trained on both structured and unstructured information from diverse sources, to address disparities and ensure that predictive models are effective for all populations.
