The DataFindR
package provides tools leveraging Large Language Models (LLMs) to semi-automate the screening and data extraction steps for systematic reviews. It is designed to populate a project created and managed by the metawoRld
package.
Key features include:
- Fetching metadata (Title/Abstract) for DOIs/PMIDs.
- Assessing study relevance using LLMs guided by project criteria stored in
metawoRld
. - Extracting specific data points from full text using LLMs guided by a data schema stored in
metawoRld
. - Caching results to avoid redundant API calls and costs.
- An interactive Shiny app for screening assistance.
Please see the Conceptual Overview vignette hosted by the metawoRld
package for a detailed explanation of the entire living review pipeline and how DataFindR
fits in. (Replace link with actual URL).
# Assuming DataFindR is on GitHub (replace 'andjar/DataFindR')
# install.packages("remotes")
remotes::install_github("andjar/DataFindR")
# DataFindR requires metawoRld
# remotes::install_github("andjar/metawoRld")
library(DataFindR)
# --- Prerequisites ---
# 1. API key set (e.g., OPENAI_API_KEY in .Renviron, then restart R)
# 2. A metawoRld project exists at the specified path
# Example metawoRld setup (see metawoRld documentation):
proj_path <- file.path(tempdir(), "assess_example_proj")
if(!dir.exists(proj_path)) {
try(metawoRld::create_metawoRld(proj_path, "Assess Example", "Desc",
inclusion_criteria=c("Human","Pregnancy","Cytokine"), exclusion_criteria=c("Animal","Review")))
}
# --- Assess a study ---
pmid <- "31772108" # Example relevant PMID
if(dir.exists(proj_path)) {
assessment_result <- tryCatch(
df_assess_relevance(
identifier = pmid,
metawoRld_path = proj_path,
email = "[email protected]" # Replace if needed
),
error = function(e) print(paste("Assessment failed:", e$message))
)
print(assessment_result)
} else {
print("Skipping assessment example: metawoRld project setup failed.")
}
# Or use the Shiny App
# df_shiny_assess()
# --- Clean up ---
# unlink(proj_path, recursive = TRUE)