Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New function save_to_gh() #5

Open
wants to merge 13 commits into
base: DEV_v2
Choose a base branch
from
150 changes: 98 additions & 52 deletions R/save_to_gh.R
Original file line number Diff line number Diff line change
@@ -1,47 +1,91 @@
#' Save to GitHub
#'
#' @param df A dataframe object
#' @param metadata a list with all the information of a file, usually from
#' [get_pip_releases]
#' @inheritParams load_from_gh
#' @return invisible NULL
#' @export
#' This function uploads or updates a file in a GitHub repository. If the file
#' does not already exist, a new file will be created. If the
#' file already exists, it will be updated with the new data.
#'
#' @param df A dataframe containing the data to be uploaded or used to
#' update an existing file. The dataframe will be converted into a base64-encoded
#' string before being uploaded
#' @param repo A character string specifying the name of the GitHub repo
#' where the file will be uploaded or updated
#' @param owner A character string specifying the GitHub username or organization
#' that owns the repository. Defaults to `pipfun.ghowner` option
#' @param branch A character string specifying the branch of the repository where
#' the file should be uploaded or updated. The default is `DEV` branch
#' @param filename A character string specifying the name of the file to be created
#' or updated in the GitHub repository. If not provided, it defaults to repo name
#' @param ext A character string representing the file extension (e.g., `.csv`). Default is `csv`
#' @param metadata A list containing metadata for an existing file in the repository. Usually from [get_pip_releases]
#' It should contain `sha` (the SHA hash of the file) and `path` (the file
#' path in the repository). If `NULL`, the function will check whether the file exists
#' and retrieve the metadata
#' @param verbose A logical: whether to print detailed messages
#' about the process. The default is `TRUE`
#' @param message A character string specifying the commit message for the GitHub upload
#' or update. The default is a message with the current timestamp
#'
#' @return
#' Returns `invisible(NULL)`. The function primarily performs an upload or update
#' operation and does not return any value other than invisibly indicating the completion
#' of the task.
#'
#' @examples
#' \dontrun{
#' df <- data.frame(a = 1:10, b = letters[1:10])
#' save_to_gh(df, repo = "pip_info",
#' filename = "to_delete.csv",
#' branch = "testing")
#' # Create a new file on GitHub
#' df <- data.frame(a = 1:5, b = letters[1:5])
#' save_to_gh(df = df, repo = "aux_test", filename = "data_example", ext = "csv")
#'
#' # Update an existing file on GitHub
#' df <- data.frame(a = 6:10, b = letters[6:10])
#' save_to_gh(df = df, repo = "aux_test", filename = "data_example", ext = "csv")
#' }
#' @export
#'
save_to_gh <- function(df,
repo,
owner = getOption("pipfun.ghowner"),
branch = "DEV",
filename = repo,
ext = NULL,
metadata = NULL,
message = paste("Updating data via R script on",
Sys.time()),
verbose = TRUE,
...) {

repo,
owner = getOption("pipfun.ghowner"),
branch = "DEV",
filename = repo,
ext = "csv",
metadata = NULL,
verbose = TRUE,
message = paste("Updating data via R script on", Sys.time())) {

# Ensure the required packages are installed
if (!requireNamespace("gh", quietly = TRUE)) {
stop("Package 'gh' is required. Please install it using install.packages('gh').")
}

if (!requireNamespace("cli", quietly = TRUE)) {
install.packages("cli")
library(cli)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary because both gh and cli should be part of the namespace of this package. As long as these packages are in the imports section of the DESCRPTION file, you don't need to add these lines of code. Thanks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @randrescastaneda, thank you for reviewing. That's right, those lines are redundant because both cli and gh are in the Imports section. I removed them and pushed again.

creds <- get_github_creds() # Use the passed function to get GitHub credentials
# Get GitHub credentials
creds <- get_github_creds()

# Convert the data frame to base64-encoded content based on the file extension
content <- convert_df_to_base64(df, ext)

# Prepare params for GitHub request
params <- list(
branch = branch,
message = message,
content = content
)

# Check if metadata is provided and is valid
if (!is.null(metadata) && (!"sha" %in% names(metadata) || !"path" %in% names(metadata))) {
cli::cli_abort("Invalid metadata provided. It must contain 'sha' and 'path'.")
}

# Try to get existing SHA of the file (if it exists)
# Version control: check if the file already exists in the repo
if (is.null(metadata)) {
# Construct the file path
file_path <- check_filename_ext(filename, ext)

# Attempt to retrieve metadata (file info) from GitHub
metadata <- tryCatch({
gh::gh(
"GET /repos/{owner}/{repo}/contents/{file_path}",
Expand All @@ -58,24 +102,16 @@ save_to_gh <- function(df,
cli::cli_abort(e)
}
})
} else {
file_path <- metadata$path

}

# Convert data frame to base64-encoded content based on the file extension
content <- convert_df_to_base64(df, ext)

# Prepare parameters for the GitHub API request
params <- list(
branch = branch,
message = message,
content = content
)

# Include 'sha' parameter if the file already exists (for updating)
if (!is.null(metadata)) {
params$sha <- metadata$sha
# If metadata exists, get the file path and SHA
file_path <- metadata$path
params$sha <- metadata$sha # Include SHA for updating an existing file
} else {
# If no metadata, this is a new file, so set the file path for creation
file_path <- check_filename_ext(filename, ext)
params$sha <- NULL
}

# Upload the file to GitHub
Expand All @@ -84,35 +120,45 @@ save_to_gh <- function(df,
owner = owner,
repo = repo,
path = file_path,
.params = params,
message = message, # Commit message
content = content,
.params = params, # Base64-encoded file content
sha = params$sha, # Include SHA directly in the body of the request if updating
.token = creds$password
)

if (verbose) {
cli::cli_alert_success("File {.file {filename}.{ext}} saved successfully to
branch {.field {branch}} of {owner}/{repo} in GitHub!")
}

# Update metadata: store initial metadata and URL info
mt <- output |>
append(list(init = metadata)) |>
append(list(init = metadata)) |> # 'init' will be NULL if file didn't exist before PUT request
append(info_from_url(output$content$url))

mt$data_change <- mt$content$sha != mt$init$sha
# Track if data has changed
if (!is.null(mt$init$sha)) {
# If SHA exists in 'init', compare the current and previous SHAs
mt$data_change <- mt$content$sha != mt$init$sha
} else {
# If the file was newly created (no initial SHA), set data_change to TRUE
mt$data_change <- TRUE
}

# If verbose, print success and data change status
if (verbose) {
cli::cli_alert_success(
"File {.file {filename}.{ext}} saved successfully to branch {.field {branch}} of {owner}/{repo} in GitHub!"
)
}

if (verbose) {
if (mt$data_change) {
cli::cli_alert("Data has been updated")
} else {
cli::cli_alert("Data did not change")
}
cli::cli_alert(
if (mt$data_change) "Data has been updated" else "Data did not change"
)
}

return(invisible(mt))
}




# Helper function to convert data frame to base64-encoded content based on file extension
convert_df_to_base64 <- function(df, ext = "csv") {
if (is.null(ext))
Expand Down
58 changes: 36 additions & 22 deletions man/save_to_gh.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading