diff --git a/CHANGELOG.md b/CHANGELOG.md index 75d8141..89f3c30 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,7 @@ ## [Unreleased] -- No notable changes. +### Features +- **clone:** Add native batch cloning from GitHub users or organizations with parallel support. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 81d35f9..eb8d90a 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -153,6 +153,7 @@ lib/ # Shared libraries (sourced, not executed) src/ # Executable scripts ├── main.sh # Entry point & dispatcher └── commands/ # Subcommand scripts + ├── clone.sh # Batch-clone from GitHub ├── pull.sh # Bulk-pull reconciliation └── status.sh # Repository status overview diff --git a/README.md b/README.md index 1bcc49f..218f78a 100644 --- a/README.md +++ b/README.md @@ -17,6 +17,7 @@ **GRR** is a CLI tool that finds all git repositories under a directory and reconciles them — fetching, pulling, stashing, resetting, and cleaning — in a single command. Built for developers and platform engineers managing many repos locally. **Features:** +- Batch-clone all repositories from a GitHub User or Organization - Discover and pull all git repos in a directory tree - Parallel execution for speed (`-p N`) - Safe pipeline: fetch → status-check → stash → checkout → reset → clean → pull @@ -84,7 +85,20 @@ grr --version ## Commands -### `grr pull` — Reconcile repositories +### \`grr clone\` — Batch clone repositories + +\`\`\`bash +# Clone all repositories from an account (User or Org) +grr clone PlatformStackPulse ./my-repos + +# Fast parallel clone with 4 jobs +grr clone PlatformStackPulse -p 4 + +# Preview what would be cloned +grr clone PlatformStackPulse --dry-run +\`\`\` + +### \`grr pull\` — Reconcile repositories ```bash # Recommended: safe pull with full features @@ -205,6 +219,19 @@ make dev-setup # Install dev tools + git hooks --- +## Releasing New Versions + +Releases are automated via GitHub Actions. To publish a new version: + +1. **Bump Version:** Go to the **Actions** tab in GitHub and select the **Update Version** workflow. +2. **Run Workflow:** Click **Run workflow**, choose the version bump type (patch, minor, or major), and run it on the `main` branch. +3. **Automated Release:** This will automatically create a new git tag. The **Release** workflow will then trigger to: + * Build the single portable binary. + * Generate a GitHub Release with the binary attached as an artifact. + * Publish a new Docker image to GitHub Container Registry (GHCR). + +--- + ## Docker ```bash diff --git a/SKILL.md b/SKILL.md new file mode 100644 index 0000000..5a2707f --- /dev/null +++ b/SKILL.md @@ -0,0 +1,53 @@ +# AI Skill: Git Repo Reconciler (GRR) Development + +This skill file guides AI agents on how to maintain, extend, and verify the **Git Repo Reconciler (GRR)** codebase. + +## 🧠 Core Architecture & Standards + +GRR is a high-performance, modular Bash CLI tool (Bash 3.2+ compatible). + +- **Library Pattern:** All reusable logic lives in `lib/*.sh`. They MUST be sourced, never executed directly, and include a double-source guard. +- **Command Dispatcher:** Commands live in `src/commands/*.sh`. The entry point `src/main.sh` dispatches to these scripts. +- **Safety First:** Every script MUST start with `set -euo pipefail`. +- **Portability:** Use `_git_timeout` (from `lib/git.sh`) instead of `timeout` to ensure compatibility between Linux and macOS. +- **Bundling:** The production binary is a single self-contained script generated by `scripts/build.sh`, which inlines all libraries and commands. + +## 🛠️ Key Workflows + +### Adding a New Command +1. Create `src/commands/mycommand.sh`. +2. Implement `mycommand_usage()` and `mycommand_run()`. +3. Update `src/main.sh` to route the command and add it to help/examples. +4. Add unit tests in `test/unit/mycommand_test.bats`. +5. Update documentation (`README.md`, `CONTRIBUTING.md`, `TEMPLATE_GUIDE.md`). + +### Parallel Processing Pattern +When implementing batch operations, follow the pattern in `pull.sh` or `clone.sh`: +- Use a `tmpdir` created via `mktemp -d` for IPC. +- Spawn background jobs `(...) &`. +- Limit concurrency using a `while [[ $running -ge $parallel ]]` loop with `wait -n`. +- Tally results from the `tmpdir` after `wait`. + +### GitHub API Interaction +- Use `curl` and `jq`. +- Always check if the account is a `User` or `Organization` using `https://api.github.com/users/`. +- Handle pagination (GitHub defaults to 30, but we prefer `per_page=100`). +- Support `GITHUB_TOKEN` via headers for private repos and rate-limit bypassing. + +## 🧪 Testing & Verification + +- **BATS:** Use the Bash Automated Testing System. +- **Mocking:** Since pure BATS doesn't have a mocking library, use `mktemp` to create fake git repositories for integration testing. +- **Standard Checks:** Always run `make lint` (ShellCheck) and `make fmt-check` (shfmt) before committing. + +## 🚀 Release & Maintenance + +- **Versioning:** Follow Semantic Versioning. +- **Releases:** Triggered via GitHub Actions `version-bump.yml`. Do not manually create tags unless necessary. +- **Artifacts:** The `release.yml` workflow generates the single portable binary for distribution. + +## 📖 Style Guide Summary +- Indent: 4 spaces. +- Variables: Always quoted `"$VAR"`. +- Conditionals: Prefer `[[ ... ]]` over `[ ... ]`. +- Separation of Concerns: Logic in `lib/`, CLI glue in `src/`. diff --git a/TEMPLATE_GUIDE.md b/TEMPLATE_GUIDE.md index 278743f..05a1493 100644 --- a/TEMPLATE_GUIDE.md +++ b/TEMPLATE_GUIDE.md @@ -4,8 +4,8 @@ An overview of the design patterns, project structure, and conventions used in G ## Project Stats -- **Shell Files**: 10 (7 lib + 2 src commands + 1 entry point) -- **Test Files**: 9 (7 unit + 1 integration + 1 helper) +- **Shell Files**: 11 (7 lib + 3 src commands + 1 entry point) +- **Test Files**: 10 (8 unit + 1 integration + 1 helper) - **Test Framework**: BATS (Bash Automated Testing System) - **Workflows**: 6 GitHub Actions workflows - **Dependencies**: ShellCheck, shfmt, BATS (dev only) diff --git a/src/commands/clone.sh b/src/commands/clone.sh new file mode 100644 index 0000000..a1ea7f1 --- /dev/null +++ b/src/commands/clone.sh @@ -0,0 +1,278 @@ +#!/usr/bin/env bash +# src/commands/clone.sh — Batch clone all repositories from a GitHub account +# +# Usage: +# grr clone [OPTIONS] [DIRECTORY] +# grr clone PlatformStackPulse /repos -p 4 + +clone_usage() { + cat < [DIRECTORY] + +Batch clone all public repositories from a GitHub User or Organization. +Default directory is the current working directory. + +OPTIONS: + -d, --dry-run Show what would be cloned without making changes + -p, --parallel N Run N parallel clone operations (default: 1) + -t, --token TOKEN GitHub Personal Access Token (or set GITHUB_TOKEN env) + -h, --help Show this help + +EXAMPLES: + # Clone all repos from a user/org + $(basename "$0") clone PlatformStackPulse ./my-repos + + # Fast parallel clone with 4 jobs + $(basename "$0") clone PlatformStackPulse -p 4 +EOF +} + +# ── Per-repo clone logic ────────────────────────────────────────────────────── + +_clone_process_repo() { + local name="$1" + local url="$2" + local target_dir="$3" + local repo_path="$target_dir/$name" + + if [[ -d "$repo_path" ]]; then + log_warning "Skipping (already exists): $name" + return 2 # special code: skipped + fi + + if [[ "${_CLONE_DRY_RUN}" == "true" ]]; then + log_info "[DRY-RUN] Would clone: $name ($url)" + return 0 + fi + + log_info "Cloning: $name..." + if ! git clone "$url" "$repo_path" >/dev/null 2>&1; then + log_error "Failed to clone: $name" + return 1 + fi + + log_success "Cloned: $name" + return 0 +} + +# ── Summary ────────────────────────────────────────────────────────────────── + +_clone_print_summary() { + local total="$1" success="$2" failed="$3" skipped="$4" + + echo "" + print_separator "=" + echo " Total repositories found: $total" + echo " Successfully cloned: $success" + echo " Failed: $failed" + echo " Skipped (exists): $skipped" + print_separator "=" + + if [[ $failed -eq 0 && $total -gt 0 ]]; then + log_success "Batch clone completed successfully!" + return 0 + elif [[ $total -eq 0 ]]; then + log_warning "No repositories found for account" + return 1 + else + log_error "Some clones failed (see errors above)" + return 1 + fi +} + +# ── Entry point ────────────────────────────────────────────────────────────── + +clone_run() { + local account="" + local target_dir="${PWD}" + _CLONE_DRY_RUN="false" + local parallel=1 + local token="${GITHUB_TOKEN:-}" + + # Parse command options + while [[ $# -gt 0 ]]; do + case "$1" in + -h | --help) + clone_usage + return 0 + ;; + -d | --dry-run) + _CLONE_DRY_RUN="true" + shift + ;; + -p | --parallel) + if [[ $# -lt 2 ]]; then + log_error "--parallel requires a number" + return "$ERR_INVALID_INPUT" + fi + parallel="$2" + shift 2 + ;; + -t | --token) + if [[ $# -lt 2 ]]; then + log_error "--token requires a value" + return "$ERR_INVALID_INPUT" + fi + token="$2" + shift 2 + ;; + -*) + log_error "Unknown option: $1" + clone_usage + return "$ERR_INVALID_INPUT" + ;; + *) + if [[ -z "$account" ]]; then + account="$1" + else + target_dir="$1" + fi + shift + ;; + esac + done + + # Validate input + if [[ -z "$account" ]]; then + log_error "Account name (User or Org) is required" + clone_usage + return "$ERR_INVALID_INPUT" + fi + + require_command "curl" "sudo apt install curl" || return "$ERR_DEPENDENCY" + require_command "jq" "sudo apt install jq" || return "$ERR_DEPENDENCY" + + if ! [[ "$parallel" =~ ^[1-9][0-9]*$ ]]; then + log_error "Parallel count must be a positive integer (got: '$parallel')" + return "$ERR_INVALID_INPUT" + fi + + mkdir -p "$target_dir" || { + log_error "Cannot create/access directory: $target_dir" + return "$ERR_PERMISSION" + } + + # ── Fetch repository list from GitHub API ──────────────────────────────── + + log_info "Fetching repository list for: $account" + + local headers=() + headers+=("-H" "Accept: application/vnd.github.v3+json") + if [[ -n "$token" ]]; then + headers+=("-H" "Authorization: token $token") + fi + + # 1. Determine account type (User or Org) + local account_data + account_data=$(curl -s "${headers[@]}" "https://api.github.com/users/$account") + local account_type + account_type=$(echo "$account_data" | jq -r '.type // empty') + + if [[ -z "$account_type" ]]; then + log_error "Could not find account: $account" + return "$ERR_NOT_FOUND" + fi + + local type_path="users" + if [[ "$account_type" == "Organization" ]]; then + type_path="orgs" + log_debug "Account is an Organization" + else + log_debug "Account is a User" + fi + + # 2. Fetch all repositories (paginated) + local page=1 + local repo_names=() + local repo_urls=() + + while true; do + log_debug "Fetching page $page..." + local res + res=$(curl -s "${headers[@]}" "https://api.github.com/$type_path/$account/repos?per_page=100&page=$page") + + # Check for errors in API response + if echo "$res" | jq -e 'type == "object" and .message != null' >/dev/null; then + local message + message=$(echo "$res" | jq -r '.message') + if [[ "$message" != "Not Found" ]]; then + log_error "GitHub API error: $message" + return "$ERR_INTEGRATION" + fi + fi + + local count + count=$(echo "$res" | jq '. | length') + if [[ -z "$count" || "$count" == "0" || "$res" == "[]" ]]; then + break + fi + + # Extract names and clone URLs + while IFS= read -r line; do repo_names+=("$line"); done < <(echo "$res" | jq -r '.[].name') + while IFS= read -r line; do repo_urls+=("$line"); done < <(echo "$res" | jq -r '.[].clone_url') + + if [[ "$count" -lt 100 ]]; then + break + fi + page=$((page + 1)) + done + + local total=${#repo_names[@]} + if [[ $total -eq 0 ]]; then + log_warning "No public repositories found for $account" + _clone_print_summary 0 0 0 0 + return 0 + fi + + log_info "Found $total repository/repositories" + + # ── Execute Cloning ────────────────────────────────────────────────────── + local success=0 failed=0 skipped=0 + + if [[ $parallel -gt 1 ]]; then + log_info "Cloning with $parallel parallel jobs" + local tmpdir + tmpdir=$(mktemp -d) + trap 'rm -rf "$tmpdir"' EXIT + + local running=0 i=0 + for ((i = 0; i < total; i++)); do + while [[ $running -ge $parallel ]]; do + wait -n 2>/dev/null || true + running=$(jobs -rp | wc -l) + done + + ( + local result + if _clone_process_repo "${repo_names[$i]}" "${repo_urls[$i]}" "$target_dir"; then + result="success" + else + [[ $? -eq 2 ]] && result="skipped" || result="failed" + fi + echo "$result" >"$tmpdir/job_$i" + ) & + running=$(jobs -rp | wc -l) + done + wait + + for f in "$tmpdir"/job_*; do + [[ -f "$f" ]] || continue + case "$(cat "$f")" in + success) success=$((success + 1)) ;; + skipped) skipped=$((skipped + 1)) ;; + *) failed=$((failed + 1)) ;; + esac + done + else + local i=0 + for ((i = 0; i < total; i++)); do + if _clone_process_repo "${repo_names[$i]}" "${repo_urls[$i]}" "$target_dir"; then + success=$((success + 1)) + else + [[ $? -eq 2 ]] && skipped=$((skipped + 1)) || failed=$((failed + 1)) + fi + done + fi + + _clone_print_summary "$total" "$success" "$failed" "$skipped" +} diff --git a/src/main.sh b/src/main.sh index 28bd440..a24183b 100755 --- a/src/main.sh +++ b/src/main.sh @@ -39,6 +39,7 @@ GRR — Git Repo Reconciler Bulk-update, inspect, and reconcile git repositories at scale. COMMANDS: + clone Batch clone repositories from a GitHub account pull Reconcile (pull) all repos in a directory status Show status of all repos in a directory @@ -51,6 +52,7 @@ GLOBAL OPTIONS: Run '$(basename "$0") COMMAND --help' for command-specific help. EXAMPLES: + $(basename "$0") clone PlatformStackPulse ./my-repos -p 4 $(basename "$0") pull --fetch --stash -V /repos $(basename "$0") pull -d --fetch /repos $(basename "$0") status /repos @@ -100,6 +102,12 @@ main() { # Command dispatch local command="${1:-}" case "$command" in + clone) + shift + # shellcheck source=commands/clone.sh + source "$SRC_DIR/commands/clone.sh" + clone_run "$@" + ;; pull) shift # shellcheck source=commands/pull.sh diff --git a/test/unit/clone_test.bats b/test/unit/clone_test.bats new file mode 100644 index 0000000..34163c2 --- /dev/null +++ b/test/unit/clone_test.bats @@ -0,0 +1,41 @@ +#!/usr/bin/env bats +# test/unit/clone_test.bats — Tests for grr clone command + +setup() { + PROJECT_ROOT="$(cd "$BATS_TEST_DIRNAME/../.." && pwd)" + export NO_COLOR=1 + export VERBOSE="false" + export LOG_FILE="" +} + +@test "clone --help shows usage" { + run bash "$PROJECT_ROOT/src/main.sh" clone --help + [ "$status" -eq 0 ] + [[ "$output" == *"Usage:"* ]] + [[ "$output" == *"--parallel"* ]] +} + +@test "clone with no account shows error" { + run bash "$PROJECT_ROOT/src/main.sh" clone + [ "$status" -eq 10 ] # ERR_INVALID_INPUT + [[ "$output" == *"Account name"* ]] +} + +@test "clone --dry-run handles missing account gracefully" { + # Mocks for curl/jq are harder in pure bats without a mocking library, + # but we can test the pre-API logic. + run bash "$PROJECT_ROOT/src/main.sh" clone -d + [ "$status" -eq 10 ] +} + +@test "clone rejects invalid parallel count" { + run bash "$PROJECT_ROOT/src/main.sh" clone myaccount -p 0 + [ "$status" -eq 10 ] + [[ "$output" == *"positive integer"* ]] +} + +@test "clone handles invalid directory permission" { + # This might fail if run as root, but usually /root is protected + run bash "$PROJECT_ROOT/src/main.sh" clone myaccount /root/no-access + [[ "$status" -eq 12 ]] # ERR_PERMISSION +}