Skip to content

Implement Sparse Checkout for GitRepository #1774

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dipti-pai
Copy link
Member

@dipti-pai dipti-pai commented Apr 10, 2025

Sparse Checkout Directories in GitRepositories.

- Add `.spec.sparseCheckout` and `.status.observedSparseCheckout` fields to `GitRepository`.
- Add controller support to send the sparse checkout directories to go-git via pkg methods.
- Use `.status/observedSparseCheckout` to detect drift in configuration.
- Trim leading "./" in directory paths.
- Validate spec configuration by checking directories specified in spec exist in the cloned repository after successful checkout
- Add tests for testing the observed sparse checkout behavior.
- Add docs describing the new fields.

Fixes: #1707

Current behavior worth calling out.

  1. If .spec.sparseCheckout directory list includes a directory that does not exist, the underlying go-git checkout method does not return an error and silently continues. See SparseCheckoutDirectories behavior when directory path does not exist go-git/go-git#1500 . This is being handled in the controller.
  2. If .spec.sparseCheckout directory list includes directory with more than one level of nesting, sparse checkout is not honored and all directories at top-level are checked out. Will be fixed by SparseCheckoutDirectories works only for 1st level directory. go-git/go-git#1455
  3. If .spec.sparseCheckout uses relative path beginning with a ./, the path is ignored and nothing is checked out and no errors are thrown. See SparseCheckoutDirectories ignores relative paths beginning with (./) while checking out git repository go-git/go-git#1506 . This is being handled in source-controller.
    4.. If .spec.sparseCheckout includes an empty directory, the entire repository is checked out (Expected)

@dipti-pai dipti-pai marked this pull request as draft April 10, 2025 22:25
@dipti-pai dipti-pai force-pushed the git-sparse-checkout branch 3 times, most recently from 72d3852 to 7035a58 Compare April 10, 2025 23:27
@stefanprodan stefanprodan changed the title Sparse Checkout Directories in GitRepositories. Implement Sparse Checkout for GitRepository Apr 11, 2025
@stefanprodan stefanprodan added area/git Git related issues and pull requests area/api API related issues and pull requests labels Apr 11, 2025
@stefanprodan
Copy link
Member

If .spec.sparseCheckout uses relative path beginning with a ./, the path is ignored and nothing is checked out and no errors are thrown.

@dipti-pai I think we should handle this on our own. We could trim ./ by prefix, before passing the array to the Git client.

@dipti-pai dipti-pai force-pushed the git-sparse-checkout branch from 7035a58 to 7565fa8 Compare April 11, 2025 17:23
@dipti-pai dipti-pai marked this pull request as ready for review April 11, 2025 17:41
@pjbgf
Copy link
Member

pjbgf commented Apr 13, 2025

From the abnormal behaviours called out, #2 was the only one which there would not be a way to work around straight from the controller. That is now fixed and merged into main.

I'd agree with @stefanprodan on #3, and the fix LGTM.

For #1, we could workaround by adding a os.Lstat after the clone operation to confirm the dir exists - only when sparse checkout is used. @stefanprodan WDYT?

@stefanprodan
Copy link
Member

we could workaround by adding a os.Lstat after the clone operation to confirm the dir exists - only when sparse checkout is used

Yes we could verify the dirs and error out in the controller 👍

@dipti-pai
Copy link
Member Author

Updated the code to handle incorrect user configuration where .spec.sparseCheckout points to a directory that does not exist. Tested scenarios where there is an error in configuration during the initial setup of gitrepository and scenarios where initially the repository is cloned successfully, followed by an updated configuration that has an error.

Spec:
  Interval:  5m
  Ref:
    Branch:  master
  Sparse Checkout:
    charts
    **./kustomizeabc/**
  Timeout:  60s
  URL:      https://github.com/stefanprodan/podinfo
Status:
  Artifact:
    Digest:            sha256:b78631f0a004c1b31891d1a481e9962312bcb7cef306c3cebbdfc3fca484a551
    Last Update Time:  2025-04-15T19:35:36Z
    Path:              gitrepository/default/podinfo/b3396adb98a6a0f5eeedd1a600beaf5e954a1f28.tar.gz
    Revision:          master@sha1:b3396adb98a6a0f5eeedd1a600beaf5e954a1f28
    Size:              14671
    URL:               http://source-controller.kustomize-system.svc.cluster.local./gitrepository/default/podinfo/b3396adb98a6a0f5eeedd1a600beaf5e954a1f28.tar.gz
  Conditions:
    Last Transition Time:  2025-04-15T19:40:08Z
    Message:               **failed to sparse checkout directories : sparse checkout dir './kustomizeabc/' does not exist in repository: lstat /tmp/gitrepository-default-podinfo-1053626913/kustomizeabc: no such file or directory**
    Observed Generation:   7
    Reason:                GitOperationFailed
    Status:                True
    Type:                  Stalled
    Last Transition Time:  2025-04-15T19:40:08Z
    Message:               failed to sparse checkout directories : sparse checkout dir './kustomizeabc/' does not exist in repository: lstat /tmp/gitrepository-default-podinfo-1053626913/kustomizeabc: no such file or directory
    Observed Generation:   7
    Reason:                GitOperationFailed
    Status:                False
    Type:                  Ready
    Last Transition Time:  2025-04-15T19:40:08Z
    Message:               failed to sparse checkout directories : sparse checkout dir './kustomizeabc/' does not exist in repository: lstat /tmp/gitrepository-default-podinfo-1053626913/kustomizeabc: no such file or directory
    Observed Generation:   7
    Reason:                GitOperationFailed
    Status:                True
    Type:                  FetchFailed
    Last Transition Time:  2025-04-15T19:35:36Z
    Message:               stored artifact for revision 'master@sha1:b3396adb98a6a0f5eeedd1a600beaf5e954a1f28'
    Observed Generation:   6
    Reason:                Succeeded
    Status:                True
    Type:                  ArtifactInStorage
  Observed Generation:     7
  Observed Sparse Checkout:
    charts
    **./kustomize**

Once the error in configuration is fixed, the reconciliation succeeds -

Spec:
  Interval:  5m
  Ref:
    Branch:  master
  Sparse Checkout:
    charts
    **./kustomize/**
  Timeout:  60s
  URL:      https://github.com/stefanprodan/podinfo
Status:
  Artifact:
    Digest:            sha256:b78631f0a004c1b31891d1a481e9962312bcb7cef306c3cebbdfc3fca484a551
    Last Update Time:  2025-04-15T19:43:01Z
    Path:              gitrepository/default/podinfo/b3396adb98a6a0f5eeedd1a600beaf5e954a1f28.tar.gz
    Revision:          master@sha1:b3396adb98a6a0f5eeedd1a600beaf5e954a1f28
    Size:              14671
    URL:               http://source-controller.kustomize-system.svc.cluster.local./gitrepository/default/podinfo/b3396adb98a6a0f5eeedd1a600beaf5e954a1f28.tar.gz
  Conditions:
    Last Transition Time:  2025-04-15T19:43:01Z
    Message:               **stored artifact for revision 'master@sha1:b3396adb98a6a0f5eeedd1a600beaf5e954a1f28'**
    Observed Generation:   8
    Reason:                Succeeded
    Status:                True
    Type:                  Ready
    Last Transition Time:  2025-04-15T19:35:36Z
    Message:               stored artifact for revision 'master@sha1:b3396adb98a6a0f5eeedd1a600beaf5e954a1f28'
    Observed Generation:   8
    Reason:                Succeeded
    Status:                True
    Type:                  ArtifactInStorage
  Observed Generation:     8
  Observed Sparse Checkout:
    charts
    **./kustomize/**

@dipti-pai dipti-pai force-pushed the git-sparse-checkout branch from 7565fa8 to d1d2461 Compare April 15, 2025 19:58
Copy link
Member

@stefanprodan stefanprodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks @dipti-pai 🏅

    - Add `.spec.sparseCheckout` and `.status.observedSparseCheckout` fields to `GitRepository`.
    - Add controller support to send the sparse checkout directories to go-git via pkg methods.
    - Use `.status/observedSparseCheckout` to detect drift in configuration.
    - Trim leading "./" in directory paths.
    - Validate spec configuration by checking directories specified in spec exist in the cloned repository after successful checkout
    - Add tests for testing the observed sparse checkout behavior.
    - Add docs describing the new fields.

Signed-off-by: Dipti Pai <[email protected]>
@dipti-pai dipti-pai force-pushed the git-sparse-checkout branch from d1d2461 to 32e40e0 Compare April 22, 2025 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api API related issues and pull requests area/git Git related issues and pull requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for sparse checkout to GitRepository API
3 participants