-
Notifications
You must be signed in to change notification settings - Fork 4
add cloudflare cache purge #638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
aryan-p03
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments about the client setup, logging and keeping dataset-api logic separate from the cloudflare package.
Nice to see testcontainers being used for unit tests
| return err | ||
| } | ||
|
|
||
| func (svc *Service) initCloudflareClient(ctx context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some of the checks in this function are repeated in initialise.go
| if err := svc.initCloudflareClient(ctx); err != nil { | ||
| log.Error(ctx, "failed to initialise cloudflare client, continuing without cache purging", err) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should return an error, if the client fails to setup then the service shouldn't start up
| func (e *Init) DoGetCloudflareClient(ctx context.Context, cfg *config.Configuration) (cloudflare.Clienter, error) { | ||
| if cfg.CloudflareAPIToken == "" || cfg.CloudflareZoneID == "" { | ||
| log.Info(ctx, "cloudflare integration disabled: missing API token or zone ID") | ||
| return nil, nil | ||
| } | ||
|
|
||
| var client cloudflare.Clienter | ||
| var err error | ||
|
|
||
| // for local mock server when SDK is disabled | ||
| if !cfg.EnableCloudflareSDK && cfg.CloudflareAPIURL != "" { | ||
| client, err = cloudflare.New( | ||
| cfg.CloudflareAPIToken, | ||
| cfg.CloudflareZoneID, | ||
| cfg.EnableCloudflareSDK, | ||
| cfg.CloudflareAPIURL, | ||
| ) | ||
| } else { | ||
| client, err = cloudflare.New( | ||
| cfg.CloudflareAPIToken, | ||
| cfg.CloudflareZoneID, | ||
| cfg.EnableCloudflareSDK, | ||
| ) | ||
| } | ||
|
|
||
| if err != nil { | ||
| return nil, err | ||
| } | ||
|
|
||
| return client, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the setup within the cloudflare package will bring up the errors with missing configs so we don't need to check that here. We just need to check if cloudflare is enabled here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the comment about the mocks on component test setup, we would be able to have a simple step saying something like:
the following URL's should have been purged: followed by a list of the URL's we expect to be purged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the component test setup, can the cloudflare package have mocks generated that can be used here instead of making the mocks here?
| func buildPrefixes(datasetID, editionID string) []string { | ||
| return []string{ | ||
| fmt.Sprintf("www.ons.gov.uk/datasets/%s", datasetID), | ||
| fmt.Sprintf("www.ons.gov.uk/datasets/%s/editions", datasetID), | ||
| fmt.Sprintf("www.ons.gov.uk/datasets/%s/editions/%s/versions", datasetID, editionID), | ||
| fmt.Sprintf("api.beta.ons.gov.uk/v1/datasets/%s", datasetID), | ||
| fmt.Sprintf("api.beta.ons.gov.uk/v1/datasets/%s/editions", datasetID), | ||
| fmt.Sprintf("api.beta.ons.gov.uk/v1/datasets/%s/editions/%s/versions", datasetID, editionID), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the previous comment, this function should be in the dataset-api and not the SDK package. I would also avoid hard-coded domain names and instead build off config values
| // PurgeCacheByPrefix purges the Cloudflare cache for dataset-related URLs | ||
| func (c *Client) PurgeCacheByPrefix(ctx context.Context, datasetID, editionID string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function shouldn't be in the SDK as it is specific to the dataset-api. The SDK should just accept a list of prefixes which it then purges (so any service can use the function).
The dataset-api should be responsible for building the prefixes and then this list will be passed to the SDK
| if api.cloudflareClient != nil { | ||
| if err := api.cloudflareClient.PurgeCacheByPrefix(ctx, datasetID, edition); err != nil { | ||
| log.Error(ctx, "failed to purge cloudflare cache", err, logData) | ||
| } else { | ||
| log.Info(ctx, "successfully triggered cloudflare cache purge", logData) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the client is nil here then there is no trace and cache will not be cleared which is a big problem for support. Ideally the client should be confirmed as not nil at setup so we don't have the nil check here.
If possible, it would be nice to have the list of URL's purged in the log.Info so we check them in environment testing.
I see that we only have the error log when purge fails as we don't want to block publishing. Might be worth asking Fran if this is ok since the purge can fail but the dataset will be published
| _, err = api.smDatasetAPI.AmendVersion(r.Context(), vars, versionUpdate) | ||
| if err != nil { | ||
| handleVersionAPIErr(ctx, err, w, logData) | ||
| return | ||
| } | ||
|
|
||
| if stateUpdate.State == models.PublishedState && updatedVersion.Distributions != nil && len(*updatedVersion.Distributions) > 0 { | ||
| err = api.publishDistributionFiles(ctx, updatedVersion, logData) | ||
| if stateUpdate.State == models.PublishedState && currentVersion.Distributions != nil && len(*currentVersion.Distributions) > 0 { | ||
| err = api.publishDistributionFiles(ctx, currentVersion, logData) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming the change to _ instead of updatedVersion is due to the linter which looks ok.
However currentVersion is being used on lines 839 and 840 where it was previously updatedVersion. I think it should remain as updatedVersion unless there's an issue I'm not seeing here?
What
Add functionality to purge cloudflare cache on publication of files (via PUT /state endpoint)
Jira - https://officefornationalstatistics.atlassian.net/browse/DIS-4109
How to review
Run the
dataset-cataloguestack and make sure you have the latest version which includes thecloud-flare-stubMake sure you have these env vars set locally -
Create a new dataset and version, and make sure when creating the version that there is a file in the local db that matches the file in the
distributionsarrayUse the PUT
/stateendpoint for the version to transition through the states until thepublishedstateAfter transitioning to the
publishedstate, check the docker logs for dp-dataset-api and cloud-flare-stub and confirm that the cache purge has run successfully. The dataset logs should look something like -You can also test that the real Cloudlfare API is being hit by setting
ENABLE_CLOUDFLARE_SDK=true, following the same steps as above, and then you should get an error in the logs. Something like this -Which is just a result of the api token being incorrect (this will be tested in sandbox with a real api token)
Confirm changes make sense, tests pass
Who can review
Anyone