-
Notifications
You must be signed in to change notification settings - Fork 247
Add AwsDownloadCountService #1451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
a version of this change is currently running on staging.open-vsx.org. It updates the counts every minutes to make it easier for testing. Also, processed log files are not yet deleted, but marked in the database as being processed. |
|
@autumnfound can you take a look at this PR? @amvanbaren it is quite important to get this reviewed and integrated so that we can switch over to an official openvsx release asap for open-vsx.org. We currently run a custom version due to the storage migration. Will also add another PR to utilize a CDN in front of the actual cloud storage provider. |
|
something that I could not figure out yet: Tthe existing AzureDownloadCountService has a recurring job defined whose name I wanted to update to better reflect that there are now multiple jobs, but changing the job name in the annotation does not update the job in the database. Adding that to the migration script will make tests fail, so for now, I was updating the respective db table manually. Do you have any ideas how this could be make more robust? Is something like that supposed to be in the migration script? |
autumnfound
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, but the lack of comments made this code a bit harder to follow/digest.
server/src/main/java/org/eclipse/openvsx/storage/log/AwsDownloadCountService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/eclipse/openvsx/storage/log/AwsDownloadCountService.java
Outdated
Show resolved
Hide resolved
|
This PR also contains an optimization for evicting the extension json cache for an extension, see #1392 . It needs to get all version / platform combinations for an extension and evict the key for each combination in the cache, which is pretty slow when you have lots of extensions. The optimization uses a pattern to clean all keys for the extension if a redis cache is used which supports that kind of operation. |
|
Needs some more updates from the version that is currently running on production to speed up analysis of download counts (evicting the cache is so slow we need to do that in bulk) |
Remove unused Shedlock classes Rename AzureDownloadCountProcessedItem entity to DownloadCountProcessedItem and include a storageType field
…he next job run time
…or all versions of an extension
eac2ba0 to
333124d
Compare
This PR adds support to download and analyse access logs from an AWS S3 bucket provided by Amazon CloudFront.
Additionally the following changes are included:
I did decide to remove the AzureDownloadCountProcessItem table instead of renaming and altering it as this feels like to be simpler and we do not lose data that is important to keep around imho.