Skip to content

Conversation

@redblackcoder
Copy link
Contributor

@redblackcoder redblackcoder commented Oct 24, 2025

Summary

Spotless is configured to work on generated code in the repo. This seems unnecesary and makes the spotlessCheck and spotlessApply take long to run.

This PR successfully eliminates unnecessary code generation from the spotlessCheck task, resulting in a 93.5% reduction in execution time (from 1m 33s to 6s).

Performance Metrics

Metric Before After Improvement
Build Time 1m 33s 6s 15.5x faster
Total Tasks 521 tasks 282 tasks -239 tasks (46% reduction)
Tasks Executed 516 280 -236 executions

Key Changes Eliminated

Code Generation Tasks No Longer Triggered

The following expensive code generation tasks no longer run during spotlessCheck:

  • :metadata-models:generateDataTemplate - Processing 610 schema files
  • :metadata-models:generateAvroSchema - Processing 610 schema files
  • :metadata-models:generateJsonSchema - JSON schema generation
  • :metadata-models:openApiGenerate - OpenAPI code generation
  • :metadata-models:compileMainGeneratedDataTemplateJava - Compiling generated code
  • :datahub-graphql-core:graphqlCodegen - Generating 1079 GraphQL classes
  • :metadata-service:openapi-analytics-servlet:openApiGenerate
  • :metadata-service:openapi-entity-servlet:openApiGenerate
  • Multiple compilation and resource processing tasks for generated sources

Changes Made

  1. Excluded metadata-models from spotless - This project only contains generated code
  2. Removed backwards dependencies - Deleted spotlessJava.dependsOn on generation tasks in:
    • metadata-service/restli-servlet-impl/build.gradle
    • test-models/build.gradle
    • entity-registry/build.gradle
    • entity-registry/custom-test-model/build.gradle
  3. Updated spotless exclusions - Removed redundant exclusion patterns that are no longer needed

Result

spotlessCheck now runs only on actual source files and does not trigger any code generation pipelines, making the development workflow significantly faster for code quality checks.

@github-actions github-actions bot added devops PR or Issue related to DataHub backend & deployment community-contribution PR or Issue raised by member(s) of DataHub Community labels Oct 24, 2025
@codecov
Copy link

codecov bot commented Oct 24, 2025

Bundle Report

Changes will increase total bundle size by 27.3kB (0.1%) ⬆️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
datahub-react-web-esm 28.6MB 27.3kB (0.1%) ⬆️

Affected Assets, Files, and Routes:

view changes for bundle: datahub-react-web-esm

Assets Changed:

Asset Name Size Change Total Size Change (%)
assets/index-*.js -2.53kB 18.94MB -0.01%
assets/sample-*.png (New) 29.82kB 29.82kB 100.0% 🚀

@datahub-cyborg datahub-cyborg bot added the needs-review Label for PRs that need review from a maintainer. label Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution PR or Issue raised by member(s) of DataHub Community devops PR or Issue related to DataHub backend & deployment needs-review Label for PRs that need review from a maintainer.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants