Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement issue #276 exclude certain URLs from checking, also impleme… #368

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from

Conversation

elebeida
Copy link

Implement issue #276 exclude certain URLs from checking, also implement hosts to exclude

@ascheman
Copy link
Member

Thanks for your contribution, @elebeida. Could you please check the test coverage (cf. SonarQube report). I see some tests but mostly integration tests, but the exclusion logic should be covered by unit tests if possible.

Additionally, it would be nice if you could also add some documentation wrt the new parameters (README of Gradle Plugin).

@@ -11,6 +11,9 @@ htmlSanityCheck {

failOnErrors = true

urlsToExclude = [ "https://www.aim42.org/"]
hostsToExclude = [ "www.aim42.org" ]

logger.quiet "HSC version: ${htmlSanityCheckVersion}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the self check is to check the HSC documentation itself.
And it should not exclude essential links in the documentation.

Please drop this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done! :)

<a href="http://included.com/page">Included Host</a>
</body>
</html>
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see these HTMLs are only used in HtmlSanityCheckTaskFunctionalSpec (despite the other final statics that are used in both derived classes. If the code is only used in one derived class, move it there please.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@ascheman
Copy link
Member

Please rethink the approach, @elebeida towards a regular expression based implementation.

Now that I look through your implementation and the original requirement of @rdmueller (who comes with a strong docToolchain back ground), I think that a regular expression is more general as it might cover URLs as well as Hosts.

@elebeida
Copy link
Author

i have now implemented support for regular expression, as you asked for.

@@ -189,6 +189,28 @@ include::../htmlSanityCheck-core/src/main/java/org/aim42/htmlsanitycheck/tools/W
The lists shown above are the default HTTP response codes handled by HSC.
The mentioned configurations effectively move the configured codes around, i.e., if you add `308` to `httpErrorCodes` it is automatically removed from its default list (`httpWarningCodes`).
****
`urlsToExclude` (optional):: A list URLs that should be excluded from the sanity check. These URLs can be written as regular expressions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, but did you see my general comment about URLs to exclude vs. Hosts to exclude, @elebeida?

I think we could cover both by just one regex based property, e.g., exclude.
By using regular expressions, It could contain full (or even partly) URLs (with schema, port etc.) as well as simple host names.

For example,

  • exclude=^.*internal\.example-to-exclude\.com.+ would exclude everything from this host, while
  • exclude=^https?://internal\.example-to-exclude\.com.* would exclude only http and https based URLs, while allowing other schema like ftp (if that makes sense at all).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but i might have misunderstood it. I will make another try. Thanks for the explanation :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have updated the code now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants