✨ Support more SAST tools #1487

laurentsimon · 2022-01-18T23:07:41Z

This PR addresses #1420, where we came to the conclusion that we want more granularity in the SAST check.
This PR does not contain unit tests yet, because there's a lot to discuss and I want to see if we're all OK with this changes.

WARNING: please ignore the comments and the calls to Println(): I will clean up once we have agreed everything else is fine.

This PR does look for:

Linter tools run on each PR, and reward 3 points if it's the case
Code analysis tools run on each PR, and reward 4 points if its the case
Supply-chain analysis tools run on each PR, and reward 3 points if it's the case. Since scorecard is the only supply-chain tool at the moment and we don't advertise PR support, we give 3 points if scorecard is enabled, even if not run on PRs.

This PR gives no additional points if a tool is declared in a workflow but not run on PRs - with the exception of scorecard above.

closes #1380
closes #1420
closes #1580

laurentsimon · 2022-01-18T23:08:15Z

checks/packaging.go

@@ -17,7 +17,6 @@ package checks
 import (


ignore this file, it's mostly typo changes.

laurentsimon · 2022-01-18T23:11:01Z

clients/githubrepo/workflows.go

+		return nil, err
+	}
+
+	req, err := handler.client.NewRequest("GET", u, nil)


go-github does not support check_suite_id as input https://github.com/google/go-github/blob/178169fc04bc4a05ca48f98709cdd02ab5b556e3/github/actions_workflow_runs.go#L56

So I've simply copied their code from https://github.com/google/go-github/blob/178169fc04bc4a05ca48f98709cdd02ab5b556e3/github/actions_workflow_runs.go#L91 until they add support. We should file an issue for it.

Could you file an issue and add a TODO comment here to point to it?

laurentsimon · 2022-01-18T23:11:26Z

clients/workflows.go

+
+// Workflow represents a workflow.
+type Workflow struct {
+	ID   int64


need discussion: pointer or not?

Yes, let's use pointer (and also below for *string)

naveensrinivasan

Great, But we need tests!

laurentsimon · 2022-01-18T23:18:59Z

Great, But we need tests!

so much to discuss before I add the test. I'll add them at the end.

laurentsimon · 2022-01-18T23:37:45Z

cc @evverx

evverx · 2022-01-19T01:41:32Z

Linter tools run on each PR, and reward 3 points if it's the case
Supply-chain analysis tools run on each PR, and reward 3 points if it's the case

I don't think that linters and supply-chain tools (combined) should outweigh SAST tools like LGTM, CodeQL or Coverity so it seems to me that they should be downgraded to 1 or something like that. I mean, systemd for example uses superlinter and it finds issues from time to time but I'm not sure it would be fair to say that in terms of security it's almost as important as LGTM or Coverity. It wouldn't be fair if projects using only LGTM received 4 while projects using only superlinter received 3 either.

I'll try to take a closer look later. Thanks!

evverx · 2022-01-19T01:48:56Z

This PR gives no additional points if a tool is declared in a workflow but not run on PRs

It's not always feasible to run SAST tools on every PR but if they are run daily or weekly it's still useful and should be rewarded I think. For example it takes CodeQL about 40 minutes to analyze systemd so it doesn't make much sense to clog its GHActions pool on every PR but it's still run daily

evverx · 2022-01-19T02:02:35Z

systemd runs LGTM and superlinter on every PR, CodeQL daily and Coverity Scan daily and with this PR applied its score went from 10 to 6. I'm not sure it's what it should get :-)

evverx · 2022-01-19T03:25:28Z

FWIW The curl score went from 7 to 0 with this PR applied.

Anyway, on the whole I think I like the idea of putting various tools into separate groups and assigning different "scores" to them. Though the way those "scores" are assigned depending on the group, frequency and so on needs tweaking I think. Thanks!

azeemshaikh38 · 2022-01-19T03:33:16Z

Will take a closer look at this tomo. Overall LGTM.

laurentsimon · 2022-01-19T04:06:56Z

Thanks for the feedback @evverx I totally agree we need tweak the score; and also that tools that run on commits/schedules should be rewarded. Maybe give 70% of the points if run on commits vs PRs. Feel free to suggest.

laurentsimon · 2022-01-19T04:10:32Z

systemd runs LGTM and superlinter on every PR, CodeQL daily and Coverity Scan daily and with this PR applied its score went from 10 to 6. I'm not sure it's what it should get :-)

you'd actually get a 10 if you install scorecard as an action. It suffices for one linter and one static code analysis tool (LGTM in your case) to be run on all PRs + scorecard installed to get 10 - we don't advertise scorecard on PR because it requires a (read) PAT to be made available on PRs atm. That said, I don't disagree that we need to reward tools run on commit or on schedules.

evverx · 2022-01-19T04:49:12Z

Maybe give 70% of the points if run on commits vs PRs

I think it depends on how frequent those runs are. If projects run SAST tools daily (or a few times a day) I think they should be scored higher than projects with weekly or monthly runs.

you'd actually get a 10 if you install scorecard as an action.

I get that :-) but it's complicated because on the one hand I think scorecard is helpful but on the other hand I don't want to promote OSSF. Other than that looking at a couple of checks that have already been introduced and issues where new checks are outlined it seems to me that scorecard is moving towards "consumers" instead of "producers" (which is understandable) but I'm not sure it will make sense for "producers" to keep it upstream at some point (time will tell I think). Just to clarify, I have absolutely no problems with this direction but somehow tools catered to "consumers" somehow shift things that should be done by "consumers" to "producers" (the latest example would be google/oss-fuzz#7146) and generally I wouldn't want to annoy maintainers.

azeemshaikh38 · 2022-01-20T15:43:09Z

clients/checkruns.go

+	Conclusion   string
+	URL          string
+	App          CheckRunApp
+	Name         string


Make Name as *string?

Can you please add the reason as to why would like to change to a pointer?

Every change requested should have a reason as to why it is requested. It helps others understand the thought process and the reasoning. This is applicable across all these comments.

Pointer sematic has drawbacks we have to be clear on why we need it? Go doesn't encourage pointer semantics. Most of the standard library has value semantics

Sure. As part of #1242, we are in the process of migrating the current data structure to a more consistent format. The consistent formatting lets us use a convention that implies nil == no data.

If I make Name a pointer, shall I make everything else one too then?

azeemshaikh38 · 2022-01-20T15:43:19Z

clients/checkruns.go

+	App          CheckRunApp
+	Name         string
+	CheckSuiteID *int64
+	// PullRequests []PullRequest


Remove comment?

clients/githubrepo/workflows.go

clients/workflows.go

azeemshaikh38 · 2022-01-20T15:48:42Z

clients/workflows.go

+
+// Workflow represents a workflow.
+type Workflow struct {
+	ID   int64


Yes, let's use pointer (and also below for *string)

azeemshaikh38 · 2022-01-20T15:53:43Z

clients/githubrepo/workflows.go

+	}, nil
+}
+
+func addOptions(s string, opts interface{}) (string, error) {


Does opts need to be an interface{}? Can it simply be *ListWorkflowRunOptions?

Why pointer semantics *ListWorkflowRunOptions? I would recommend it being value sematics.

I made it a pointer. Pointers are better for large structure, right? Otherwise it's copying around data for calls, etc, I think

azeemshaikh38 · 2022-01-20T16:27:04Z

clients/githubrepo/workflows.go

+		return nil, err
+	}
+
+	req, err := handler.client.NewRequest("GET", u, nil)


Could you file an issue and add a TODO comment here to point to it?

azeemshaikh38 · 2022-01-20T16:31:09Z

LGTM. 2 open issues would be - unit tests and addressing @evverx's comment.

naveensrinivasan

Thanks. Few comments.

naveensrinivasan · 2022-01-20T17:11:03Z

checks/fileparser/github_workflow.go

@@ -298,6 +298,10 @@ func IsWorkflowFile(pathfn string) bool {
 	}
 }

+func IsWorkflowFileCb(filename string) (bool, error) {


Why have a func that never returns error as part of the return?

Could we remove the error?

One line go funcs are discouraged. Is there a specific reason to have this in a separate func?

naveensrinivasan · 2022-01-20T17:13:11Z

clients/githubrepo/checkruns.go

+			checkRun.CheckSuite.ID != nil {
+			cr.CheckSuiteID = checkRun.CheckSuite.ID
+		}
+		/*


Could we please remove this commented code?

naveensrinivasan · 2022-01-20T17:14:47Z

clients/githubrepo/workflows.go

+			WorkflowID: workflowRun.WorkflowID,
+		}
+
+		/*prs := workflowRun.PullRequests


Could we please remove this commented code?

naveensrinivasan · 2022-01-20T17:15:54Z

clients/repo_client.go

@@ -35,6 +35,8 @@ type RepoClient interface {
 	ListReleases() ([]Release, error)
 	ListContributors() ([]Contributor, error)
 	ListSuccessfulWorkflowRuns(filename string) ([]WorkflowRun, error)
+	ListWorkflowRuns(opts *ListWorkflowRunOptions) ([]WorkflowRun, error)


Could this not be pointer *ListWorkflowRunOption? Makes the API cleaner.

github-actions · 2022-01-31T02:06:11Z

Stale pull request message

checks/sast.go

laurentsimon · 2022-02-03T03:00:54Z

update: I've not tweaked the logic for score computation as follows:

linter are run on all PRs. Linters are cheap so this seems acceptable. 1 points is awarded if it's the case
supply-chain tool are used. We don't check if it's run on all PRs. So long as it's defined in a workflow and/or run on at least one commit, we award 2 points for it
static analysis. Same as for supply-chain, and we award 5 points. If one tool is run on all PRs, we give an additional 2 points.

Wdut?

evverx · 2022-02-03T03:16:06Z

supply-chain tool are used. We don't check if it's run on all PRs. So long as it's defined in a workflow and/or run on at least one commit, we award 2 points for it

I think supply chain tools should follow the same rules as static analyzers in the sense that if they can't be run on PRs they should be downgraded to 1. It should help to prompt the developers of those tools to make them compatible with the PR workflow (which is supposed to catch issues as early as possible).

laurentsimon · 2022-02-03T16:47:09Z

supply-chain tool are used. We don't check if it's run on all PRs. So long as it's defined in a workflow and/or run on at least one commit, we award 2 points for it

I think supply chain tools should follow the same rules as static analyzers in the sense that if they can't be run on PRs they should be downgraded to 1. It should help to prompt the developers of those tools to make them compatible with the PR workflow (which is supposed to catch issues as early as possible).

good point. I'll do that then. Thank you for your feedback!

laurentsimon · 2022-02-03T17:17:30Z

thoughts on rate limiting

https://docs.github.com/en/rest/overview/resources-in-the-rest-api#requests-from-github-actions

When using GITHUB_TOKEN, the rate limit is 1,000 requests per hour per repository. For requests to resources that belong to an enterprise account on GitHub.com, GitHub Enterprise Cloud's rate limit applies, and the limit is 15,000 requests per hour per repository.

That's fine since each commit should have a different token.

For PAT, https://docs.github.com/en/developers/apps/building-github-apps/rate-limits-for-github-apps

GitHub Apps making server-to-server requests use the installation's minimum rate limit of 5,000 requests per hour. If an application is installed on an organization with more than 20 users, the application receives another 50 requests per hour for each user. Installations that have more than 20 repositories receive another 50 requests per hour for each repository. The maximum rate limit for an installation is 12,500 requests per hour.

For a repository like systemd/systemd, we'd need 30*6 = ~200 API requests, so let's say 300 for each push. 10-15 commits per hours should be fine.

FYI, we hope to be able to support GITHUB_TOKEN in the future. For cron job, highly unlikely we'll ever be able to run SAST check if we apply the changes proposed here.

@azeemshaikh38 any further thought?

laurentsimon · 2022-02-03T17:17:43Z

thoughts on rate limiting

https://docs.github.com/en/rest/overview/resources-in-the-rest-api#requests-from-github-actions

When using GITHUB_TOKEN, the rate limit is 1,000 requests per hour per repository. For requests to resources that belong to an enterprise account on GitHub.com, GitHub Enterprise Cloud's rate limit applies, and the limit is 15,000 requests per hour per repository.

That's fine since each commit should have a different token.

For PAT, https://docs.github.com/en/developers/apps/building-github-apps/rate-limits-for-github-apps

GitHub Apps making server-to-server requests use the installation's minimum rate limit of 5,000 requests per hour. If an application is installed on an organization with more than 20 users, the application receives another 50 requests per hour for each user. Installations that have more than 20 repositories receive another 50 requests per hour for each repository. The maximum rate limit for an installation is 12,500 requests per hour.

For a repository like systemd/systemd, we'd need 30*6 = ~200 API requests, so let's say 300 for each push. 10-15 commits per hours should be fine.

FYI, we hope to be able to support GITHUB_TOKEN in the future. For cron job, highly unlikely we'll ever be able to run SAST check if we apply the changes proposed here.

@azeemshaikh38 any further thought?

laurentsimon · 2022-04-11T22:42:50Z

@azeemshaikh38 take a final look before I merge. Ignore the comments, I'm going to remove them before the merge.

evverx · 2022-04-15T07:47:39Z

FWIW Looking at #1816 I think by that logic the SAST check should be inconclusive as well when scorecard can't find static analyzers it's aware of. Coverity is often hidden in bash scripts sending data to the scanner and it's unlikely that scorecard can ever detect that reliably.

azeemshaikh38 · 2022-05-12T19:54:36Z

@laurentsimon are we blocked on anything to get this merged?

laurentsimon · 2022-05-12T20:18:18Z

no real blocker, mostly delayed to fix the e2e tests.

- removed the old json format from cron fix #1487 Signed-off-by: naveensrinivasan <[email protected]>

laurentsimon · 2022-05-18T00:46:50Z

Please don't close, Im going to get to this PR sometimes after I've migrated other checks to raw results.

naveensrinivasan · 2022-06-02T14:36:33Z

Please don't close, Im going to get to this PR sometimes after I've migrated other checks to raw results.

I apologize it was a mistake.

laurentsimon · 2022-06-02T15:46:03Z

Please don't close, Im going to get to this PR sometimes after I've migrated other checks to raw results.

I apologize it was a mistake.

np, happens to me all the time :)

naveensrinivasan · 2022-08-27T22:25:54Z

@laurentsimon What do we want to do about this PR?

laurentsimon · 2022-08-30T00:30:47Z

Let's keep it open if you don't mind. The code uses too many API calls, but I think if we relax some of checks, a large part of the code can be re-used. I'm not working on it atm, though.

naveensrinivasan · 2022-08-30T19:33:19Z

Let's keep it open if you don't mind. The code uses too many API calls, but I think if we relax some of checks, a large part of the code can be re-used. I'm not working on it atm, though.

OK, sounds good! Thanks

naveensrinivasan · 2023-02-26T19:16:09Z

Hi @laurentsimon,

I wondered if keeping this Pull Request open would be beneficial, considering it is over a year old.

naveensrinivasan · 2023-04-15T16:00:35Z

Hi @laurentsimon,

I wondered if keeping this Pull Request open would be beneficial, considering it is over a year old.

A friendly ping @laurentsimon

laurentsimon · 2023-04-17T16:45:23Z

Hi @laurentsimon,
I wondered if keeping this Pull Request open would be beneficial, considering it is over a year old.

A friendly ping @laurentsimon

Up to you. There's a lot of useful code someone may be able to re-use to implement the SAST. But we can close it a search for it if need be.

naveensrinivasan · 2023-04-20T16:23:27Z

Hi @laurentsimon,
I wondered if keeping this Pull Request open would be beneficial, considering it is over a year old.

A friendly ping @laurentsimon

Up to you. There's a lot of useful code someone may be able to re-use to implement the SAST. But we can close it a search for it if need be.

Thanks for keeping it clean. I will close it.

laurentsimon requested a review from azeemshaikh38 January 18, 2022 23:07

laurentsimon had a problem deploying to integration-test January 18, 2022 23:07 Failure

laurentsimon commented Jan 18, 2022

View reviewed changes

checks/packaging.go Outdated

@@ -17,7 +17,6 @@ package checks

import (

Copy link

Contributor Author

laurentsimon Jan 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore this file, it's mostly typo changes.

laurentsimon commented Jan 18, 2022

View reviewed changes

laurentsimon had a problem deploying to integration-test January 18, 2022 23:15 Failure

naveensrinivasan reviewed Jan 18, 2022

View reviewed changes

laurentsimon had a problem deploying to integration-test January 18, 2022 23:40 Failure

azeemshaikh38 approved these changes Jan 20, 2022

View reviewed changes

naveensrinivasan approved these changes Jan 20, 2022

View reviewed changes

github-actions bot added the no-pr-activity label Jan 31, 2022

laurentsimon mentioned this pull request Feb 2, 2022

SAST false positive: CodeQL steps in main workflow not detected #1580

Open

mjpieters reviewed Feb 2, 2022

View reviewed changes

checks/sast.go Outdated Show resolved Hide resolved

laurentsimon had a problem deploying to integration-test April 11, 2022 22:43 Failure

naveensrinivasan added a commit that referenced this pull request May 17, 2022

⚠️ Remove the oldjson format from cron

6d3b82e

- removed the old json format from cron fix #1487 Signed-off-by: naveensrinivasan <[email protected]>

naveensrinivasan mentioned this pull request May 17, 2022

⚠️ Remove the oldjson format from cron #1920

Merged

naveensrinivasan added a commit that referenced this pull request May 17, 2022

⚠️ Remove the oldjson format from cron

7898a47

- removed the old json format from cron fix #1487 Signed-off-by: naveensrinivasan <[email protected]>

naveensrinivasan closed this in bbaf072 May 18, 2022

laurentsimon reopened this May 18, 2022

laurentsimon had a problem deploying to integration-test May 18, 2022 00:46 Failure

naveensrinivasan mentioned this pull request Jun 3, 2022

✨ SAST Check for TFSec #1981

Closed

laurentsimon mentioned this pull request Dec 2, 2022

BUG SAST tool check runs on doc-only commits. #2487

Open

laurentsimon mentioned this pull request Dec 13, 2022

On SAST check, evaluate is the project's language is supported by the SAST tools #2538

Open

naveensrinivasan added the waiting-for-feedback label Feb 26, 2023

naveensrinivasan assigned laurentsimon Feb 26, 2023

naveensrinivasan closed this Apr 20, 2023

✨ Support more SAST tools #1487

✨ Support more SAST tools #1487

Conversation

laurentsimon commented Jan 18, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

naveensrinivasan left a comment

Choose a reason for hiding this comment

laurentsimon commented Jan 18, 2022 • edited Loading

laurentsimon commented Jan 18, 2022

evverx commented Jan 19, 2022

evverx commented Jan 19, 2022

evverx commented Jan 19, 2022

evverx commented Jan 19, 2022

azeemshaikh38 commented Jan 19, 2022

laurentsimon commented Jan 19, 2022

laurentsimon commented Jan 19, 2022 • edited Loading

evverx commented Jan 19, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

azeemshaikh38 commented Jan 20, 2022

naveensrinivasan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Jan 31, 2022

laurentsimon commented Feb 3, 2022

evverx commented Feb 3, 2022

laurentsimon commented Feb 3, 2022

laurentsimon commented Feb 3, 2022

laurentsimon commented Feb 3, 2022

laurentsimon commented Apr 11, 2022

evverx commented Apr 15, 2022

azeemshaikh38 commented May 12, 2022

laurentsimon commented May 12, 2022

laurentsimon commented May 18, 2022

naveensrinivasan commented Jun 2, 2022

laurentsimon commented Jun 2, 2022 • edited Loading

naveensrinivasan commented Aug 27, 2022

laurentsimon commented Aug 30, 2022

naveensrinivasan commented Aug 30, 2022

naveensrinivasan commented Feb 26, 2023

naveensrinivasan commented Apr 15, 2023

laurentsimon commented Apr 17, 2023

naveensrinivasan commented Apr 20, 2023

laurentsimon commented Jan 18, 2022 •

edited

Loading

laurentsimon commented Jan 18, 2022 •

edited

Loading

laurentsimon commented Jan 19, 2022 •

edited

Loading

laurentsimon commented Jun 2, 2022 •

edited

Loading