Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] Trace taint when reporting. #281

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

PurelyApplied
Copy link
Collaborator


Themes of this PR:

  • Introduce taint tracing by converting tainted map[Node]bool to taintedBy map[Node]Node.
    • Plumb necessary arguments through taint signatures
  • Refactor taintCall so that removing taint from Call does not break taint trace.
    • Removing taint from the Call can break the taint chain entirely. Taint now propagates directly between arguments rather than requiring propagation through the Call.
  • Intentionally failing tests demonstrate trace output.
  • Required to graduate from draft:

@PurelyApplied
Copy link
Collaborator Author

PurelyApplied commented Feb 6, 2021

While still a Draft, my key questions are these:

  • The refactor to taintCall to allow for taint propagating "directly" between args is nontrivial. I could split that out to a separate PR, but it would require the plumbing of the taintedBy Node arguments through the signatures that more obviously belongs to this PR. Let me know if this one seems too heavy and I'll split it up.
  • What is a good first step for showing the taint trace? I went with just the Position of each instruction (for those that have them) and filtered out multiple positions reporting the same line in a row. This fits with my workflow in my IDE, but I'd love to hear what would work better / be more informative to a general user. See here for the current failing run demonstrating the new trace.
  • There's obviously a lot of clean-up left to do with all the TODOs and ignored return values. Don't worry about reviewing that sort of thing - really looking for more high-level conversation for the time being.
  • Since this is partially a work-flow concern, you might want to run tests locally and see how the trace feels in your workflow. Get it here: git fetch origin pull/281/head:pr/281 && git checkout pr/281

Copy link
Contributor

@mlevesquedion mlevesquedion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactor to taintCall ...

Having this refactor in this PR is fine with me.

What is a good first step for showing the taint trace?

I think showing the positions is a good start. I don't especially like the repetition, however:

analysistest.go:333: levee_analysistest/example/tests/callorder/colocation.go:35:12: unexpected diagnostic: a source has reached a sink
         source: /home/runner/work/go-flow-levee/go-flow-levee/internal/pkg/levee/testdata/src/levee_analysistest/example/tests/callorder/colocation.go:31:2
         taints: /home/runner/work/go-flow-levee/go-flow-levee/internal/pkg/levee/testdata/src/levee_analysistest/example/tests/callorder/colocation.go:33:17
         taints: /home/runner/work/go-flow-levee/go-flow-levee/internal/pkg/levee/testdata/src/levee_analysistest/example/tests/callorder/colocation.go:32:19
         taints: /home/runner/work/go-flow-levee/go-flow-levee/internal/pkg/levee/testdata/src/levee_analysistest/example/tests/callorder/colocation.go:34:16
         taints: /home/runner/work/go-flow-levee/go-flow-levee/internal/pkg/levee/testdata/src/levee_analysistest/example/tests/callorder/colocation.go:35:13
         sinks at: /home/runner/work/go-flow-levee/go-flow-levee/internal/pkg/levee/testdata/src/levee_analysistest/example/tests/callorder/colocation.go:35:12

Since taint traces are currently confined to a single function, I think we should show the filepath a single time, and we may want to print the function's name. Source locations can then be indicated via just the line and column. (This will break your workflow, though.)

There's obviously a lot of clean-up left to do with all the TODOs and ignored return values ...

Oops. I read this before starting my review and promptly forgot about it while reviewing. Please ignore my comments if they are impertinent.

return recvOk && tbOk && recvVal == tbVal
}

// getCallComponents returns the ssa.Value(s) for the call's receiver and arguments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: decomposeCall maybe? Feels similar to the utils functions.

// The returned slice is tainted if either the slice argument or the values
// are tainted, so we need to visit the referrers.
prop.taintCallArg(c.Call.Args[0], taintedBy, maxInstrReached, lastBlockVisited)
// The returned slice is taintedBy if either the slice argument or the values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should still say tainted (see also next line).

prop.taint(s.Chan.(ssa.Node), maxInstrReached, lastBlockVisited, false)
// If the sent value (Send) is taintedBy, propagate taint to the channel
case s.Dir == types.SendOnly && taintedSend:
// TODO Does the channel get tainted by the select? Does that make sense?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be preferable to use the value being sent, i.e. s.Send.

return nil, call.Call.Args
}

func taintedByInArgs(taintedBy ssa.Node, args []ssa.Value) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be dead.

@@ -340,7 +391,8 @@ func (prop *Propagation) taintSelect(sel *ssa.Select, maxInstrReached map[*ssa.B
if !ok || e.Index != extractIndex[s] {
continue
}
prop.taint(e, maxInstrReached, lastBlockVisited, false)
// TODO Does the extract get tainted by the select? Does that make sense?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I would use the channel itself, s.Chan.

var taintedSend, taintedChan bool
if s.Send != nil {
_, taintedSend = prop.taintedBy[s.Send.(ssa.Node)]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extraneous blank line here.

// If the sent value (Send) is tainted, propagate taint to the channel
case s.Dir == types.SendOnly && prop.tainted[s.Send.(ssa.Node)]:
prop.taint(s.Chan.(ssa.Node), maxInstrReached, lastBlockVisited, false)
// If the sent value (Send) is taintedBy, propagate taint to the channel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taintedBy should be justed tainted.


func (prop *Propagation) TaintPathTo(n ssa.Node) ([]ssa.Node, bool) {
var path []ssa.Node
var hadJump bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would there be a "jump" in the path?

// Here, we only display the position of nodes with a Pos().
// Additionally, we filter out positions that have the same file and linenumber as the previous.
prevPos := pass.Fset.Position(source.Node.Pos())
for i := len(path) - 2; i > 0; i-- {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: consider not including the root in the path if we're just going to skip it here anyway.

// Additionally, we filter out positions that have the same file and linenumber as the previous.
prevPos := pass.Fset.Position(source.Node.Pos())
for i := len(path) - 2; i > 0; i-- {
if path[i].Pos() == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: prefer token.NoPos instead of 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants