Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSDK-10247 Guard FTDC net dereference #4849

Merged
merged 4 commits into from
Mar 13, 2025

Conversation

benjirewis
Copy link
Member

@benjirewis benjirewis commented Mar 12, 2025

RSDK-10247

I think netStatser.fs.NetUDPSummary() can return nil and cause a npe in some circumstances (after robot close maybe?) Guarded the one actual dereference (the calls above don't cause issue.)

I saw this stack trace from a new test I was writing:

goroutine 236 [running]:
runtime/debug.Stack()
	/usr/local/go/src/runtime/debug/stack.go:26 +0x67
runtime/debug.PrintStack()
	/usr/local/go/src/runtime/debug/stack.go:18 +0x1d
go.viam.com/utils.PanicCapturingGoWithCallback.func1.1()
	/home/testbot/go/pkg/mod/go.viam.com/[email protected]/runtime.go:146 +0x65
panic({0x35e85a0?, 0x53a1280?})
	/usr/local/go/src/runtime/panic.go:785 +0x132
go.viam.com/rdk/ftdc/sys.(*netStatser).Stats(0xc000cb8678)
	/host/ftdc/sys/net.go:70 +0x23c
go.viam.com/rdk/ftdc.(*FTDC).constructDatum(0xc000396600)
	/host/ftdc/ftdc.go:300 +0x1f8
go.viam.com/rdk/ftdc.(*FTDC).statsReader(0xc000396600, {0x3f65e80, 0xc001272230})
	/host/ftdc/ftdc.go:215 +0x4c
go.viam.com/utils.NewStoppableWorkerWithTicker.func1()
	/home/testbot/go/pkg/mod/go.viam.com/[email protected]/stoppable_workers.go:66 +0x192
go.viam.com/utils.PanicCapturingGoWithCallback.func1()
	/home/testbot/go/pkg/mod/go.viam.com/[email protected]/runtime.go:156 +0x82
created by go.viam.com/utils.PanicCapturingGoWithCallback in goroutine 46
	/home/testbot/go/pkg/mod/go.viam.com/[email protected]/runtime.go:143 +0xd1
2025-03-12T18:46:12.623Z	ERROR	[email protected]/runtime.go:147	panic while running function	{"error": "runtime error: invalid memory address or nil pointer dereference"}

@benjirewis benjirewis requested a review from dgottlieb March 12, 2025 18:54
@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Mar 12, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 12, 2025
@benjirewis
Copy link
Member Author

@dgottlieb lint started complaining about NPEs for all the other fields once I started checking that Drops one 😆 . Added a broader check (and included the tcp section, too.)

ftdc/sys/net.go Outdated
@@ -57,13 +57,13 @@ func (netStatser *netStatser) Stats() any {
}
}

if netTCPSummary, err := netStatser.fs.NetTCPSummary(); err == nil {
if netTCPSummary, err := netStatser.fs.NetTCPSummary(); err == nil && netTCPSummary != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check the source code to see if it's possible for NetTCPSummary() to return a nil value and* a nil error? I'm not keen on accepting this change -- I prefer to be more explicit with a nolint if the linter is actually wrong. To avoid confusing the reader.

ftdc/sys/net.go Outdated
ret.TCP.TxQueueLength = netTCPSummary.TxQueueLength
ret.TCP.RxQueueLength = netTCPSummary.RxQueueLength
ret.TCP.UsedSockets = netTCPSummary.UsedSockets
}

if netUDPSummary, err := netStatser.fs.NetUDPSummary(); err == nil {
if netUDPSummary, err := netStatser.fs.NetUDPSummary(); err == nil && netUDPSummary != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question/concern

@benjirewis
Copy link
Member Author

Yeah I'll take a closer look.

@benjirewis benjirewis changed the title Guard FTDC net dereference RSDK-10247 Guard FTDC net dereference Mar 12, 2025
@benjirewis
Copy link
Member Author

I took a closer look: neither NetTCPSummary nor NetUDPSummary can return nil as their summary value if err == nil. The panic actually occurred because netUDPSummary.Drops == nil. The line *netUDPSummary.Drops gets the pointer to Drops and dereferences that (i.e. the . operator has higher precedence than the * operator in Golang.) I misunderstood what Golang was doing. I now only guard the setting of Drops, and I added some parentheses to the object of dereference more obvious.

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 13, 2025
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 13, 2025
@benjirewis benjirewis merged commit 98fb4bb into viamrobotics:main Mar 13, 2025
16 checks passed
@benjirewis benjirewis deleted the npe-ftdc-net branch March 13, 2025 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to test This pull request is marked safe to test from a trusted zone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants