-
Notifications
You must be signed in to change notification settings - Fork 15
feat(ptv3): add a lidar segmentation model with onnx support #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ve more and awml-fy it (can train/test) Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
…neralize yet. no idea how many errors will appear in tensorrt yet Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
- limited range on eval - used max spatial shape throughout the network for tensorrt generalization. inference may have changed somewhat so may need to retrain Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
…code Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Signed-off-by: Kenzo Lobos-Tsunekawa <[email protected]>
Memo |
@amadeuszsz The one you provided is 5cm per voxel, but for "real time" I recommend the 10cm one |
@knzo25 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great PR overall, but couldn't test as we miss some files.
Unfortunately @knzo25 is not available to look into it, so we may have to delve deeper into this issue (if there is time allocation)
projects/PTv3/models/point_transformer_v3/point_transformer_v3m1_base.py
Show resolved
Hide resolved
Hi @amadeuszsz @knzo25 |
The code changes attached in review solve most of issues and I can push them. However, the issue regarding EDIT: |
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Thanks @amadeuszsz |
@KSeangTan |
Signed-off-by: Amadeusz Szymko <[email protected]>
For now, we also have another issue: we can export to ONNX, but when our ROS node builds the engine, the TRT backend somehow assigns a static shape to the input tensors, even though I can see correctly defined dynamic axes in the ONNX file. I see that this static shape overlaps with one of the GEMM block constants (160~ k). I believe the ONNX backend uses the concrete value from the sample input data and bakes it into the graph as a Constant node. Then in TRT:
which further results with Nx161089 input tensor shapes. Now we can't deploy ONNX properly, so ROS node cannot be merged as well... I will try to find the root cause. Edit: Fixed. By accident I used wrong spconv implementation. Now I just need to properly make this project able to train and export without code modification. Also right now testing fix for crash during training. |
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Note:
- We can deploy the model.
- We still have an issue with NaNs during training, which later causes training loop crash. This issue is during investigation and I hope we can address it soon.
- Code cleanup after NaNs issue fix.
- Need to add dataset description.
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
Signed-off-by: Amadeusz Szymko <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, let's approve and merge first
Summary
this PR ports Pointcept's PTv3 with the following features:
Change point
Same as the summary
Note
Since the onnx compatible spconv had to be modified, BEVFusion and other spconv dependent modules should be trained with spconv from now instead of mmcv's implementation
Test performed
Before NaN fix
Logs [TIER IV INTERNAL LINK]
After NaN fix
Logs [TIER IV INTERNAL LINK]