Fitting the RAFT model 16GB model on 2080Ti? #3

dhruvmetha · 2022-04-05T18:32:54Z

Hey, A question about the optical flow training model, was the RAFT model in a way shrunk down to fit it on the 11GB GPU that you mention in paper?

If so, will the code for training the RAFT optical flow model also be released?

wenbin-lin · 2022-04-07T06:04:44Z

Hi, thanks for your interest!

We only made a few changes to the RAFT open source implementation to adapt the RGB-D input, and no additional modifications were made for the GPU memory.

dhruvmetha · 2022-04-07T19:46:40Z

Thanks for the response!

Will the code for this adaptation for RGB-D input be released?
If not, could you give a high-level overview of how I could go about it, I'm trying to replicate the paper for RGB-D inputs using the raw RAFT code. Is it adding the inverse of the depth channel as the extra channel to the RGB image and the rest remains the same?

This information would be of great help!

wenbin-lin · 2022-04-10T08:20:10Z

We do not have plan to release the code for RGB-D based RAFT training for now, it's actually quite simple to implement.
As you mentioned, we just add the inverse of the depth as an extra channel and keep the rest remains the same.

dhruvmetha · 2022-04-11T17:24:27Z

Thank you! It is mentioned you retrain on 3 datasets, Sintel, FlyingThings3D, and Monkaa. Do you do them in order and successively train for 100k, 100k and 100k iterations? Sorry to be asking so many questions!

wenbin-lin · 2022-04-12T14:57:55Z

We train the model successively in the order of FlyingThings3D -> Monkaa -> Sintel for 100k iterations each.
If there is any confusion about it, please feel free to let me know.

dhruvmetha · 2022-04-12T17:26:34Z

Do y'all freeze the backbone post training on FlyingThings3D or just freeze the batchnorm inside the backbone model as done so in the original RAFT paper? Also do you use the smaller FlyingThings3D dataset (the subset used for dispnet/Flownet2.0) ? Thanks in advance, appreciate the help!

wenbin-lin · 2022-04-14T06:14:53Z

We follow the RAFT implementation and just freeze the batchnorm after training on FlyingThings3D.
And we use the full the FlyingThings3D dataset instead of the smaller subset.

dhruvmetha · 2022-04-14T22:41:34Z

Thanks, this has been really helpful @wenbin-lin !

dhruvmetha · 2022-04-15T22:20:28Z

Is the equation from disparity to depth depth = (focal_length * baseline) / (image_width * disparity), which is equivalent to (1050 * 1.0) / (960 * disparity) for FlyingThings3D? and depth inverse is just 1 - depth where values in depth range from 0 to 1?

wenbin-lin · 2022-04-19T03:24:25Z

Your equation is right.
The inverse depth is 1 / depth, as there can be large depth values in the background and using the inverse depth stabilizes the depth values.
In addition, we use a min max scaler for the inverse depth: x = (x - x_min) / (x_max - x_min). For convenience, you can just use the disparity as the inverse depth, because the value is the same after the min max scaler.

dhruvmetha · 2022-04-20T18:02:38Z

Thank you @wenbin-lin

dhruvmetha · 2022-04-21T19:15:03Z

Do y'all have any rough evaluation results for optical training through each phase of training? This would really help me know if I'm training the model correctly!

wenbin-lin · 2022-04-26T04:18:27Z

We are sorry that we lost the training log, but we are retraining the RGB-D based optical flow model. When the training is done, we will share the evaluation results with you.

A rough conclusion is that the evaluation errors of RGB-D based method can be significantly lower than the RGB based method. Perhaps you can compare your results with the results of the original RGB-based RAFT and the error should be much lower.

phamtrongthang123 · 2022-07-09T04:27:30Z

Is the equation from disparity to depth depth = (focal_length * baseline) / (image_width * disparity), which is equivalent to (1050 * 1.0) / (960 * disparity) for FlyingThings3D? and depth inverse is just 1 - depth where values in depth range from 0 to 1?

Your equation is right. The inverse depth is 1 / depth, as there can be large depth values in the background and using the inverse depth stabilizes the depth values. In addition, we use a min max scaler for the inverse depth: x = (x - x_min) / (x_max - x_min). For convenience, you can just use the disparity as the inverse depth, because the value is the same after the min max scaler.

Wait it should be focallength*baseline / disparity right? Why do we need to multiply image_width by disparity there?
Also, is the scaler applied to each depth? Or the x_min and x_max is the value from the whole dataset?

Guptajakala · 2023-04-19T00:39:20Z

@wenbin-lin Hi, is there any update on retraining the rgbd optical flow? I'm working on a research project and eager to try your method out!

phamtrongthang123 mentioned this issue Jul 9, 2022

About the dataset version of FlyingThings3D #9

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fitting the RAFT model 16GB model on 2080Ti? #3

Fitting the RAFT model 16GB model on 2080Ti? #3

dhruvmetha commented Apr 5, 2022

wenbin-lin commented Apr 7, 2022

dhruvmetha commented Apr 7, 2022 •

edited

Loading

wenbin-lin commented Apr 10, 2022

dhruvmetha commented Apr 11, 2022 •

edited

Loading

wenbin-lin commented Apr 12, 2022

dhruvmetha commented Apr 12, 2022 •

edited

Loading

wenbin-lin commented Apr 14, 2022

dhruvmetha commented Apr 14, 2022

dhruvmetha commented Apr 15, 2022

wenbin-lin commented Apr 19, 2022

dhruvmetha commented Apr 20, 2022

dhruvmetha commented Apr 21, 2022

wenbin-lin commented Apr 26, 2022

phamtrongthang123 commented Jul 9, 2022 •

edited

Loading

Guptajakala commented Apr 19, 2023

Fitting the RAFT model 16GB model on 2080Ti? #3

Fitting the RAFT model 16GB model on 2080Ti? #3

Comments

dhruvmetha commented Apr 5, 2022

wenbin-lin commented Apr 7, 2022

dhruvmetha commented Apr 7, 2022 • edited Loading

wenbin-lin commented Apr 10, 2022

dhruvmetha commented Apr 11, 2022 • edited Loading

wenbin-lin commented Apr 12, 2022

dhruvmetha commented Apr 12, 2022 • edited Loading

wenbin-lin commented Apr 14, 2022

dhruvmetha commented Apr 14, 2022

dhruvmetha commented Apr 15, 2022

wenbin-lin commented Apr 19, 2022

dhruvmetha commented Apr 20, 2022

dhruvmetha commented Apr 21, 2022

wenbin-lin commented Apr 26, 2022

phamtrongthang123 commented Jul 9, 2022 • edited Loading

Guptajakala commented Apr 19, 2023

dhruvmetha commented Apr 7, 2022 •

edited

Loading

dhruvmetha commented Apr 11, 2022 •

edited

Loading

dhruvmetha commented Apr 12, 2022 •

edited

Loading

phamtrongthang123 commented Jul 9, 2022 •

edited

Loading