You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The latency i am getting here and the actual time when i am inferencing are not same. And also there is a huge difference between these two. So could be the problem?
The text was updated successfully, but these errors were encountered:
I used the tool by a running slurm job . Whereas the actual time I observed was loading the model and timing it using time module when given a prompt till decoding it . And all this is done on a cpu not gpu.
The latency i am getting here and the actual time when i am inferencing are not same. And also there is a huge difference between these two. So could be the problem?
The text was updated successfully, but these errors were encountered: