You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Due to hardware limitations, I used Llama-3.1-1B as the draft model and Llama-3.1-13B as the target model; why is the resulting generation speed slower than the baseline?
Due to hardware limitations, I used Llama-3.1-1B as the draft model and Llama-3.1-13B as the target model; why is the resulting generation speed slower than the baseline?