Can't improve the scaler of batch size with ZeRO technique #1884
              
                Unanswered
              
          
                  
                    
                      Lyn-Lucy
                    
                  
                
                  asked this question in
                Community | Q&A
              
            Replies: 1 comment
-
| Hi, ZeRO has own AMP. DO NOT use autocast and grad scaler. | 
Beta Was this translation helpful? Give feedback.
                  
                    0 replies
                  
                
            
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
        
    
Uh oh!
There was an error while loading. Please reload this page.
-
Here is my code of training refer to the example of zero in [ColossalAI-Examples/train_v2.py at main · hpcaitech/ColossalAI-Examples (github.com)](https://github.com/hpcaitech/ColossalAI-Examples/blob/main/features/zero/train_v2.py)
However,I can only set one batch to per GPU. I can set 4 batch to per GPU without ZeRO.
Here is the result:
And the same time ,I also want to ask why that the parameters don't change after step.
Beta Was this translation helpful? Give feedback.
All reactions