• See also Megatron-DeepSpeed
  • GPT2ModelPipe: a PipelineModule handles the sequential forwards

  • Debugging
    • fused_rope

        ![Untitled](<https://prod-files-secure.s3.us-west-2.amazonaws.com/fd0e9e06-47ef-46a5-bfc2-18344b58b466/f37b9cc3-42ef-4597-91de-4d85a7042306/Untitled.png>)
        
        ![Untitled](<https://prod-files-secure.s3.us-west-2.amazonaws.com/fd0e9e06-47ef-46a5-bfc2-18344b58b466/8e91c105-d0ac-4bd8-a868-f0f6ff9a9a9b/Untitled.png>)