Neo_Chen (BU4AK) says to OKTW Network
Pipeline Parallel 沒搞好優化會生出一堆 pipeline bubble,結果實際用起來效能比 tensor parallel 還差