Aug 25, 2022 We Need to Route Gradients for Distributed Training with In-network Aggregation Carefully