Gradient descent aligns the layers of deep linear networks

Ziwei Ji, Matus Telgarsky. ICLR 2019.  (Poster PDF)