Volume 17, Issue 4
Transient Dynamics of Block Coordinate Descent in a Valley

Martin Mohlenkamp, Todd R. Young & Balázs Bárány

DOI:

Int. J. Numer. Anal. Mod., 17 (2020), pp. 557-591.

Published online: 2020-08

Preview Purchase PDF 0 812
Export citation
  • Abstract

We investigate the transient dynamics of Block Coordinate Descent algorithms in valleys of the optimization landscape. Iterates converge linearly to a vicinity of the valley floor and then progress in a zig-zag fashion along the direction of the valley floor. When the valley sides are symmetric, the contraction factor to a vicinity of the valley floor appears to be no worse than 1/8, but without symmetry the contraction factor can approach 1. Progress along the direction of the valley floor is proportional to the gradient on the valley floor and inversely proportional to the "narrowness" of the valley. We quantify narrowness using the eigenvalues of the Hessian on the valley floor and give explicit formulas for certain cases. Progress also depends on the direction of the valley with respect to the blocks of coordinates. When the valley sides are symmetric, we give an explicit formula for this dependence and use it to show that in higher dimensions nearly all directions give progress similar to the worst case direction. Finally, we observe that when starting the algorithm, the ordering of blocks in the first few steps can be important, but show that a greedy strategy with respect to objective function improvement can be a bad choice.

  • Keywords

Block Coordinate Descent, alternating Least Squares, tensor Approximation, swamp, diagonal Valley.

  • AMS Subject Headings

  • Copyright

COPYRIGHT: © Global Science Press

  • Email address
  • BibTex
  • RIS
  • TXT
@Article{IJNAM-17-557, author = {Martin Mohlenkamp , and Todd R. Young , and Balázs Bárány , }, title = {Transient Dynamics of Block Coordinate Descent in a Valley}, journal = {International Journal of Numerical Analysis and Modeling}, year = {2020}, volume = {17}, number = {4}, pages = {557--591}, abstract = {

We investigate the transient dynamics of Block Coordinate Descent algorithms in valleys of the optimization landscape. Iterates converge linearly to a vicinity of the valley floor and then progress in a zig-zag fashion along the direction of the valley floor. When the valley sides are symmetric, the contraction factor to a vicinity of the valley floor appears to be no worse than 1/8, but without symmetry the contraction factor can approach 1. Progress along the direction of the valley floor is proportional to the gradient on the valley floor and inversely proportional to the "narrowness" of the valley. We quantify narrowness using the eigenvalues of the Hessian on the valley floor and give explicit formulas for certain cases. Progress also depends on the direction of the valley with respect to the blocks of coordinates. When the valley sides are symmetric, we give an explicit formula for this dependence and use it to show that in higher dimensions nearly all directions give progress similar to the worst case direction. Finally, we observe that when starting the algorithm, the ordering of blocks in the first few steps can be important, but show that a greedy strategy with respect to objective function improvement can be a bad choice.

}, issn = {2617-8710}, doi = {https://doi.org/}, url = {http://global-sci.org/intro/article_detail/ijnam/17870.html} }
TY - JOUR T1 - Transient Dynamics of Block Coordinate Descent in a Valley AU - Martin Mohlenkamp , AU - Todd R. Young , AU - Balázs Bárány , JO - International Journal of Numerical Analysis and Modeling VL - 4 SP - 557 EP - 591 PY - 2020 DA - 2020/08 SN - 17 DO - http://doi.org/ UR - https://global-sci.org/intro/article_detail/ijnam/17870.html KW - Block Coordinate Descent, alternating Least Squares, tensor Approximation, swamp, diagonal Valley. AB -

We investigate the transient dynamics of Block Coordinate Descent algorithms in valleys of the optimization landscape. Iterates converge linearly to a vicinity of the valley floor and then progress in a zig-zag fashion along the direction of the valley floor. When the valley sides are symmetric, the contraction factor to a vicinity of the valley floor appears to be no worse than 1/8, but without symmetry the contraction factor can approach 1. Progress along the direction of the valley floor is proportional to the gradient on the valley floor and inversely proportional to the "narrowness" of the valley. We quantify narrowness using the eigenvalues of the Hessian on the valley floor and give explicit formulas for certain cases. Progress also depends on the direction of the valley with respect to the blocks of coordinates. When the valley sides are symmetric, we give an explicit formula for this dependence and use it to show that in higher dimensions nearly all directions give progress similar to the worst case direction. Finally, we observe that when starting the algorithm, the ordering of blocks in the first few steps can be important, but show that a greedy strategy with respect to objective function improvement can be a bad choice.

Martin Mohlenkamp, Todd R. Young & Balázs Bárány. (2020). Transient Dynamics of Block Coordinate Descent in a Valley. International Journal of Numerical Analysis and Modeling. 17 (4). 557-591. doi:
Copy to clipboard
The citation has been copied to your clipboard