Function approximation has been an indispensable component in modern reinforcement learning algorithms designed to tackle problems with large state spaces in high dimensions. This paper reviews recent results on error analysis for these reinforcement learning algorithms in linear or nonlinear approximation settings, emphasizing approximation error and estimation error/sample complexity. We discuss various properties related to approximation error and present concrete conditions on transition probability and reward function under which these properties hold true. Sample complexity analysis in reinforcement learning is more complicated than in supervised learning, primarily due to the distribution mismatch phenomenon. With assumptions on the linear structure of the problem, numerous algorithms in the literature achieve polynomial sample complexity with respect to the number of features, episode length, and accuracy, although the minimax rate has not been achieved yet. These results rely on the $L^∞$ and UCB estimation of estimation error, which can handle the distribution mismatch phenomenon. The problem and analysis become substantially more challenging in the setting of nonlinear function approximation, as both $L^∞$ and UCB estimation are inadequate for bounding the error with a favorable rate in high dimensions. We discuss additional assumption necessary to address the distribution mismatch and derive meaningful results for nonlinear RL problems.
In this paper, we show that structures similar to self-attention are natural for learning many sequence-to-sequence problems from the perspective of symmetry. Inspired by language processing applications,
we study the orthogonal equivariance of seq2seq functions with knowledge, which are functions taking two
inputs — an input sequence and a knowledge — and outputting another sequence. The knowledge consists
of a set of vectors in the same embedding space as the input sequence, containing the information of the
language used to process the input sequence. We show that orthogonal equivariance in the embedding space
is natural for seq2seq functions with knowledge, and under such equivariance, the function must take a form
close to self-attention. This shows that network structures similar to self-attention are the right structures
for representing the target function of many seq2seq problems. The representation can be further refined if
a finite information principle is considered, or a permutation equivariance holds for the elements of the input
Explicit antisymmetrization of a neural network is a potential candidate for a universal function
approximator for generic antisymmetric functions, which are ubiquitous in quantum physics. However, this
procedure is a priori factorially costly to implement, making it impractical for large numbers of particles. The
strategy also suffers from a sign problem. Namely, due to near-exact cancellation of positive and negative
contributions, the magnitude of the antisymmetrized function may be significantly smaller than before antisymmetrization. We show that the anti-symmetric projection of a two-layer neural network can be evaluated
efficiently, opening the door to using a generic anti-symmetric layer as a building block in anti-symmetric
neural network Ansatzes. This approximation is effective when the sign problem is controlled, and we show
that this property depends crucially the choice of activation function under standard Xavier/He initialization
methods. As a consequence, using a smooth activation function requires rescaling of the neural network
weights compared to standard initializations.
This server could not verify that you are authorized to access the document
you supplied the wrong credentials (e.g., bad password), or your browser doesn't
how to supply the credentials required.
Apache/2.2.15 (CentOS) Server at www.global-sci.org Port 80