This research introduces Trajectory Entropy-Constrained RL (TECRL) to address non-stationary Q-value estimation flaws in maximum entropy RL, presenting the D...
Level: advanced
By Guojian Zhan and 6 other authors
Category: research