PinnedCheng Xi TsouinNerd For TechGenetic Algorithm: 8 Queens ProblemIn my recent lecture on AI (CS4100), I came across an interesting concept: a genetic algorithm. As described in “Artificial Intelligence: A…7 min read·May 18, 2021----
PinnedCheng Xi TsouinGeek CulturePolicy Optimizations: TRPO/PPOIn this post, I will be talking about policy optimization methods from the papers Trust Region Policy Optimization (Schulman et al. 2015)…10 min read·Sep 17, 2021----
Cheng Xi TsouinGeek CultureIntroduction to Deterministic Policy Gradient (DPG)In this post, I will be exploring the concepts following the paper Deterministic Policy Gradient Algorithms (Silver et al.), implementing…12 min read·Aug 26, 2021--1--1
Cheng Xi TsouinGeek CultureActor-Critic: Off-Policy Actor-Critic AlgorithmIn this post, I will be exploring the ideas behind the paper Off-Policy Actor-Critic (Degris et al.) submitted to the ICML 2012. The paper…10 min read·Aug 18, 2021----
Cheng Xi TsouinGeek CulturePolicy Parameterization for a Continuous Action SpaceIn the past few Policy Gradient and Actor-Critic algorithms I’ve implemented, I’ve been using the classical control environment, CartPole…8 min read·Aug 9, 2021--1--1
Cheng Xi TsouinGeek CultureActor-Critic: Implementing Actor-Critic MethodsIn this post, I’ll be implementing some Actor-Critic methods using the policy gradients methods and value function approximations from my…10 min read·Aug 3, 2021----
Cheng Xi TsouinGeek CultureActor-Critic: Value Function ApproximationsIn my previous post, I discussed a way to reduce variance by using the generalized policy update equation, which is derived from the policy…11 min read·Jul 23, 2021----
Cheng Xi TsouinNerd For TechPolicy Gradients: REINFORCE with BaselineAfter an introduction to the REINFORCE algorithm, I wanted to explore a little bit further this simple algorithm derived from the policy…7 min read·Jul 17, 2021----
Cheng Xi TsouinNerd For TechReinforcement Learning: Introduction to Policy GradientsIn the previous posts, I have been working on a form of Reinforcement learning, Q learning, where the agent finds an optimal policy that…9 min read·Jul 14, 2021--1--1
Cheng Xi TsouinNerd For TechReinforcement Learning: Deep Q-Learning with Atari gamesIn my previous post A First Look at Reinforcement Learning, I attempted to use Deep Q learning to solve the CartPole problem. In this post…11 min read·Jul 8, 2021--1--1