Cheng Xi Tsou – Medium

Pinned

Cheng Xi Tsou
in
Nerd For Tech

Genetic Algorithm: 8 Queens Problem

In my recent lecture on AI (CS4100), I came across an interesting concept: a genetic algorithm. As described in “Artificial Intelligence: A…

7 min readMay 18, 2021

--

Genetic Algorithm: 8 Queens Problem

--

Pinned

Cheng Xi Tsou
in
Geek Culture

Policy Optimizations: TRPO/PPO

In this post, I will be talking about policy optimization methods from the papers Trust Region Policy Optimization (Schulman et al. 2015)…

10 min readSep 17, 2021

--

Policy Optimizations: TRPO/PPO

--

Cheng Xi Tsou
in
Geek Culture

Introduction to Deterministic Policy Gradient (DPG)

In this post, I will be exploring the concepts following the paper Deterministic Policy Gradient Algorithms (Silver et al.), implementing…

12 min readAug 26, 2021

--

1

Introduction to Deterministic Policy Gradient (DPG)

--

1

Cheng Xi Tsou
in
Geek Culture

Actor-Critic: Off-Policy Actor-Critic Algorithm

In this post, I will be exploring the ideas behind the paper Off-Policy Actor-Critic (Degris et al.) submitted to the ICML 2012. The paper…

10 min readAug 18, 2021

--

Actor-Critic: Off-Policy Actor-Critic Algorithm

--

Cheng Xi Tsou
in
Geek Culture

Policy Parameterization for a Continuous Action Space

In the past few Policy Gradient and Actor-Critic algorithms I’ve implemented, I’ve been using the classical control environment, CartPole…

8 min readAug 9, 2021

--

1

Policy Parameterization for a Continuous Action Space

--

1

Cheng Xi Tsou
in
Geek Culture

Actor-Critic: Implementing Actor-Critic Methods

In this post, I’ll be implementing some Actor-Critic methods using the policy gradients methods and value function approximations from my…

10 min readAug 3, 2021

--

Actor-Critic: Implementing Actor-Critic Methods

--

Cheng Xi Tsou
in
Geek Culture

Actor-Critic: Value Function Approximations

In my previous post, I discussed a way to reduce variance by using the generalized policy update equation, which is derived from the policy…

11 min readJul 23, 2021

--

Actor-Critic: Value Function Approximations

--

Cheng Xi Tsou
in
Nerd For Tech

Policy Gradients: REINFORCE with Baseline

After an introduction to the REINFORCE algorithm, I wanted to explore a little bit further this simple algorithm derived from the policy…

7 min readJul 17, 2021

--

Policy Gradients: REINFORCE with Baseline

--

Cheng Xi Tsou
in
Nerd For Tech

Reinforcement Learning: Introduction to Policy Gradients

In the previous posts, I have been working on a form of Reinforcement learning, Q learning, where the agent finds an optimal policy that…

9 min readJul 14, 2021

--

1

Reinforcement Learning: Introduction to Policy Gradients

--

1

Cheng Xi Tsou
in
Nerd For Tech

Reinforcement Learning: Deep Q-Learning with Atari games

In my previous post A First Look at Reinforcement Learning, I attempted to use Deep Q learning to solve the CartPole problem. In this post…

11 min readJul 8, 2021

--

1

Reinforcement Learning: Deep Q-Learning with Atari games

--

1

Cheng Xi Tsou

Cheng Xi Tsou

Interested in Web Dev, AI/ML, specifically RL. Github: github.com/chengxi600

Following

Help
Status
About
Careers
Blog
Privacy
Terms
Text to speech
Teams