Deep RL & Sequential Decision Making: Master Q-Learning, Policy Gradients, DQN, and PPO Implementation for Certification