• No products in the basket.

Learn the importance of approximation methods in complex decision-making and apply them in a practical Stock Trading Project. FAQs in the appendix ensure a thorough grasp of the subject. Ideal for Python enthusiasts and AI learners, this course equips you to navigate the world of reinforcement learning. Join us to unleash the potential of AI in Python!

Course Price:
Original price was: £194.00.Current price is: £19.99.
Course Duration:
55 years, 6 months
Total Lectures:
Total Students:
Average Rating:
Embark on an exhilarating journey into the world of Artificial Intelligence Reinforcement Learning in Python. This comprehensive course is designed to provide you with a deep understanding of reinforcement learning techniques, from fundamental concepts to real-world applications. Begin with a warm welcome to the course, setting the stage for your exploration of this exciting field. Dive into the practical aspects of reinforcement learning with "Return of the Multi-Armed Bandit" and gain a high-level overview of the subject.

You'll get hands-on experience by building an intelligent Tic-Tac-Toe agent and explore Markov Decision Processes, dynamic programming, Monte Carlo methods, and Temporal Difference Learning to master the core principles of reinforcement learning. Discover approximation methods and their significance in complex decision-making scenarios. To solidify your knowledge, take on a practical Stock Trading Project using reinforcement learning techniques.

Throughout your journey, find answers to frequently asked questions in the appendix, ensuring a comprehensive grasp of the subject matter. Whether you're a Python enthusiast or an AI enthusiast, this course will equip you with the skills to navigate the exciting world of reinforcement learning and its real-world applications. Join us to unlock the potential of AI in Python!

What Will You Learn?

  • Understand the fundamentals of reinforcement learning in Python.
  • Develop intelligent agents for games and decision-making processes.
  • Master key concepts like Markov Decision Processes, dynamic programming, and Monte Carlo methods.
  • Implement reinforcement learning in real-world applications, such as stock trading.
  • Gain practical experience and access an FAQ for a comprehensive understanding.

Who Should Take The Course?

  • Python developers and AI enthusiasts interested in reinforcement learning.
  • Game developers seeking to enhance game AI.
  • Data scientists and machine learning practitioners looking to expand their skill set.
  • Anyone curious about the application of AI in decision-making processes.


  • Basic knowledge of Python programming.
  • A computer with Python installed.
  • Familiarity with fundamental machine learning concepts is helpful but not mandatory.
  • Enthusiasm to explore the exciting world of reinforcement learning in Python.

Course Curriculum

    • Introduction 00:03:00
    • Where to get the Code 00:02:00
    • Strategy for Passing the Course 00:05:00
    • Course Outline 00:04:00
    • Problem Setup and The Explore-Exploit Dilemma 00:03:00
    • Applications of the Explore-Exploit Dilemma 00:07:00
    • Epsilon-Greedy 00:01:00
    • Updating a Sample Mean 00:01:00
    • Designing Your Bandit Program 00:04:00
    • Comparing Different Epsilons 00:04:00
    • Optimistic Initial Values 00:02:00
    • UCB1 Unlimited
    • Bayesian Thompson Sampling 00:09:00
    • Thompson Sampling vs. Epsilon-Greedy vs. Optimistic Initial Values vs. UCB1 00:05:00
    • Nonstationary Bandits 00:04:00
    • Bandit Summary, Real Data, and Online Learning 00:06:00
    • What is Reinforcement Learning 00:08:00
    • On Unusual or Unexpected Strategies of RL 00:06:00
    • Defining Some Terms 00:07:00
    • Naive Solution to Tic-Tac-Toe 00:03:00
    • Components of a Reinforcement Learning System 00:08:00
    • Notes on Assigning Rewards 00:02:00
    • The Value Function and Your First Reinforcement Learning Algorithm 00:16:00
    • Tic Tac Toe Code Outline 00:03:00
    • Tic Tac Toe Code Representing States 00:02:00
    • Tic Tac Toe Code Enumerating States Recursively 00:06:00
    • Tic Tac Toe Code The Environment 00:06:00
    • Tic Tac Toe Code The Agent 00:05:00
    • Tic Tac Toe Summary 00:06:00
    • Tic Tac Toe Exercise 00:05:00
    • Gridworld 00:03:00
    • The Markov Property 00:02:00
    • Defining and Formalizing the MDP 00:04:00
    • Future Rewards 00:04:00
    • Value Function Introduction 00:03:00
    • Value Functions 00:12:00
    • Value Functions Unlimited
    • Bellman Examples 00:22:00
    • Optimal Policy and Optimal Value Function 00:04:00
    • MDP Summary 00:01:00
    • Intro to Dynamic Programming and Iterative Policy Evaluation 00:03:00
    • Gridworld in Code 00:05:00
    • Designing Your RL Program 00:05:00
    • Iterative Policy Evaluation in Code 00:06:00
    • Policy Improvement 00:02:00
    • Policy Iteration 00:02:00
    • Policy Iteration in Code 00:03:00
    • Policy Iteration in Windy Gridworld 00:04:00
    • Value Iteration 00:03:00
    • Value Iteration in Code 00:02:00
    • Dynamic Programming Summary 00:05:00
    • Monte Carlo Intro 00:03:00
    • Monte Carlo Policy Evaluation 00:05:00
    • Monte Carlo Policy Evaluation in Code 00:03:00
    • Policy Evaluation in Windy Gridworld 00:03:00
    • Monte Carlo Control 00:05:00
    • Monte Carlo Control in Code 00:04:00
    • Monte Carlo Control without Exploring Starts 00:02:00
    • Monte Carlo Control without Exploring Starts in Code 00:04:00
    • Monte Carlo Summary 00:03:00
    • Temporal Difference Intro 00:02:00
    • TD(0) Prediction 00:01:00
    • TD(0) Prediction in Code 00:02:00
    • SARSA 00:05:00
    • SARSA in Code 00:03:00
    • Q Learning 00:03:00
    • Q Learning in Code 00:02:00
    • TD Summary 00:02:00
    • Approximation Intro 00:04:00
    • Linear Models for Reinforcement Learning 00:04:00
    • Features 00:04:00
    • Monte Carlo Prediction with Approximation 00:01:00
    • Monte Carlo Prediction with Approximation in Code 00:02:00
    • TD(0) Semi-Gradient Prediction 00:04:00
    • Semi-Gradient SARSA 00:03:00
    • Semi-Gradient SARSA in Code 00:04:00
    • Course Summary and Next Steps 00:08:00
    • Stock Trading Project Section Introduction 00:05:00
    • Data and Environment 00:12:00
    • How to Model Q for Q-Learning 00:09:00
    • Design of the Program 00:06:00
    • Code pt 1 00:07:00
    • Code pt 2 00:09:00
    • Code pt 3 00:04:00
    • Code pt 4 00:07:00
    • Stock Trading Project Discussion 00:03:00
    • What is the Appendix 00:02:00
    • Windows-Focused Environment Setup 2018 00:20:00
    • How to install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow 00:17:00
    • How to Code by Yourself (part 1) 00:15:00
    • How to Code by Yourself (part 2) 00:09:00
    • How to Succeed in this Course (Long Version) 00:10:00
    • Is this for Beginners or Experts Academic or Practical Fast or slow-paced 00:22:00
    • Proof that using Jupyter Notebook is the same as not using it 00:12:00
    • Python 2 vs Python 3 00:04:00
    • What order should I take your courses in (part 1) 00:11:00
    • What order should I take your courses in (part 2) 00:16:00
    • BONUS Where to get discount coupons and FREE deep learning material 00:05:00
    • Order Certificate 00:05:00

New Courses




    ADHD Training for Teachers: Empowering Educators to Support Students with Attention Challenges

    Relationships may be severely harmed by narcissistic behaviours, leaving emotional scars and...



    Narcissistic Behaviour and Relationships: Understanding the Impact and Finding Healing

    Relationships may be severely harmed by narcissistic behaviours, leaving emotional...



    Childhood Trauma in Adults

    What Is Childhood Trauma? Childhood trauma refers to distressing or...



    Creating A Social Media Strategy

    Set Clear Objectives:The first step in developing a successful social media...



    Neuro-Linguistic Programming Techniques

    Neuro-Linguistic Programming (NLP) is a fascinating and widely acclaimed approach...



    Acceptance and Commitment Therapy in the UK

    What is acceptance and commitment therapy? Acceptance and Commitment Therapy...

    © Course Line. All rights reserved.