Job on Value Iteration and Policy Iteration

Full–time

Posted on: 9 days ago

I am looking for an experienced Python developer with knowledge of reinforcement learning / Markov Decision Processes (MDP) to implement decision-making algorithms for an agent operating in a stochastic grid-world environment.
The task involves implementing and visualizing Value Iteration and Policy Iteration algorithms in a probabilistic maze environment.

The goal is to compute:
  • Optimal policies
  • State utilities
  • Convergence behaviour of the algorithms
Clear visualizations and well-structured code are required.

Will need you to do revisions if required.

More details will be provided.

Contract duration of less than 1 month.

Mandatory skills:
Python, Writing, English