Skip to content

Latest commit

 

History

History
27 lines (14 loc) · 1.25 KB

README.md

File metadata and controls

27 lines (14 loc) · 1.25 KB

Implementation of the MDP Order Dispatch Policy

This repository contains the implementation of the paper Large-Scale Order Dispatch in On-Demand Ride-Hailing Platforms: A Learning and Planning Approach in Python. Specifically, it creates a synthetic environment to simulate the ridesharing marketplace according to Section 6.1 of the paper and applies the MDP order dispatch policy developed in the paper to this example. Please refer to Demonstration.ipynb for the detailed implementation.

Summary of the Algorithm

The algorithm consists of two steps:

  • Policy Evaluation: Apply temporal difference learning to the historical data to learn the value function
  • Order Dispatch: Implement the order dispatch policy by maximizing the value function

Illustration of the policy evaluation step:

drawing

Pseudocode:

drawing

The order dispatch step:

drawing

Simulation results and comparison against other baseline policies:

drawing