PURR will have scheduled maintenance work on Thursday, June 27, 2024. This time the regular maintenance will also include a major operating system update. The platform should operate normally for most of the day, except for a 2-4 hour long outage in the afternoon starting at 2 PM (EDT). Please plan accordingly and we do apologize for any inconvenience.

DRSP-Sim: A Simulator for Ride-sharing with Pooling: Joint Matching,Pricing, Route Planning, and Dispatching

Listed in Datasets

By Marina Haliem1, Vaneet Aggarwal1, Bharat Bhargava1

Purdue University

Significant development of ride-sharing services presents a plethora of opportunities to transform urban mobility by providing personalized and convenient transportation while ensuring the efficiency of large-scale ride pooling.

Additional materials available

Version 1.0 - published on 31 Aug 2021 doi:10.4231/GDFY-5A56 - cite this Archived on 02 Oct 2021

Licensed under CC0 1.0 Universal

Screen Shot 2021-08-29 at 1.06.02 AM.png Screen Shot 2021-08-30 at 11.19.44 PM.png


This is NYC taxi trip records used as the real-world dataset for the comprehensive ride-sharing simulation available at: https://github.itap.purdue.edu/Clan-labs/Dynamic_Matching_RS. We use 2016-05 trips for training and 2016-06 trips for evaluation.

To run this simulator, the user can choose to either go through the pre-processing steps explained in README in the above Github link (also available under "Supporting Docs" here) to generate the data under the title "Data Generation", or just fetch the pre-processed files directly from here, load them into a directory, and set the DATA_DIR variable in config/settings.py.

Description of the provided files:

1. The "osrm" folder (available in the Github repository) contains the OSM data downloaded from https://download.bbbike.org/osm/bbbike/NewYork/NewYork.osm.pbf and preprocessed using the OSRM backend engine to extract the road network and geometry. (Data Generation steps #1 & #2). This is, then, used if the user decides to use the OSRM engine for real-time routing.

2.  db.sqlite3 is the trip database used for simulation after pre-processing the trip records downloaded from https://s3.amazonaws.com/nyc-tlc/trip+data/. The columns in this database include ['request_datetime', 'trip_time', 'origin_lon', 'origin_lat', 'destination_lon', 'destination_lat', 'fare']. (Data Generation steps #3 through #7). In addition, (in Data Generation step #8), prepares statistical demand profile using the training dataset and stores that in this database as well.

3. routes.pkl and tt_map.npy contains the pre-computed trajectories and trip times respectively. reachable_map.npy contains the discretized map of NewYork city according to the maps obtained from OSRM. These are pre-computed and generated using the OSRM engine to be used in real-time during simulation as the Fast Routing option provided to the user.  (Data Generation steps #9).

Cite this work

Researchers should cite this work as follows:


The Purdue University Research Repository (PURR) is a university core research facility provided by the Purdue University Libraries and the Office of the Executive Vice President for Research and Partnerships, with support from additional campus partners.