February 12, 2025

Exploring – Nvidia CuOpt

By Paola Massonnier, Ignacio Aristimuño and Santiago Ferreiros

Efficiency plays a critical role in many industries, from logistics and manufacturing to resource management and beyond. Problems such as optimizing delivery routes, scheduling tasks, or allocating resources have gained great relevance in our daily lives, mainly driven by the massive rise of e-commerce retail over the last couple of years.

Despite not being as fashionable as AI, and as mentioned on our last entry Intro to Mixed-Integer Linear Programming (MILP), optimization solutions can not only save companies millions but they can also be tightly related to customer satisfaction since they allow goods to move efficiently from producers to consumers. The B-side of running large scale optimization is that it is highly demanding on computer resources, especially if executed with many variables and constraints. As an example, deliveries are believed to increase an 80% by 2027, defying the capability of CPU based solutions to respond to this demand.

In this blog we explored NVIDIA cuOpt, an innovative AI optimization microservice that specializes on optimal decision taking on large scale problems with millions of variables and constraints by providing GPU-accelerated solutions, almost in real time.

NVIDIA cuOpt | Optimización de rutas | NVIDIA

About CuOpt

We’ll introduce cuOpt in the way you would consult your go-to LLM.

What is it?

NVIDIA cuOpt is a GPU-accelerated combinatorial and linear optimization engine for solving complex route optimization problems such as Vehicle Routing and large Linear Programming. NVIDIA cuOpt is available as an AI microservice.

What do we mean by cuOpt as an AI microservice?

NVIDIA refers to cuOpt as an AI microservice generically, but the easiest (and quickest) way to understand how to access it is by exploring the API based demo and executing requests to the Managed API using an NGC token — at the moment, 1000 credits free of charge with your free NGC user account. Through an NVIDIA NGC account it is also possible to deploy a cuOpt-specific container, either by hosting it on the hardware of your choice, or by deploying it for example on AWS Marketplace or Azure Marketplace. The cuopt-thin-client library for Python is also available. All of this is implied in the “AI microservice” conception of cuOpt.

What does it offer?

cuOpt offers GPU accelerated logistics solutions based on heuristics, metaheuristics and optimizations to calculate complex problems with a wide range of variables and constraints.

Its greatest value? Its world record precision: by march 2024 NVIDIA announced that cuOpt held the top spot for 100% of the largest routing benchmarks through the previous three years.

What can I solve with cuOpt?

Typical problem spaces include:

The Traveling Salesperson Problem (TSP)
Vehicle Routing Problem (VRP)
Pickup and Delivery Problem (PDP)
Linear Programming (LP)
Mixed Integer Linear Programming (MILP)

Disclaimer: LP and MILP at the moment are Early Access features and are currently open to select customers only. Therefore our focus for the time being is on testing a routing optimization example, specifically the Last Mile Delivery use case of the Vehicle Routing Problem.

Nonetheless, it is important to highlight that the LP and MILP features show the greatest potential as they are some of the most commonly found optimization problems in the industry. Hopefully our insights on VRP usage are transferrable to LP and MILP once it becomes fully available.

Last Mile Delivery optimization problem

Last Mile delivery (LMD) is the final phase of the delivery process. A few practical examples of possible scenarios where Last Mile Delivery optimization would be required are autonomous delivery vehicles, delivery from restaurant to customer via independent drivers, and online order shopping and delivery. As an optimization problem it is a particular case of the Vehicle Routing Problem. In fact, as it was also mentioned on our previous blog entry Intro to Mixed-Integer Linear Programming (MILP), routing optimization belongs to the MILP family of optimization problems.

As any optimization problem LMD presents itself with an objective function, which is the following:

c_ij: Cost of travelling from i to j

x_vij: Binary decision variable indicating whether vehicle v travels from i to j

The optimized solution to this objective function provides the optimal task route for each vehicle that minimizes the total delivery costs, while taking into account all of the following specifications/constraints:

#V vehicles must complete #T tasks at different locations,
#L is the universe of locations,
{{c_ij}} is the (squared) cost matrix with the known “distance” cost between all the locations i, j.
- If measured through the physical distance in kilometers, there is one cost matrix for all the vehicles
- If measured through the gas price between locations, then the cost matrix would be vehicle-specific.
{{t_ij}} is the (also squared) time cost matrix, specifying how long it takes to reach every location i, j,
Each vehicle has a specific capacity C restriction (such as the weight or volume dimensions of a truck),
Every task implies specific capacity demands (for example, the weight and size of the package),
Each vehicle must operate within a specific time window (for example, the driver’s working hours),
Every vehicle is allowed to take breaks of a given duration, taken within a specific time window (for example, the worker’s 30 minute lunch break),
And given a specific time window in which the task should be achieved

If solved manually, these constraints would be mathematically modeled together with other flow conservation constraints.

However, cuOpt doesn’t demand the mathematical model of the problem. It only requires loading the specifications in the payload’s schema.

Last Mile Delivery using cuOpt

How to interpret LMD specifications to cuOpt’s payload?

Easy. Just fill in the fields on the payload. Most of them are self-explanatory, like “vehicle_ids”, “task_locations” or “task_duration”. In this section we will analyze the default payload provided on cuOpts API demo site, also brought to you at the bottom of this blog post under the Appendix: cuOpt example payload section.

In these guidelines we summarized our insights on how the LMD is modeled and all of its constraints fit into the cuOpt API payload:

The first level of the payload has 3 keys: action, data and client_version.
1. "action": "cuOpt_OptimizedRouting" informs cuOpt that the addressed problem is for routing optimization. The value cuOpt_LP with a different data field would be used if the Linear Programming feature was available. Note: The current response of the API with cuOpt_LP is 422 – ‘NVIDIA cuOpt – Linear Programming is disabled’.
2. "data": {...} this field carries all the relevant information of the optimization problem. In the case of routing optimization this includes cost and time matrices, vehicle/fleet constraints, tasks constraints, and solver settings.
3. “client_version” points by default to a valid API version and is mostly unused.
“cost_matrix_data”: {“data”: {“0”: [<the matrix>]}} would be used if it can be assumed that the cost_matrix data is the same for all the vehicles.
“cost_matrix_data”: {“data”: {“1”: [<one matrix>], “2”: [<another matrix>]}} is the syntax used in the example payload below, and means that there are two vehicles, each one with different cost matrices.
The analogy from points 2) and 3) also apply to the time_matrix_data.
Time and cost matrices must be squared. Their dimensions are given by the number of locations in the problem. The locations are given by joining the depot locations and task locations.
1. In the example, there are 3 locations: 0, 1, and 2 after the rows/columns in the time and cost matrices.
2. The fleet_data field of vehicle_locations refers to the starting point/depot of each vehicle. For both vehicles the starting point is location 0. For dimensionality purposes, the locations for each vehicle are presented as [0,0].
3. The task_data field of task_locations refers to the two tasks in the problem being located at 1 and 2. We say that if a vehicle reaches location 1 then that task is covered.
vehicle_max_costs/vehicle_max_time follow the same vehicle specific logic. They are specific upper bounds for each vehicle’s total cost/total travel time.
There are fields referring to time windows:
1. The fleet_data field of vehicle_time_windows provides the time range/time window of each vehicle. In the example there are 2 ranges associating each time window to as many vehicles.
2. The task_data field of task_time_windows provides the interval in which each task should be attended. In the example there are 2 ranges associating each time window to as many task locations.
The fleet_data fields of vehicle_break_time_windows, vehicle_break_durations and vehicle_break_locations refer to the period, duration and location in which the vehicle is allowed to take breaks. As an example, a vehicle driver could take their 30 minute lunch break between 11 am and 2 pm, leaving the truck at a specific parking center.
vehicle_order_match and order_vehicle_match associate the vehicles that must attend specific tasks, and the tasks that should be attended by specific vehicles. If unmatched, this constraint will immediately turn the optimization problem unfeasible.
capacities and demands are constraints that refer to the capacity of the vehicle and the capacity demand of the task. As an example, if a bike messenger is able to carry up to 3 kg of packages (capacity of the vehicle), while each package to be delivered weights 1 kg, then the biker wouldn’t be able to attend more than 3 tasks in their route. In the same scenario, if a task weighed 4 kg, then the biker wouldn’t be able to attend it and the task would be left out.
1. At the provided example there are as many capacity constraints as vehicles – 2 vehicles, 2 capacity specifications.
2. In the example there are as many demand specifications as there are tasks – 2 tasks, 2 demand specifications.
3. It is posible to have several dimensions for capacity/demand, such as weight and volume, as it is also the case in the payload example. We say there is a weight specification for each vehicle, a volume specification for each vehicle, a weight specification for each task, and a volume specification for each task. This explains the capacities and demands dimensionality in the payload.
skip_first_trips (and drop_return_trips) simply determine that the cost from the depot to the first task (and from the last task back to the depot) shouldn’t be accounted for in the objective function. Such is the case if, for example, though the route should ideally end back at the depot, the driver is allowed to run personal errands on the company’s vehicle after work hours.
The field solver_config has the relevant configuration for the solver of the optimization model.
1. time_limit: 1 means that the algorithm should iterate optimizing for no longer than 1 second. A deeper analysis varying this parameter is covered in upcoming following sections.
2. The objectives field defines what the objective function should focus on. The current LMD definition focuses on , translated through the cost: 1 field. This means that the objective function will minimize costs. If there was travel_time: 1 instead, then the objective function would be defined as optimized to minimize the travel time.
3. The same can be interpreted with the variance_route_size and variance_route_service_time parameters, where if set to values different than zero, would be taken into consideration in the objective function.
4. The prize field is a different read of the objective function. If set to a value different from zero, would aim to maximize a prize variable that could stand, for example, for a commission through a reward system, net profits, etc.

Bonus insight: The payload for Linear Programming can take in the data field a maximize: True/False flag and a tolerance field in the solver_config: {“tolerances”: {“optimality”: 0.0001}} field.

What about the necessary token?

Let’s point out that the previous guide for mapping of the problem variables into cuOpt’s payload is useful however you are using cuOpt, whether you’re using the API demo or the python client dependency.

However, we found that the easiest way to test it outside the UI demo is by:

Reusing the provided snippet in your own Python environment.
Clicking on the “Get API Key” button in the UI and following the steps to get a token.
Placing the token in an environment variable that can be referred to in the header “Authorization” field – or the developer’s choice of environment variable usage.

API Response interpretation

The response is much more straightforward than the payload: it provides the number of vehicles used in the found optimal solution, the total cost of the objective function, the delivery route for each vehicle and whether any tasks were dropped.

The most relevant note on the response body is that the “response”[“solver_response”][“status”] field that can be:

0, meaning that a feasible solution was found.
1, meaning that there is no feasible solution to the provided problem.

As an example, the response to the default payload is available on cuOpts API demo site, and is also provided at the bottom of this blog post under the Appendix: cuOpt response payload section.

If unfeasible, the response will include the information of the constraint violation, which can be very helpful.

Bonus note: One request can only handle so much

The small print in the nvidia / cuOpt API reference documentation for the Submit to solver POST endpoint states the following:

“If the size of the data exceeds 250KB, please utilize the large assets API to upload it to s3. In such cases, set the data as null and include the header NVCF-INPUT-ASSET-REFERENCES: $ASSET_ID in the POST request.”

As unlikely as it seems to exceed 250KB in the body of a POST request, should you find yourself in this situation, refer to the reference documentation of NVCF Asset APIs where the usage of List asset, Create asset, Get asset details and Delete asset endpoints are explained.

Dummy LMD problem: cuOpt and Pyomo

In this section we ran a simplification of the 2 vehicles/2 task example provided in the API demo, and modeled it using Pyomo. The full code used to run with cuOpt and pyomo is available at the end of this blog post under the Dummy example run with cuOpt and with Pyomo section of the Appendix. Here, we will cover only the most relevant aspects of the LMD problem model for the research operations enthusiasts.

The simplifications made on the constraints of the example are that now both vehicles have skip_first_trips and skip_first_trips flags as False, capacities/demands constraints were downgraded to only one dimension per vehicle, and breaks-related variables were excluded.

The following sections assume that the pyomo_data variable has the problem parameters loaded in their respective fields. Even with the mentioned simplifications to the payload example problem, the model resulted in the following definitions.

Parameters

Nodes enumerate the possible locations (0, 1 and 2), vehicles are the 2 vehicle tags (“veh-i”) and task_locations are the 2 task locations (1 and 2). Location 0 is the depot.

model.nodes = Set(initialize=pyomo_data["nodes"])
model.vehicles = Set(initialize=pyomo_data["vehicles"])
model.task_locations = Set(initialize=pyomo_data["task_locations"])

def initialize_cost_matrix(model, v, i, j):
    return pyomo_data["cost_matrix"][v][i][j]

model.cost = Param(
model.vehicles, 
model.nodes, 
model.nodes, 
initialize=initialize_cost_matrix
)

model.vehicle_max_cost = Param(
model.vehicles, 
initialize=lambda model, v: pyomo_data['vehicle_max_costs'][list(model.vehicles).index(v)]
)

def initialize_time_cost_matrix(model, v, i, j):
    return pyomo_data["time_cost_matrix"][v][i][j]

model.time_cost = Param(
model.vehicles, 
model.nodes, 
model.nodes, 
initialize=initialize_time_cost_matrix
)

model.vehicle_capacity = Param(
model.vehicles,
initialize=lambda model, v: pyomo_data["capacities"][list(model.vehicles).index(v)]
)

model.demand = Param(
model.task_locations, 
initialize=lambda model, i: pyomo_data["demand"][list(model.task_locations).index(i)]
) 

model.vehicle_time_window_start = Param(
    model.vehicles,
    initialize=lambda model, v: pyomo_data["vehicle_time_windows"][list(model.vehicles).index(v)][0]
)

model.vehicle_time_window_end = Param(
    model.vehicles,
    initialize=lambda model, v: pyomo_data["vehicle_time_windows"][list(model.vehicles).index(v)][1]
)

model.vehicle_max_time = Param(
model.vehicles, 
initialize=lambda model, v: pyomo_data['vehicle_max_times'][list(model.vehicles).index(v)]
)

model.task_time_window_start = Param(
    model.nodes,
    initialize=lambda model, i: pyomo_data['task_time_windows'][list(model.task_locations).index(i)][0] if i != 0 else None  
)

model.task_time_window_end = Param(
    model.nodes,
    initialize=lambda model, i: pyomo_data['task_time_windows'][list(model.task_locations).index(i)][1] if i != 0 else None 
)

Decision variables

model.x = Var(
model.vehicles, 
model.nodes, 
model.nodes, 
domain=Binary
)

model.t = Var(
model.vehicles, 
model.nodes, 
domain=NonNegativeReals
)

Constraints

# 1. Each task must be attended once
def task_served_rule(model, i):
    if i != 0:
        return sum(model.x[v, j, i] for v in model.vehicles for j in model.nodes if j != i) == 1
    return Constraint.Skip

model.task_served = Constraint(model.nodes, rule=task_served_rule)

# 2. Avoid trivial cicles (x[v, i, i] = 0 for all v, i)
def no_self_loops_rule(model, v, i):
    return model.x[v, i, i] == 0

model.no_self_loops = Constraint(model.vehicles, model.nodes, rule=no_self_loops_rule)

# 3. Each vehicle must leave and return to the depot
def leave_depot_leave_rule(model, v):
    # Ensure the vehicle leaves the depot at least once
    return sum(model.x[v, 0, j] for j in model.nodes if j != 0) == 1

model.leave_depot_leave = Constraint(model.vehicles, rule=leave_depot_leave_rule)

def leave_depot_return_rule(model, v):
    # Ensure the vehicle returns to the depot after visiting tasks
    return sum(model.x[v, i, 0] for i in model.nodes if i != 0) == 1

model.leave_depot_return = Constraint(model.vehicles, rule=leave_depot_return_rule)

# 4. There can only be a trainsition from a location to another if the exit location was previously visited
def valid_flow_rule(model, v, i, j):
    if i != j and i != 0 and j != 0:
        return model.x[v, i, j] <= sum(model.x[v, k, i] for k in model.nodes if k != i)
    return Constraint.Skip

model.valid_flow = Constraint(model.vehicles, model.nodes, model.nodes, rule=valid_flow_rule)

# 5. Maximum cost and maximum time constraints per vehicle
def vehicle_max_cost_rule(model, v):
    return sum(model.cost[v, i, j] * model.x[v, i, j] for i in model.nodes for j in model.nodes) <= model.vehicle_max_cost[v]

model.vehicle_max_cost_constraint = Constraint(model.vehicles, rule=vehicle_max_cost_rule)

def vehicle_max_time_rule(model, v):
    return sum(model.time_cost[v, i, j] * model.x[v, i, j] for i in model.nodes for j in model.nodes) <= model.vehicle_max_time[v]

model.vehicle_max_time_constraint = Constraint(model.vehicles, rule=vehicle_max_time_rule)

# 6. Capacity constraint: Total demand for a vehicle cannot exceed its capacity
def capacity_constraint_rule(model, v):
    return sum(model.demand[i] * sum(model.x[v, j, i] for j in model.nodes if j != i) for i in model.task_locations) <= model.vehicle_capacity[v]

model.capacity_constraint = Constraint(model.vehicles, rule=capacity_constraint_rule)

# 7. Ensure the start time of each vehicle at the depot is within its time window
def vehicle_time_window_lower_rule(model, v):
    return model.vehicle_time_window_start[v] <= model.t[v, 0]

def vehicle_time_window_upper_rule(model, v):
    return model.t[v, 0] <= model.vehicle_time_window_end[v]

## Add the constraints for the start and end time windows
model.vehicle_time_window_lower_constraint = Constraint(
    model.vehicles, rule=vehicle_time_window_lower_rule
)
model.vehicle_time_window_upper_constraint = Constraint(
    model.vehicles, rule=vehicle_time_window_upper_rule
)

# 8.  Time window constraint for tasks
# Task time window lower bound constraint
def task_time_window_lower_rule(model, v, i):
    if i != 0:  # Skip depot (node 0)
        return model.task_time_window_start[i] <= model.t[v, i]
    return Constraint.Skip

# Task time window upper bound constraint
def task_time_window_upper_rule(model, v, i):
    if i != 0:  # Skip depot (node 0)
        return model.t[v, i] <= model.task_time_window_end[i]
    return Constraint.Skip

model.task_time_window_lower_constraint = Constraint(
    model.vehicles, model.nodes, rule=task_time_window_lower_rule
)

model.task_time_window_upper_constraint = Constraint(
    model.vehicles, model.nodes, rule=task_time_window_upper_rule
)

Objective Function

# Objective function: minimize total cost of routes
def objective_rule(model):
    return sum(
        model.cost[v, i, j] * model.x[v, i, j]
        for v in model.vehicles
        for i in model.nodes
        for j in model.nodes
    )

model.objective = Objective(rule=objective_rule, sense=minimize)

Solving the model

solver = SolverFactory('glpk') 
result = solver.solve(model, tee=True)

Both the solution with cuOpt and Pyomo generated the same vehicles routing solution. Since this is a dummy example that involves 3 locations, 2 vehicles and 2 tasks, it is simple to visualize as a feasible solution route one where each vehicle attends the smallest-costing task, and then return back to the depot. Also, in such a low dimensional dataset, cuOpts higher performance to Pyomo’s isn’t fairly represented.

This example is still very helpful to get further understanding on cuOpt’s usage, as well as to gaining insight into what happens behind the scenes, and how (fun? painful?) it would be to model the VRP problem from scratch.

cuOpt time-cost analysis on a large dataset

As the cuOpt API provides a solution relative to the time limit parameter it will run the optimization during that time and provide the optimal solution that was found. Therefore, different time limits could correspond to different solutions. The cost function values throughout the iterative process are not provided within the response.

Therefore, in order to analyze the stability of the total cost throughout iterations, we ran a simple scenario of consecutive requests with the same LMD parameters setting the time limit at 5 seconds at first, then at 15 seconds, 30 seconds, 60 seconds, and higher 30 second intervals until 6 minutes.

To raise the stakes, the dataset used to model the LMD problem for this analysis has much higher dimensionality than the dummy problem: it has 100 locations, 97 tasks and up to 10 vehicles with different capacities.

The resulting cost of these consecutive runs are shown in the following plot:

The initial runs can find a feasible solution with total cost of 115,000, while the stable value appears on runs of 150 second-timeout setting and higher, at an approximate cost of 108,000. In this example, the instant solution cost is only 6.5% higher than the stable cost found after a 2 minute run.

The tradeoff between instant optimization and a 2 minute delay with a cost variation of 6.5% really depends on the impact such costs represent to the industry in question. However, there is no doubt that achieving such a low margin in an almost real time response during an iterative process for an optimization problem with such a high dimensionality is certainly impressive, in terms of computerized optimization.

Bonus: And what about Pyomo?

In an attempt to compare cuOpt’s performance to pyomo’s, we modeled the LMD problem for this large dataset by following the same logic we used when implementing the dummy example.

Taking aside the sunk time costs of modeling such a high dimensionality problem theoretically with pyomo, it took at least 30 seconds on an 8GB CPU computer to gather one feasible solution as a starting point. Even when running with scip solver (superior to the standard glpk), not even after 3 hours of running the optimization was the algorithm able to find a remotely close solution cost-wise.

This allowed us to conclude that such a high dimensional problem is too much to ask of a PC, while cuOpt can find a reasonable solution instantly, and an even more precise one within 5 minutes.

Conclusion

When dealing with a vehicle routing problem (VRP) cuOpt is a highly efficient and easy to use microservice. For said problems it admits millions of variables and constraints, which can be a huge advantage against working locally (being computationally unrealistic) or modeling the problem from scratch on a different framework. By using cuOpt, a developer has no need to be an operations research specialist in modeling a routing problem with several constraints and they can save the time of the mathematical problem modelling, which is usually the hardest part in an optimization problem resolution.

When to use cuOpt?

We would recommend adopting cuOpt on scenarios where the VRP optimization model applies, with high requirements of frequent re-calculation and real time optimization needs.

This being said, VRP is only a fraction of the MILP family of optimization problems. Regarding the rest of operational research custom problems that might come up, we would still resort to more model-flexible frameworks like Pyomo. We might need to reassess our options, though, if Linear Programming and Mixed Integer Linear Programming features became available…

Who could benefit from this?

A few examples are logistics companies that handle producer to consumer deliveries, agencies that provide courier services from last mile deliveries to container shipping, restaurant delivery apps, any company leveraging from door-to-door sales, among others.

Kawasaki Heavy Industries and SyncTwin are among the companies that have already benefited from using cuOpt. According to Kawasaki, it’s estimated that such an AI-driven system can save $218 million a year for seven companies from automating their track inspections.

Data concerns

Finally, just like in any other API consumption microservice, sensitive data policies and requirements must be factored in. It is hard to imagine that a VRP problem could imply sensitive information, especially because optimization is frequently a line of work explored in situations when there is not enough data available for a machine learning solution. Still, in order to keep being ethical providers, the client ‘s data treatment policies should meet the providers’ treatment policy of the processed data at all times.

If said scenario was the case, there are encoding techniques that could be used. As obvious as it might seem, as a general rule, avoid using personal e-mails to tag as vehicle identifiers in the cuOpt data payload.

What ‘s next?

cuOpt has taken us this far in Vehicle Routing Problems. We’re excited to learn what cuOpt has to offer with their upcoming Linear Programming and Mixed Integer Linear Programming features.

It is likely that further optimization lines of work are being developed. If NVIDIA continues to follow this path, they could become key players in the “optimization as a service” business.

References

Intro to Mixed-Integer Linear Programming (MILP)
cuOpt Demo video
cuOpt API demo
NVIDIA cuOpt user guide
Submit to solver POST endpoint reference
cuOpt Resources repository: contains a collection of resources demonstrating use of NVIDIA cuOpt via service APIs. Source for the large dataset (100 locations, 97 tasks, 10 vehicles) used on time-cost analysis.
All Aboard: NVIDIA Scores 23 World Records for Route Optimization: Blog entry from NVIDIA’s website

Appendix

cuOpt example request Payload

This is the payload explained in the “How to translate LMD to cuOpt’s payload?” section. It is the same one that is provided in the cuOpts API demo site at the current date.

payload = {
  "action": "cuOpt_OptimizedRouting", # or cuOpt_LP if it was available
  "data": {
    "cost_waypoint_graph_data": None,
    "travel_time_waypoint_graph_data": None,
    "cost_matrix_data": {
      "data": {
        "1": [[0,1,1],
         	[1, 0,1 ],
          	[1,1, 0 ]],
        "2": [[0, 1,1],
          	[1,0,1],
          	[1,2, 0 ]]
      }
    },
    "travel_time_matrix_data": {
      "data": {
        "1": [[0,1,1],
          	[1, 0, 1],
         	 [ 1,1,0]],
        "2": [[0,1, 1],
          	[1,0,1 ],
          	[ 1,2, 0]]
      }
    },
    "fleet_data": {
      "vehicle_locations": [ [0,0 ],
       			 [0, 0 ] ],
      "vehicle_ids": ["veh-1", "veh-2"],
      "capacities": [[ 2,2 ], [ 4,1 ] ],
      "vehicle_time_windows": [ [ 0, 10 ],
       				[  0, 10 ]],
      "vehicle_break_time_windows": [
        [ [ 1, 2],
          [ 2,3] ]
      ],
      "vehicle_break_durations": [ [ 1, 1 ] ],
      "vehicle_break_locations": [ 0, 1 ],
      "vehicle_types": [ 1,  2 ],
      "vehicle_order_match": [
        {
          "order_ids": [ 0 ],
          "vehicle_id": 0
        },
        {
          "order_ids": [ 1 ],
          "vehicle_id": 1
        }
      ],
      "skip_first_trips": [ True, False ],
      "drop_return_trips": [ True, False],
      "min_vehicles": 2,
      "vehicle_max_costs": [ 7, 10 ],
      "vehicle_max_times": [  7,  10 ]
    },
    "task_data": {
      "task_locations": [ 1, 2 ],
      "task_ids": [ "Task-A", "Task-B"],
      "demand": [[ 1, 1 ],
       	 [3, 1]],
      "task_time_windows": [ [  0, 5  ],
        			[3,9 ]],
      "service_times": [ 0,  0],
      "order_vehicle_match": [
        {
          "order_id": 0,
          "vehicle_ids": [ 0]
        },
        {
          "order_id": 1,
          "vehicle_ids": [ 1 ]
        }
      ]
    },
    "solver_config": {
      "time_limit": 1,
      "objectives": {
        "cost": 1,
        "travel_time": 0,
        "variance_route_size": 0,
        "variance_route_service_time": 0,
        "prize": 0
      },
      "verbose_mode": False,
      "error_logging": True
    }
  },
  "client_version": ""
}

cuOpt example response Payload

{
  "response": {
    "solver_response": {
      "status": 0,
      "num_vehicles": 2,
      "solution_cost": 2,
      "objective_values": {
        "cost": 2
      },
      "vehicle_data": {
        "veh-1": {
          "task_id": [
            "Break",
            "Task-A"
          ],
          "arrival_stamp": [
            1,
            2
          ],
          "type": [
            "Break",
            "Delivery"
          ],
          "route": [
            1,
            1
          ]
        },
        "veh-2": {
          "task_id": [
            "Depot",
            "Break",
            "Task-B",
            "Depot"
          ],
          "arrival_stamp": [
            2,
            2,
            4,
            5
          ],
          "type": [
            "Depot",
            "Break",
            "Delivery",
            "Depot"
          ],
          "route": [
            0,
            0,
            2,
            0
          ]
        }
      },
      "dropped_tasks": {
        "task_id": [],
        "task_index": []
      }
    },
    "perf_times": {
      "etl_time": 0.0244138240814209,
      "solver_run_time": 0.42867231369018555
    }
  }
}

Dummy example run with cuOpt and with Pyomo

import sys
import os
import requests
from dotenv import load_dotenv
from pyomo.environ import *

sys.path.append("../")
load_dotenv()

# LMD data
cost_matrix_veh1 = [
    [0, 1, 1],
    [1, 0, 1],
    [1, 1, 0]
]
cost_matrix_veh2 = [
    [0, 1, 1],
    [1, 0, 1],
    [1, 2, 0]
]

travel_time_matrix_veh1 = [
    [0, 1, 1],
    [1, 0, 1],
    [1, 1, 0]
]

travel_time_matrix_veh2 = [
    [0, 1, 1],
    [1, 0, 1],
    [1, 2, 0]
]

vehicles_initial_locations = [
    [0, 0],
    [0, 0]
]

vehicle_ids = ["veh-1", "veh-2"]

capacities = [
    [2, 4]
]

vehicle_TW = [
    [0, 10],
    [0, 10]
]

vehicle_max_costs = [7, 10]
vehicle_max_times = [7, 10]

task_locations = [1, 2]
task_ids = ["Task-A", "Task-B"]
task_demand = [
      [1, 3]
      ]

task_TW = [
    [0, 5],
    [3, 9]
]

# cuOpt request specifications
invoke_url = "https://optimize.api.nvidia.com/v1/nvidia/cuopt"
fetch_url_format = "https://optimize.api.nvidia.com/v1/status/"

## Add your NGC token to a variable API_KEY_REQUIRED_IF_EXECUTING_OUTSIDE_NGC= in your env file
headers = {
    "Authorization": f"Bearer {os.getenv('API_KEY_REQUIRED_IF_EXECUTING_OUTSIDE_NGC')}",
    "Accept": "application/json",
}

## Here goes the data from the LMD problem
payload = {
  "action": "cuOpt_OptimizedRouting",
  "data": {
    "cost_waypoint_graph_data": None,
    "travel_time_waypoint_graph_data": None,
    "cost_matrix_data": {
      "data": {
        "1": cost_matrix_veh1,
        "2": cost_matrix_veh2
      }
    },
    "travel_time_matrix_data": {
      "data": {
        "1": travel_time_matrix_veh1,
        "2": travel_time_matrix_veh2
      }
    },
    "fleet_data": {
      "vehicle_locations": vehicles_initial_locations,
      "vehicle_ids": vehicle_ids,
      "capacities": capacities,
      "vehicle_time_windows": vehicle_TW,
      "vehicle_types": [
        1,
        2
      ],
      "min_vehicles": 2,
      "vehicle_max_costs": vehicle_max_costs,
      "vehicle_max_times": vehicle_max_times
    },
    "task_data": {
      "task_locations": task_locations,
      "task_ids": task_ids,
      "demand": task_demand,
      "task_time_windows": task_TW,
    },
    "solver_config": {
      "time_limit": 1,
      "objectives": {
        "cost": 1,
        "travel_time": 0,
        "variance_route_size": 0,
        "variance_route_service_time": 0,
        "prize": 0
      },
      "verbose_mode": False,
      "error_logging": True
    }
  },
  "client_version": ""
}
# re-use connections
session = requests.Session()

response = session.post(invoke_url, headers=headers, json=payload)

while response.status_code == 202:
    request_id = response.headers.get("NVCF-REQID")
    fetch_url = fetch_url_format + request_id
    response = session.get(fetch_url, headers=headers)

response.raise_for_status()
response_body = response.json()

print("==================================================================================")
print(f"---- NVIDIA CuOpt response:")

print(response_body)

print("==================================================================================")

# Load data into a dictionary
pyomo_data = {
    "nodes": list(range(len(cost_matrix_veh1))),
    "vehicles": ["veh-1", "veh-2"], 
    "cost_matrix": {
        "veh-1": cost_matrix_veh1,
        "veh-2": cost_matrix_veh2,
    },
    "time_cost_matrix": {
        "veh-1": travel_time_matrix_veh1,
        "veh-2": travel_time_matrix_veh2
    },
    "task_locations": [1, 2],
    "vehicle_max_costs": vehicle_max_costs,
    "vehicle_max_times": vehicle_max_times,
    'capacities': capacities[0],
    'vehicle_time_windows': vehicle_TW,
    'demand': task_demand[0],
    'task_time_windows': task_TW
}

# Create model
model = ConcreteModel()

# Nodes, vehicles and tasks
model.nodes = Set(initialize=pyomo_data["nodes"])
model.vehicles = Set(initialize=pyomo_data["vehicles"])
model.task_locations = Set(initialize=pyomo_data["task_locations"])

# Parameters
def initialize_cost_matrix(model, v, i, j):
    return pyomo_data["cost_matrix"][v][i][j]

model.cost = Param(model.vehicles, model.nodes, model.nodes, initialize=initialize_cost_matrix)
model.vehicle_max_cost = Param(model.vehicles, initialize=lambda model, v: pyomo_data['vehicle_max_costs'][list(model.vehicles).index(v)])

def initialize_time_cost_matrix(model, v, i, j):
    return pyomo_data["time_cost_matrix"][v][i][j]

model.time_cost = Param(model.vehicles, model.nodes, model.nodes, initialize=initialize_time_cost_matrix)
model.vehicle_capacity = Param(model.vehicles, initialize=lambda model, v: pyomo_data["capacities"][list(model.vehicles).index(v)])
model.demand = Param(model.task_locations, initialize=lambda model, i: pyomo_data["demand"][list(model.task_locations).index(i)]) 

# Add vehicle time window parameters
model.vehicle_time_window_start = Param(
    model.vehicles,
    initialize=lambda model, v: pyomo_data["vehicle_time_windows"][list(model.vehicles).index(v)][0]
)
model.vehicle_time_window_end = Param(
    model.vehicles,
    initialize=lambda model, v: pyomo_data["vehicle_time_windows"][list(model.vehicles).index(v)][1]
)
# Maximum travel time for each vehicle
model.vehicle_max_time = Param(model.vehicles, initialize=lambda model, v: pyomo_data['vehicle_max_times'][list(model.vehicles).index(v)])

# Task time window start and end
model.task_time_window_start = Param(
    model.nodes,
    initialize=lambda model, i: pyomo_data['task_time_windows'][list(model.task_locations).index(i)][0] if i != 0 else None  # Skip depot (0)
)
model.task_time_window_end = Param(
    model.nodes,
    initialize=lambda model, i: pyomo_data['task_time_windows'][list(model.task_locations).index(i)][1] if i != 0 else None  # Skip depot (0)
)

# Variables: cost and time
model.x = Var(model.vehicles, model.nodes, model.nodes, domain=Binary)

# Time variable: Time at which vehicle `v` arrives at node `i`
model.t = Var(model.vehicles, model.nodes, domain=NonNegativeReals)

# Objective function: minimize total cost of routes
def objective_rule(model):
    return sum(
        model.cost[v, i, j] * model.x[v, i, j]
        for v in model.vehicles
        for i in model.nodes
        for j in model.nodes
    )

model.objective = Objective(rule=objective_rule, sense=minimize)

# Constraints

# 1. Each task must be attended once
def task_served_rule(model, i):
    if i != 0:  # Excluir el depósito
        return sum(model.x[v, j, i] for v in model.vehicles for j in model.nodes if j != i) == 1
    return Constraint.Skip

model.task_served = Constraint(model.nodes, rule=task_served_rule)

# 2. Avoid trivial cicles (x[v, i, i] = 0 for all v, i)
def no_self_loops_rule(model, v, i):
    return model.x[v, i, i] == 0

model.no_self_loops = Constraint(model.vehicles, model.nodes, rule=no_self_loops_rule)

# 3. Each vehicle must leave and return to the depot
def leave_depot_leave_rule(model, v):
    # Ensure the vehicle leaves the depot at least once
    return sum(model.x[v, 0, j] for j in model.nodes if j != 0) == 1

model.leave_depot_leave = Constraint(model.vehicles, rule=leave_depot_leave_rule)

def leave_depot_return_rule(model, v):
    # Ensure the vehicle returns to the depot after visiting tasks
    return sum(model.x[v, i, 0] for i in model.nodes if i != 0) == 1

model.leave_depot_return = Constraint(model.vehicles, rule=leave_depot_return_rule)

# 4. There can only be a trainsition from a location to another if the exit location was previously visited
def valid_flow_rule(model, v, i, j):
    if i != j and i != 0 and j != 0:
        return model.x[v, i, j] <= sum(model.x[v, k, i] for k in model.nodes if k != i)
    return Constraint.Skip

model.valid_flow = Constraint(model.vehicles, model.nodes, model.nodes, rule=valid_flow_rule)

# 5. Maximum cost and maximum time constraints per vehicle
def vehicle_max_cost_rule(model, v):
    return sum(model.cost[v, i, j] * model.x[v, i, j] for i in model.nodes for j in model.nodes) <= model.vehicle_max_cost[v]

model.vehicle_max_cost_constraint = Constraint(model.vehicles, rule=vehicle_max_cost_rule)

def vehicle_max_time_rule(model, v):
    return sum(model.time_cost[v, i, j] * model.x[v, i, j] for i in model.nodes for j in model.nodes) <= model.vehicle_max_time[v]

model.vehicle_max_time_constraint = Constraint(model.vehicles, rule=vehicle_max_time_rule)

# 6. Capacity constraint: Total demand for a vehicle cannot exceed its capacity
def capacity_constraint_rule(model, v):
    return sum(model.demand[i] * sum(model.x[v, j, i] for j in model.nodes if j != i) for i in model.task_locations) <= model.vehicle_capacity[v]

model.capacity_constraint = Constraint(model.vehicles, rule=capacity_constraint_rule)

# 7. Ensure the start time of each vehicle at the depot is within its time window
def vehicle_time_window_lower_rule(model, v):
    return model.vehicle_time_window_start[v] <= model.t[v, 0]

def vehicle_time_window_upper_rule(model, v):
    return model.t[v, 0] <= model.vehicle_time_window_end[v]

# ###### Add the constraints for the start and end time windows
model.vehicle_time_window_lower_constraint = Constraint(
    model.vehicles, rule=vehicle_time_window_lower_rule
)
model.vehicle_time_window_upper_constraint = Constraint(
    model.vehicles, rule=vehicle_time_window_upper_rule
)

# 8.  Time window constraint for tasks
# Task time window lower bound constraint
def task_time_window_lower_rule(model, v, i):
    if i != 0:  # Skip depot (node 0)
        return model.task_time_window_start[i] <= model.t[v, i]
    return Constraint.Skip

# Task time window upper bound constraint
def task_time_window_upper_rule(model, v, i):
    if i != 0:  # Skip depot (node 0)
        return model.t[v, i] <= model.task_time_window_end[i]
    return Constraint.Skip

model.task_time_window_lower_constraint = Constraint(
    model.vehicles, model.nodes, rule=task_time_window_lower_rule
)

model.task_time_window_upper_constraint = Constraint(
    model.vehicles, model.nodes, rule=task_time_window_upper_rule
)

print("==================================================================================")
print("---- Pyomo output:")
# Solve model
solver = SolverFactory('glpk') 
result = solver.solve(model, tee=True)

# Add return cost to the total cost in the objective
total_cost = sum(
    model.cost[v, i, j] * model.x[v, i, j].value
    for v in model.vehicles
    for i in model.nodes
    for j in model.nodes
)

# Total Time
total_time = sum(
    model.time_cost[v, i, j] * model.x[v, i, j].value
    for v in model.vehicles
    for i in model.nodes
    for j in model.nodes
)

print("Total Cost:", total_cost)
print("Total Time:", total_time)

print("Solver status:", result.solver.status)
print("Solution state:", result.solver.termination_condition)

for v in model.vehicles:
    for i in model.nodes:
        for j in model.nodes:
            if model.x[v, i, j].value is not None:
                print(f"x[{v},{i},{j}] = {model.x[v, i, j].value}")