Post

Scalable Kernel Inverse Optimization

2024 · NeurIPS · Youyuan Long, Tolga Ok, Pedro Zattoni Scroccaro, and Peyman Mohajerin Esfahani

Inverse Optimization (IO) is a framework for learning the unknown objective function of an expert decision-maker from a past dataset. In this paper, we extend the hypothesis class of IO objective functions to a reproducing kernel Hilbert space (RKHS), thereby enhancing feature representation to an infinite-dimensional space. We validate the generalization capabilities of the proposed models through learning-from-demonstration tasks on the MuJoCo benchmark.

Post

Offline Reinforcement Learning via Inverse Optimization

2024 · Preprint · Ioannis Dimanidis, Tolga Ok, and Peyman Mohajerin Esfahani

We propose a novel offline Reinforcement Learning (ORL) algorithm leveraging “sub-optimality loss” from Inverse Optimization, with a robust, tractable Model Predictive Control (MPC) expert. It achieves competitive performance on MuJoCo benchmarks with minimal data and computational resources, supported by an open-source implementation.

Post

Long-horizon Value Gradient Methods On Stiefel Manifold

2022 · M.S. Thesis - Computer Engineering · Istanbul Technical University

This thesis aims to investigate a geometric approach to enhance the implementation of Value Gradients on long trajectories. The research focuses on addressing the vanishing and exploding gradient issue in Value Gradients that limits their variance reduction capabilities.

Post

Core Skill Decomposition of Complex Wargames with Reinforcement Learning

2022 · AIAA SCITECH · Kubilay K. Kömürcü, Batuhan Ince, Tolga Ok, Emircan Kilickaya and Nazim Kemal Üre

This paper applies hierarchical reinforcement learning in Starcraft II and demonstrate competitive performance in the challenging settings.