Post
Scalable Kernel Inverse Optimization
2024 · NeurIPS · Youyuan Long, Tolga Ok, Pedro Zattoni Scroccaro, and Peyman Mohajerin Esfahani
Inverse Optimization (IO) is a framework for learning the unknown objective function of an expert decision-maker from a past dataset. In this paper, we extend the hypothesis class of IO objective functions to a reproducing kernel Hilbert space (RKHS), thereby enhancing feature representation to an infinite-dimensional space. We validate the generalization capabilities of the proposed models through learning-from-demonstration tasks on the MuJoCo benchmark.
Post
Offline Reinforcement Learning via Inverse Optimization
2024 · Preprint · Ioannis Dimanidis, Tolga Ok, and Peyman Mohajerin Esfahani
We propose a novel offline Reinforcement Learning (ORL) algorithm leveraging “sub-optimality loss” from Inverse Optimization, with a robust, tractable Model Predictive Control (MPC) expert. It achieves competitive performance on MuJoCo benchmarks with minimal data and computational resources, supported by an open-source implementation.
Post
Long-horizon Value Gradient Methods On Stiefel Manifold
2022 · M.S. Thesis - Computer Engineering · Istanbul Technical University
This thesis aims to investigate a geometric approach to enhance the implementation of Value Gradients on long trajectories. The research focuses on addressing the vanishing and exploding gradient issue in Value Gradients that limits their variance reduction capabilities.
Post
Core Skill Decomposition of Complex Wargames with Reinforcement Learning
2022 · AIAA SCITECH · Kubilay K. Kömürcü, Batuhan Ince, Tolga Ok, Emircan Kilickaya and Nazim Kemal Üre
This paper applies hierarchical reinforcement learning in Starcraft II and demonstrate competitive performance in the challenging settings.