Please use this identifier to cite or link to this item: doi:10.22028/D291-48015
Title: Continuous time reinforcement learning: A random measure approach
Author(s): Bender, Christian
Thuan, Nguyen Tran
Language: English
Title: Stochastic Processes and their Applications
Volume: 194 (2026)
Publisher/Platform: Elsevier
Year of Publication: 2025
Free key words: Exploratory control
Orthogonal martingale measures
Poisson random measures
Reinforcement learning
Weak convergence
DDC notations: 510 Mathematics
Publikation type: Journal Article
Abstract: We present a random measure approach for modeling exploration, i.e., the execution of measure valued controls, in continuous-time reinforcement learning with controlled diffusion and jumps. We begin with the case when sampling the randomized control in continuous time takes place on a discrete-time grid and reformulate the resulting SDE as an equation driven by suitable random measures. Our main result is a limit theorem for these random measures as the mesh-size of the sampling grid goes to zero. The resulting limit SDE can be applied for the theoretical analysis of exploratory control problems and for the derivation of learning algorithms.
DOI of the first publication: 10.1016/j.spa.2025.104848
URL of the first publication: https://doi.org/10.1016/j.spa.2025.104848
Link to this record: urn:nbn:de:bsz:291--ds-480157
hdl:20.500.11880/42000
http://dx.doi.org/10.22028/D291-48015
ISSN: 0304-4149
Date of registration: 11-Jun-2026
Description of the related object: Supplementary materials
Related object: https://ars.els-cdn.com/content/image/1-s2.0-S0304414925002923-mmc1.pdf
Faculty: MI - Fakultät für Mathematik und Informatik
Department: MI - Mathematik
Professorship: MI - Prof. Dr. Christian Bender
Collections:SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Files for this record:
File Description SizeFormat 
1-s2.0-S0304414925002923-main.pdf4,69 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons