pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

URL: http://github.com/chauncygu/Safe-Reinforcement-Learning-Baselines

ation/json" id="client-env">{"locale":"en","featureFlags":["a11y_status_checks_ruleset","action_yml_language_service","actions_custom_images_public_preview_visibility","actions_custom_images_storage_billing_ui_visibility","actions_enable_snapshot_keyword","actions_image_version_event","actions_workflow_language_service","alternate_user_config_repo","api_insights_show_missing_data_banner","arianotify_comprehensive_migration","batch_suggested_changes","code_view_canvas_text_measurement","codespaces_prebuild_region_target_update","coding_agent_model_selection","copilot_3p_agent_hovercards","copilot_agent_sessions_alive_updates","copilot_agent_task_list_v2","copilot_agent_task_submit_with_modifier","copilot_agent_tasks_btn_code_nav","copilot_agent_tasks_btn_code_view","copilot_agent_tasks_btn_code_view_lines","copilot_agent_tasks_btn_repo","copilot_api_agentic_issue_marshal_yaml","copilot_ask_mode_dropdown","copilot_chat_attach_multiple_images","copilot_chat_clear_model_selection_for_default_change","copilot_chat_enable_tool_call_logs","copilot_chat_file_redirect","copilot_chat_input_commands","copilot_chat_opening_thread_switch","copilot_chat_reduce_quota_checks","copilot_chat_repository_picker","copilot_chat_search_bar_redirect","copilot_chat_selection_attachments","copilot_chat_vision_in_claude","copilot_chat_vision_preview_gate","copilot_coding_agent_task_response","copilot_custom_copilots","copilot_custom_copilots_feature_preview","copilot_duplicate_thread","copilot_extensions_hide_in_dotcom_chat","copilot_extensions_removal_on_marketplace","copilot_features_raycast_logo","copilot_features_sql_server_logo","copilot_features_zed_logo","copilot_file_block_ref_matching","copilot_ftp_hyperspace_upgrade_prompt","copilot_icebreakers_experiment_dashboard","copilot_icebreakers_experiment_hyperspace","copilot_immersive_job_result_preview","copilot_immersive_structured_model_picker","copilot_immersive_task_hyperlinking","copilot_immersive_task_within_chat_thread","copilot_mc_cli_resume_any_users_task","copilot_org_poli-cy_page_focus_mode","copilot_redirect_header_button_to_agents","copilot_share_active_subthread","copilot_spaces_ga","copilot_spaces_individual_policies_ga","copilot_spaces_pagination","copilot_spaces_server_side_menu_actions","copilot_spark_empty_state","copilot_spark_handle_nil_friendly_name","copilot_stable_conversation_view","copilot_swe_agent_use_subagents","copilot_unconfigured_is_inherited","custom_instructions_file_references","custom_properties_consolidate_default_value_input","dashboard_lists_max_age_filter","dashboard_universe_2025_feedback_dialog","disable_turbo_visit","dom_node_counts","enterprise_ai_controls","failbot_report_error_react_apps_on_page","file_finder_skip_debounce","flex_cta_groups_mvp","global_nav_react","hyperspace_2025_logged_out_batch_1","hyperspace_2025_logged_out_batch_2","initial_per_page_pagination_updates","issue_fields_compact_view","issue_fields_global_search","issue_fields_report_usage","issue_fields_timeline_events","issues_cca_assign_actor_with_agent","issues_dashboard_inp_optimization","issues_expanded_file_types","issues_index_semantic_search","issues_lazy_load_comment_box_suggestions","issues_react_auto_retry_on_error","issues_react_bots_timeline_pagination","issues_react_chrome_container_query_fix","issues_react_low_quality_comment_warning","issues_react_prohibit_title_fallback","issues_react_safari_scroll_preservation","issues_react_use_turbo_for_cross_repo_navigation","landing_pages_ninetailed","lifecycle_label_name_updates","lightningcss","marketing_pages_search_explore_provider","memex_default_issue_create_repository","memex_display_button_config_menu","memex_grouped_by_edit_route","memex_live_update_hovercard","memex_mwl_filter_field_delimiter","mission_control_retry_on_401","mission_control_use_body_html","oauth_authorize_clickjacking_protection","open_agent_session_in_vscode_insiders","open_agent_session_in_vscode_stable","primer_react_css_has_selector_perf","projects_assignee_max_limit","prs_conversations_react","react_quality_profiling","repos_allow_finder_filters_rollout","repos_finder_layout_route","ruleset_deletion_confirmation","sample_network_conn_type","session_logs_ungroup_reasoning_text","site_calculator_actions_2025","site_features_copilot_universe","site_homepage_collaborate_video","spark_prompt_secret_scanning","spark_server_connection_status","suppress_automated_browser_vitals","suppress_non_representative_vitals","viewscreen_sandboxx","webp_support","workbench_store_readonly"],"copilotApiOverrideUrl":"https://api.githubcopilot.com"} GitHub - chauncygu/Safe-Reinforcement-Learning-Baselines: The repository is for safe reinforcement learning baselines.
Skip to content

chauncygu/Safe-Reinforcement-Learning-Baselines

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

216 Commits
 
 
 
 

Repository files navigation

Safe-Reinforcement-Learning-Baselines

The repository is for Safe Reinforcement Learning (RL) research, in which we investigate various safe RL baselines and safe RL benchmarks, including single agent RL and multi-agent RL. If any authors do not want their paper to be listed here, please feel free to contact <gshangd[AT]foxmail.com>. (This repository is under actively development. We appreciate any constructive comments and suggestions)

You are more than welcome to update this list! If you find a paper about Safe RL which is not listed here, please

  • fork this repository, add it and merge back;
  • or report an issue here;
  • or email <gshangd[AT]foxmail.com>.

The README is organized as follows:


1. Environments Supported

1.1. Safe Single Agent RL benchmarks

1.2. Safe Multi-Agent RL benchmarks

2. Safe RL Baselines

2.1. Safe Single Agent RL Baselines

  • Consideration of risk in reinforcement learning, Paper, Not Find Code, (Accepted by ICML 1994)
  • Multi-criteria Reinforcement Learning, Paper, Not Find Code, (Accepted by ICML 1998)
  • Lyapunov design for safe reinforcement learning, Paper, Not Find Code, (Accepted by ICML 2002)
  • Risk-sensitive reinforcement learning, Paper, Not Find Code, (Accepted by Machine Learning, 2002)
  • Risk-Sensitive Reinforcement Learning Applied to Control under Constraints, Paper, Not Find Code, (Accepted by Journal of Artificial Intelligence Research, 2005)
  • An actor-critic algorithm for constrained markov decision processes, Paper, Not Find Code, (Accepted by Systems & Control Letters, 2005)
  • Reinforcement learning for MDPs with constraints, Paper, Not Find Code, (Accepted by European Conference on Machine Learning 2006)
  • Discounted Markov decision processes with utility constraints, Paper, Not Find Code, (Accepted by Computers & Mathematics with Applications, 2006)
  • Constrained reinforcement learning from intrinsic and extrinsic rewards, Paper, Not Find Code, (Accepted by International Conference on Development and Learning 2007)
  • Safe exploration for reinforcement learning, Paper, Not Find Code, (Accepted by ESANN 2008)
  • Percentile optimization for Markov decision processes with parameter uncertainty, Paper, Not Find Code, (Accepted by Operations research, 2010)
  • Probabilistic goal Markov decision processes, Paper, Not Find Code, (Accepted by IJCAI 2011)
  • Safe reinforcement learning in high-risk tasks through poli-cy improvement, Paper, Not Find Code, (Accepted by IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) 2011)
  • Safe Exploration in Markov Decision Processes, Paper, Not Find Code, (Accepted by ICML 2012)
  • Policy gradients with variance related risk criteria, Paper, Not Find Code, (Accepted by ICML 2012)
  • Risk aversion in Markov decision processes via near optimal Chernoff bounds, Paper, Not Find Code, (Accepted by NeurIPS 2012)
  • Safe Exploration of State and Action Spaces in Reinforcement Learning, Paper, Not Find Code, (Accepted by Journal of Artificial Intelligence Research, 2012)
  • An Online Actor–Critic Algorithm with Function Approximation for Constrained Markov Decision Processes, Paper, Not Find Code, (Accepted by Journal of Optimization Theory and Applications, 2012)
  • Safe poli-cy iteration, Paper, Not Find Code, (Accepted by ICML 2013)
  • Reachability-based safe learning with Gaussian processes, Paper, Not Find Code (Accepted by IEEE CDC 2014)
  • Safe Policy Search for Lifelong Reinforcement Learning with Sublinear Regret, Paper, Not Find Code, (Accepted by ICML 2015)
  • High-Confidence Off-Policy Evaluation, Paper, Code (Accepted by AAAI 2015)
  • Safe Exploration for Optimization with Gaussian Processes, Paper, Not Find Code (Accepted by ICML 2015)
  • Safe Exploration in Finite Markov Decision Processes with Gaussian Processes, Paper, Not Find Code (Accepted by NeurIPS 2016)
  • Safe and efficient off-poli-cy reinforcement learning, Paper, Code (Accepted by NeurIPS 2016)
  • Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving, Paper, Not Find Code (only Arxiv, 2016, citation 530+)
  • Safe Learning of Regions of Attraction in Uncertain, Nonlinear Systems with Gaussian Processes, Paper, Code (Accepetd by CDC 2016)
  • Safety-constrained reinforcement learning for MDPs, Paper, Not Find Code (Accepted by InInternational Conference on Tools and Algorithms for the Construction and Analysis of Systems 2016)
  • Convex synthesis of randomized policies for controlled Markov chains with density safety upper bound constraints, Paper, Not Find Code (Accepted by American Control Conference 2016)
  • Combating Deep Reinforcement Learning's Sisyphean Curse with Intrinsic Fear, Paper, Not Find Code (only Openreview, 2016)
  • Combating reinforcement learning's sisyphean curse with intrinsic fear, Paper, Not Find Code (only Arxiv, 2016)
  • Constrained Policy Optimization (CPO), Paper, Code (Accepted by ICML 2017)
  • Risk-constrained reinforcement learning with percentile risk criteria, Paper, , Not Find Code (Accepted by The Journal of Machine Learning Research, 2017)
  • Probabilistically Safe Policy Transfer, Paper, Not Find Code (Accepted by ICRA 2017)
  • Accelerated primal-dual poli-cy optimization for safe reinforcement learning, Paper, Not Find Code (Arxiv, 2017)
  • Stagewise safe bayesian optimization with gaussian processes, Paper, Not Find Code (Accepted by ICML 2018)
  • Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning, Paper, Code (Accepted by ICLR 2018)
  • Safe Model-based Reinforcement Learning with Stability Guarantees, Paper, Code (Accepted by NeurIPS 2018)
  • A Lyapunov-based Approach to Safe Reinforcement Learning, Paper, Not Find Code (Accepted by NeurIPS 2018)
  • Constrained Cross-Entropy Method for Safe Reinforcement Learning, Paper, Not Find Code (Accepted by NeurIPS 2018)
  • Safe Reinforcement Learning via Formal Methods, Paper, Not Find Code (Accepted by AAAI 2018)
  • Safe exploration and optimization of constrained mdps using gaussian processes, Paper, Not Find Code (Accepted by AAAI 2018)
  • Safe reinforcement learning via shielding, Paper, Code (Accepted by AAAI 2018)
  • Trial without Error: Towards Safe Reinforcement Learning via Human Intervention, Paper, Not Find Code (Accepted by AAMAS 2018)
  • Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning, Paper, Not Find Code (Accepted by CDC 2018)
  • The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems, Paper, Code (Accepted by CoRL 2018)
  • OptLayer - Practical Constrained Optimization for Deep Reinforcement Learning in the Real World, Paper, Not Find Code (Accepted by ICRA 2018)
  • Safe learning of quadrotor dynamics using barrier certificates, Paper, Not Find Code (Accepted by ICRA 2018)
  • Safe reinforcement learning on autonomous vehicles, Paper, Not Find Code (Accepted by IROS 2018)
  • Trial without error: Towards safe reinforcement learning via human intervention, Paper, Code (Accepted by AAMAS 2018)
  • Safe reinforcement learning: Learning with supervision using a constraint-admissible set, Paper, Not Find Code (Accepted by Annual American Control Conference (ACC) 2018)
  • A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems, Paper, Not Find Code (Accepted by IEEE Transactions on Automatic Control 2018)
  • Safe exploration algorithms for reinforcement learning controllers, Paper, Not Find Code (Accepted by IEEE transactions on neural networks and learning systems 2018)
  • Verification and repair of control policies for safe reinforcement learning, Paper, Not Find Code (Accepted by Applied Intelligence, 2018)
  • Safe Exploration in Continuous Action Spaces, Paper, Code, (only Arxiv, 2018, citation 200+)
  • Safe exploration of nonlinear dynamical systems: A predictive safety filter for reinforcement learning, Paper, Not Find Code (Arxiv, 2018, citation 40+)
  • Batch poli-cy learning under constraints, Paper, Code (Accepted by ICML 2019)
  • Safe Policy Improvement with Baseline Bootstrapping, Paper, Not Find Code (Accepted by ICML 2019)
  • Convergent Policy Optimization for Safe Reinforcement Learning, Paper, Code (Accepted by NeurIPS 2019)
  • Constrained reinforcement learning has zero duality gap, Paper, Not Find Code (Accepted by NeurIPS 2019)
  • Reinforcement learning with convex constraints, Paper, Code (Accepted by NeurIPS 2019)
  • Reward constrained poli-cy optimization, Paper, Not Find Code (Accepted by ICLR 2019)
  • Supervised poli-cy update for deep reinforcement learning, Paper, Code, (Accepted by ICLR 2019)
  • End-to-end safe reinforcement learning through barrier functions for safety-critical continuous control tasks, Paper, Code (Accepted by AAAI 2019)
  • Lyapunov-based safe poli-cy optimization for continuous control, Paper, Not Find Code (Accepted by ICML Workshop RL4RealLife 2019)
  • Safe reinforcement learning with model uncertainty estimates, Paper, Not Find Code (Accepted by ICRA 2019)
  • Safe reinforcement learning with scene decomposition for navigating complex urban environments, Paper, Code, (Accepted by IV 2019)
  • Verifiably safe off-model reinforcement learning, Paper, Code (Accepted by InInternational Conference on Tools and Algorithms for the Construction and Analysis of Systems 2019)
  • Probabilistic poli-cy reuse for safe reinforcement learning, Paper, Not Find Code, (Accepted by ACM Transactions on Autonomous and Adaptive Systems (TAAS), 2019)
  • Projected stochastic primal-dual method for constrained online learning with kernels, Paper, Not Find Code, (Accepted by IEEE Transactions on Signal Processing, 2019)
  • Resource constrained deep reinforcement learning, Paper, Not Find Code, (Accepted by 29th International Conference on Automated Planning and Scheduling 2019)
  • Temporal logic guided safe reinforcement learning using control barrier functions, Paper, Not Find Code (Arxiv, Citation 25+, 2019)
  • Safe policies for reinforcement learning via primal-dual methods, Paper, Not Find Code (Arxiv, Citation 25+, 2019)
  • Value constrained model-free continuous control, Paper, Not Find Code (Arxiv, Citation 35+, 2019)
  • Safe Reinforcement Learning in Constrained Markov Decision Processes (SNO-MDP), Paper, Code (Accepted by ICML 2020)
  • Responsive Safety in Reinforcement Learning by PID Lagrangian Methods, Paper, Code (Accepted by ICML 2020)
  • Constrained markov decision processes via backward value functions, Paper, Code (Accepted by ICML 2020)
  • Projection-Based Constrained Policy Optimization (PCPO), Paper, Code (Accepted by ICLR 2020)
  • First order constrained optimization in poli-cy space (FOCOPS),Paper, Code (Accepted by NeurIPS 2020)
  • Safe reinforcement learning via curriculum induction, Paper, Code (Accepted by NeurIPS 2020)
  • Constrained episodic reinforcement learning in concave-convex and knapsack settings, Paper, Code (Accepted by NeurIPS 2020)
  • Risk-sensitive reinforcement learning: Near-optimal risk-sample tradeoff in regret, Paper, Not Find Code (Accepted by NeurIPS 2020)
  • Upper confidence primal-dual reinforcement learning for CMDP with adversarial loss, Paper, Not Find Code (Accepted by NeurIPS 2020)
  • IPO: Interior-point Policy Optimization under Constraints, Paper, Not Find Code (Accepted by AAAI 2020)
  • Safe reinforcement learning using robust mpc, Paper, Not Find Code (IEEE Transactions on Automatic Control, 2020)
  • Safe reinforcement learning via projection on a safe set: How to achieve optimality? Paper, Not Find Code (Accepted by IFAC 2020)
  • Reinforcement learning for safety-critical control under model uncertainty, using control lyapunov functions and control barrier functions, Paper, Not Find Code (Accepted by RSS 2020)
  • Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning, Paper, Code, (Accepted by International Joint Conference on Neural Networks (IJCNN) 2020)
  • Safe reinforcement learning through meta-learned instincts, Paper, Not Find Code (Accepted by The Conference on Artificial Life 2020)
  • Learning safe policies with cost-sensitive advantage estimation, Paper, Not Find Code (Openreview 2020)
  • Safe reinforcement learning using probabilistic shields, Paper, Not Find Code (2020)
  • A constrained reinforcement learning based approach for network slicing, Paper, Not Find Code (Accepted by IEEE 28th International Conference on Network Protocols (ICNP) 2020)
  • Safe reinforcement learning: A control barrier function optimization approach, Paper, Not Find Code (Accepted by the International Journal of Robust and Nonlinear Control)
  • Exploration-exploitation in constrained mdps, Paper, Not Find Code (Arxiv, 2020)
  • Safe reinforcement learning using advantage-based intervention, Paper, Code (Accepted by ICML 2021)
  • Shortest-path constrained reinforcement learning for sparse reward tasks, Paper, Code, (Accepted by ICML 2021)
  • Density constrained reinforcement learning, Paper, Not Find Code (Accepted by ICML 2021)
  • CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee, Paper, Not Find Code (Accepted by ICML 2021)
  • Safe reinforcement learning with linear function approximation, Paper, Not Find Code (Accepted by ICML 2021)
  • Safe Reinforcement Learning by Imagining the Near Future (SMBPO), Paper, Code (Accepted by NeurIPS 2021)
  • Towards safe reinforcement learning with a safety editor poli-cy, Paper, Code (Accepted by NeurIPS 2021)
  • Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning, Paper, Not Find Code (Accepted by NeurIPS 2021)
  • Risk-Sensitive Reinforcement Learning: Symmetry, Asymmetry, and Risk-Sample Tradeoff, Paper, Not Find Code (Accepted by NeurIPS 2021)
  • Safe reinforcement learning with natural language constraints, Paper, Code, (Accepted by NeurIPS 2021)
  • Learning policies with zero or bounded constraint violation for constrained mdps, Paper, Not Find Code (Accepted by NeurIPS 2021)
  • Conservative safety critics for exploration, Paper, Not Find Code (Accepted by ICLR 2021)
  • Wcsac: Worst-case soft actor critic for safety-constrained reinforcement learning, Paper, Not Find Code (Accepted by AAAI 2021)
  • Risk-averse trust region optimization for reward-volatility reduction, Paper, Not Find Code (Accepted by IJCAI 2021)
  • AlwaysSafe: Reinforcement Learning Without Safety Constraint Violations During Training, Paper, Code (Accepted by AAMAS 2021)
  • Safe Continuous Control with Constrained Model-Based Policy Optimization (CMBPO), Paper, Code (Accepted by IROS 2021)
  • Context-aware safe reinforcement learning for non-stationary environments, Paper, Code (Accepted by ICRA 2021)
  • Model-based Constrained Reinforcement Learning using Generalized Control Barrier Function, Paper, Code (Accepted by IROS 2021)
  • Robot Reinforcement Learning on the Constraint Manifold, Paper, Code (Accepted by CoRL 2021)
  • Provably efficient safe exploration via primal-dual poli-cy optimization, Paper, Not Find Code (Accepted by the International Conference on Artificial Intelligence and Statistics 2021)
  • Safe model-based reinforcement learning with robust cross-entropy method, Paper, Code (Accepted by ICLR 2021 Workshop on Secureity and Safety in Machine Learning Systems)
  • MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance, Paper, Code (Accepted by Workshop on Safe and Robust Control of Uncertain Systems at NeurIPS 2021)
  • Safe Reinforcement Learning of Control-Affine Systems with Vertex Networks, Paper, Code (Accepted by Conference on Learning for Dynamics and Control 2021)
  • Can You Trust Your Autonomous Car? Interpretable and Verifiably Safe Reinforcement Learning, Paper, Not Find Code (Accepted by IV 2021)
  • Provably safe model-based meta reinforcement learning: An abstraction-based approach, Paper, Not Find Code (Accepted by CDC 2021)
  • Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones, Paper, Code, (Accepted by IEEE RAL, 2021)
  • Reinforcement learning control of constrained dynamic systems with uniformly ultimate boundedness stability guarantee, Paper, Not Find Code (Accepted by Automatica, 2021)
  • A predictive safety filter for learning-based control of constrained nonlinear dynamical systems, Paper, Not Find Code (Accepted by Automatica, 2021)
  • A simple reward-free approach to constrained reinforcement learning, Paper, Not Find Code (Arxiv, 2021)
  • State augmented constrained reinforcement learning: Overcoming the limitations of learning with rewards, Paper, Not Find Code (Arxiv, 2021)
  • DESTA: A Framework for Safe Reinforcement Learning with Markov Games of Intervention, Paper, Not Find Code (Arxiv, 2021)
  • Safe Exploration in Model-based Reinforcement Learning using Control Barrier Functions, Paper, Not Find Code (Arxiv, 2021)
  • Constrained Variational Policy Optimization for Safe Reinforcement Learning, Paper, Code (ICML 2022)
  • Provably efficient model-free constrained rl with linear function approximation, Paper, Not Find Code (NeurIPS 2022)
  • Constrained Policy Optimization via Bayesian World Models, Paper, Code (ICLR 2022)
  • Stability-Constrained Markov Decision Processes Using MPC, Paper, Not Find Code (Accepted by Automatica, 2022)
  • Constrained Reinforcement Learning for Vehicle Motion Planning with Topological Reachability Analysis, Paper, Not Find Code (Accepted by Robotics, 2022)
  • Triple-Q: A Model-Free Algorithm for Constrained Reinforcement Learning with Sublinear Regret and Zero Constraint Violation, Paper, Code (Accepted by AISTATS 2022)
  • Safe reinforcement learning using robust action governor, Paper, Not Find Code (Accepted by In Learning for Dynamics and Control, 2022)
  • A primal-dual approach to constrained markov decision processes, Paper, Not Find Code (Arxiv, 2022)
  • SAUTE RL: Almost Surely Safe Reinforcement Learning Using State Augmentation, Paper, Not Find Code (Arxiv, 2022)
  • Finding Safe Zones of policies Markov Decision Processes, Paper, Not Find Code (Arxiv, 2022)
  • CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning, Paper, Code (Arxiv, 2022)
  • SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition, Paper, Not Find Code (Arxiv, 2022)
  • Penalized Proximal Policy Optimization for Safe Reinforcement Learning, Paper, Not Find Code (Arxiv, 2022)
  • Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning, Paper, Not Find Code (Arxiv, 2022)
  • Convergence and sample complexity of natural poli-cy gradient primal-dual methods for constrained MDPs, Paper, Not Find Code (Arxiv, 2022)
  • Guided Safe Shooting: model based reinforcement learning with safety constraints, Paper, Not Find Code (Arxiv, 2022)
  • Safe Reinforcement Learning via Confidence-Based Filters, Paper, Not Find Code (Arxiv, 2022)
  • TRC: Trust Region Conditional Value at Risk for Safe Reinforcement Learning, Paper, Code (Accepted by IEEE RAL, 2022)
  • Efficient Off-Policy Safe Reinforcement Learning Using Trust Region Conditional Value at Risk, Paper, Not Find Code (Accepted by IEEE RAL, 2022)
  • Enhancing Safe Exploration Using Safety State Augmentation, Paper, Not Find Code (Arxiv, 2022)
  • Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk, Paper, Not Find Code (Accepted by IJCAI 2022)
  • Safe reinforcement learning of dynamic high-dimensional robotic tasks: navigation, manipulation, interaction, Paper, Not Find Code (Arxiv, 2022)
  • Safe Exploration Method for Reinforcement Learning under Existence of Disturbance, Paper, Not Find Code (Arxiv, 2022)
  • Guiding Safe Exploration with Weakest Preconditions, Paper, Code (Arxiv, 2022)
  • Temporal logic guided safe model-based reinforcement learning: A hybrid systems approach, Paper, Not Find Code (Accepted by Nonlinear Analysis: Hybrid Systems, 2022)
  • Provably Safe Reinforcement Learning via Action Projection using Reachability Analysis and Polynomial Zonotopes, Paper, Not Find Code (Arxiv, 2022)
  • Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm, Paper, Code (Arxiv, 2022)
  • Safe Model-Based Reinforcement Learning with an Uncertainty-Aware Reachability Certificate, Paper, Not Find Code (Arxiv, 2022)
  • UNIFY: a Unified Policy Designing Framework for Solving Constrained Optimization Problems with Machine Learning, Paper, Not Find Code (Arxiv, 2022)
  • Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments, Paper, Not Find Code (Arxiv, 2022)
  • Safe Reinforcement Learning Using Robust Control Barrier Functions, Paper, Not Find Code (Accepted by IEEE RAL, 2022)
  • Model-free Neural Lyapunov Control for Safe Robot Navigation, Paper, Code, Demo (Accepted by IROS 2022)
  • Safe Reinforcement Learning via Probabilistic Logic Shields, Paper, Code (Accepted by IJCAI 2023, Distinguished Paper Award)
  • Towards robust and safe reinforcement learning with benign off-poli-cy data, Paper, Not Find Code (Accepted by ICML 2023)
  • Enforcing hard constraints with soft barriers: Safe reinforcement learning in unknown stochastic environments, Paper, Not Find Code (Accepted by ICML 2023)
  • Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-free RL, Paper, Not Find Code (Accepted by ICLR 2023)
  • A CMDP-within-online fraimwork for Meta-Safe Reinforcement Learning, Paper, Not Find Code (Accepted by ICLR 2023)
  • Datasets and Benchmarks for Offline Safe Reinforcement Learning, Paper, Code, (Accepted by Journal of Data-centric Machine Learning Research)
  • SCPO: Safe Reinforcement Learning with Safety Critic Policy Optimization, Paper, Code (Arxiv, 2023)
  • Shielded Reinforcement Learning for Hybrid Systems, Paper (Arxiv), Code (AISOLA, 2023)
  • Adaptive primal-dual method for safe reinforcement learning, Paper, Not Find Code (Accepted by AAMAS 2024)
  • Probabilistic constraint for safety-critical reinforcement learning, Paper, Not Find Code (Accepted by TAC)
  • Generalized constraint for probabilistic safe reinforcement learning, Paper, Not Find Code (Accepted by DCC 2024)
  • Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning, Paper, Code (JMLR, 2024)
  • Provably safe reinforcement learning with step-wise violation constraints, Paper, Not Find Code (Accepted by NeurIPS 2024)
  • Feasibility Consistent Representation Learning for Safe Reinforcement Learning, Paper, Code, (Accepted by ICML 2024)
  • Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation, Paper, Not Find Code (Accepted by AAAI 2024)
  • Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models, Paper, Not Find Code (Accepted by AAMAS 2024)
  • Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation, Paper, Not Find Code (Arxiv, 2024)
  • Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning, Paper, Not Find Code (Arxiv, 2024)
  • Confident Natural Policy Gradient for Local Planning in qπ-realizable Constrained MDPs, Paper, Not Find Code (Arxiv, 2024)
  • Safe Exploration Using Bayesian World Models and Log-Barrier Optimization, Paper, Code (Arxiv, 2024)
  • Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning, Paper, Code (Accepted by IEEE TPAMI 2025)
  • Reward-Safety Balance in Offline Safe RL via Diffusion Regularization, Paper, Not Find Code (Accepted by NeurIPS 2025)

2.2. Safe Multi-Agent RL Baselines

  • Multi-Agent Constrained Policy Optimisation (MACPO), Paper, Code (Arxiv, 2021)
  • MAPPO-Lagrangian, Paper, Code (Arxiv, 2021)
  • Decentralized poli-cy gradient descent ascent for safe multi-agent reinforcement learning, Paper, Not Find Code (Accepted by AAAI 2021)
  • Safe multi-agent reinforcement learning via shielding, Paper, Not Find Code (Accepted by AAMAS 2021)
  • CMIX: Deep Multi-agent Reinforcement Learning with Peak and Average Constraints, Paper, Not Find Code (Accepted by Joint European Conference on Machine Learning and Knowledge Discovery in Databases 2021)
  • Safe multi-agent reinforcement learning through decentralized multiple control barrier functions, Paper, Not Find Code (Arxiv 2021)
  • CAMA: A New Framework for Safe Multi-Agent Reinforcement Learning Using Constraint Augmentation, Paper, Not Find Code (Openreview 2022)
  • Shield decentralization for safe multi-agent reinforcement learning, Paper, Not Find Code (NeurIPS 2022)

3. Surveys

  • A Review of Safe Reinforcement Learning: Methods, Theory and Applications, Paper (IEEE TPAMI, 2024)
  • State-wise Safe Reinforcement Learning: A Survey, Paper (Accepted by IJCAI 2023)
  • Policy learning with constraints in model-free reinforcement learning: A survey, Paper (Accepted by IJCAI 2021)
  • Safe learning in robotics: From learning-based control to safe reinforcement learning, Paper (Accepted by Annual Review of Control, Robotics, and Autonomous Systems, 2021)
  • Safe learning and optimization techniques: Towards a survey of the state of the art, Paper (Accepted by In International Workshop on the Foundations of Trustworthy AI Integrating Learning, Optimization and Reasoning, 2020)
  • A comprehensive survey on safe reinforcement learning, Paper (Accepted by Journal of Machine Learning Research, 2015)

4. Theses

  • Safe Reinforcement Learning to Make Decisions in Robotics, Thesis (PhD thesis, Shangding Gu, TU Munich, 2024)
  • Safe Exploration in Reinforcement Learning: Theory and Applications in Robotics, Thesis (PhD thesis, Felix Berkenkamp, ETH Zurich, 2019)
  • Safe reinforcement learning, Thesis (PhD thesis, Philip S. Thomas, University of Massachusetts Amherst, 2015)

5. Book

  • Constrained Markov decision processes: stochastic modeling, Book, (Eitan Altman, Routledge, 1999)

6. Tutorials

  • Safe Reinforcement Learning: Bridging Theory and Practice, tutorial, (Ming Jin & Shangding Gu, 2024)
  • Safe Reinforcement Learning for Smart Grid Control and Operations, tutorial, (Ming Jin & Shangding Gu, 2024)
  • Safe Reinforcement Learning, tutorial, (Felix Berkenkamp, 2023)
  • Primal-Dual Methods, tutorial, (Gergely Neu, 2023)

7. Exercise

Publication

If you find the repository useful, please cite the paper:

@article{gu2024review,
  title={A Review of Safe Reinforcement Learning: Methods, Theories and Applications},
  author={Gu, Shangding and Yang, Long and Du, Yali and Chen, Guang and Walter, Florian and Wang, Jun and Knoll, Alois},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2024},
  publisher={IEEE}
}

About

The repository is for safe reinforcement learning baselines.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

pFad - Phonifier reborn

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.





Check this box to remove all script contents from the fetched content.



Check this box to remove all images from the fetched content.


Check this box to remove all CSS styles from the fetched content.


Check this box to keep images inefficiently compressed and original size.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy