OLAM
- class amlgym.algorithms.OLAM.OLAM(input_domain_path, planning_timeout=30, max_length=8, max_subproblems=5, max_goals=10000)[source]
Bases:
ActiveAlgorithmAdapterAdapter class for running the OLAM algorithm: “Online Learning of Action Models for PDDL Planning”, L. Lamanna, A. Saetti, L. Serafini, A. Gerevini, and P. Traverso, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021. https://doi.org/10.24963/ijcai.2021/566
Example
from unified_planning.io import PDDLReader from unified_planning.shortcuts import SequentialSimulator from amlgym.algorithms import get_algorithm from amlgym.benchmarks import get_domain_path, get_problems_path from amlgym.util.util import empty_domain domain = 'blocksworld' domain_ref_path = get_domain_path(domain) input_domain_path = empty_domain(domain_ref_path) problem_path = get_problems_path(domain, kind='learning')[0] problem = PDDLReader().parse_problem(domain_ref_path, problem_path) env = SequentialSimulator(problem=problem) olam = get_algorithm('OLAM', input_domain_path=input_domain_path) model, trajectory = olam.learn(env) print("##################### Learned model #####################") print(model) print("################# Generated trajectory ##################") print(trajectory)
- Parameters:
planning_timeout (int) – Time limit in seconds for each planning call (default: 30)
max_length (int) – Maximum number of uncertain preconditions/effects considered in goal conjunctions (default: 8)
max_subproblems (int) – Maximum number of subproblems when handling object type ambiguity (default: 5)
max_goals (int) – Maximum number of disjunctions in a goal formula used during planning for learning preconditions and effects. When the number of generated goals exceeds this limit, some goals are discarded. (default: 10000)
- __init__(input_domain_path, planning_timeout=30, max_length=8, max_subproblems=5, max_goals=10000)
-
input_domain_path:
str
- learn(simulator, max_steps=10000, seed=123)[source]
- Learns a PDDL action model from:
a simulator of the environment to learn from
a (possibly empty) input model which is required to specify the predicates and operators signature (set via the input_domain_path attribute at instantiation time);
- Parameters:
simulator (
SequentialSimulator) – environment simulatormax_steps (
int) – maximum number of interaction steps with the simulatorseed (
int) – random seed for reproducibility
- Return type:
Tuple[str,Trajectory]- Returns:
a string representing the learned PDDL model, and a JSON specification of the trajectory
-
max_goals:
int= 10000
-
max_length:
int= 8
-
max_subproblems:
int= 5
-
planning_timeout:
int= 30