OLAM

class amlgym.algorithms.OLAM.OLAM(input_domain_path, planning_timeout=30, max_length=8, max_subproblems=5, max_goals=10000)[source]

Bases: ActiveAlgorithmAdapter

Adapter class for running the OLAM algorithm: “Online Learning of Action Models for PDDL Planning”, L. Lamanna, A. Saetti, L. Serafini, A. Gerevini, and P. Traverso, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021. https://doi.org/10.24963/ijcai.2021/566

Example

from unified_planning.io import PDDLReader
from unified_planning.shortcuts import SequentialSimulator
from amlgym.algorithms import get_algorithm
from amlgym.benchmarks import get_domain_path, get_problems_path
from amlgym.util.util import empty_domain

domain = 'blocksworld'
domain_ref_path = get_domain_path(domain)
input_domain_path = empty_domain(domain_ref_path)
problem_path = get_problems_path(domain, kind='learning')[0]
problem = PDDLReader().parse_problem(domain_ref_path, problem_path)

env = SequentialSimulator(problem=problem)
olam = get_algorithm('OLAM', input_domain_path=input_domain_path)
model, trajectory = olam.learn(env)

print("##################### Learned model #####################")
print(model)

print("################# Generated trajectory ##################")
print(trajectory)
Parameters:
  • planning_timeout (int) – Time limit in seconds for each planning call (default: 30)

  • max_length (int) – Maximum number of uncertain preconditions/effects considered in goal conjunctions (default: 8)

  • max_subproblems (int) – Maximum number of subproblems when handling object type ambiguity (default: 5)

  • max_goals (int) – Maximum number of disjunctions in a goal formula used during planning for learning preconditions and effects. When the number of generated goals exceeds this limit, some goals are discarded. (default: 10000)

__init__(input_domain_path, planning_timeout=30, max_length=8, max_subproblems=5, max_goals=10000)
input_domain_path: str
learn(simulator, max_steps=10000, seed=123)[source]
Learns a PDDL action model from:
  1. a simulator of the environment to learn from

  2. a (possibly empty) input model which is required to specify the predicates and operators signature (set via the input_domain_path attribute at instantiation time);

Parameters:
  • simulator (SequentialSimulator) – environment simulator

  • max_steps (int) – maximum number of interaction steps with the simulator

  • seed (int) – random seed for reproducibility

Return type:

Tuple[str, Trajectory]

Returns:

a string representing the learned PDDL model, and a JSON specification of the trajectory

max_goals: int = 10000
max_length: int = 8
max_subproblems: int = 5
planning_timeout: int = 30