amlgym.algorithms package

Submodules

amlgym.algorithms.ActiveAlgorithmAdapter module

class amlgym.algorithms.ActiveAlgorithmAdapter.ActiveAlgorithmAdapter(input_domain_path)[source]

Bases: ABC

An abstract class for an active action model learning algorithm, which defines the abstract interface that must be implemented by every (subclass) algorithm adapter.

input_domain_path: str
abstract learn(simulator, max_steps=100, seed=123)[source]

Learns a PDDL action model by acting within a simulated environment.

Parameters:
  • simulator (SequentialSimulator) – environment simulator

  • max_steps (int) – maximum number of interaction steps with the simulator

  • seed (int) – random seed for reproducibility

Return type:

Tuple[str, Trajectory]

Returns:

a string representing the learned PDDL model, and a JSON specification of the trajectory

amlgym.algorithms.InformationGainAgent module

class amlgym.algorithms.InformationGainAgent.InformationGainAgent(input_domain_path, use_object_subset=True, spare_objects_per_type=2, model_mode='safe', learn_negative_preconditions=True, selection_strategy='greedy', epsilon=0.1, temperature=1.0, lookahead_depth=2, lookahead_top_k=5, lookahead_discount=0.9, mcts_iterations=50, mcts_rollout_depth=5)[source]

Bases: ActiveAlgorithmAdapter

Online action model learning via information gain.

Uses CNF/SAT-based information-theoretic approach to select actions that maximize expected information gain about the action model.

Parameters:
  • use_object_subset (bool) – Enable object subset selection for reduced grounding

  • spare_objects_per_type (int) – Extra objects per type beyond minimum requirement (for subset selection)

  • model_mode (str) – “safe” (all possible preconditions, confirmed effects only) or “complete” (certain preconditions only, all possible effects)

  • learn_negative_preconditions (bool) – Whether to learn negative preconditions

  • selection_strategy (str) – Action selection strategy. One of: - “greedy” — always select highest information gain (default) - “epsilon_greedy” — explore with probability epsilon - “boltzmann” — softmax probabilistic selection - “lookahead” — depth-limited lookahead with discounted future gain - “mcts” — full UCT-based Monte Carlo Tree Search

  • lookahead_depth (int) – Lookahead depth for ‘lookahead’ strategy (default: 2)

  • lookahead_top_k (int) – Number of top actions to evaluate in lookahead (default: 5)

  • lookahead_discount (float) – Discount factor for future gain in lookahead (default: 0.9)

  • epsilon (float) – Exploration probability for ‘epsilon_greedy’ strategy (default: 0.1)

  • temperature (float) – Temperature for ‘boltzmann’ softmax selection (default: 1.0)

  • mcts_iterations (int) – Number of MCTS iterations per action selection (default: 50)

  • mcts_rollout_depth (int) – Simulation depth during MCTS rollout phase (default: 5)

Example

from unified_planning.io import PDDLReader
from unified_planning.shortcuts import SequentialSimulator
from amlgym.algorithms import get_algorithm
from amlgym.benchmarks import get_domain_path, get_problems_path
from amlgym.util.util import empty_domain

domain = 'blocksworld'
domain_ref_path = get_domain_path(domain)
input_domain_path = empty_domain(domain_ref_path)
problem_path = get_problems_path(domain, kind='learning')[0]
problem = PDDLReader().parse_problem(domain_ref_path, problem_path)

env = SequentialSimulator(problem=problem)
info_gain = get_algorithm('InformationGainAgent', input_domain_path=input_domain_path)
model, trajectory = info_gain.learn(env, max_steps=100)

# With lookahead strategy
info_gain = get_algorithm(
    'InformationGainAgent',
    input_domain_path=input_domain_path,
    selection_strategy='lookahead',
    lookahead_depth=3,
)
model, trajectory = info_gain.learn(env, max_steps=100)

print("##################### Learned model #####################")
print(model)

print("################# Generated trajectory ##################")
print(trajectory)
epsilon: float = 0.1
input_domain_path: str
learn(simulator, max_steps=500, seed=123)[source]

Learn a PDDL action model by interacting with the environment.

Parameters:
  • simulator (SequentialSimulator) – environment simulator

  • max_steps (int) – maximum number of interaction steps with the simulator

  • seed (int) – random seed for reproducibility

Return type:

Tuple[str, Trajectory]

Returns:

(learned PDDL model string, trajectory)

learn_negative_preconditions: bool = True
lookahead_depth: int = 2
lookahead_discount: float = 0.9
lookahead_top_k: int = 5
mcts_iterations: int = 50
mcts_rollout_depth: int = 5
model_mode: str = 'safe'
selection_strategy: str = 'greedy'
spare_objects_per_type: int = 2
temperature: float = 1.0
use_object_subset: bool = True

amlgym.algorithms.NOLAM module

class amlgym.algorithms.NOLAM.NOLAM(noise=0.0)[source]

Bases: PassiveAlgorithmAdapter

Adapter class for running the NOLAM algorithm: “Action Model Learning from Noisy Traces: a Probabilistic Approach”, L. Lamanna and L. Serafini, Proceedings of the Thirty-Fourth International Conference on Automated Planning and Scheduling, 2024. https://ojs.aaai.org/index.php/ICAPS/article/view/31493

Parameters:

noise (float) – The observation noise.

Example

from amlgym.algorithms import get_algorithm
nolam = get_algorithm('NOLAM')
model = nolam.learn('path/to/domain.pddl', ['path/to/trace0', 'path/to/trace1'])
print(model)
learn(domain_path, trajectory_paths)[source]
Learns a PDDL action model from:
  1. a (possibly empty) input model which is required to specify the predicates and operators signature;

  2. a list of trajectory file paths.

Parameters:
  • domain_path (str) – input PDDL domain file path

  • trajectory_paths (List[str]) – list of trajectory file paths

Return type:

str

Returns:

a string representing the learned PDDL model

noise: float = 0.0

amlgym.algorithms.OLAM module

class amlgym.algorithms.OLAM.OLAM(input_domain_path, planning_timeout=30, max_length=8, max_subproblems=5, max_goals=10000)[source]

Bases: ActiveAlgorithmAdapter

Adapter class for running the OLAM algorithm: “Online Learning of Action Models for PDDL Planning”, L. Lamanna, A. Saetti, L. Serafini, A. Gerevini, and P. Traverso, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021. https://doi.org/10.24963/ijcai.2021/566

Example

from unified_planning.io import PDDLReader
from unified_planning.shortcuts import SequentialSimulator
from amlgym.algorithms import get_algorithm
from amlgym.benchmarks import get_domain_path, get_problems_path
from amlgym.util.util import empty_domain

domain = 'blocksworld'
domain_ref_path = get_domain_path(domain)
input_domain_path = empty_domain(domain_ref_path)
problem_path = get_problems_path(domain, kind='learning')[0]
problem = PDDLReader().parse_problem(domain_ref_path, problem_path)

env = SequentialSimulator(problem=problem)
olam = get_algorithm('OLAM', input_domain_path=input_domain_path)
model, trajectory = olam.learn(env)

print("##################### Learned model #####################")
print(model)

print("################# Generated trajectory ##################")
print(trajectory)
Parameters:
  • planning_timeout (int) – Time limit in seconds for each planning call (default: 30)

  • max_length (int) – Maximum number of uncertain preconditions/effects considered in goal conjunctions (default: 8)

  • max_subproblems (int) – Maximum number of subproblems when handling object type ambiguity (default: 5)

  • max_goals (int) – Maximum number of disjunctions in a goal formula used during planning for learning preconditions and effects. When the number of generated goals exceeds this limit, some goals are discarded. (default: 10000)

input_domain_path: str
learn(simulator, max_steps=10000, seed=123)[source]
Learns a PDDL action model from:
  1. a simulator of the environment to learn from

  2. a (possibly empty) input model which is required to specify the predicates and operators signature (set via the input_domain_path attribute at instantiation time);

Parameters:
  • simulator (SequentialSimulator) – environment simulator

  • max_steps (int) – maximum number of interaction steps with the simulator

  • seed (int) – random seed for reproducibility

Return type:

Tuple[str, Trajectory]

Returns:

a string representing the learned PDDL model, and a JSON specification of the trajectory

max_goals: int = 10000
max_length: int = 8
max_subproblems: int = 5
planning_timeout: int = 30

amlgym.algorithms.OffLAM module

class amlgym.algorithms.OffLAM.OffLAM(**kwargs)[source]

Bases: PassiveAlgorithmAdapter

Adapter class for running the OffLAM algorithm: “Lifted Action Models Learning from Partial Traces”, L. Lamanna, L. Serafini, A. Saetti, A. Gerevini, and P. Traverso, Artificial Intelligence Journal, 2025. https://www.sciencedirect.com/science/article/abs/pii/S0004370224001929

Example

from amlgym.algorithms import get_algorithm
offlam = get_algorithm('OffLAM')
model = offlam.learn('path/to/domain.pddl', ['path/to/trace0', 'path/to/trace1'])
print(model)
learn(domain_path, trajectory_paths)[source]
Learns a PDDL action model from:
  1. a (possibly empty) input model which is required to specify the predicates and operators signature;

  2. a list of trajectory file paths.

Parameters:
  • domain_path (str) – input PDDL domain file path

  • trajectory_paths (List[str]) – list of trajectory file paths

Return type:

str

Returns:

a string representing the learned PDDL model

amlgym.algorithms.PassiveAlgorithmAdapter module

class amlgym.algorithms.PassiveAlgorithmAdapter.PassiveAlgorithmAdapter[source]

Bases: ABC

An abstract class for an action model learning algorithm, which defines the abstract interface that must be implemented by every (subclass) algorithm adapter to enable automated evaluation.

abstract learn(domain_path, trajectory_paths)[source]
Learns a PDDL action model from:
  1. a (possibly empty) input model which is required to specify the predicates and operators signature;

  2. a list of trajectory file paths.

Parameters:
  • domain_path (str) – input PDDL domain file path

  • trajectory_paths (List[str]) – list of trajectory file paths

Return type:

str

Returns:

a string representing the learned PDDL model

amlgym.algorithms.ROSAME module

class amlgym.algorithms.ROSAME.ROSAME(**kwargs)[source]

Bases: PassiveAlgorithmAdapter

Adapter class for running an unofficial implementation of the ROSAME algorithm: “Neuro-Symbolic Learning of Lifted Action Models from Visual Traces”, Kai Xi, Stephen Gould, Sylvie Thiebaux, Proceedings of the Thirty-Fourth International Conference on Automated Planning and Scheduling, 2024. https://ojs.aaai.org/index.php/ICAPS/article/download/31528/33688

Example

from amlgym.algorithms import get_algorithm
rosame = get_algorithm('ROSAME')
model = rosame.learn('path/to/domain.pddl', ['path/to/trace0', 'path/to/trace1'])
print(model)
learn(domain_path, trajectory_paths, use_problems=True)[source]
Learns a PDDL action model from:
  1. a (possibly empty) input model which is required to specify the predicates and operators signature;

  2. a list of trajectory file paths.

Parameters:
  • domain_path (str) – input PDDL domain file path

  • trajectory_paths (List[str]) – list of trajectory file paths

  • use_problems (bool) – boolean flag indicating whether to provide the set of objects specified in the problem from which the trajectories have been generated

Return type:

str

Returns:

a string representing the learned PDDL model

amlgym.algorithms.RandomAgent module

class amlgym.algorithms.RandomAgent.RandomAgent(input_domain_path)[source]

Bases: ActiveAlgorithmAdapter

A simple baseline for online learning in a fully observable and deterministic environment by randomly executing actions. The baselines firstly generates a trajectory and then applies the SAM algorithm for offline learning a model from the generated trace.

Example

from unified_planning.io import PDDLReader
from unified_planning.shortcuts import SequentialSimulator
from amlgym.algorithms import get_algorithm
from amlgym.benchmarks import get_domain_path, get_problems_path
from amlgym.util.util import empty_domain

domain = 'blocksworld'
domain_ref_path = get_domain_path(domain)
input_domain_path = empty_domain(domain_ref_path)
problem_path = get_problems_path(domain, kind='learning')[0]
problem = PDDLReader().parse_problem(domain_ref_path, problem_path)

env = SequentialSimulator(problem=problem)
baseline = get_algorithm('RandomAgent', input_domain_path=input_domain_path)
model, trajectory = baseline.learn(env, max_steps=100)

print("##################### Learned model #####################")
print(model)

print("################# Generated trajectory ##################")
print(trajectory)
input_domain_path: str
learn(simulator, max_steps=100, seed=123)[source]
Learns a PDDL action model from:
  1. a simulator of the environment to learn from

  2. a (possibly empty) input model which is required to specify the predicates and operators signature (set via the input_domain_path attribute at instantiation time);

Parameters:
  • simulator (SequentialSimulator) – environment simulator

  • max_steps (int) – maximum number of interaction steps with the simulator

  • seed (int) – random seed for reproducibility

Return type:

Tuple[str, Trajectory]

Returns:

a string representing the learned PDDL model, and a JSON specification of the trajectory

amlgym.algorithms.SAM module

class amlgym.algorithms.SAM.SAM[source]

Bases: PassiveAlgorithmAdapter

Adapter class for running the SAM algorithm: “Safe Learning of Lifted Action Models”, B. Juba and H. S. Le, and R. Stern, Proceedings of the 18th International Conference on Principles of Knowledge Representation and Reasoning, 2021. https://proceedings.kr.org/2021/36/

Example

from amlgym.algorithms import get_algorithm
sam = get_algorithm('SAM')
model = sam.learn('path/to/domain.pddl', ['path/to/trace0', 'path/to/trace1'])
print(model)
learn(domain_path, trajectory_paths)[source]
Learns a PDDL action model from:
  1. a (possibly empty) input model which is required to specify the predicates and operators signature;

  2. a list of trajectory file paths.

Parameters:
  • domain_path (str) – input PDDL domain file path

  • trajectory_paths (List[str]) – list of trajectory file paths

Return type:

str

Returns:

a string representing the learned PDDL model

Module contents

amlgym.algorithms.get_algorithm(name, **kwargs)[source]

Retrieve an algorithm by name from the registry.

If the name is not found, raises a ValueError with suggestions for close matches.

amlgym.algorithms.print_algorithms()[source]

Print available algorithms and their constructor parameters.

Return type:

None