amlgym.algorithms package
Submodules
amlgym.algorithms.ActiveAlgorithmAdapter module
- class amlgym.algorithms.ActiveAlgorithmAdapter.ActiveAlgorithmAdapter(input_domain_path)[source]
Bases:
ABCAn abstract class for an active action model learning algorithm, which defines the abstract interface that must be implemented by every (subclass) algorithm adapter.
-
input_domain_path:
str
- abstract learn(simulator, max_steps=100, seed=123)[source]
Learns a PDDL action model by acting within a simulated environment.
- Parameters:
simulator (
SequentialSimulator) – environment simulatormax_steps (
int) – maximum number of interaction steps with the simulatorseed (
int) – random seed for reproducibility
- Return type:
Tuple[str,Trajectory]- Returns:
a string representing the learned PDDL model, and a JSON specification of the trajectory
-
input_domain_path:
amlgym.algorithms.InformationGainAgent module
- class amlgym.algorithms.InformationGainAgent.InformationGainAgent(input_domain_path, use_object_subset=True, spare_objects_per_type=2, model_mode='safe', learn_negative_preconditions=True, selection_strategy='greedy', epsilon=0.1, temperature=1.0, lookahead_depth=2, lookahead_top_k=5, lookahead_discount=0.9, mcts_iterations=50, mcts_rollout_depth=5)[source]
Bases:
ActiveAlgorithmAdapterOnline action model learning via information gain.
Uses CNF/SAT-based information-theoretic approach to select actions that maximize expected information gain about the action model.
- Parameters:
use_object_subset (bool) – Enable object subset selection for reduced grounding
spare_objects_per_type (int) – Extra objects per type beyond minimum requirement (for subset selection)
model_mode (str) – “safe” (all possible preconditions, confirmed effects only) or “complete” (certain preconditions only, all possible effects)
learn_negative_preconditions (bool) – Whether to learn negative preconditions
selection_strategy (str) – Action selection strategy. One of: - “greedy” — always select highest information gain (default) - “epsilon_greedy” — explore with probability epsilon - “boltzmann” — softmax probabilistic selection - “lookahead” — depth-limited lookahead with discounted future gain - “mcts” — full UCT-based Monte Carlo Tree Search
lookahead_depth (int) – Lookahead depth for ‘lookahead’ strategy (default: 2)
lookahead_top_k (int) – Number of top actions to evaluate in lookahead (default: 5)
lookahead_discount (float) – Discount factor for future gain in lookahead (default: 0.9)
epsilon (float) – Exploration probability for ‘epsilon_greedy’ strategy (default: 0.1)
temperature (float) – Temperature for ‘boltzmann’ softmax selection (default: 1.0)
mcts_iterations (int) – Number of MCTS iterations per action selection (default: 50)
mcts_rollout_depth (int) – Simulation depth during MCTS rollout phase (default: 5)
Example
from unified_planning.io import PDDLReader from unified_planning.shortcuts import SequentialSimulator from amlgym.algorithms import get_algorithm from amlgym.benchmarks import get_domain_path, get_problems_path from amlgym.util.util import empty_domain domain = 'blocksworld' domain_ref_path = get_domain_path(domain) input_domain_path = empty_domain(domain_ref_path) problem_path = get_problems_path(domain, kind='learning')[0] problem = PDDLReader().parse_problem(domain_ref_path, problem_path) env = SequentialSimulator(problem=problem) info_gain = get_algorithm('InformationGainAgent', input_domain_path=input_domain_path) model, trajectory = info_gain.learn(env, max_steps=100) # With lookahead strategy info_gain = get_algorithm( 'InformationGainAgent', input_domain_path=input_domain_path, selection_strategy='lookahead', lookahead_depth=3, ) model, trajectory = info_gain.learn(env, max_steps=100) print("##################### Learned model #####################") print(model) print("################# Generated trajectory ##################") print(trajectory)
-
epsilon:
float= 0.1
-
input_domain_path:
str
- learn(simulator, max_steps=500, seed=123)[source]
Learn a PDDL action model by interacting with the environment.
- Parameters:
simulator (
SequentialSimulator) – environment simulatormax_steps (
int) – maximum number of interaction steps with the simulatorseed (
int) – random seed for reproducibility
- Return type:
Tuple[str,Trajectory]- Returns:
(learned PDDL model string, trajectory)
-
learn_negative_preconditions:
bool= True
-
lookahead_depth:
int= 2
-
lookahead_discount:
float= 0.9
-
lookahead_top_k:
int= 5
-
mcts_iterations:
int= 50
-
mcts_rollout_depth:
int= 5
-
model_mode:
str= 'safe'
-
selection_strategy:
str= 'greedy'
-
spare_objects_per_type:
int= 2
-
temperature:
float= 1.0
-
use_object_subset:
bool= True
amlgym.algorithms.NOLAM module
- class amlgym.algorithms.NOLAM.NOLAM(noise=0.0)[source]
Bases:
PassiveAlgorithmAdapterAdapter class for running the NOLAM algorithm: “Action Model Learning from Noisy Traces: a Probabilistic Approach”, L. Lamanna and L. Serafini, Proceedings of the Thirty-Fourth International Conference on Automated Planning and Scheduling, 2024. https://ojs.aaai.org/index.php/ICAPS/article/view/31493
- Parameters:
noise (float) – The observation noise.
Example
from amlgym.algorithms import get_algorithm nolam = get_algorithm('NOLAM') model = nolam.learn('path/to/domain.pddl', ['path/to/trace0', 'path/to/trace1']) print(model)
- learn(domain_path, trajectory_paths)[source]
- Learns a PDDL action model from:
a (possibly empty) input model which is required to specify the predicates and operators signature;
a list of trajectory file paths.
- Parameters:
domain_path (
str) – input PDDL domain file pathtrajectory_paths (
List[str]) – list of trajectory file paths
- Return type:
str- Returns:
a string representing the learned PDDL model
-
noise:
float= 0.0
amlgym.algorithms.OLAM module
- class amlgym.algorithms.OLAM.OLAM(input_domain_path, planning_timeout=30, max_length=8, max_subproblems=5, max_goals=10000)[source]
Bases:
ActiveAlgorithmAdapterAdapter class for running the OLAM algorithm: “Online Learning of Action Models for PDDL Planning”, L. Lamanna, A. Saetti, L. Serafini, A. Gerevini, and P. Traverso, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021. https://doi.org/10.24963/ijcai.2021/566
Example
from unified_planning.io import PDDLReader from unified_planning.shortcuts import SequentialSimulator from amlgym.algorithms import get_algorithm from amlgym.benchmarks import get_domain_path, get_problems_path from amlgym.util.util import empty_domain domain = 'blocksworld' domain_ref_path = get_domain_path(domain) input_domain_path = empty_domain(domain_ref_path) problem_path = get_problems_path(domain, kind='learning')[0] problem = PDDLReader().parse_problem(domain_ref_path, problem_path) env = SequentialSimulator(problem=problem) olam = get_algorithm('OLAM', input_domain_path=input_domain_path) model, trajectory = olam.learn(env) print("##################### Learned model #####################") print(model) print("################# Generated trajectory ##################") print(trajectory)
- Parameters:
planning_timeout (int) – Time limit in seconds for each planning call (default: 30)
max_length (int) – Maximum number of uncertain preconditions/effects considered in goal conjunctions (default: 8)
max_subproblems (int) – Maximum number of subproblems when handling object type ambiguity (default: 5)
max_goals (int) – Maximum number of disjunctions in a goal formula used during planning for learning preconditions and effects. When the number of generated goals exceeds this limit, some goals are discarded. (default: 10000)
-
input_domain_path:
str
- learn(simulator, max_steps=10000, seed=123)[source]
- Learns a PDDL action model from:
a simulator of the environment to learn from
a (possibly empty) input model which is required to specify the predicates and operators signature (set via the input_domain_path attribute at instantiation time);
- Parameters:
simulator (
SequentialSimulator) – environment simulatormax_steps (
int) – maximum number of interaction steps with the simulatorseed (
int) – random seed for reproducibility
- Return type:
Tuple[str,Trajectory]- Returns:
a string representing the learned PDDL model, and a JSON specification of the trajectory
-
max_goals:
int= 10000
-
max_length:
int= 8
-
max_subproblems:
int= 5
-
planning_timeout:
int= 30
amlgym.algorithms.OffLAM module
- class amlgym.algorithms.OffLAM.OffLAM(**kwargs)[source]
Bases:
PassiveAlgorithmAdapterAdapter class for running the OffLAM algorithm: “Lifted Action Models Learning from Partial Traces”, L. Lamanna, L. Serafini, A. Saetti, A. Gerevini, and P. Traverso, Artificial Intelligence Journal, 2025. https://www.sciencedirect.com/science/article/abs/pii/S0004370224001929
Example
from amlgym.algorithms import get_algorithm offlam = get_algorithm('OffLAM') model = offlam.learn('path/to/domain.pddl', ['path/to/trace0', 'path/to/trace1']) print(model)
- learn(domain_path, trajectory_paths)[source]
- Learns a PDDL action model from:
a (possibly empty) input model which is required to specify the predicates and operators signature;
a list of trajectory file paths.
- Parameters:
domain_path (
str) – input PDDL domain file pathtrajectory_paths (
List[str]) – list of trajectory file paths
- Return type:
str- Returns:
a string representing the learned PDDL model
amlgym.algorithms.PassiveAlgorithmAdapter module
- class amlgym.algorithms.PassiveAlgorithmAdapter.PassiveAlgorithmAdapter[source]
Bases:
ABCAn abstract class for an action model learning algorithm, which defines the abstract interface that must be implemented by every (subclass) algorithm adapter to enable automated evaluation.
- abstract learn(domain_path, trajectory_paths)[source]
- Learns a PDDL action model from:
a (possibly empty) input model which is required to specify the predicates and operators signature;
a list of trajectory file paths.
- Parameters:
domain_path (
str) – input PDDL domain file pathtrajectory_paths (
List[str]) – list of trajectory file paths
- Return type:
str- Returns:
a string representing the learned PDDL model
amlgym.algorithms.ROSAME module
- class amlgym.algorithms.ROSAME.ROSAME(**kwargs)[source]
Bases:
PassiveAlgorithmAdapterAdapter class for running an unofficial implementation of the ROSAME algorithm: “Neuro-Symbolic Learning of Lifted Action Models from Visual Traces”, Kai Xi, Stephen Gould, Sylvie Thiebaux, Proceedings of the Thirty-Fourth International Conference on Automated Planning and Scheduling, 2024. https://ojs.aaai.org/index.php/ICAPS/article/download/31528/33688
Example
from amlgym.algorithms import get_algorithm rosame = get_algorithm('ROSAME') model = rosame.learn('path/to/domain.pddl', ['path/to/trace0', 'path/to/trace1']) print(model)
- learn(domain_path, trajectory_paths, use_problems=True)[source]
- Learns a PDDL action model from:
a (possibly empty) input model which is required to specify the predicates and operators signature;
a list of trajectory file paths.
- Parameters:
domain_path (
str) – input PDDL domain file pathtrajectory_paths (
List[str]) – list of trajectory file pathsuse_problems (
bool) – boolean flag indicating whether to provide the set of objects specified in the problem from which the trajectories have been generated
- Return type:
str- Returns:
a string representing the learned PDDL model
amlgym.algorithms.RandomAgent module
- class amlgym.algorithms.RandomAgent.RandomAgent(input_domain_path)[source]
Bases:
ActiveAlgorithmAdapterA simple baseline for online learning in a fully observable and deterministic environment by randomly executing actions. The baselines firstly generates a trajectory and then applies the SAM algorithm for offline learning a model from the generated trace.
Example
from unified_planning.io import PDDLReader from unified_planning.shortcuts import SequentialSimulator from amlgym.algorithms import get_algorithm from amlgym.benchmarks import get_domain_path, get_problems_path from amlgym.util.util import empty_domain domain = 'blocksworld' domain_ref_path = get_domain_path(domain) input_domain_path = empty_domain(domain_ref_path) problem_path = get_problems_path(domain, kind='learning')[0] problem = PDDLReader().parse_problem(domain_ref_path, problem_path) env = SequentialSimulator(problem=problem) baseline = get_algorithm('RandomAgent', input_domain_path=input_domain_path) model, trajectory = baseline.learn(env, max_steps=100) print("##################### Learned model #####################") print(model) print("################# Generated trajectory ##################") print(trajectory)
-
input_domain_path:
str
- learn(simulator, max_steps=100, seed=123)[source]
- Learns a PDDL action model from:
a simulator of the environment to learn from
a (possibly empty) input model which is required to specify the predicates and operators signature (set via the input_domain_path attribute at instantiation time);
- Parameters:
simulator (
SequentialSimulator) – environment simulatormax_steps (
int) – maximum number of interaction steps with the simulatorseed (
int) – random seed for reproducibility
- Return type:
Tuple[str,Trajectory]- Returns:
a string representing the learned PDDL model, and a JSON specification of the trajectory
-
input_domain_path:
amlgym.algorithms.SAM module
- class amlgym.algorithms.SAM.SAM[source]
Bases:
PassiveAlgorithmAdapterAdapter class for running the SAM algorithm: “Safe Learning of Lifted Action Models”, B. Juba and H. S. Le, and R. Stern, Proceedings of the 18th International Conference on Principles of Knowledge Representation and Reasoning, 2021. https://proceedings.kr.org/2021/36/
Example
from amlgym.algorithms import get_algorithm sam = get_algorithm('SAM') model = sam.learn('path/to/domain.pddl', ['path/to/trace0', 'path/to/trace1']) print(model)
- learn(domain_path, trajectory_paths)[source]
- Learns a PDDL action model from:
a (possibly empty) input model which is required to specify the predicates and operators signature;
a list of trajectory file paths.
- Parameters:
domain_path (
str) – input PDDL domain file pathtrajectory_paths (
List[str]) – list of trajectory file paths
- Return type:
str- Returns:
a string representing the learned PDDL model