Predefined Tools¶
The following are predefined tools that can be used directly with agents.
Code Interpreter¶
code_interpreter(code: str, context: Context)
async
¶
Run the code in docker container and return the output from stdout or stderr.
Uses one Python sandbox per rollout (acquired via Context); the sandbox is released when the rollout ends. Warm the pool at training start with ResourceEngine.start(python_sandbox_spec(), size=32, backend="local") if needed.
Parameters:
-
code(str) –The code to run.
-
context(Context) –Injected rollout context; used to acquire the sandbox resource.
Returns:
-
str–The output from stdout or stderr.
Calculator¶
calculator(expression: str)
¶
Calculate the result of a mathematical expression.
Parameters:
-
expression(str) –The mathematical expression to calculate
Returns:
-
str–The result of the expression
ALFWorld¶
alfworld_step(action: str, context: Context)
async
¶
Take an action in the ALFWorld environment and return the observation
Parameters:
-
action(str) –The action to take in the environment
-
context(Context) –Injected rollout context; used to acquire the ALFWorld resource.
Returns:
-
dict–A dictionary containing the observation, reward, done, and info
ScienceWorld¶
scienceworld_explorer(action: str, context: Context)
async
¶
Executes an action in the ScienceWorld environment and returns the resulting observation.
Parameters:
-
action(str) –The action to perform in the environment. Valid actions include commands like 'look around', 'inventory', 'open
', etc. -
context(Context) –Injected rollout context; used to acquire the ScienceWorld resource.
Returns:
-
str–The observation returned by the environment after performing the action, or an error message if the action is invalid or an exception occurs.
SWE (container workspace)¶
Tools for SWE-bench style rollouts: they are stateful and expect rollout Context with context.metadata["image_id"] set to the task container image. File tools mount file_manager.py into the container and delegate reads and edits to that helper. run_shell_command runs in the container with working directory /testbed. On first shell acquisition for a command, if context.metadata includes git_commit_hash, the shell tool runs git fetch and git checkout for that commit under /testbed (for example swe-smith style tasks).
File workspace¶
Defined in agentfly.tools.src.file.tools (also re-exported from agentfly.tools).
read_file(path: str, start_line: Optional[int] = None, end_line: Optional[int] = None, context: Context = None)
async
¶
Reads a file from the workspace with line numbers. Args: path: The relative path to the file (under container workspace). start_line: Start line number to read from. end_line: End line number to read to.
grep_search(pattern: str, path: str = '.', include: str = '', context: Context = None)
async
¶
Search for a regex pattern across all files in a directory. Args: pattern: The regex/string to search for. path: Directory to search in. include: Glob filter, e.g. "*.py" for Python files only.
create_file(path: str, content: str = '', context: Context = None)
async
¶
Create a new file under the workspace. Fails if the path already exists. Args: path: Relative path for the new file. content: Initial file body (default empty).
edit_file(path: str, search_block: str, replace_block: str, context: Context)
async
¶
Surgically replaces a block of text in a file. Only replaces the first occurrence of search_block. Args: path: Path to file. search_block: Exact text to find. replace_block: Text to insert.
undo_edit(path: str, context: Context)
async
¶
Reverts the last modification made to a specific file. Args: path: The path of the file to revert.
run_python(path: str, timeout: int = 60, context: Context = None)
async
¶
Run Python only: executes a script file under the workspace (python3 path; cwd is workspace root). Args: path: Relative path to a .py file inside the workspace. timeout: Max seconds for the subprocess (default 60).
Shell¶
run_shell_command(cmd: str, context: Context)
async
¶
Runs a shell command in the container workspace.
Commands are checked by :class:~.command_filter.CommandFilter before execution;
blocked commands return the filter reason (e.g. Blocked: ...) without running.
Parameters:
-
cmd(str) –The shell command to run (e.g. "ls -la", "cat file.txt", "pwd"). For multi-line python -c code, use bash ANSI-C quoting so newlines work: python3 -c $'line1\nline2\nline3' (the \n are real newlines inside the container).
-
context(Context) –Injected rollout context; used to acquire the container resource.