multigrid.envs.blockedunlockpickup module#

class multigrid.envs.blockedunlockpickup.BlockedUnlockPickupEnv[source]#

Bases: RoomGrid

https://i.imgur.com/uSFi059.gif

Description#

The objective is to pick up a box which is placed in another room, behind a locked door. The door is also blocked by a ball which must be moved before the door can be unlocked. Hence, agents must learn to move the ball, pick up the key, open the door and pick up the object in the other room.

The standard setting is cooperative, where all agents receive the reward when the task is completed.

Mission Space#

“pick up the {color} box”

{color} is the color of the box. Can be any Color.

Observation Space#

The multi-agent observation space is a Dict mapping from agent index to corresponding agent observation space.

Each agent observation is a dictionary with the following entries:

  • imagendarray[int] of shape (view_size, view_size, WorldObj.dim)

    Encoding of the agent’s partially observable view of the environment, where the object at each grid cell is encoded as a vector: (Type, Color, State)

  • directionint

    Agent’s direction (0: right, 1: down, 2: left, 3: up)

  • missionMission

    Task string corresponding to the current environment configuration

Action Space#

The multi-agent action space is a Dict mapping from agent index to corresponding agent action space.

Agent actions are discrete integer values, given by:

Num

Name

Action

0

left

Turn left

1

right

Turn right

2

forward

Move forward

3

pickup

Pick up an object

4

drop

Drop an object

5

toggle

Toggle / activate an object

6

done

Done completing task

Rewards#

A reward of 1 - 0.9 * (step_count / max_steps) is given for success, and 0 for failure.

Termination#

The episode ends if any one of the following conditions is met:

  • Any agent picks up the correct box

  • Timeout (see max_steps)

Registered Configurations#

  • MultiGrid-BlockedUnlockPickup-v0

__init__(room_size: int = 6, max_steps: int | None = None, joint_reward: bool = True, **kwargs)[source]#
Parameters:
room_sizeint, default=6

Width and height for each of the two rooms

max_stepsint, optional

Maximum number of steps per episode

joint_rewardbool, default=True

Whether all agents receive the reward when the task is completed

**kwargs

See multigrid.base.MultiGridEnv.__init__

grid: Grid#
agents: list[Agent]#