multigrid.envs.locked_hallway module#

class multigrid.envs.locked_hallway.LockedHallwayEnv[source]#

Bases: RoomGrid

https://i.imgur.com/VylPtnn.gif

Description#

This environment consists of a hallway with multiple locked rooms on either side. To unlock each door, agents must first find the corresponding key, which may be in another locked room. Agents are rewarded for each door they unlock.

The standard setting is cooperative, where all agents receive a reward for each door that is opened.

Mission Space#

“unlock all the doors”

Observation Space#

The multi-agent observation space is a Dict mapping from agent index to corresponding agent observation space.

Each agent observation is a dictionary with the following entries:

  • imagendarray[int] of shape (view_size, view_size, WorldObj.dim)

    Encoding of the agent’s partially observable view of the environment, where each grid cell is encoded as a 3 dimensional tuple: (Type, Color, State)

  • directionint

    Agent’s direction (0: right, 1: down, 2: left, 3: up)

  • missionMission

    Task string corresponding to the current environment configuration

Action Space#

The multi-agent action space is a Dict mapping from agent index to corresponding agent action space.

Agent actions are discrete integer values, given by:

Num

Name

Action

0

left

Turn left

1

right

Turn right

2

forward

Move forward

3

pickup

Pick up an object

4

drop

Drop an object

5

toggle

Toggle / activate an object

6

done

Done completing task

Rewards#

A reward of 1 - 0.9 * (step_count / max_steps) is given when a door is unlocked.

Termination#

The episode ends if any one of the following conditions is met:

  • All doors are unlocked

  • Timeout (see max_steps)

Registered Configurations#

  • MultiGrid-LockedHallway-2Rooms-v0

  • MultiGrid-LockedHallway-4Rooms-v0

  • MultiGrid-LockedHallway-6Rooms-v0

__init__(num_rooms: int = 6, room_size: int = 5, max_hallway_keys: int = 1, max_keys_per_room: int = 2, max_steps: int | None = None, joint_reward: bool = True, **kwargs)[source]#
Parameters:
num_roomsint, default=6

Number of rooms in the environment

room_sizeint, default=5

Width and height for each of the rooms

max_hallway_keysint, default=1

Maximum number of keys in the hallway

max_keys_per_roomint, default=2

Maximum number of keys in each room

max_stepsint, optional

Maximum number of steps per episode

joint_rewardbool, default=True

Whether all agents receive the same reward

**kwargs

See multigrid.base.MultiGridEnv.__init__