# A deep dive into robomimic datasets

This notebook will provide examples on how to work with robomimic datasets through various python code examples. This notebook assumes that you have installed `robomimic` and `robosuite` (which should be on the `offline_study` branch).

## Download dataset

First, let's try downloading a simple dataset - we'll use the Lift (PH) dataset. Note that there are utility scripts such as `scripts/download_datasets.py` to do this for us, but for the purposes of this example, we'll use our class' pre-segmented datasets.

In [2]:
import os
# First, we need to decide where to host the runtime storage
USE_GDRIVE_STORAGE = True

if not USE_GDRIVE_STORAGE:
    # Option 1: use the colab runtime storage. All trained model and downloaded
    # will disappear after you disconnect from the runtime.
    WS_DIR = "/content/"
else:
    # Option 2: use your google drive as the runtime storage. You need to grant
    # permission for the colab runtime to access your google drive. You also
    # need to decide on a workspace for robomimic. In this case, we've created a
    # folder called "colab_ws" in Google Drive.
    from google.colab import drive
    drive.mount('/content/drive')
    WS_DIR = "/content/drive/MyDrive/colab_ws/" # this should be the absolute path, e.g., "/content/drive/MyDrive/my-ws/"
    assert os.path.exists(WS_DIR)

%cd $WS_DIR

Mounted at /content/drive
/content/drive/MyDrive/colab_ws/suite


In [3]:
# Install the basic requirements
%cd $WS_DIR
!pip install -e robosuite/
!pip install -e robomimic/
!pip install -e mimicgen_environments/
!pip install mujoco

import sys
import os
sys.path.append('./robosuite/')
sys.path.append('./robomimic/')
sys.path.append('./mimicgen_environments/')

/content/drive/MyDrive/colab_ws/suite
Obtaining file:///content/drive/MyDrive/colab_ws/suite/robosuite
  Installing build dependencies ... [?25l[?25hdone
  Checking if build backend supports build_editable ... [?25l[?25hdone
  Getting requirements to build editable ... [?25l[?25hdone
  Preparing editable metadata (pyproject.toml) ... [?25l[?25hdone
Collecting mujoco>=2.3.0 (from robosuite==1.4.1)
  Downloading mujoco-3.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.4/5.4 MB[0m [31m15.0 MB/s[0m eta [36m0:00:00[0m
Collecting pynput (from robosuite==1.4.1)
  Downloading pynput-1.7.6-py2.py3-none-any.whl (89 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m89.2/89.2 kB[0m [31m10.6 MB/s[0m eta [36m0:00:00[0m
Collecting glfw (from mujoco>=2.3.0->robosuite==1.4.1)
  Downloading glfw-2.7.0-py2.py27.py3.py30.py31.py32.py33.py34.py35.py36.py37.py38-none-manylinux2014_x8

Now it's time to download some datasets. You can download the full dataset for stack_d0 using this command:


```
python download_datasets.py --dataset_type core --tasks stack_d0 --download_dir $DATA_DIR
```

But, this is a large file (~3GB) containing 1000 demonstrations. In the interest of not overreaching our Google Drive quotas, let's instead use some [pre-segmented datasets](https://drive.google.com/drive/folders/121aouiwC5U-pcrqTVt7QWnQgA1GE4Fzc?usp=drive_link), where each segment contains just 100 demonstration. That way we can download just as many (or as few) demonstrations as we need. This link provides us with the first 100 demonstrations of the stack_d0 task:

In [4]:
DATA_DIR = WS_DIR + "/mimicgen_data/"
import mimicgen_envs.utils.file_utils as FileUtils
FileUtils.download_url_from_gdrive(
                url="https://drive.google.com/file/d/1o2LbBglyY7AtsAs4GqTQ15LH4_cqJFmP",
                download_dir=DATA_DIR,
                check_overwrite=True,
            )



    No private macro file found!
    It is recommended to use a private macro file
    To setup, run: python /content/drive/MyDrive/colab_ws/suite/./robomimic/robomimic/scripts/setup_macros.py
)


Downloading...
From (original): https://drive.google.com/uc?id=1o2LbBglyY7AtsAs4GqTQ15LH4_cqJFmP
From (redirected): https://drive.google.com/uc?id=1o2LbBglyY7AtsAs4GqTQ15LH4_cqJFmP&confirm=t&uuid=e19caa39-2216-487d-a217-74174b305daa
To: /tmp/tmpvfry5rb9/stack_d0_100.hdf5
100%|██████████| 115M/115M [00:00<00:00, 151MB/s]


In [5]:
import json
import h5py
import numpy as np

# enforce that the dataset exists
dataset_path = os.path.join(DATA_DIR, "stack_d0_100.hdf5")
assert os.path.exists(dataset_path)

## Read quantities from dataset

Next, let's demonstrate how to read different quantities from the dataset. There are scripts such as `scripts/get_dataset_info.py` that can help you easily understand the contents of a dataset, but in this example, we'll break down how to do this directly.

First, let's take a look at the number of demonstrations in the file.

In [6]:
# open file
f = h5py.File(dataset_path, "r")

# each demonstration is a group under "data"
demos = list(f["data"].keys())
num_demos = len(demos)

print("hdf5 file {} has {} demonstrations".format(dataset_path, num_demos))

hdf5 file /content/drive/MyDrive/colab_ws/suite//mimicgen_data/stack_d0_100.hdf5 has 100 demonstrations


Next, let's list all of the demonstrations, along with the number of state-action pairs in each demonstration.

In [7]:
# each demonstration is named "demo_#" where # is a number.
# Let's put the demonstration list in increasing episode order
inds = np.argsort([int(elem[5:]) for elem in demos])
demos = [demos[i] for i in inds]

for ep in demos:
    num_actions = f["data/{}/actions".format(ep)].shape[0]
    print("{} has {} samples".format(ep, num_actions))

demo_13 has 124 samples
demo_21 has 110 samples
demo_27 has 104 samples
demo_33 has 104 samples
demo_44 has 110 samples
demo_45 has 126 samples
demo_54 has 104 samples
demo_63 has 114 samples
demo_72 has 118 samples
demo_74 has 88 samples
demo_103 has 103 samples
demo_129 has 121 samples
demo_140 has 123 samples
demo_156 has 111 samples
demo_169 has 111 samples
demo_197 has 105 samples
demo_200 has 102 samples
demo_202 has 102 samples
demo_212 has 132 samples
demo_221 has 110 samples
demo_234 has 124 samples
demo_242 has 97 samples
demo_244 has 112 samples
demo_259 has 126 samples
demo_260 has 122 samples
demo_267 has 108 samples
demo_269 has 115 samples
demo_271 has 111 samples
demo_280 has 99 samples
demo_284 has 110 samples
demo_286 has 111 samples
demo_287 has 101 samples
demo_300 has 121 samples
demo_313 has 111 samples
demo_315 has 130 samples
demo_322 has 104 samples
demo_324 has 112 samples
demo_328 has 101 samples
demo_331 has 109 samples
demo_343 has 112 samples
demo_345 has 

Now, let's dig into a single trajectory to take a look at some of the quantities in each demonstration.

In [8]:
# look at first demonstration
demo_key = demos[0]
demo_grp = f["data/{}".format(demo_key)]

# Each observation is a dictionary that maps modalities to numpy arrays, and
# each action is a numpy array. Let's print the observation modalities and look at
# the action taken in the first 5 timesteps of this trajectory.

print("observation modalities:")
print(demo_grp["obs"].keys())
for t in range(5):
  print("timestep" + str(t) + ":")
  print(demo_grp["obs"]["robot0_eef_pos"][t])
  print(demo_grp["obs"]["robot0_joint_pos"][t])
  print(demo_grp["actions"][t])

observation modalities:
<KeysViewHDF5 ['agentview_image', 'object', 'robot0_eef_pos', 'robot0_eef_quat', 'robot0_eef_vel_ang', 'robot0_eef_vel_lin', 'robot0_eye_in_hand_image', 'robot0_gripper_qpos', 'robot0_gripper_qvel', 'robot0_joint_pos', 'robot0_joint_pos_cos', 'robot0_joint_pos_sin', 'robot0_joint_vel']>
timestep0:
[-0.12866293 -0.00976382  0.98762931]
[-0.00742377  0.22644302 -0.01306292 -2.63885243  0.00832412  2.90485036
  0.80037847]
[-0.07348718 -0.03079287  0.05985955  0.00159009  0.0408305   0.04626626
 -1.        ]
timestep1:
[-0.12959865 -0.009665    0.98774438]
[-0.00746988  0.22451373 -0.01288523 -2.64048272  0.0081662   2.90213795
  0.79809848]
[ 2.18826773e-01  1.65892522e-01  4.85591237e-02 -8.53143749e-04
 -1.78531948e-02 -1.06042288e-01 -1.00000000e+00]
timestep2:
[-0.12861059 -0.00941697  0.98834481]
[-0.00719291  0.22420417 -0.01253234 -2.63859384  0.00880861  2.90149102
  0.80147624]
[ 0.40283534  0.32910132  0.08742945 -0.00228626 -0.0285195  -0.20149955
 -1. 

In [9]:
# we can also grab multiple timesteps at once directly, or even the full trajectory at once
first_ten_actions = demo_grp["actions"][:10]
print("shape of first ten actions {}".format(first_ten_actions.shape))
all_actions = demo_grp["actions"][:]
print("shape of all actions {}".format(all_actions.shape))

shape of first ten actions (10, 7)
shape of all actions (124, 7)


In [10]:
# we also have "done" and "reward" information stored in each trajectory.
# In this case, we have sparse rewards that indicate task completion at
# that timestep.
dones = demo_grp["dones"][:]
rewards = demo_grp["rewards"][:]
print("dones")
print(dones)
print("")
print("rewards")
print(rewards)

dones
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 1 1]

rewards
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 1. 1.]


In [11]:
# each demonstration also contains metadata
num_samples = demo_grp.attrs["num_samples"] # number of samples in this trajectory
mujoco_xml_file = demo_grp.attrs["model_file"] # mujoco XML file for this demonstration
print(mujoco_xml_file)

<mujoco model="base">
  <compiler angle="radian" meshdir="meshes/" autolimits="true"/>
  <option impratio="20" density="1.2" viscosity="2e-05" cone="elliptic"/>
  <size njmax="5000" nconmax="5000"/>
  <visual>
    <map znear="0.001"/>
  </visual>
  <asset>
    <texture type="skybox" builtin="gradient" rgb1="0.9 0.9 1" rgb2="0.2 0.3 0.4" width="256" height="1536"/>
    <texture type="2d" name="texplane" file="/home/amandlekar/installed_libraries/robosuite-public/robosuite/models/assets/arenas/../textures/light-gray-floor-tile.png"/>
    <texture type="cube" name="tex-ceramic" file="/home/amandlekar/installed_libraries/robosuite-public/robosuite/models/assets/arenas/../textures/ceramic.png"/>
    <texture type="cube" name="tex-steel-brushed" file="/home/amandlekar/installed_libraries/robosuite-public/robosuite/models/assets/arenas/../textures/steel-brushed.png"/>
    <texture type="2d" name="tex-light-gray-plaster" file="/home/amandlekar/installed_libraries/robosuite-public/robosuite/mod

Finally, let's take a look at some global metadata present in the file. The hdf5 file stores environment metadata which is a convenient way to understand which simulation environment (task) the dataset was collected on.

In [12]:
env_meta = json.loads(f["data"].attrs["env_args"])
# note: we could also have used the following function:
# env_meta = FileUtils.get_env_metadata_from_dataset(dataset_path=dataset_path)
print("==== Env Meta ====")
print(json.dumps(env_meta, indent=4))
print("")

==== Env Meta ====
{
    "env_name": "Stack_D0",
    "env_version": "1.4.1",
    "type": 1,
    "env_kwargs": {
        "has_renderer": false,
        "has_offscreen_renderer": true,
        "ignore_done": true,
        "use_object_obs": true,
        "use_camera_obs": true,
        "control_freq": 20,
        "controller_configs": {
            "type": "OSC_POSE",
            "input_max": 1,
            "input_min": -1,
            "output_max": [
                0.05,
                0.05,
                0.05,
                0.5,
                0.5,
                0.5
            ],
            "output_min": [
                -0.05,
                -0.05,
                -0.05,
                -0.5,
                -0.5,
                -0.5
            ],
            "kp": 150,
            "damping": 1,
            "impedance_mode": "fixed",
            "kp_limits": [
                0,
                300
            ],
            "damping_limits": [
                0,
       

## Visualizing demonstration trajectories

Finally, let's play some of these demonstrations back in the simulation environment to easily visualize the data that was collected.

It turns out that the environment metadata stored in the hdf5 allows us to easily create a simulation environment that is consistent with the way the dataset was collected!

In [13]:
import robomimic.utils.env_utils as EnvUtils

# create simulation environment from environment metedata
env = EnvUtils.create_env_from_metadata(
    env_meta=env_meta,
    render=False,            # no on-screen rendering
    render_offscreen=True,   # off-screen rendering to support rendering video frames
)

Created environment with name Stack_D0
Action size is 7


In [14]:
import robomimic.utils.obs_utils as ObsUtils

# We normally need to make sure robomimic knows which observations are images (for the
# data processing pipeline). This is usually inferred from your training config, but
# since we are just playing back demonstrations, we just need to initialize robomimic
# with a dummy spec.
dummy_spec = dict(
    obs=dict(
            low_dim=["robot0_eef_pos"],
            rgb=[],
        ),
)
ObsUtils.initialize_obs_utils_with_obs_specs(obs_modality_specs=dummy_spec)



using obs modality: low_dim with keys: ['robot0_eef_pos']
using obs modality: rgb with keys: []


In [15]:
import imageio

# prepare to write playback trajectories to video
video_path = os.path.join(DATA_DIR, "playback.mp4")
video_writer = imageio.get_writer(video_path, fps=20)

In [16]:
def playback_trajectory(demo_key):
    """
    Simple helper function to playback the trajectory stored under the hdf5 group @demo_key and
    write frames rendered from the simulation to the active @video_writer.
    """

    # robosuite datasets store the ground-truth simulator states under the "states" key.
    # We will use the first one, alone with the model xml, to reset the environment to
    # the initial configuration before playing back actions.
    init_state = f["data/{}/states".format(demo_key)][0]
    model_xml = f["data/{}".format(demo_key)].attrs["model_file"]
    initial_state_dict = dict(states=init_state, model=model_xml)

    # reset to initial state
    env.reset_to(initial_state_dict)

    # playback actions one by one, and render frames
    actions = f["data/{}/actions".format(demo_key)][:]
    for t in range(actions.shape[0]):
        env.step(actions[t])
        video_img = env.render(mode="rgb_array", height=512, width=512, camera_name="agentview")
        video_writer.append_data(video_img)

In [18]:
# playback the first 3 demos and record them to a video file
for ep in demos[:3]:
    print("Playing back demo key: {}".format(ep))
    playback_trajectory(ep)

# done writing video
video_writer.close()

Playing back demo key: demo_13
Playing back demo key: demo_21
Playing back demo key: demo_27


In [19]:
# view the trajectories!
from IPython.display import Video
Video(video_path, embed=True)