LILocBench: Benchmark for Evaluating Long-Term Localization in Indoor Environments Under Substantial Static and Dynamic Scene Changes

2D LiDAR scans

RGB images

depth images

odometry

We introduce LILocBench, a Long-term Indoor Localization Benchmark. Our benchmark comes with a dataset for which we have collected data with a mobile robot platform in an indoor office environment. We provide data from 2D LiDARs, RGB-D cameras, and the robot's wheel odometry. To facilitate the evaluation of localization algorithms on our data, we also provide accurate ground truth within a single global frame over the complete environment.

Overview

Colored point cloud of the office environment that is captured in our dataset.

We recorded various sequences in an office environment consisting of multiple office rooms of different sizes and a kitchen, which is depicted in the colored point cloud above.

We recorded the sequences under different conditions, which can be briefly classified into sequences in a static environment, sequences with short-term dynamics, and sequences with long-term changes. Typical examples of the non-static characteristics are shown below:

dynamics

long-term changes

The following table provides an overview of our dataset's statistics, which we group by the different characteristics of the sequences:

type number of sequences duration [s] size [GB] distance [m]
mapping 1 812.0 56.4 217.8
static 7 2857.8 198.5 1025.1
dynamics 5 1261.2 87.6 375.8
long-term changes 6 2688.0 186.7 979.0
long-term changes + dynamics 2 1206.0 83.8 437.2
overall 21 8824.9 612.9 3034.8

Hardware Platform

For the data collection, we use a Clearpath Robotics Dingo omnidirectional wheeled robot. We gather data from the following sensors:

  • 2x SICK TIM781S 2D LiDARs (@15 Hz)
  • 3x Intel RealSense D455 RGB-D cameras (640 x 480 pixels @15 Hz)
  • the robot's wheel odometry (@15Hz)
  • an upward-facing FLIR BlackFly S PGE 31S4C-C fisheye camera (2048 × 1036 pixels @20 Hz), solely used to detect ceiling-mounted AprilTag markers to obtain pose ground truth

Data Format and Folder Structure

Overall, we provide all data recorded by our hardware platform except the upward-facing camera's images used for the ground truth. We provide raw data in two different ROS bags, one version with all data and for convenience a reduced version that does not contain any camera-related topics. In addition, we also provide data in individual data files that are human-readable. Here, we have for each sequence the following folder structure:

  • sequence_XXX
    • camera_XXX
      • color
        • images
          • 1733375656.630170663.png
          • 1733375656.696981258.png
          •          
        • intrinsics.yaml
      • depth
        • images
          • 1733375656.638256597.png
          • 1733375656.704976817.png
          •          
        • intrinsics.yaml
      • extrinsics_depth_to_color.yaml
    • dingo_velocity_controller
      • cmd_vel.txt
      • odom.txt
    • joint_states
      • front_left_wheel.txt
      • front_right_wheel.txt
      • rear_left_wheel.txt
      • rear_right_wheel.txt
    • laser_scan_XXX
      • intensities.txt
      • ranges.txt
      • laser_specs.yaml
    • transformations.yaml
    • robot_params.yaml

In more detail, the data is stored as follows:

  • camera_XXX: For each of the three cameras, we store for both RGB and depth images all image files in .png format in subfolders. The names of the images correspond to their time stamps. In the subfolders, we also provide the intrinsics in intrinsics.yaml. The transformation between the RGB and depth frame is provided in the extrinsics.yaml file.
  • dingo_velocity_controller: Contains poses from wheel odometry and input velocity commands to the robot.
    The poses calculated from wheel odometry in the file odom.txt are given in a format with the following columns:
    time_stamp pos[3] quat[4] pose_cov[36] vel_t[3] vel_r[3] vel_cov[36]
    where pos[3] = pos_x pos_y pos_z is the position, quat[4] = quat_x quat_y quat_z quat_w a quaternion representing the rotational part, vel_t[3] = vel_tx vel_ty vel_tz is the translational velocity, and vel_r[3] = vel_rx vel_ry vel_rz the rotational velocity. pose_cov[36] and vel_cov[36] are the position's and velocity's covariance matrix.
    The input command velocities in cmd_vel.txt follows a format with the columns:
    time_stamp vel_t[3] vel_r[3]
  • joint_states: We provide also the raw output of the four wheel encoders in the four according .txt files. The columns in the files correspond to:
    time_stamp position velocity current
  • laser_scan_XXX: For the two 2D LiDARs, the ranges and intensities are provided in separate files with with the columns formatted as follows:
    ranges.txt:  time_stamp ranges[]         intensities.txttime_stamp intensities[]
    The parameters of the laser scanner, e.g. required to calculate points from the ranges is provided in the file laser_specs.yaml.
  • transformations.yaml: Contains all transformations between the sensor frames and the robot base.
  • robot_params.yaml: Contains additional parameters related to the robot, specifically the position of the wheels relative to the robot base.

Coordinate Frames

The relation between the coordinate frames in the transformations.yaml file is depicted in the image below. The /tf and /tf_static topics in the ROS bags contain additional frames related to the robot, which are however not related to the provided sensor information. For the cameras, the frames follow the convention used in RealSense cameras, with camera_XXX_link aligned with the camera_XXX_depth_frame. For all three camera_XXX_link, camera_XXX_depth_frame, and camera_XXX_color_frame, the x-axis points out of the image plane and the z-axis upward. The optical frames of the cameras, i.e., camera_XXX_color_optical_frame and camera_XXX_depth_optical_frame, follow the convention with the z-axis pointing out of the image plane and the y-axis pointing downwards.

Tree of the frames provided in the transformations.yaml file.

Ground Truth Poses

We generate ground truth poses by using the upward-facing fisheye camera to detect AprilTag markers installed on all room ceilings. Therefore, we are able to provide pose information in the complete environment in a single global frame consistent over all sequences. When using the ground truth poses together with the data from the mapping sequence to build an arbitrary map representation, the estimated poses from the localization sequences based on the generated map can directly be compared with the ground truth poses without requiring trajectory alignment. The ground truth poses are provided with respect to the base_link of the robot.
The ground truth pose files follow the TUM-format, i.e. the files have the following columns: time_stamp pos_x pos_y pos_z quat_x quat_y quat_z quat_w

Download

For convenience, we provide data both as ROS bags and in individual data files. Below are links both for downloading the complete dataset, as well as for downloading data of individual sequences. We provide ground truth only for a subset of the sequences, as the sequences with non-public sequences are part of the benchmark, as described further below.

Full dataset

Individual Files ROS bags
(full/no camera topics)
Ground Truth
(203.3 GB) (222.0 GB) / (1.1 GB) (3.6 MB)

With ground truth

Sequence Description Individual Files ROS bags
(full/no camera topics)
Ground Truth
mapping mapping sequence (18.3 GB) (56.4 GB) / (192.2 MB) (2.6 MB)
static_0 static environment (13.9 GB) (41.6 GB) / (141.8 MB) (1.9 MB)
dynamics_0 ~10 people moving in the vicinity of the robot (3.8 GB) (11.1 GB) / (38.1 MB) (518.9 KB)
lt_changes_0 rearranged objects in multiple rooms + hallway (rearranged furniture, cardboard boxes placed in the environment) (10.0 GB) (30.3 GB) / (103.8 MB) (1.4 MB)
lt_changes_dynamics_0 combination of rearranged objects in multiple rooms + hallway and moving people (12.9 GB) (38.8 GB) / (132.9 MB) (1.8 MB)

Without ground truth

Sequence Description Individual Files ROS bags
(full/no camera topics)
static_1 static environment (16.6 GB) (49.7 GB) / (169.5 MB)
static_2 static environment, robot moves omnidirectionally (4.1 GB) (12.2 GB) / (41.6 MB)
static_3 static environment, robot moves omnidirectionally (4.8 GB) (14.2 GB) / (48.7 MB)
static_4 static environment, robot moves omnidirectionally (7.1 GB) (20.8 GB) / (71.3 MB)
static_5 "natural" conditions typically found in office environments, e.g. small scene changes due to moved chairs and few dynamics due to humans (2.4 GB) (6.9 GB) / (23.7 MB)
static_6 "natural" conditions typically found in office environments, e.g. small scene changes due to moved chairs and few dynamics due to humans (17.6 GB) (53.0 GB) / (180.9 MB)
dynamics_1 ~10 people moving in the vicinity of the robot (3.2 GB) (9.3 GB) / (32.0 MB)
dynamics_2 ~10 people moving in the vicinity of the robot (3.8 GB) (11.3 GB) / (39.0 MB)
dynamics_3 the robot follows a person carrying a large board (13.1 GB) (39.6 GB) / (136.4 MB)
dynamics_4 ~10 people moving in the vicinity of the robot, also shifting cardboard boxes around (5.6 GB) (16.2 GB) / (56.3 MB)
lt_changes_1 rearranged objects in multiple rooms + hallway (rearranged furniture, cardboard boxes placed in the environment) (5.8 GB) (17.2 GB) / (59.0 MB)
lt_changes_2 a wall in the hallway is "shifted" by placing cardboard boxes in front of it (7.5 GB) (22.6 GB) / (77.3 MB)
lt_changes_3 robot moves only in the hallway, doors to all offices are closed, which were previously open (10.1 GB) (31.9 GB) / (109.1 MB)
lt_changes_4 objects are placed in the hallway in such a way that the robot traverses an area that is entirely inconsistent with the mapping sequence (22.8 GB) (70.1 GB) / (240.1 MB)
lt_changes_5 objects are placed in the hallway in such a way that the robot traverses an area that is entirely inconsistent with the mapping sequence (4.7 GB) (14.4 GB) / (49.4 MB)
lt_changes_dynamics_1 combination of rearranged objects in multiple rooms + hallway and moving people (15.1 GB) (44.9 GB) / (153.9 MB)

Occupancy Map

While the mapping sequence can be used to build an arbitrary map representation used for localization, we provide an occupancy grid map created from laser scans of the front 2D LiDAR. The map is generated based on the ground truth poses together with refinement of the laser scan poses using scan matching.
The download below comes with a .yaml metadata file following the format for the ROS1 map_server.

LILocBench CodaBench Challenge

We created a CodaBench challenge for LILocBench, which is a public competition to evaluate localization algorithms on the part of our dataset without public ground truth. The challenge and informations on how to participate can be found at:

In addition, we provide the CodaBench evaluation code for the challenge, which can be used to evaluate localization algorithms on the sequences with public ground truth. The evaluation code is available on Github.

Help & Issues

For any questions or issues with the data or the benchmark challenge, please open an issue on Github.

Citation

If you use our dataset, we appreciate if you would cite our paper (PDF):

@inproceedings{trekel2025iros,
            author = {N. Trekel and T. Guadagnino and T. L\"abe and L. Wiesmann and P. Aguiar and J. Behley and C. Stachniss},
            title = {{Benchmark for Evaluating Long-Term Localization in Indoor Environments Under Substantial 
              Static and Dynamic Scene Changes}},
            booktitle = {Proc.~of the IEEE/RSJ Intl.~Conf.~on Intelligent Robots and Systems (IROS)},
            year = {2025},
            }