Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 26 additions & 27 deletions docs/closed_set.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ In some cases, a more minimal installation is desirable (e.g., containers). The

1. Add the CUDA repositories [here](https://developer.nvidia.com/cuda-downloads) by installing the `deb (network)` package or

```bash
```shell
# make sure you pick the correct ubuntu version!
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
Expand All @@ -34,21 +34,21 @@ libnvinfer-dev/unknown,now 10.4.0.26-1+cuda12.6 amd64

3. Install TensorRT and CUDA if necessary:

```bash
```shell
# use the corresponding version number from the previous step or omit nvcc if already installed
sudo apt install libnvinfer-dev libnvonnxparsers-dev libnvinfer-plugin-dev cuda-nvcc-12-6
```

### Building

Once the necessary dependencies are installed and this repository has been placed in a workspace, run the following:
```
catkin build
```shell
colcon build
```

You can run the following to make sure everything is working:
```
catkin test semantic_inference
You can run the following to validate that `semantic_inference` built correctly:
```shell
colcon test --packages-select semantic_inference
```

### Models
Expand All @@ -61,7 +61,7 @@ By default, the code uses [this](https://drive.google.com/file/d/1XRcsyLSvqqhqNI
> We recommend using models within the [dense2d](https://drive.google.com/drive/folders/17p_ZZIxI9jI_3GjjtbMijC2WFnc9Bz-a?usp=sharing) folder, which are named corresponding to the labelspace they output to.
> The top-level models are deprecated as they do not follow this naming scheme (they all output to the ade20k label space).

By default, the closed set node looks under the directory `$HOME/.semantic_inference$` for models (this works on Linux or as long as `HOME` is set).
By default, the closed set node looks under the directory `$HOME/.semantic_inference` for models (this works on Linux or as long as `HOME` is set).
It is possible to change this directory by specifying the `SEMANTIC_INFERENCE_MODEL_DIR` environment variable.
To use a specific downloaded model, use the argument `model_file:=MODEL_FILE` when running the appropriate launch file (where `MODEL_NAME` is the filename of the model relative to the configured model directory).
Specifying an absolute filepath will override the default model directory.
Expand All @@ -73,37 +73,36 @@ See [here](exporting.md) for details on exporting a new model.
### Python utilities

You may find it useful to set up some of the included model utilities. From the top-level of this repository, run:
```
python3 -m virtualenv /path/to/new/environment
source /path/to/new/environment/bin/activate
```shell
python3 -m virtualenv <DESIRED_PATH_TO_ENVIRONMENT>
source <DESIRED_PATH_TO_ENVIRONMENT>/bin/activate
pip install ./semantic_inference
```

## Using closed-set segmentation online

To use the open-set segmentation as part of a larger system, include [semantic_inference.launch](../semantic_inference_ros/launch/semantic_inference.launch) in your launch file. Often this will look like this:
```xml
<launch>

<!-- ... rest of launch file ... -->

<remap from="semantic_inference/color/image_raw" to="YOUR_INPUT_TOPIC_HERE"/>
<include file="$(find semantic_inference_ros)/launch/semantic_inference.launch"/>

</launch>
To use the open-set segmentation as part of a larger system, include [closed_set.launch.yaml](../semantic_inference_ros/launch/closed_set.launch.yaml) in your launch file. Often this will look like this:
```yaml
launch:
# ... rest of launch file ...
- set_remap: {from: "color/image_raw", to: "YOUR_INPUT_TOPIC_HERE"}
- include: {file: "$(find-pkg-share semantic_inference_ros)/launch/closed_set.launch.yaml"}
```

> **Note** </br>
> You'll probably also want to namespace the included launch file and corresponding remap via a `group` tag and `push_ros_namespace` with the camera name as the namespace.

## Adding New Datasets

To adapt to a new dataset (or to make a new grouping of labels), you will have to:

- Make a new grouping config (see [this](config/label_groupings/ade150_outdoor.yaml) or [this](config/label_groupings/ade150_indoor.yaml) for examples)
- Make a new grouping config (see [this](../semantic_inference_ros/config/label_groupings/ade20k_outdoor.yaml) or [this](../semantic_inference_ros/config/label_groupings/ade20k_indoor.yaml) for examples)
- Pass in the appropriate arguments to the launch file (`labelspace_name`)

You can view the groupings for a particular labelspace by running `semantic_inference labelspace compare`.
You can view the groupings for a particular labelspace by running `semantic-inference labelspace compare`.
For a grouping of the ade20k labelspace:
```bash
source ~/path/to/environment/bin/activate
cd path/to/semantic_inference
semantic_inference labelspace compare resources/ade20k_categories.csv config/label_groupings/ade150_indoor.yaml
```shell
source <DESIRED_PATH_TO_ENVIRONMENT>/bin/activate
cd <PATH_TO_REPO>
semantic-inference labelspace compare semantic_inference/resources/ade20k_categories.csv semantic_inference_ros/config/label_groupings/ade20k_indoor.yaml
```
23 changes: 2 additions & 21 deletions docs/exporting.md
Original file line number Diff line number Diff line change
@@ -1,34 +1,15 @@
# Exporting Instructions

When exporting a new model, you will also need to write a config similar to [this](../semantic_inference/config/models/ade20k-efficientvit_seg_l2.yaml).
When exporting a new model, you will also need to write a config similar to [this](../semantic_inference_ros/config/models/ade20k-efficientvit_seg_l2.yaml).

## ADE20k baseline models

You can export all models (that are compatible) from the MIT scene parsing challenge via [this script](../exporting/export_mit_semseg.py).
At the moment, only `hrnetv2-c1` and `mobilenetv2dilated-c1_deepsup` are compatible.
This may change as the newer onnx export method in torch becomes stable (or not, it is unclear whether or not the custom batch norm operator will ever work with the export).
To run the script, you will want to create a separate virtual environment and install dependencies lists in the [pyproject file](../semantic_inference/setup.cfg).
To run the script, you will want to create a separate virtual environment and install dependencies lists in the [pyproject file](../semantic_inference/pyproject.toml).

## EfficientViT

Make a separate virtual environment (with `python3.10`) and just install efficientvit to the environment (pip installing from the git repo url worked well).
Run [this script](../exporting/export_efficientvit.py). You may need other dependencies (`click`, `matplotlib`, etc.).

## OneFormer

Requires pytorch 1.13.1 (i.e., last release before 2.0) and CUDA 11.7. Do NOT use conda

For CUDA:
```
sudo apt install cuda-libraries-11-7 cuda-libraries-dev-11-7 cuda-nvrtc-dev-11-7 cuda-nvcc-11-7
sudo update-alternatives --config cuda
```

Other tweaks:
- Install detectron2 directly from the repo
- Remove pinned versions from the requirements file (as well as natten)
- Install the corresponding version of natten from the SHI-labs website

In general, the export does not seem to work.
Newer versions of pytorch lead to significantly decreased model peformance and there are too many problems with how inputs and outputs are passed around.
See [this script](../exporting/export_oneformer.py) for the closest attempt.
67 changes: 30 additions & 37 deletions docs/open_set.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,33 +2,30 @@

## Setting Up

The open-set segmentation interface works with and without ROS. For working with ROS, we assume you have already built your catkin workspace with this repository in it beforehand (i.e., by running `catkin build`).
The open-set segmentation interface works with and without ROS. For working with ROS, we assume you have already built your workspace with this repository in it beforehand (i.e., by running `colcon build`).

> **Note </br>**
> If you intend only to use the open-set segmentation interface, you may want to turn off building against TensorRT, which you can do by the following:
> ```
> catkin config -a -DSEMANTIC_INFERENCE_USE_TRT=OFF
> ```shell
> colcon build --cmake-args --no-warn-unused-cli -DSEMANTIC_INFERENCE_USE_TRT=OFF
> ```

### Installing

We assume you are using a virtual environment. You may want to install `virtualenv` (usually `sudo apt install python3-virtualenv`) if you haven't already.
To set up a virtual environment for use with ROS:
```
python3 -m virtualenv -p /usr/bin/python3 --system-site-packages /desired/path/to/environment
```shell
python3 -m virtualenv -p /usr/bin/python3 --system-site-packages <DESIRED_PATH_TO_ENVIRONMENT>
```
Otherwise, omit the ``--system-site-packages`` option:
```
python3 -m virtualenv -p /usr/bin/python3 --download /desired/path/to/environment
```shell
python3 -m virtualenv -p /usr/bin/python3 --download <DESIRED_PATH_TO_ENVIRONMENT>
```

> :warning: **Warning** <br>
> Note that newer versions of `setuptools` are incompatible with `--system-site-packages` on Ubuntu 20.04. Do not use `--download` and `--system-site-packages` and expect the installation to work (specifically with external packages specified by git url).

Then, install `semantic_inference`
```bash
cd /path/to/repo
source /path/to/environment
```shell
cd <PATH_TO_REPO>
source <PATH_TO_ENVIRONMENT>/bin/activate
pip install ./semantic_inference[openset] # note that the openset extra is required for open-set semantic segmentation
```
You may see dependency version errors from pip if installing into an environment created with `--system-site-packages`. This is expected.
Expand All @@ -40,33 +37,29 @@ It is also possible to install via an editable install (i.e., by using `-e` when
Note that both CLIP and FastSAM automatically download the relevant model weights when they are first run.
Running with the original SAM may require downloading the model weights. See the official SAM repository [here](https://github.com/facebookresearch/segment-anything) for more details.

## Using open-set segmentation online

To use the open-set segmentation as part of a larger system, include [openset_segmentation.launch](../semantic_inference_ros/launch/openset_segmentation.launch) in your launch file. Often this will look like this:
```xml
<launch>
## Trying out open-set segmentation nodes

<!-- ... rest of launch file ... -->
Similar to the example [here](../README.md#usage), you can run any of the open-set launch files:

<remap from="semantic_inference/color/image_raw" to="YOUR_INPUT_TOPIC_HERE"/>
<include file="$(find semantic_inference_ros)/launch/openset_segmentation.launch"/>

</launch>
```shell
activate <PATH_TO_ENVIRONMENT>/bin/activate
## this example just produces an embedding vector per image
# ros2 launch semantic_inference_ros image_embedding_node.launch.yaml
ros2 launch semantic_inference_ros open_set.launch.yaml
```
and then run
```shell
ros2 bag play PATH_TO_BAG --remap INPUT_TOPIC:=/color/image_raw
```
Note there are some arguments you may want to specify when including `openset_segmentation.launch` that are not shown here, specifically the configuration for the model to use.

## Pre-generating semantics
You should see a single embedding vector published under `/semantic/feature` and (if running the full open-set segmenter), the segmentation results under `/semantic/image_raw` and a visualization of the results under `/semantic_color/image_raw` and `/semantic_overlay/image_raw`.

It is also possible to pre-generate semantics when working with recorded data.
To create a rosbag containing the original bag contents *plus* the resulting open-set segmentation, run the following
```
rosrun semantic_inference_ros make_rosbag --copy /path/to/input_bag \
/color_topic:/output_topic \
-o /path/to/desired/output_bag
```
replacing `/color_topic` and `/output_topic` with appropriate topic names (usually `/camera_name/color/image_raw` and `/camera_name/semantic/image_raw`).
## Using open-set segmentation online

Additional options exist.
Running without `--copy` will output just the open-set segmentation at the path specified by `-o`.
If no output path is specified, the semantics will be added in-place to the bag after a confirmation prompt (you can disable the prompt with `-y`).
Additional information and documentation is available via `--help`.
To use the open-set segmentation as part of a larger system, include [open_set.launch.yaml](../semantic_inference_ros/launch/open_set.launch.yaml) in your launch file. Often this will look like this:
```yaml
launch:
# ... rest of launch file ...
- set_remap: {from: "color/image_raw", to: "YOUR_INPUT_TOPIC_HERE"}
- include: {file: "$(find-pkg-share semantic_inference_ros)/launch/opsen_set.launch.yaml"}
```
2 changes: 2 additions & 0 deletions semantic_inference/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,14 @@ dependencies = [
"imageio",
"onnx",
"pandas",
"rich",
"rosbags",
"ruamel.yaml",
"seaborn",
"torch",
"torchvision",
"spark_config@git+https://github.com/MIT-SPARK/Spark-Config.git",
"numpy<2",
]

[tool.setuptools.packages.find]
Expand Down
75 changes: 26 additions & 49 deletions semantic_inference/python/semantic_inference/commands/labelspace.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@
import pathlib

import click
import rich.console
import rich.table
from ruamel.yaml import YAML

yaml = YAML(typ="safe", pure=True)
Expand All @@ -58,44 +60,34 @@ def _load_catmap(filepath, cat_index=0, name_index=-1):

def _load_groups(filename):
with filename.expanduser().absolute().open("r") as fin:
contents = yaml.load(fin.read(), Loader=yaml.SafeLoader)
contents = yaml.load(fin.read())

names = {index + 1: group["name"] for index, group in enumerate(contents["groups"])}
return names
info = {}
for index, group in enumerate(contents["groups"]):
for label in group["labels"]:
info[label] = (index, group["name"])

return info

def _show_labels(groups, catmap):
print("{:<39}||{:<39}".format("orig: label → name", "grouping: label → name"))
print("-" * 80)
N_max = max(max(groups), max(catmap))
for i in range(N_max):
orig_str = f"{i} → {catmap[i]}" if i in catmap else "n/a"
new_str = f"{i} → {groups[i]}" if i in groups else "n/a"
print(f"{orig_str:<39}||{new_str:<39}")

def _make_label_table(groups, catmap):
table = rich.table.Table(title="Label Mapping")
table.add_column("Original Label")
table.add_column("Original Name(s)", max_width=30)
table.add_column("New Label")
table.add_column("New Name")

def _get_match_str(name, index, matches):
if index in matches:
match_str = f" - {name}: {index} → {matches[index]}"
return f"{match_str:<39}"
for i in range(max(catmap)):
label = f"{i}"
orig_names = ", ".join(catmap.get(i, "--").split(";"))

match_str = f" - {name}: {index} → ?"
match_str = f"{match_str:<39}"
return click.style(match_str, fg="red")
group_info = groups.get(i)
if group_info is None:
table.add_row(label, orig_names, "--", "--")
else:
table.add_row(label, orig_names, str(group_info[0]), group_info[1])


def _show_matches(groups, catmap, group_matches, cat_matches):
print(
"{:<39}||{:<39}".format(
"orig name: label → match", "grouping name: label → match"
)
)
print("-" * 80)
N_max = max(max(groups), max(catmap))
for i in range(N_max):
orig_str = _get_match_str(catmap[i], i, cat_matches) if i in catmap else "n/a"
new_str = _get_match_str(groups[i], i, group_matches) if i in groups else "n/a"
print(f"{orig_str}||{new_str}")
return table


@click.group(name="labelspace")
Expand All @@ -104,7 +96,7 @@ def cli():
pass


@click.command()
@cli.command()
@click.argument("labelspace", type=click.Path(exists=True))
@click.argument("grouping", type=click.Path(exists=True))
@click.option("-n", "--name-index", default=-1, type=int, help="index for name column")
Expand All @@ -117,20 +109,5 @@ def compare(labelspace, grouping, name_index, label_index):
groups = _load_groups(grouping_config_path)
catmap = _load_catmap(labelspace_path, cat_index=label_index, name_index=name_index)

group_matches = {}
for index, group in groups.items():
for label, name in catmap.items():
if group == name:
group_matches[index] = label
break

cat_matches = {}
for index, group in catmap.items():
for label, name in groups.items():
if group == name:
cat_matches[index] = label
break

_show_labels(groups, catmap)
print("")
_show_matches(groups, catmap, group_matches, cat_matches)
console = rich.console.Console()
console.print(_make_label_table(groups, catmap))
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
#
import torch

from semantic_inference.models.feature_visualizers import *
from semantic_inference.models.mask_functions import *
from semantic_inference.models.openset_segmenter import *
from semantic_inference.models.patch_extractor import *
Expand Down
Loading