HalfVulpes/IGEV-plusplus: My IGEV-plusplus

My IGEV-plusplus

Python 99.3%
Dockerfile 0.7%

Find a file

Rex 4ee63d8273 Minor error fix		2024-10-18 18:11:00 -04:00
checkpoints	update model	2024-10-03 18:01:19 -04:00
core	modified	2024-10-01 02:40:31 -04:00
.gitignore	update gitignore	2024-09-23 00:29:59 -04:00
Dockerfile	Add dockerfile	2024-09-23 17:58:16 -04:00
evaluate_stereo.py	update training	2024-09-28 16:34:10 -04:00
extract_disparity.py	update model	2024-10-01 22:30:26 -04:00
LICENSE	Initial commit	2024-09-22 22:57:03 -04:00
params.yaml	Add ros noetic wrapper	2024-10-14 22:41:15 -04:00
README.md	update gitignore	2024-09-23 00:29:59 -04:00
ros_noetic_wrapper.py	Minor error fix	2024-10-18 18:11:00 -04:00
to_pointcloud.py	Support translate point cloud with specified indexes	2024-10-01 15:49:35 -04:00
train_stereo.py	remove random seed	2024-10-01 01:42:23 -04:00

README.md

My IGEV-plusplus

Credit goes to:

IGEV++: Iterative Multi-range Geometry Encoding Volumes for Stereo Matching

Gangwei Xu, Xianqi Wang, Zhaoxing Zhang, Junda Cheng, Chunyuan Liao, Xin Yang

Citation

If you find our works useful in your research, please consider citing our papers:


@article{xu2024igev++,
  title={IGEV++: Iterative Multi-range Geometry Encoding Volumes for Stereo Matching},
  author={Xu, Gangwei and Wang, Xianqi and Zhang, Zhaoxing and Cheng, Junda and Liao, Chunyuan and Yang, Xin},
  journal={arXiv preprint arXiv:2409.00638},
  year={2024}
}

@inproceedings{xu2023iterative,
  title={Iterative Geometry Encoding Volume for Stereo Matching},
  author={Xu, Gangwei and Wang, Xianqi and Ding, Xiaohuan and Yang, Xin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={21919--21928},
  year={2023}
}

Environment

python 3.8
torch 1.12.1+cu113

Dependencies

pip install jinja2
pip install jupyter-client pyzmq
pip install notebook jupyterlab
pip install pyyaml
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip install tqdm
pip install scipy
pip install opencv-python
pip install scikit-image
pip install tensorboard
pip install matplotlib 
pip install timm==0.5.4

Usage

extract_disparity.py:

Command-Line Arguments

The script accepts the following command-line arguments:

General Arguments

--restore_ckpt: Path to the model checkpoint file.
- Default: checkpoints/sceneflow.pth
--save_numpy: If set, the script will save the disparity maps as .npy NumPy array files.
- Default: Not set (disabled)
-l, --left_imgs: Path to the directory containing left images.
- Default: left/
-r, --right_imgs: Path to the directory containing right images.
- Default: right/
--output_directory: Directory where output files will be saved.
- Default: output
--mixed_precision: Use mixed precision for faster inference.
- Default: True
--valid_iters: Number of iterations during the forward pass.
- Default: 16

Model Architecture Arguments

--hidden_dims: Hidden state and context dimensions.
- Default: [128, 128, 128]
--corr_levels: Number of levels in the correlation pyramid.
- Default: 2
--corr_radius: Width of the correlation pyramid.
- Default: 4
--n_downsample: Resolution of the disparity field (1/2^K).
- Default: 2
--n_gru_layers: Number of hidden GRU levels.
- Default: 3
--max_disp: Maximum disparity range.
- Default: 768
--s_disp_range: Max disparity of small disparity-range geometry encoding volume.
- Default: 48
--m_disp_range: Max disparity of medium disparity-range geometry encoding volume.
- Default: 96
--l_disp_range: Max disparity of large disparity-range geometry encoding volume.
- Default: 192
--s_disp_interval: Disparity interval of small disparity-range geometry encoding volume.
- Default: 1
--m_disp_interval: Disparity interval of medium disparity-range geometry encoding volume.
- Default: 2
--l_disp_interval: Disparity interval of large disparity-range geometry encoding volume.
- Default: 4

Output Files

For each image pair, the script generates the following files in the output/ directory:

Disparity Map as PNG
- Filename: <imagename>.png
- Description: A colorized disparity map saved as a PNG image using the 'jet' colormap.
Disparity Map as DSPM
- Filename: <imagename>.dspm
- Description: The disparity map saved in ASCII format, which can be used for further processing.
Disparity Map as NumPy Array (Optional)
- Filename: <imagename>.npy
- Description: If --save_numpy is specified, the disparity map is saved as a NumPy array for direct use in Python.

Examples

Example 1: Basic Usage with Default Settings

python extract_disparity.py --restore_ckpt checkpoints/sceneflow.pth

Processes image pairs from left/ and right/ directories.
Saves outputs in output/.
Does not save NumPy arrays.

Example 2: Saving NumPy Arrays

python extract_disparity.py --restore_ckpt checkpoints/sceneflow.pth --save_numpy

Saves disparity maps as both PNG images and NumPy arrays.

Example 3: Custom Image Directories and Output Directory

python extract_disparity.py \
  --left_imgs /path/to/your/left_images/ \
  --right_imgs /path/to/your/right_images/ \
  --output_directory /path/to/save/outputs/ \
  --restore_ckpt checkpoints/sceneflow.pth

Processes images from specified directories.
Saves outputs in the specified output directory.

to_pointcloud.py

Command-Line Arguments

The script accepts several command-line arguments to customize its behavior:

--in, --input_folder: Input folder containing disparity maps (default: output).
--out, --output_folder: Output folder to save point clouds (default: output_pointcloud).
--fx: Focal length in the x-direction (default: 425.99684953503163).
--fy: Focal length in the y-direction (default: 426.0108446650122).
--cx: Optical center x-coordinate (default: 426.5960073761994).
--cy: Optical center y-coordinate (default: 240.4590369784203).
--baseline: Baseline between the stereo cameras in meters (default: 0.05000244116935238).
--extrinsic_yaml_file: Path to the YAML file containing the extrinsic matrices.
--camera_id: Camera ID to use from the YAML file (default: body_T_cam0).
--denoise: A flag to enable denoising.
--nb_neighbors: Number of neighbors to analyze for each point.
--std_ratio: The threshold based on the standard deviation of distances.

Note: All parameters have default values. You can override them by specifying the corresponding arguments.

Usage Examples

Basic Usage

To process disparity maps using default parameters:

python to_pointcloud.py

This command:

Processes all .dspm files in the output directory.
Saves the generated point clouds to the output_pointcloud directory.
Uses the default intrinsic and extrinsic parameters specified in the script.

Specifying Intrinsic Parameters

If you want to specify custom intrinsic parameters:

python to_pointcloud.py --fx 500 --fy 500 --cx 320 --cy 240 --baseline 0.05

This command sets:

Focal lengths fx and fy to 500.
Optical centers cx and cy to 320 and 240, respectively.
Baseline to 0.05 meters.

Using Extrinsic Matrices from a YAML File

To use extrinsic matrices defined in a YAML configuration file:

Prepare your YAML file (e.g., config.yaml) with the extrinsic matrices in the OpenCV matrix format:

body_T_cam0: !!opencv-matrix
  rows: 4
  cols: 4
  dt: d
  data: [0.00137718, -0.02366516, 0.99971899, 0.13932612,
         -0.99998467, 0.00532906, 0.00150369, 0.015785,
         -0.00536315, -0.99970574, -0.02365746, 0.00489279,
         0., 0., 0., 1.]

Run the script with the YAML file and specify the camera ID:

python to_pointcloud.py --extrinsic_yaml_file config.yaml --camera_id body_T_cam0

This command:

Loads the extrinsic matrix body_T_cam0 from config.yaml.
Processes disparity maps using this extrinsic matrix.

Processing Multiple Files

The script automatically processes all .dspm files in the specified input directory.

Input Folder: Use --in to specify the input directory.
Output Folder: Use --out to specify the output directory.

Example:

python to_pointcloud.py --in disparity_maps --out pointclouds

This command:

Processes all .dspm files in the disparity_maps directory.
Saves the point clouds to the pointclouds directory.

Input and Output Formats

Input: Disparity Map Files (.dspm)

The disparity map files should be in ASCII format.
Each file contains rows of disparity values separated by spaces.

Example content of a .dspm file:

0.0 0.0 0.0 1.5 1.6 0.0
0.0 2.1 2.2 0.0 0.0 0.0

Output: Point Cloud Files (`.ply`)

The point clouds are saved in a custom ASCII PLY format.
Each file starts with a PLY header followed by the point data.

Example content of a .ply file:

ply
format ascii 1.0
element vertex 12345
property float x
property float y
property float z
end_header
x0 y0 z0
x1 y1 z1
...

Complete Example

Suppose you have the following setup:

Disparity maps are stored in ./data/disparity_maps.
You want to save point clouds to ./data/pointclouds.
You have intrinsic parameters different from the defaults.
You have a YAML file camera_config.yaml with extrinsic matrices.

Run the script as follows:

python to_pointcloud.py \
  --in ./data/disparity_maps \
  --out ./data/pointclouds \
  --fx 600.0 \
  --fy 600.0 \
  --cx 400.0 \
  --cy 300.0 \
  --baseline 0.06 \
  --extrinsic_yaml_file camera_config.yaml \
  --camera_id body_T_cam1

This command:

Processes all .dspm files in ./data/disparity_maps.
Saves point clouds to ./data/pointclouds.
Uses the specified intrinsic parameters.
Loads the extrinsic matrix body_T_cam1 from camera_config.yaml.

Additional Notes

Image Formats: The script supports image files that can be opened by the Pillow library (e.g., .png, .jpg).
Grayscale Images: If the images are grayscale, the script converts them to RGB by replicating the single channel.
Image Size: Images are padded internally to be divisible by 32 for model compatibility.

README.md

My IGEV-plusplus

Citation

Environment

Dependencies

Usage

extract_disparity.py:

Command-Line Arguments

General Arguments

Model Architecture Arguments

Output Files

Examples

Example 1: Basic Usage with Default Settings

Example 2: Saving NumPy Arrays

Example 3: Custom Image Directories and Output Directory

to_pointcloud.py

Command-Line Arguments

Usage Examples

Basic Usage

Specifying Intrinsic Parameters

Using Extrinsic Matrices from a YAML File

Run the script with the YAML file and specify the camera ID:

Processing Multiple Files

Input and Output Formats

Output: Point Cloud Files (.ply)

Complete Example

Additional Notes

Output: Point Cloud Files (`.ply`)