Wenlong Ye

MS Student

Institute of Cyber-Systems and Control, Zhejiang University, China

Biography

I am pursuing my M.S. degree in College of Control Science and Engineering, Zhejiang University, Hangzhou, China. My major research interests include SLAM and robots.

Research and Interests

SLAM

Publications

Xingxing Zuo, Wenlong Ye, Yulin Yang, Renjie Zheng, Teresa Vidal-Calleja, Guoquan Huang, and Yong Liu. Multimodal localization: Stereo over LiDAR map. Journal of Field Robotics, 37:1003–1026, 2020.
[BibTeX] [Abstract] [DOI] [PDF]

In this paper, we present a real‐time high‐precision visual localization system for an autonomous vehicle which employs only low‐cost stereo cameras to localize the vehicle with a priori map built using a more expensive 3D LiDAR sensor. To this end, we construct two different visual maps: a sparse feature visual map for visual odometry (VO) based motion tracking, and a semidense visual map for registration with the prior LiDAR map. To register two point clouds sourced from different modalities (i.e., cameras and LiDAR), we leverage probabilistic weighted normal distributions transformation (ProW‐NDT), by particularly taking into account the uncertainty of source point clouds. The registration results are then fused via pose graph optimization to correct the VO drift. Moreover, surfels extracted from the prior LiDAR map are used to refine the sparse 3D visual features that will further improve VO‐based motion estimation. The proposed system has been tested extensively in both simulated and real‐world experiments, showing that robust, high‐precision, real‐time localization can be achieved.

@article{zuo2020multimodalls,
title = {Multimodal localization: Stereo over LiDAR map},
author = {Xingxing Zuo and Wenlong Ye and Yulin Yang and Renjie Zheng and Teresa Vidal-Calleja and Guoquan Huang and Yong Liu},
year = 2020,
journal = {Journal of Field Robotics},
volume = 37,
pages = {1003--1026},
doi = {https://doi.org/10.1002/rob.21936},
abstract = {In this paper, we present a real‐time high‐precision visual localization system for an autonomous vehicle which employs only low‐cost stereo cameras to localize the vehicle with a priori map built using a more expensive 3D LiDAR sensor. To this end, we construct two different visual maps: a sparse feature visual map for visual odometry (VO) based motion tracking, and a semidense visual map for registration with the prior LiDAR map. To register two point clouds sourced from different modalities (i.e., cameras and LiDAR), we leverage probabilistic weighted normal distributions transformation (ProW‐NDT), by particularly taking into account the uncertainty of source point clouds. The registration results are then fused via pose graph optimization to correct the VO drift. Moreover, surfels extracted from the prior LiDAR map are used to refine the sparse 3D visual features that will further improve VO‐based motion estimation. The proposed system has been tested extensively in both simulated and real‐world experiments, showing that robust, high‐precision, real‐time localization can be achieved.}
}

Xiangrui Zhao, Lina Liu, Renjie Zheng, Wenlong Ye, and Yong Liu. A Robust Stereo Feature-aided Semi-direct SLAM System. Robotics and Autonomous Systems, 132:103597, 2020.
[BibTeX] [Abstract] [DOI] [PDF]

In autonomous driving, many intelligent perception technologies have been put in use. However, visual SLAM still has problems with robustness, which limits its application, although it has been developed for a long time. We propose a feature-aided semi-direct approach to combine the direct and indirect methods in visual SLAM to allow robust localization under various situations, including large-baseline motion, textureless environment, and great illumination changes. In our approach, we first calculate inter-frame pose estimation by feature matching. Then we use the direct alignment and a multi-scale pyramid, which employs the previous coarse estimation as a priori, to obtain a more precise result. To get more accurate photometric parameters, we combine the online photometric calibration method with visual odometry. Furthermore, we replace the Shi–Tomasi corner with the ORB feature, which is more robust to illumination. For extreme brightness change, we employ the dark channel prior to weaken the halation and maintain the consistency of the image. To evaluate our approach, we build a full stereo visual SLAM system. Experiments on the publicly available dataset and our mobile robot dataset indicate that our approach improves the accuracy and robustness of the SLAM system.

@article{zhao2020ars,
title = {A Robust Stereo Feature-aided Semi-direct SLAM System},
author = {Xiangrui Zhao and Lina Liu and Renjie Zheng and Wenlong Ye and Yong Liu},
year = 2020,
journal = {Robotics and Autonomous Systems},
volume = 132,
pages = 103597,
doi = {https://doi.org/10.1016/j.robot.2020.103597},
abstract = {In autonomous driving, many intelligent perception technologies have been put in use. However, visual SLAM still has problems with robustness, which limits its application, although it has been developed for a long time. We propose a feature-aided semi-direct approach to combine the direct and indirect methods in visual SLAM to allow robust localization under various situations, including large-baseline motion, textureless environment, and great illumination changes. In our approach, we first calculate inter-frame pose estimation by feature matching. Then we use the direct alignment and a multi-scale pyramid, which employs the previous coarse estimation as a priori, to obtain a more precise result. To get more accurate photometric parameters, we combine the online photometric calibration method with visual odometry. Furthermore, we replace the Shi–Tomasi corner with the ORB feature, which is more robust to illumination. For extreme brightness change, we employ the dark channel prior to weaken the halation and maintain the consistency of the image. To evaluate our approach, we build a full stereo visual SLAM system. Experiments on the publicly available dataset and our mobile robot dataset indicate that our approach improves the accuracy and robustness of the SLAM system.}
}

Liang Liu, Guangyao Zhai, Wenlong Ye, and Yong Liu. Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity. In 28th International Joint Conference on Artificial Intelligence (IJCAI), 2019.
[BibTeX] [Abstract] [DOI] [PDF]

Scene flow estimation in the dynamic scene remains a challenging task. Computing scene flow by a combination of 2D optical flow and depth has shown to be considerably faster with acceptable performance. In this work, we present a unified framework for joint unsupervised learning of stereo depth and optical flow with explicit local rigidity to estimate scene flow. We estimate camera motion directly by a Perspective-n-Point method from the optical flow and depth predictions, with RANSAC outlier rejection scheme. In order to disambiguate the object motion and the camera motion in the scene, we distinguish the rigid region by the re-project error and the photometric similarity. By joint learning with the local rigidity, both depth and optical networks can be refined. This framework boosts all four tasks: depth, optical flow, camera motion estimation, and object motion segmentation. Through the evaluation on the KITTI benchmark, we show that the proposed framework achieves state-of-the-art results amongst unsupervised methods. Our models and code are available at https://github.com/lliuz/unrigidflow.

@inproceedings{liu2019unsupervisedlo,
title = {Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity},
author = {Liang Liu and Guangyao Zhai and Wenlong Ye and Yong Liu},
year = 2019,
booktitle = {28th International Joint Conference on Artificial Intelligence (IJCAI)},
doi = {https://doi.org/10.24963/ijcai.2019%2F123},
abstract = {Scene flow estimation in the dynamic scene remains a challenging task. Computing scene flow by a combination of 2D optical flow and depth has shown to be considerably faster with acceptable performance. In this work, we present a unified framework for joint unsupervised learning of stereo depth and optical flow with explicit local rigidity to estimate scene flow. We estimate camera motion directly by a Perspective-n-Point method from the optical flow and depth predictions, with RANSAC outlier rejection scheme. In order to disambiguate the object motion and the camera motion in the scene, we distinguish the rigid region by the re-project error and the photometric similarity. By joint learning with the local rigidity, both depth and optical networks can be refined. This framework boosts all four tasks: depth, optical flow, camera motion estimation, and object motion segmentation. Through the evaluation on the KITTI benchmark, we show that the proposed framework achieves state-of-the-art results amongst unsupervised methods. Our models and code are available at https://github.com/lliuz/unrigidflow.}
}

Wenlong Ye, Renjie Zheng, Fangqiang Zhang, Zizhou Ouyang, and Yong Liu. Robust and Efficient Vehicles Motion Estimation with Low-Cost Multi-Camera and Odometer-Gyroscope. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), page 4490–4496, 2019.
[BibTeX] [Abstract] [DOI] [PDF]

In this paper, we present a robust and efficient estimation approach with multi-camera, odometer and gyroscope. Robust initialization, tightly-coupled optimization estimator and multi-camera loop-closure detection are utilized in the proposed approach. In initialization, the measurements of odometer and gyroscope are used to compute scale, and then estimate the bias of sensors. In estimator, the pre-integration of odometer and gyroscope is derived and combined with the measurements of multi-camera to estimate the motion in a tightly-coupled optimization framework. In loop-closure detection, a connection between different cameras of the vehicle can be built, which significantly improve the success rate of loop-closure detection. The proposed algorithm is validated in multiple real-world datasets collected in different places, time, weather and illumination. Experimental results show that the proposed approach can estimate the motion of vehicles robustly and efficiently.

@inproceedings{ye2019robustae,
title = {Robust and Efficient Vehicles Motion Estimation with Low-Cost Multi-Camera and Odometer-Gyroscope},
author = {Wenlong Ye and Renjie Zheng and Fangqiang Zhang and Zizhou Ouyang and Yong Liu},
year = 2019,
booktitle = {2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
pages = {4490--4496},
doi = {https://doi.org/10.1109/IROS40897.2019.8968048},
abstract = {In this paper, we present a robust and efficient estimation approach with multi-camera, odometer and gyroscope. Robust initialization, tightly-coupled optimization estimator and multi-camera loop-closure detection are utilized in the proposed approach. In initialization, the measurements of odometer and gyroscope are used to compute scale, and then estimate the bias of sensors. In estimator, the pre-integration of odometer and gyroscope is derived and combined with the measurements of multi-camera to estimate the motion in a tightly-coupled optimization framework. In loop-closure detection, a connection between different cameras of the vehicle can be built, which significantly improve the success rate of loop-closure detection. The proposed algorithm is validated in multiple real-world datasets collected in different places, time, weather and illumination. Experimental results show that the proposed approach can estimate the motion of vehicles robustly and efficiently.}
}

Xiangrui Zhao, Renjie Zheng, Wenlong Ye, Yong Liu, and Mingyang Li. A Robust Stereo Semi-direct SLAM System Based on Hybrid Pyramid. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), page 5376–5382, 2019.
[BibTeX] [Abstract] [DOI] [PDF]

We propose a hybrid pyramid based approach to fuse the direct and indirect methods in visual SLAM, to allow robust localization under various situations including large-baseline motion, low-texture environment, and various illumination changes. In our approach, we first calculate coarse inter-frame pose estimation by matching the feature points. Subsequently, we use both direct image alignment and a multiscale pyramid method, for refining the previous estimation to attain better precision. Furthermore, we perform online photometric calibration along with pose estimation, to reduce un-modelled errors. To evaluate our approach, we conducted various real-world experiments on both public datasets and self-collected ones, by implementing a full SLAM system with the proposed methods. The results show that our system improves both localization accuracy and robustness by a wide margin.

@inproceedings{zhao2019ars,
title = {A Robust Stereo Semi-direct SLAM System Based on Hybrid Pyramid},
author = {Xiangrui Zhao and Renjie Zheng and Wenlong Ye and Yong Liu and Mingyang Li},
year = 2019,
booktitle = {2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
pages = {5376--5382},
doi = {https://doi.org/10.1109/IROS40897.2019.8968008},
abstract = {We propose a hybrid pyramid based approach to fuse the direct and indirect methods in visual SLAM, to allow robust localization under various situations including large-baseline motion, low-texture environment, and various illumination changes. In our approach, we first calculate coarse inter-frame pose estimation by matching the feature points. Subsequently, we use both direct image alignment and a multiscale pyramid method, for refining the previous estimation to attain better precision. Furthermore, we perform online photometric calibration along with pose estimation, to reduce un-modelled errors. To evaluate our approach, we conducted various real-world experiments on both public datasets and self-collected ones, by implementing a full SLAM system with the proposed methods. The results show that our system improves both localization accuracy and robustness by a wide margin.}
}

Xingxing Zuo, Patrick Geneva, Yulin Yang, Wenlong Ye, Yong Liu, and Guoquan Huang. Visual-Inertial Localization With Prior LiDAR Map Constraints. IEEE Robotics and Automation Letters, 4:3394–3401, 2019.
[BibTeX] [Abstract] [DOI] [PDF]

In this letter, we develop a low-cost stereo visual-inertial localization system, which leverages efficient multi-state constraint Kalman filter (MSCKF)-based visual-inertial odometry (VIO) while utilizing an a priori LiDAR map to provide bounded-error three-dimensional navigation. Besides the standard sparse visual feature measurements used in VIO, the global registrations of visual semi-dense clouds to the prior LiDAR map are also exploited in a tightly-coupled MSCKF update, thus correcting accumulated drift. This cross-modality constraint between visual and LiDAR pointclouds is particularly addressed. The proposed approach is validated on both Monte Carlo simulations and real-world experiments, showing that LiDAR map constraints between clouds created through different sensing modalities greatly improve the standard VIO and provide bounded-error performance.

@article{zuo2019visualinertiallw,
title = {Visual-Inertial Localization With Prior LiDAR Map Constraints},
author = {Xingxing Zuo and Patrick Geneva and Yulin Yang and Wenlong Ye and Yong Liu and Guoquan Huang},
year = 2019,
journal = {IEEE Robotics and Automation Letters},
volume = 4,
pages = {3394--3401},
doi = {https://doi.org/10.1109/LRA.2019.2927123},
abstract = {In this letter, we develop a low-cost stereo visual-inertial localization system, which leverages efficient multi-state constraint Kalman filter (MSCKF)-based visual-inertial odometry (VIO) while utilizing an a priori LiDAR map to provide bounded-error three-dimensional navigation. Besides the standard sparse visual feature measurements used in VIO, the global registrations of visual semi-dense clouds to the prior LiDAR map are also exploited in a tightly-coupled MSCKF update, thus correcting accumulated drift. This cross-modality constraint between visual and LiDAR pointclouds is particularly addressed. The proposed approach is validated on both Monte Carlo simulations and real-world experiments, showing that LiDAR map constraints between clouds created through different sensing modalities greatly improve the standard VIO and provide bounded-error performance.}
}

Address

Wenlong Ye

Biography

Research and Interests

Publications

Latest Events

APRIL实验室斩获ATEC 2025科技精英赛冠军，具身智能技术实现真实场景重大突破

喜报！APRIL实验室硕士生侯典泳荣获IROS 2025“移动操作领域最佳论文提名奖”

喜报！APRIL实验室在IROS 2025四足机器人挑战赛上荣获最佳自主导航奖