Yuanyuan Ding
MS Student
Institute of Cyber-Systems and Control, Zhejiang University, China
Biography
I am pursuing my M.S. degree in College of Control Science and Engineering, Zhejiang University after getting my B.S. degree in Automation from Shandong University in 2022. My major research interests include Neural Radiance Fields and 3D Reconstruction.
Research and Interests
- Computer vision
- Neural Radiance Fields(NeRF)
Publications
- Rongyao Cai, Kexin Zhang, Hanchen Tai, Yang Zhou, Yuanyuan Ding, Chunlin Zhou, and Yong Liu. Adversarial Multimodal Contrastive Learning for Robust Industrial Fault Diagnosis. IEEE Transactions on Instrumentation and Measurement, 74:3559412, 2025.
[BibTeX] [Abstract] [DOI] [PDF]Fault diagnosis (FD) techniques leveraging self-supervised contrastive learning (SSCL) have demonstrated significant potential in industrial scenarios due to their reduced dependence on manually annotated data. However, the existing SSCL algorithms primarily focus on establishing complex similarity relationships among unimodal augmented views. These unimodal SSCL approaches are particularly vulnerable to learning shallow, domain-dependent spurious features in the training data rather than more intrinsic and essential features. Consequently, such spurious features may cause the algorithm failure when encountering distribution shift issues resulting from environmental perturbations or changes in working conditions. To address this challenge, we propose adversarial multimodal contrastive learning (AMMCL), a novel approach designed to extract robust and generalizable multimodal representations from time series and their corresponding spectrograms. AMMCL utilizes intermodal contrastive learning and adversarial training strategy to align modal-invariant features from both elementwise and setwise perspectives. These essential features are beneficial for intradomain and cross-domain FD tasks. Furthermore, a slice segmentation processing (SSP) method based on dominant frequency is employed to enhance model’s ability to recognize varying patterns within time series. AMMCL is first evaluated on intradomain and cross-domain FD tasks using the Gearbox and XJTU-SY datasets, where it outperforms nine existing FD algorithms in terms of performance. Additionally, AMMCL is compared with ten other valve stiction detection algorithms on International Stiction Database (ISDB) dataset, successfully identifying the most loop states (23 out of 26). Finally, the trained AMMCL model on the ISDB dataset is implemented in actual industrial valve detection, demonstrating the feasibility and practicality of AMMCL in real industrial scenarios.
@article{cai2025amc, title = {Adversarial Multimodal Contrastive Learning for Robust Industrial Fault Diagnosis}, author = {Rongyao Cai and Kexin Zhang and Hanchen Tai and Yang Zhou and Yuanyuan Ding and Chunlin Zhou and Yong Liu}, year = 2025, journal = {IEEE Transactions on Instrumentation and Measurement}, volume = 74, pages = {3559412}, doi = {10.1109/TIM.2025.3608323}, abstract = {Fault diagnosis (FD) techniques leveraging self-supervised contrastive learning (SSCL) have demonstrated significant potential in industrial scenarios due to their reduced dependence on manually annotated data. However, the existing SSCL algorithms primarily focus on establishing complex similarity relationships among unimodal augmented views. These unimodal SSCL approaches are particularly vulnerable to learning shallow, domain-dependent spurious features in the training data rather than more intrinsic and essential features. Consequently, such spurious features may cause the algorithm failure when encountering distribution shift issues resulting from environmental perturbations or changes in working conditions. To address this challenge, we propose adversarial multimodal contrastive learning (AMMCL), a novel approach designed to extract robust and generalizable multimodal representations from time series and their corresponding spectrograms. AMMCL utilizes intermodal contrastive learning and adversarial training strategy to align modal-invariant features from both elementwise and setwise perspectives. These essential features are beneficial for intradomain and cross-domain FD tasks. Furthermore, a slice segmentation processing (SSP) method based on dominant frequency is employed to enhance model’s ability to recognize varying patterns within time series. AMMCL is first evaluated on intradomain and cross-domain FD tasks using the Gearbox and XJTU-SY datasets, where it outperforms nine existing FD algorithms in terms of performance. Additionally, AMMCL is compared with ten other valve stiction detection algorithms on International Stiction Database (ISDB) dataset, successfully identifying the most loop states (23 out of 26). Finally, the trained AMMCL model on the ISDB dataset is implemented in actual industrial valve detection, demonstrating the feasibility and practicality of AMMCL in real industrial scenarios.} } - Yuanyuan Ding, Yiming Fei, Jiandang Yang, Xiaobin Wei, Jiajun Lv, and Yong Liu. OARecon: Object-Aware Viewpoint Augmentation for Indoor Compositional Reconstruction. In 2025 IEEE lnternational Conference on Acoustics, Speech and Signal Processing (ICASSP), 2025.
[BibTeX] [Abstract] [DOI] [PDF]Real-world scenes likely involve repetitive objects indicating that the reconstruction of the target object can be supplemented by the views of other identical objects. However, traditional 3D reconstruction methods do not take this a priori knowledge into account and fail to make full use of the available information. In this paper, we propose an object-aware viewpoint augmentation scheme for indoor compositional reconstruction. Within this scheme, a viewpoint supplementation strategy based on signed distance function and neural radiance fields is proposed to fully leverage the information from repetitive objects such that the occlusion problem is suppressed. Moreover, this scheme introduces monocular uncertainty priors and regional smoothness constraints to enhance the reconstruction accuracy of slender and thin structures and the smoothness of occluded background, respectively. Experimental results considering both synthetic and real-world scenes demonstrate that our method effectively improves the reconstruction quality of repetitive objects and background.
@inproceedings{ding2025oar, title = {OARecon: Object-Aware Viewpoint Augmentation for Indoor Compositional Reconstruction}, author = {Yuanyuan Ding and Yiming Fei and Jiandang Yang and Xiaobin Wei and Jiajun Lv and Yong Liu}, year = 2025, booktitle = {2025 IEEE lnternational Conference on Acoustics, Speech and Signal Processing (ICASSP)}, doi = {10.1109/ICASSP49660.2025.10888042}, abstract = {Real-world scenes likely involve repetitive objects indicating that the reconstruction of the target object can be supplemented by the views of other identical objects. However, traditional 3D reconstruction methods do not take this a priori knowledge into account and fail to make full use of the available information. In this paper, we propose an object-aware viewpoint augmentation scheme for indoor compositional reconstruction. Within this scheme, a viewpoint supplementation strategy based on signed distance function and neural radiance fields is proposed to fully leverage the information from repetitive objects such that the occlusion problem is suppressed. Moreover, this scheme introduces monocular uncertainty priors and regional smoothness constraints to enhance the reconstruction accuracy of slender and thin structures and the smoothness of occluded background, respectively. Experimental results considering both synthetic and real-world scenes demonstrate that our method effectively improves the reconstruction quality of repetitive objects and background.} } - Zizhang Li, Xiaoyang Lyu, Yuanyuan Ding, Mengmeng Wang, Yiyi Liao, and Yong Liu. RICO: Regularizing the Unobservable for Indoor Compositional Reconstruction. In 19th IEEE/CVF International Conference on Computer Vision (ICCV), pages 17715-17725, 2023.
[BibTeX] [Abstract] [DOI] [PDF]Recently, neural implicit surfaces have become popular for multi-view reconstruction. To facilitate practical applications like scene editing and manipulation, some works extend the framework with semantic masks input for the object-compositional reconstruction rather than the holistic perspective. Though achieving plausible disentanglement, the performance drops significantly when processing the indoor scenes where objects are usually partially observed. We propose RICO to address this by regularizing the unobservable regions for indoor compositional reconstruction. Our key idea is to first regularize the smoothness of the occluded background, which then in turn guides the foreground object reconstruction in unobservable regions based on the object-background relationship. Particularly, we regularize the geometry smoothness of occluded background patches. With the improved background surface, the signed distance function and the reversedly rendered depth of objects can be optimized to bound them within the background range. Extensive experiments show our method outperforms other methods on synthetic and real-world indoor scenes and prove the effectiveness of proposed regularizations. The code is available at https://github.com/kyleleey/RICO
@inproceedings{li2023rico, title = {RICO: Regularizing the Unobservable for Indoor Compositional Reconstruction}, author = {Zizhang Li and Xiaoyang Lyu and Yuanyuan Ding and Mengmeng Wang and Yiyi Liao and Yong Liu}, year = 2023, booktitle = {19th IEEE/CVF International Conference on Computer Vision (ICCV)}, pages = {17715-17725}, doi = {10.1109/ICCV51070.2023.01628}, abstract = {Recently, neural implicit surfaces have become popular for multi-view reconstruction. To facilitate practical applications like scene editing and manipulation, some works extend the framework with semantic masks input for the object-compositional reconstruction rather than the holistic perspective. Though achieving plausible disentanglement, the performance drops significantly when processing the indoor scenes where objects are usually partially observed. We propose RICO to address this by regularizing the unobservable regions for indoor compositional reconstruction. Our key idea is to first regularize the smoothness of the occluded background, which then in turn guides the foreground object reconstruction in unobservable regions based on the object-background relationship. Particularly, we regularize the geometry smoothness of occluded background patches. With the improved background surface, the signed distance function and the reversedly rendered depth of objects can be optimized to bound them within the background range. Extensive experiments show our method outperforms other methods on synthetic and real-world indoor scenes and prove the effectiveness of proposed regularizations. The code is available at https://github.com/kyleleey/RICO} }
