Professional Summary

I am currently a Ph.D. candidate at the Ai4City-Lab, Urban Governance and Design Thrust, Society Hub, The Hong Kong University of Science and Technology (Guangzhou), under the supervision of Prof. Wufan Zhao and Prof. Yuan Liu. Prior to this, I obtained my Master’s degree from the School of Geospatial Engineering and Science, Sun Yat-sen University, where I was advised by Prof. Wuming Zhang and Prof. Yiping Chen.

My research focuses on 3D visual perception, intelligent interpretation and processing of point cloud data, and multi-modal urban foundation models. I am particularly interested in bridging geometric understanding with semantic reasoning in large-scale urban environments, with an emphasis on open-vocabulary learning, training-free paradigms, and cross-modal fusion between 2D and 3D data.

My goal is to develop scalable, interpretable, and generalizable AI systems for urban analysis, enabling applications such as digital twin construction, urban scene understanding, and intelligent infrastructure management.

Education

PhD in UGOD (URBAN GOVERNANCE AND DESIGN)

2024-09
--

The Hong Kong University of Science and Technology (Guangzhou).

M.S in Remote Sensing and Geoinformation Engineering

2021-09
2024-06

SUN YAT-SEN UNIVERSITY

B.E in Surveying and Mapping Engineering

2017-09
2021-06

Shandong University Of Science and Technology

Interests

3D Vision and Geometric Representation City-Scale 3D Scene Modeling Point Cloud Learning and Feature Encoding 3D Scene Parsing and Structural Analysis Open-Vocabulary 3D Segmentation Open-World Semantic Inference in 3D Multimodal Urban Data Fusion Urban Foundation Models and Spatial AI Training-Free and Data-Efficient Learning Paradigms Geometry-Guided Spatial Reasoning Forestry Point Cloud Analysis and Tree Modeling
📚 My Research

My research explores how cities can be understood through multimodal intelligence. I focus on integrating heterogeneous data—particularly 3D point clouds, imagery, and text—to build unified representations of complex urban environments. My current research interests include:

3D Urban Perception: Combining geometric cues with visual and linguistic data for scalable, city-level scene understanding.

Urban Foundation Models: Leveraging vision-language models to develop training-efficient, open-vocabulary paradigms for spatial reasoning.

Intelligent City Systems: Moving beyond isolated tasks to build unified frameworks for large-scale infrastructure analysis.

Feel free to reach out for collaboration! 😃

Featured Publications
Point and voxel cross perception with lightweight cosformer for large-scale point cloud semantic segmentation featured image

Point and voxel cross perception with lightweight cosformer for large-scale point cloud semantic segmentation

This paper proposes PVCFormer, a cross-attention architecture combining point and voxel representations with CosFormer for efficient large-scale outdoor point cloud semantic …

avatar
Dr. Shuai Zhang
Chat3D: Interactive understanding 3D scene-level point clouds by chatting with foundation model for urban ecological construction featured image

Chat3D: Interactive understanding 3D scene-level point clouds by chatting with foundation model for urban ecological construction

This paper presents Chat3D, a method for applying large language models to urban ecological construction by combining the results of 3D point cloud semantic segmentation. The …

avatar
Dr. Shuai Zhang
Recent Publications