Xiaoshui Huang

Research fellow
Shanghai Artificial Intelligence Laboratory

Home

News

Research

Learning

CV

Social
Email
Scholar
GitHub
LinkedIn

Contact:
Email: huangxiaoshui@pjlab.org.cn

Publications

Please go to my Google Scholar for the recent publications.

2019

  • Fast Registration for cross-source point clouds by using weak regional affinity and pixel-wise refinement.
  • Xiaoshui Huang, Lixin Fan, Qiang Wu, Jian Zhang, Chun Yuan
    International Conference of Multimedia Expro (ICME) , 2019 .
    PDF

    Many types of 3D acquisition sensors have emerged in recent years and point cloud has been widely used in many areas. Accurate and fast registration of cross-source 3D point clouds from different sensors is an emerged research problem in computer vision. This problem is extremely challenging because cross-source point clouds contain a mixture of various variances, such as density, partial overlap, large noise and outliers, viewpoint changing. In this paper, an algorithm is proposed to align cross-source point clouds with both high accuracy and high efficiency. There are two main contributions: firstly, two components, the weak region affinity and pixel-wise refinement, are proposed to maintain the global and local information of 3D point clouds. Then, these two components are integrated into an iterative tensor-based registration algorithm to solve the cross-source point cloud registration problem. We conduct experiments on synthetic cross-source benchmark dataset and real cross-source datasets. Comparison with six state-of-the-art methods, the proposed method obtains both higher efficiency and accuracy.

  • KPSNET: Keypoint detection and feature extraction for point cloud registration.
  • Anan Du, Xiaoshui Huang, Jian Zhang, Lingxiang Yao, Qiang Wu.
    International Conference on Image Processing (ICIP) , 2019 .
    PDF      Code (coming soon)

    This paper presents the KPSNet, a KeyPoint Siamese Network to simultaneously learn task-desirable keypoint detector and feature extractor. The keypoint detector is optimized to predict a score vector, which signifies the probability of each candidate being a keypoint. The feature extractor is optimized to learn robust features of keypoints by exploiting the correspondence between the keypoints generated from two inputs, respectively. For training, the KPSNet does not require to manually annotate keypoints and local patches pairwise. Instead, we design an alignment module to establish the correspondence between the two inputs and generate positive and negative samples on-the-fly. Therefore, our method can be easily extended to new scenes. We test the proposed method on the open-source benchmark and experiments show the validity of our method.

    2018

  • Attention-based Transactional Context Embedding for Next-Item Recommendation.
  • Shoujin Wang, Liang Hu, Longbing Cao, Xiaoshui Huang, Defu Lian, Wei Liu
    The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) , 2018 .
    PDF

    To recommend the next item to a user in a transactional context is practical yet challenging in applications such as marketing campaigns. Transactional context refers to the items that are observable in a transaction. Most existing transactionbased recommender systems (TBRSs) make recommendations by mainly considering recently occurring items instead of all the ones observed in the current context. Moreover, they often assume a rigid order between items within a transaction, which is not always practical. More importantly, a long transaction often contains many items irreverent to the next choice, which tends to overwhelm the influence of a few truely relevant ones. Therefore, we posit that a good TBRS should not only consider all the observed items in the current transaction but also weight them with different relevance to build an attentive context that outputs the proper next item with a high probability. To this end, we design an effective attentionbased transaction embedding model (ATEM) for context embedding to weight each observed item in a transaction without assuming order. The empirical study on real-world transaction datasets proves that ATEM significantly outperforms the state-of-the-art methods in terms of both accuracy and novelty.

    2017

  • A coarse-to-fine algorithm for matching and registration in 3D cross-sourced point clouds.
  • Xiaoshui Huang, Lixin Fan, Qiang Wu, Jian Zhang, Chun Yuan.
    Transactions on Circuits and Systems for Video Technology (T-CSVT) , 2017 .
    PDF

    We propose an efficient method to deal with the matching and registration problem found in cross-source point clouds captured by different types of sensors. This task is especially challenging due to the presence of density variation, scale difference, a large proportion of noise and outliers, missing data, and viewpoint variation. The proposed method has two stages: in the coarse matching stage, we use the ensemble of shape functions descriptor to select potential K regions from the candidate point clouds for the target. In the fine stage, we propose a scale embedded generative Gaussian mixture models registration method to refine the results from the coarse matching stage. Following the fine stage, both the best region and accurate camera pose relationships between the candidates and target are found. We conduct experiments in which we apply the method to two applications: one is 3D object detection and localization in street-view outdoor (LiDAR/VSFM) cross-source point clouds and the other is 3D scene matching and registration in indoor (KinectFusion/VSFM) cross-source point clouds. The experiment results show that the proposed method performs well when compared with the existing methods. It also shows that the proposed method is robust under various sensing techniques, such as LiDAR, Kinect, and RGB camera.

  • A Systematic Approach for Cross-source Point Cloud Registration by Preserving Macro and Micro Structures.
  • Xiaoshui Huang, Jian Zhang, Lixin Fan, Qiang Wu, Chun Yuan.
    Transactions on Image Processing (T-IP) , 2017 .
    PDF

    We propose a systematic approach for registering cross-source point clouds. The compelling need for cross-source point cloud registration is motivated by the rapid development of a variety of 3D sensing techniques, but many existing registration methods face critical challenges as a result of the large variations in cross-source point clouds. This paper therefore illustrates a novel registration method which successfully aligns two cross-source point clouds in the presence of significant missing data, large variations in point density, scale difference and so on. The robustness of the method is attributed to the extraction of macro and micro structures. Our work has three main contributions: (1) a systematic pipeline to deal with cross-source point cloud registration; (2) a graph construction method to maintain macro and micro structures; (3) a new graph matching method is proposed which considers the global geometric constraint to robustly register these variable graphs. Compared to most of the related methods, the experiments show that the proposed method successfully registers in cross-source datasets, while other methods have difficulty achieving satisfactory results. The proposed method also shows great ability in same-source datasets.

    2016

  • A coarse-to-fine algorithm for registration in 3D street-view cross-source point clouds.
  • Xiaoshui Huang, Jian Zhang, Qiang Wu, Lixin Fan, Chun Yuan.
    International Conference on Digital Image Computing: Techniques and Applications (DICTA) , 2016 .
    PDF

    With the development of numerous 3D sensing technologies, object registration on cross-source point cloud has aroused researchers’ interests. When the point clouds are captured from different kinds of sensors, there are large and different kinds of variations. In this study, we address an even more challenging case in which the differently-source point clouds are acquired from a real street view. One is produced directly by the LiDAR system and the other is generated by using VSFM software on image sequence captured from RGB cameras. When it confronts to large scale point clouds, previous methods mostly focus on point-to-point level registration, and the methods have many limitations.The reason is that the least mean error strategy shows poor ability in registering large variable cross-source point clouds. In this paper, different from previous ICP-based methods, and from a statistic view, we propose a effective coarse-to-fine algorithm to detect and register a small scale SFM point cloud in a large scale Lidar point cloud. Seen from the experimental results, the model can successfully run on LiDAR and SFM point clouds, hence it can make a contribution to many applications, such as robotics and smart city development

  • Real Time Complete Dense Depth Reconstruction for a Monocular Camera.
  • Xiaoshui Huang, Lixin Fan, Jian Zhang, Qiang Wu, and Chun Yuan.
    IEEE Conference on Computer Vision and Pattern Recognition Workshops , 2016 .
    PDF

    In this paper, we aim to solve the problem of estimating complete dense depth maps from a monocular moving camera. By ’complete’, we mean depth information is estimated for every pixel and detailed reconstruction is achieved. Although this problem has previously been attempted, the accuracy of complete dense depth reconstruction is a remaining problem. We propose a novel system which produces accurate complete dense depth map. The new system consists of two subsystems running in separated threads, namely, dense mapping and sparse patch-based tracking. For dense mapping, a new projection error computation method is proposed to enhance the gradient component in estimated depth maps. For tracking, a new sparse patch-based tracking method estimates camera pose by minimizing a normalized error term. The experiments demonstrate that the proposed method obtains improved performance in terms of completeness and accuracy compared to three state-ofthe-art dense reconstruction methods VSFM+CMVC, LSDSLAM and REMODE.

    2015

  • Graph Cuts Stereo Matching Based on Patch-Match and Ground Control Points Constraint.
  • Xiaoshui Huang, Chun Yuan, and Jian Zhang.
    Pacific Rim Conference on Multimedia (PCM) , 2015 .
    PDF

    Stereo matching methods based on Patch-Match obtain good results on complex texture regions but show poor ability on low texture regions. In this paper, a new method that integrates Patch-Match and graph cuts (GC) is proposed in order to achieve good results in both complex and low texture regions. A label is randomly assigned for each pixel and the label is optimized through propagation process. All these labels constitute a label space for each iteration in GC. Also, a Ground Control Points (GCPs) constraint term is added to the GC to overcome the disadvantages of Patch-Match stereo in low texture regions. The proposed method has the advantage of the spatial propagation of Patch-Match and the global property of GC. The results of experiments are tested on the Middlebury evaluation system and outperform all the other PatchMatch based methods.

  • Dense Correspondence Using Non-local DAISY Forest.
  • Xiaoshui Huang, Jian Zhang, Qiang Wu, Chun Yuan, and Lixin Fan.
    International Conference on Digital Image Computing: Techniques and Applications (DICTA) , 2015 .
    PDF

    Dense correspondence computation is a critical computer vision task with many applications. The most existing dense correspondence methods consider all the neighbors connected to the center pixels and use local support region. However, such approach might only achieve a locally-optimal solution.In this paper, we propose a non-local dense correspondence computation method by calculating the match cost on a tree structure. It is non-local because all other nodes on the tree contribute to the match cost computing for the current node. The proposed method consists of three steps, namely: 1) DAISY descriptor computation, 2) edge-preserving segmentation and forest construction, 3) PatchMatch fast search. We test our algorithm on the Middlebury and Moseg datasets. The results show that the proposed method outperforms the state-of-the-art methods in dense correspondence computing and has a low computation complexity.