Deep Hough Transform for Semantic Line Detection
online demo
Abstract
In this paper, we put forward a simple yet effective method to detect meaningful straight lines, a.k.a. semantic lines, in given scenes. Prior methods take line detection as a special case of object detection, while neglect the inherent characteristics of lines, leading to less efficient and suboptimal results. We propose a one-shot end-to-end framework by incorporating the classical Hough transform into deeply learned representations. By parameterizing lines with slopes and biases, we perform Hough transform to translate deep representations to the parametric space and then directly detect lines in the parametric space. More concretely, we aggregate features along candidate lines on the feature map plane and then assign the aggregated features to corresponding locations in the parametric domain. Consequently, the problem of detecting semantic lines in the spatial domain is transformed to spotting individual points in the parametric domain, making the post-processing steps,i.e.non-maximal suppression, more efficient. Furthermore, our method makes it easy to extract contextual line features, that are critical to accurate line detection. Experimental results on a public dataset demonstrate the advantages of our method over state-of-the-arts.

Papers
- Deep Hough Transform for Semantic Line Detection, Kai Zhao#, Qi Han#, Chang-Bin Zhang, Jun Xu, Ming-Ming Cheng*, IEEE TPAMI, 2022. [pdf | code| bib | project | latex]
- Qi Han#, Kai Zhao#, Jun Xu, Ming-Ming Cheng*. “Deep Hough Transform for Semantic Line Detection.”. 16th European Conference on Computer Vision, August 2020. (ECCV2020) (* denotes equal contribution) [Pdf |Code | bib | Video |Slides | 中文版]
Background
Previous solutions for this task take line detection as a special case of object detection and adopt existing CNN-based object detectors. In their method, features are aggregated along straight lines by using LOI pooling. A classification network and a regression network are applied to the extracted feature vectors to identify positive lines and adjust the line positions. These methods extract line-wise feature vectors by aggregating deep features solely along each line, leading to inadequate context information, and are less efficient in terms of running time.
Pipeline
Our method comprises the following four major components:
- A CNN encoder that extracts pixel-wise deep representations.
- The deep Hough transform that converts the spatial representations to a parametric space.
- Line detector that is responsible to detect lines in the parametric space.
- A reverse Hough transform that converts the detected lines back to image space.

Deep Hough Transform performs Hough Transform on deep representations and transforms the spatial features to parametric space with high dimensions in parallel. The line structures could be more compactly represented in parametric space because lines nearby a specific line are translated to surrounding points of this line in parametric space. We use a context-aware line detector to aggregate features of nearby lines. Then we directly get individual points in the prediction of parametric space. The time-consuming NMS is converted to calculate the geological center of the connected-points in parametric space. Finally, reverse mapping of Hough Transform maps points in parametric space back to spatial space.
Evaluation Metric
A principled metric named EA-score, which measures the similarity between two lines, which consider both Euclidean distance and angular distance.


Performance

ECCV2020 Video
ALL Detection Results on SEL dataset
