Reflection Removal of 3D Point Clouds for Realistic 3D Scene Reconstruction

(1)

저작자표시-비영리-변경금지 2.0 대한민국 이용자는 아래의 조건을 따르는 경우에 한하여 자유롭게

l 이 저작물을 복제, 배포, 전송, 전시, 공연 및 방송할 수 있습니다. 다음과 같은 조건을 따라야 합니다:

l 귀하는, 이 저작물의 재이용이나 배포의 경우, 이 저작물에 적용된 이용허락조건 을 명확하게 나타내어야 합니다.

저작권법에 따른 이용자의 권리는 위의 내용에 의하여 영향을 받지 않습니다. 이것은 이용허락규약(Legal Code)을 이해하기 쉽게 요약한 것입니다.

Disclaimer

저작자표시. 귀하는 원저작자를 표시하여야 합니다.

비영리. 귀하는 이 저작물을 영리 목적으로 이용할 수 없습니다.

변경금지. 귀하는 이 저작물을 개작, 변형 또는 가공할 수 없습니다.

(2)

Doctoral Dissertation

Reflection Removal of 3D Point Clouds for Realistic 3D Scene Reconstruction

Jae-Seong Yun

Department of Electrical Engineering

Ulsan National Institute of Science and Technology

2021

(3)

Reflection Removal of 3D Point Clouds for Realistic 3D Scene Reconstruction

Jae-Seong Yun

Department of Electrical Engineering

Ulsan National Institute of Science and Technology

(4)

Reflection Removal of 3D Point Clouds for Realistic 3D Scene Reconstruction

A dissertation submitted to

Ulsan National Institute of Science and Technology in partial fulfillment of the

requirements for the degree of Doctor of Philosophy

Jae-Seong Yun

12/14/2020 of submission Approved by

_________________________

Advisor

Jae-Young Sim

(5)

Reflection Removal of 3D Point Clouds for Realistic 3D Scene Reconstruction

Jae-Seong Yun

This certifies that the dissertation of Jae-Seong Yun is approved.

12/14/2020 of submission

Signature

___________________________

Advisor: Jae-Young Sim Signature

___________________________

Seungjoon Yang: Thesis Committee Member #1 Signature

___________________________

Se Young Chun: Thesis Committee Member #2 Signature

___________________________

Seong-Jin Kim: Thesis Committee Member #3 Signature

___________________________

Won-Ki Jeong: Thesis Committee Member #4;

nh ^h

(6)

Abstract

With an advent of LiDAR scanners, the Large-scale 3D point clouds (LS3DPCs) captured by terrestrial LiDAR scanners or dynamic 3D point cloud captured by real-time LiDAR scanners are widely used. However, the captured 3D point clouds often include virtual points which are generated by glass reflection. The virtual points may degrade the performance of various computer vision techniques when applied to 3D point clouds. In this dissertation, we propose virtual point removal algorithms for static 3D point clouds and dynamic 3D point clouds captured by LiDAR scanners.

We first propose an efficient reflection removal algorithm for static LS3DPCs when there is a single pane of glass. We first partition the unit sphere into local surface patches which are then classified into the ordinary patches and the glass patches according to the number of echo pulses from emitted laser pulses. Then we estimate the glass region of dominant reflection artifacts by measuring the reliability.

We also detect and remove the virtual points using the conditions of the reflection symmetry and the geometric similarity. We test the performance of the proposed algorithm on LS3DPCs capturing real- world outdoor scenes and show that the proposed algorithm estimates valid glass regions faithfully and removes the virtual points caused by reflection artifacts successfully.

Then, we propose a virtual point removal algorithm for LS3DPCs with multiple glass planes. We first estimate multiple glass regions by modeling the reliability with respect to each glass plane, respectively, such that the regions are assigned high reliability when they have multiple echo pulses for each emitted laser pulse. Then we detect each point whether it is a virtual point or not. For a given point, we recursively traverse all the possible trajectories of reflection, and select the optimal trajectory which provides a point with a similar geometric feature to a given point at the symmetric location. We evaluate the performance of the proposed algorithm on various LS3DPC models with diverse numbers of glass planes. Experimental results show that the proposed algorithm estimates multiple glass regions faithfully and detects the virtual points successfully. Moreover, we also show that the proposed algorithm yields a much better performance of reflection artifact removal compared with the existing method qualitatively and quantitatively.

Finally, we propose a novel virtual point removal algorithm for dynamic 3D point clouds. We first estimate the glass planes using specular reflection characteristics of glass. We collect local maxima points for every frames and estimate glass planes using the relationship between intensity and the angle of incidence for laser pulse. To remove virtual points more clearly, we cluster 3D point clouds and then, recursively estimate all possible trajectories of laser pulses and remove the clusters when there are reflections in the most reliable trajectory. Experimental results show that the proposed algorithm successfully estimate glass planes, and detect and remove virtual points.

The main contribution of this dissertation is the removal of reflection artifacts in LS3DPCs and dynamic 3D point clouds. In first chapter, we analyze the characteristic of received echo pulses for

(7)

terrestrial LiDAR scanners and estimate glass regions and glass planes. Also, we investigate the symmetry relation of corresponding real and virtual points. In second chapter, we propose recursive trajectory estimation methods to detect and remove reflection artifacts. We also perform quantitative evaluation of proposed methods. In last chapter, we analyze the relationship between intensity and the angle of incidence for the received echo pulse to estimate glass plane for dynamic 3D point clouds captured by real-time LiDAR scanners. Then, we apply cluster-wise removal of reflection artifacts to get clear reflection-free 3D point clouds.

(8)

(9)

i

List of Figures

1.1 Reflection artifact in LS3DPC. (a) Reflection artifact in 2D image. (b) A LS3DPC model where the virtual points caused by glass reflection are detected in red and the associated glass plane is shown in cyan. ... 2 2.1 The principle of reflection in a LiDAR laser scanner. The black building and tree are real-world

objects, while the gray building denotes a virtual object generated by reflection. ... 9 2.2 Partitioning of the unit sphere into local surface patches. ... 12 2.3 Glass region estimation. (a) The panorama image for a target scene. (b) The number of projected

points 𝑛𝑖 . (c) The posterior probability of 𝑝1(𝑛𝑖) of belonging to glass regions. (d) The reliability map of 𝜌𝑖 where dominant glass regions are highlighted. ... 13 2.4 Histogram of 𝑛𝑖's for the patches with 𝑛𝑖 > 0 in ‘Engineering Building’ model. ... 14 2.5 Symmetry relation of reflection between a pair of real point and virtual point. ... 17 2.6 Fast Point Feature Histogram which describes the three angular variations 𝛼, ϕ, θ associated

with the two points 𝐩𝑖 and 𝐩𝑗. ... 18 2.7 Scores for virtual points. (a) An input LS3DPC model. (b) The distribution of score 𝛾𝐩 where

the red and blue colors depict high and low scores, respectively. ... 19 2.8 Estimation of glass regions associated with dominant reflection artifacts. In each subfigure, a

color panorama image and the resulting reliability distribution is shown in top and bottom, respectively. ... 20 2.9 Results of reflection removal for LS3DPCs. In each subfigure, the top row shows an input

LS3DPC model, the middle row visualizes the estimated glass regions (yellow) and the detected virtual points (red), and the bottom row shows the resulting LS3DPC model where the virtual points are removed. ... 24 3.1 Partitioning of the unit sphere into local surface patches. ... 28 3.2 Distribution of the number of projected points in ‘International Hall’ model. (a) A panoramic

image for target scene. (b) The distribution of the number of projected points 𝑔𝑖 . (c) The

(12)

iv

histogram of 𝑔𝑖's. (d) The distribution of the posterior probability 𝑝1𝑔𝑖. ... 29

3.3 Glass plane estimation. (a) The rendered 3D point cloud where the estimated three glass planes are depicted in purple, yellow, and cyan, respectively. (b) A color image for target scene captured at a similar viewpoint to (a). ... 31

3.4 Glass region refinement. (a) Absence of sampled points within the minimum acquisition distance from the scanner. (b) An image of target scene. (c) The initial reliability map. (d) The refined reliability map. ... 32

3.5 Estimation results of multiple glass regions. (a) An input panoramic image where the three glass planes Π0, Π1, and Π2, and the marble floor Π3 are depicted in purple, cyan, yellow and green, respectively. The resulting reliability maps of (b) Ψ0, (c) Ψ1, (d) Ψ2, and (e) Ψ3. ... 34

3.6 Virtual point generation with two parallel glass planes. ... 35

3.7 Trajectory estimation with multiple glass planes. ... 37

3.8 Symmetry relation of reflection between a pair of real point and virtual point. ... 41

3.9 Fast Point Feature Histogram which describes the three angular variations 𝛼, 𝜙, 𝜃 associated with the two points 𝐩𝑖 and 𝐩𝑗. ... 42

3.10 Validity scores of trajectories associated with virtual points. (a) An input LS3DPC model with virtual points by glass reflection. (b) The distribution of 𝛾𝐩 where the red and blue colors depict high and low scores, respectively. ... 44

3.11 Results of the proposed algorithm. (a) Input LS3DPC models. (b) Panoramic images and the reliability maps of the glass regions. (c) The estimated multiple glass planes. (d) The detected virtual points. (e) The resulting LS3DPC models where the reflection artifacts are removed. From left to right, ‘Botanical Garden,’ ‘Terrace,’ and ‘Natural Science Building,’ models. ... 46

3.11 Results of the proposed algorithm. (a) Input LS3DPC models. (b) Panoramic images and the reliability maps of the glass regions. (c) The estimated multiple glass planes. (d) The detected virtual points. (e) The resulting LS3DPC models where the reflection artifacts are removed. From left to right, ‘Engineering Building,’ ‘Corridor,’ ‘Lobby,’ and ‘International Hall’ models. ... 47 3.12 Comparison of the glass region estimation results on LS3DPCs with a single glass plane. (a)

Panoramic images of input LS3DPCs. The reliability maps of the glass regions estimated by using

(13)

v

(b) [42] and (c) the proposed algorithm. From top to bottom, ‘Natural Science Building,’

‘Gymnasium,’ ‘Botanical Garden’ and ‘Architecture Building’. ... 48 3.13 Comparison of the glass region estimation results on LS3DPCs with multiple glass planes. (a)

Panoramic images of input LS3DPCs. The reliability maps of the glass regions associated with the (b) first, (c) second, (d) third and (e) fourth glass planes, respectively. For each model, the upper and lower rows show the reliability maps estimated by the iteration of [42] and the proposed algorithm, respectively. From top to bottom, ‘Engineering Building,’ ‘Corridor,’

‘Lobby,’ ‘International Hall,’ and ‘Two Buildings’ models. ... 49 3.14 Comparison of the virtual point removal results. (a) Input LS3DPC models and the virtual point

removal results obtained by using (b) [42], (c) the iteration of [42], and (d) the proposed algorithm, respectively. From left to right, ‘Gymnasium,’ ‘Architecture Building,’ and ‘International Hall,’

respectively. ... 50 3.14 Comparison of the virtual point removal results. (a) Input LS3DPC models and the virtual point

removal results obtained by using (b) [42], (c) the iteration of [42], and (d) the proposed algorithm, respectively. From left to right, ‘Lobby,’ ‘Corridor,’ and ‘Two Buildings,’ respectively. ... 51 3.15 Comparison of the virtual point removal results on ‘Dance Practice Room’ (top) and ‘Lobby’

(bottom) models. (a) Input LS3DPC models. The virtual point removal results obtained by using (b) the iteration of [42] and (c) the proposed algorithm, respectively. ... 53 3.16 Comparison of the virtual point removal performance on ‘Amazon Lumberyard Bistro’ model.

The blue and red colors depict the real and virtual points, respectively. The two glass planes are shown in cyan and purple. ... 55 4.1 Specular reflection characteristics of glass. ... 59 4.2 The intensity of received echo pulse by [31] for (a) virtual and real points and (b) points on the

glass plane. High and low values are depicted in red and blue, respectively. (c) Distribution of intensity of echo laser pulse along angle of incidence from [48]. ... 60 4.3 Intensities of received echo pulses from all frames. Red and blue colors depicts high and low

intensities, respectively. The points sampled on the glass planes are marked in a blue ellipse. . 61 4.3 Intensities of received echo pulses from all frames. Red and blue colors depicts high and low

intensities, respectively. The points sampled on the glass planes are marked in a blue ellipse. . 62

(14)

vi

4.4 Estimated glass plane Π0 for ‘Parking Lot’ model. ... 62 4.5 Example of virtual points. ... 65 4.6 (a) Distribution of final scores 𝛾𝐜𝑖 , where red and blue colors depict high and low values,

respectively, and (b) detected virtual clusters where virtual and real clusters are colored in red and blue, respectively. ... 66 4.7 Results of proposed algorithm. (a) Input dynamic 3D point clouds. (b) The estimated glass planes

are colored in cyan, and virtual and real clusters are colored in red and blue, respectively. (c) The resulting 3D point clouds where the virtual clusters are removed. From top to bottom, ‘Parking Lot,’ ‘Main Building,’ ‘Engineering Building,’ and ‘Communication Center’ models. ... 69

(15)

vii

List of Tables

2.1 Processing time of the proposed algorithm. (a) ‘Architecture Building,’ (b) ‘Engineering Building,’ (c) ‘Natural Science Building,’ (d) ‘Botanical Garden,’ (e) ‘Gymnasium,’ and (f)

‘Terrace.’ ... 22 3.1 Comparison of quantitative performance. ... 56 3.2 Comparison of processing times in seconds. (a) ‘Architecture Building,’ (b) ‘Botanical Garden,’

(c) ‘Gymnasium,’ (d) ‘Natural Science Building,’ (e) ‘Terrace,’ (f) ‘Corridor,’ (g) ‘Dance Practice Room,’ (h) ‘Engineering Building,’ (i) ‘Lobby,’ (j) ‘International Hall,’ and (k) ‘Two Building.’

... 56 4.1 Processing time of the proposed algorithm for (a) ‘Parking Lot,’ (b) ‘Main Building,’ (c)

‘Engineering Building,’ and (d) ‘Communication Center’ models. ... 67

(16)

1

Chapter 1 Introduction

1 Motivations

We often take glass images, for example, the photos of goods taken through show windows or the photos of buildings with glass curtain walls. While light passes through the glass, it is reflected on the pane of glass simultaneously. Therefore, a glass image captures a target scene behind the glass and also captures an undesired scene in front of the glass together. Figure 1.1(a) shows an example of glass image where the sky, buildings, and trees are taken as reflected scenes on the transparent window wall. The undesired reflected scenes usually degrade the performance of image processing and computer vision techniques when applied to glass images.

Reflection removal is a technique to automatically separate a glass image into a transmission image and a reflection image. Attempts have been made to remove reflection artifacts from a single glass image. Levin et al. [1] separated an input glass image into a transmission image and a reflection image by minimizing the number of edges and corners in separated images. Levin and Weiss [2] solved the layer separation problem by minimizing the gradients of transmission image based on the gradient sparsity prior. Li and Brown [3] regularized the gradient distribution of transmission image to have a long tail by assuming that reflection images are smoother than transmission images. Shih et al. [4]

considered double reflection caused by a thick pane of glass, and used the ghosting effect observed in reflected scene to separate a reflection image from a transmission image. Arvanitopoulos et al. [5]

decomposed a glass image into transmission and reflection layers by minimizing the Laplacian fidelity which regularizes the difference between the Laplacian of an input image and the Laplacian of a transmission image. Fan et al. [6] proposed two deep neural networks, where the edge prediction network extracts a transmission edge map and the image reconstruction network reconstructs a transmission image from the predicted edge map. Wan et al. [7] used an encoder-decoder convolutional neural network architecture where the decoder of image inference network is guided by the associated decoder of gradient inference network. Zhang et al. [8] used a conditional generative adversarial network, which employs a feature loss to minimize the feature difference between the ground truth and reconstructed transmission images, and an exclusion loss to minimize the overlapping edges between the transmission and reflection images. Yang et al. [9] proposed a cascade deep neural network which estimates the transmission image from the reflection image and vice versa. Han and Sim [10] proposed

(17)

2 (a)

(b)

Figure 1.1: Reflection artifact in LS3DPC. (a) Reflection artifact in 2D image. (b) A LS3DPC model where the virtual points caused by glass reflection are detected in red and the associated glass plane is shown in cyan.

(18)

3

a non-linearly synthesizing method of glass images and a neural network for reflection removal guided by semantic information of transmission images.

Reflection removal techniques using multiple glass images have been also proposed. Kong et al. [11]

estimated the dense angle of incidence map by minimizing the mutual information between the transmission and reflection images. Agrawal et al. [12] used a pair of two images taken with and without flashlight, respectively, to remove reflection images. Schechner et al. [13] estimated defocus blur kernels for differently focused glass images. Wieschollek et al. [14] synthesized a training set of differently polarized glass images based on reflection physics, and designed a reflection removal algorithm using deep neural network. Multiple images taken from different camera positions have been used for reflection removal. Gai et al. [15] estimated the parameterized motions including translation, rotation, and scaling, to separate the layers from multiple glass images. Sinha et al. [16]estimated the depth map using multiple images, and decomposed the depth map into rear and front depth maps which are then used to restore the transmission and reflection images. Li and Brown [17] first aligned multiple glass images together and generated edge maps. The edge pixels are classified to the transmission and reflection images, respectively, according to their coincidence, which are then used to reconstruct the transmission and reflection images. Guo et al. [18] separated the image layers by putting a user driven region of interest to glass images. Xue et al. [19] proposed an optimization algorithm for reflection removal by iteratively estimating the motion fields of the transmitted and reflected scenes. Han and Sim [20] applied low-rank matrix completion to the gradient of multiple glass images to extract a consistent transmission image, and they also improved the performance by aligning multiple glass images with respect to the transmitted scene using co-saliency detection [21]. Nandoriya et al. [22] developed a reflection removal algorithm for video sequences which reduces the flickering artifact using temporal information.

With the advent of high-performance Light Detection And Ranging (LiDAR) scanners, lots of effort has been made to develop efficient techniques for processing 3D point clouds. A real-time LiDAR scanner and stereo cameras were used to generate a test dataset [23] for the applications of autonomous vehicles, which was used to study object detection [24] [25] and road detection [26] [25]. Moreover, high-resolution terrestrial LiDAR scanners provide Large-Scale 3D Point Clouds (LS3DPCs) capturing 360° environmental real world scenes, and LS3DPCs are being used recently to pioneer new research issues such as saliency detection [27] [28] and data compression [29].

A LiDAR scanner emits a laser pulse and receives its echo pulse to measure the distance from the scanner to a target object based on the traveling time of light. Therefore, LS3DPCs obtained by terrestrial LiDAR scanners also suffer from reflection artifacts. LiDAR scanner emits a laser pulse toward an object and measures distance to the object using the traveling time of returning laser pulse.

(19)

4

When the emitted laser pulse is reflected on the glass plane and the scanner receives its echo pulse, it generates a virtual point at the opposite side of the glass plane, since the scanner does not know the existence of glass and the received pulse is regarded to be reflected at a real object only once. Figure 1.1(b) renders a LS3DPC capturing a real-world outdoor scene corresponding to Figure 1.1(a). We see that the virtual points of the building and trees are detected in red, and the associated glass plane is depicted in cyan, respectively. The virtual points in LS3DPCs distort the true geometry of target real- world scenes, and hence degrade the performance of the related geometry processing techniques for LS3DPCs accordingly.

(20)

5

2 Proposed Approaches

In this section, we describe the proposed approaches removing reflection artifacts from large-scale 3D point clouds (LS3DPCs) when there are single or multiple glass planes. Also, we describe the proposed approach removing virtual points from dynamic 3D point clouds captured by real-time LiDAR scanners.

2. 1 Reflection Removal for Large-Scale 3D Point Clouds with Single Glass Plane

LS3DPCs captured by terrestrial LiDAR scanners often exhibit reflection artifacts by glasses, which degrade the performance of related computer vision techniques. In this work, we propose an efficient reflection removal algorithm for LS3DPCs. We first partition the unit sphere into local surface patches which are then classified into the ordinary patches and the glass patches according to the number of echo pulses from emitted laser pulses. Then we estimate the glass region of dominant reflection artifacts by measuring the reliability. We also detect and remove the virtual points using the conditions of the reflection symmetry and the geometric similarity. We test the performance of the proposed algorithm on LS3DPCs capturing real-world outdoor scenes, and show that the proposed algorithm estimates valid glass regions faithfully and removes the virtual points caused by reflection artifacts successfully.

2. 2 Reflection Removal for Large-Scale 3D Point Clouds with Multiple Glass Planes

LS3DPCs captured by terrestrial LiDAR scanners often include virtual points which are generated by glass reflection. The virtual points may degrade the performance of various computer vision techniques when applied to LS3DPCs. In this work, we propose a virtual point removal algorithm for LS3DPCs with multiple glass planes. We first estimate multiple glass regions by modeling the reliability with respect to each glass plane, respectively, such that the regions are assigned high reliability when they have multiple echo pulses for each emitted laser pulse. Then we detect each point whether it is a virtual point or not. For a given point, we recursively traverse all the possible trajectories of reflection, and select the optimal trajectory which provides a point with a similar geometric feature to a given point at the symmetric location. We evaluate the performance of the proposed algorithm on various LS3DPC models with diverse numbers of glass planes. Experimental results show that the proposed algorithm estimates multiple glass regions faithfully and detects the virtual points successfully. Moreover, we also

(21)

6

show that the proposed algorithm yields a much better performance of reflection artifact removal compared with the existing method qualitatively and quantitatively.

2. 3 Reflection Removal for Dynamic 3D Point Clouds

The terrestrial LiDAR scanners captures high-resolution LS3DPCs, but it takes couple of minutes to capture single 3D point clouds and generates undesired distortion when there are moving objects.

Therefore, it is hard to use in many applications. In many applications, the real-time LiDAR scanners are widely used to capture 3D spatial information and the captured dynamic 3D point clouds also suffer from reflection artifacts. In this work, we proposed virtual point removal methods for dynamic 3D point clouds captured by real-time LiDAR scanners. Unlike the high-resolution terrestrial LiDAR scanners, the real-time LiDAR scanner does not capture all echo pulses and it is impossible to directly use reflection removal methods for LS3DPCs. To remove reflection artifacts from dynamic 3D point clouds, we analyze the relationship between intensities of echo pulses and angle of incidence for emitted laser pulses. We estimate glass plane using the characteristics that the intensity of echo pulse is maximized when the emitted laser pulse is perpendicular to the glass plane. We also perform cluster-wise removal of reflection artifacts. We first cluster 3D point clouds using supervoxel algorithm [30] and estimate all possible trajectories of each cluster. We remove clusters where the most reliable trajectory has reflection of laser pulse. The experiment results show that the proposed algorithm detects the glass regions and removes the virtual objects more completely compared with the existing method.

(22)

7

3 Organization

The rest of this dissertation is organized as follows. Chapter 2 presents a novel method for removing reflection artifacts from LS3DPCs with single glass plane. Chapter 3 introduces a novel method for virtual point removal for LS3DPCs with multiple glass planes. Chapter 4 introduces a novel virtual point removal algorithm for dynamic 3D point clouds. Finally, we conclude this dissertation proposal in Chapter 5

(23)

8

Chapter 2 Reflection Removal for Large-Scale 3D Point Clouds with Single Glass Plane

1 Introduction

LS3DPCs captured by using terrestrial LiDAR scanners suffer from the reflection artifacts since many outdoor real-world structures include glasses, e.g., vehicles and buildings. A LiDAR scanner measures the distances of target scene by emitting laser pulses and receiving their echo pulses. Figure 2.1(b) shows the principle of reflection caused by LiDAR scanners. It calculates the distance from the scanner to an object by measuring the time it takes to emit a laser pulse and receive the echo pulse. A single laser pulse emitted from the scanner first hits the glass and its echo pulse is come back to the scanner creating a 3D point 𝐩₁ on the glass. Also, penetration and reflection of light occur simultaneously on the glass. The penetrated laser pulse hits the tree, a real-world object behind the glass, and its echo pulse is received at the scanner creating another 3D point 𝐩2. On the other hand, the reflected laser pulse hits the building, a real-world object in front of the glass, and the scanner receives its echo pulse to create a virtual 3D point 𝐪_virtual at the opposite side of the glass plane. Consequently, from a single emitted pulse, multiple echo pulses are generated which generate three different 3D points.

Among the points, 𝐩₁ and 𝐩₂ are valid points sampled on real-world objects, but 𝐪_virtual locates on a wrong position in 3D space, i.e., gray building. Such reflection artifacts occur since the scanner regards a received pulse is reflected on a real-world object only once. Therefore, the resulting LS3DPC includes virtual scene which may degrade the performance of the related processing techniques of LS3DPCs.

In this chapter, we propose a reflection removal algorithm for LS3DPCs. We investigate the capturing mechanism of terrestrial LiDAR scanner and estimate the glass regions by modeling the distribution of the number of received echo pulses. Then we detect a point as a virtual point when it has a corresponding real point with similar geometric feature at the opposite side about the glass plane. We perform the experiments on LS3DPC models by capturing real-world outdoor scenes and show that the proposed algorithm removes the reflection artifacts faithfully.

(24)

9

Figure 2.1: The principle of reflection in a LiDAR laser scanner. The black building and tree are real- world objects, while the gray building denotes a virtual object generated by reflection.

(25)

10

2 Overview of Proposed Algorithm

We use a 3D terrestrial laser scanner, RIEGL VZ-400 [31], to acquire LS3DPCs for real-world outdoor scenes including glasses. In general, glass is highly specular, and therefore valid 3D points are sampled over a relatively small area on a glass plane where the directions of emitted lasers are close to the normal direction of the glass plane. However, many buildings have coated glass curtain walls and windows which produce sampled points over relatively larger areas than uncoated ones. Also, when capturing a typical real-world scene by LiDAR scanners, while a single glass plane is located close to the scanner associated with dominant reflection artifact, the other glass regions yield small numbers of points with negligible reflection artifacts. Therefore, in this work, we first estimate a glass plane of dominant reflection artifact in a captured scene. Then we detect and remove the virtual points by comparing the features between a pair of symmetric points about the glass plane.

(26)

11

3 Glass Region Estimation

In the research of reflection removal for 2D images, we usually consider glass images are captured such that the reflection occurs over the entire image area. However, LiDAR scanners capture 360°

environment of real-world scene, and therefore, the virtual points associated with glasses are distributed at local regions in a single LS3DPC model. We first estimate the glass plane where the dominant reflection occurs using the characteristics of LiDAR scanning.

3. 1 Patch Classification by Point Projection

In general, a single 3D point is created for each laser pulse, since the light is reflected on a real-world object only once in most cases. But, as shown in Figure 2.1(b), a laser pulse hitting the glass plane may produce more than two points: one from the penetrated laser pulse and the others from the reflected pulses on the glass plane. Each laser pulse is emitted periodically with predefined azimuthal and polar angular resolutions, 𝜔_azimuthal and 𝜔_polar , respectively. By using the measured distance and the associated azimuthal and polar angles of an emitted laser pulse, the coordinates of 3D points are computed in the spherical coordinate system. As shown in Figure 2.2, we consider the unit sphere with the origin at the scanner location, and partition the sphere surface into local patches where the blue rectangular area becomes a primitive surface patch which covers the angular range of 𝜔_azimuthal× 𝜔_polar. Then we count the number of points corresponding to each patch by projecting the 3D points onto the surface of the unit sphere, and we classify the patches into the ordinary patches including only a single 3D point and the glass patches where two or more 3D points are projected.

However, a laser scanner acquires the points based on whisk broom scanning which samples one point at a certain time instance by mechanically rotating the sensor. Hence, to reduce the sampling error of points, we employ a larger surface patch covering a wider angular range of 𝑚𝜔_azimuthal× 𝑚𝜔_polar where 𝑚 is a positive integer. We set 𝑚 = 3 empirically as depicted by the red rectangular area in Figure 2.2, since it is the smallest integer detecting the glass regions of dominant reflection reliably. We see that each patch is associated with approximately 𝑚² laser pulses.

Let 𝑛_𝑖 be the number of projected points to the 𝑖-th surface patch 𝒮_𝑖. Figure 2.3(b) visualizes the distribution of 𝑛_𝑖's associated with the target scene shown in Figure 2.3(a). We see that while the ordinary patches usually exhibit 𝑛_𝑖 = 𝑚², the glass patches tend to have 𝑛_𝑖 > 𝑚² due to multiple echo pulses by reflection. Also, Figure 2.4 shows the histogram of 𝑛_𝑖's counted over all the patches having valid projected points, i.e., 𝑛_𝑖> 0 on a LS3DPC model in Figure 2.3. We observe that there are two strong peaks at 𝑛_𝑖= 9 and 𝑛_𝑖 = 18. The first peak at 𝑛_𝑖 = 9 implies that 𝑚² laser pulses

(27)

12

Figure 2.2: Partitioning of the unit sphere into local surface patches.

produce the same number of points which is highly probable to describe the ordinary patches. On the other hand, the second peak at 𝑛_𝑖= 18 is associated with 2𝑚² pulses yielded mainly by the glass regions where each pulse generates two points on average, one from the penetrated laser pulse and the other from the reflected laser pulse. We also see a weak peak at 𝑛_𝑖 = 27 associated with 3𝑚² pulses on average.

3. 2 Reliability for Glass Patches

We classify the patches into two categories of ordinary patches and glass patches by modeling the distribution of 𝑛_𝑖 using the mixture of 𝐾 Gaussian distributions [32]. The density of Gaussian mixture model is given by

𝑓(𝑛_𝑖) = ∑ 𝜆_𝑘𝒩(𝑛_𝑖|𝜇_𝑘, 𝜎_𝑘²)

𝐾−1

𝑘=0

, (2.1)

where 𝒩(𝑛_𝑖|𝜇_𝑘, 𝜎_𝑘²) is the 𝑘-th Gaussian density with the mean 𝜇_𝑘 and the variance 𝜎_𝑘², respectively, and 𝜆_𝑘 is the mixing coefficient. We set the number of Gaussians to 𝐾 = 2 , one for the ordinary patches and the other for the glass patches. We introduce a two-dimensional binary random vector

(28)

13 (a)

(b)

(c)

(d)

Figure 2.3: Glass region estimation. (a) The panorama image for a target scene. (b) The number of projected points 𝑛_𝑖 . (c) The posterior probability of 𝑝₁(𝑛_𝑖) of belonging to glass regions. (d) The reliability map of 𝜌_𝑖 where dominant glass regions are highlighted.

(29)

14

Figure 2.4: Histogram of 𝑛_𝑖's for the patches with 𝑛_𝑖> 0 in ‘Engineering Building’ model.

𝐳 = [𝑧₀, 𝑧₁]^𝑇 where 𝑧_𝑘 ∈ {0, 1} and ∑^𝐾−1_𝑘=0𝑧_𝑘 = 1. Without loss of generality, we assume that 𝜇₀≤ 𝜇₁, then 𝐳 = [1,0]^𝑇 at ordinary patches and 𝐳 = [0,1]^𝑇 at the glass patches, respectively.

To estimate the parameters 𝜇_𝑘, 𝜎_𝑘, and 𝜆_𝑘, we use the Expectation Maximization (EM) algorithm [33]. For given randomly initialized parameters 𝜇_𝑘 , 𝜎_𝑘 , and 𝜆_𝑘 , the EM algorithm evaluates the posterior probability 𝑝_𝑘(𝑛_𝑖) as

𝑝_𝑘(𝑛_𝑖) = 𝜆_𝑘𝒩(𝑛_𝑖|𝜇_𝑘, 𝜎_𝑘²)

∑^𝐾−1_𝑗=0𝜆_𝑗𝒩(𝑛_𝑗|𝜇_𝑗, 𝜎_𝑗²). (2.2) Then the parameters of 𝜇𝑘, 𝜎𝑘, and 𝜆𝑘 are updated by using 𝑝𝑘(𝑛_𝑖). This process is iteratively applied to yield optimal parameters. Figure 2.3(c) shows the resulting probability distribution of 𝑝₁(𝑛_𝑖), where we see that the patches corresponding to the glass regions are assigned relatively high probabilities.

However, the patches corresponding to the complex scene structures such as trees as shown in far background also have high probabilities as well, since a laser pulse may produce multiple echo pulses due to arbitrary diffuse reflection. In addition, some small glass regions located far from the scanner are also assigned high probabilities, but their reflection artifacts are negligible.

(30)

15

Therefore, in order to select only the region on a glass plane yielding dominant reflection artifact, we compute a reliability for each patch. Let us define a set of points 𝒞 = {𝐜_𝑖}, where 𝐜_𝑖 is the closest point to the scanner among all the projected points to 𝒮_𝑖. Then, we define the set 𝒞_candidates⊆ 𝒞 as

𝒞candidates= {𝐜𝑖|𝑝1(𝑛_𝑖) > 𝑝₀(𝑛_𝑖), 𝐜_𝑖 ∈ 𝒞}. (2.3) Since the points sampled on the dominant glass plane should have smaller distances from the scanner than that of the transmission and virtual points, we assume that 𝒞_candidates consists of the points belonging to the glass patches and some complex objects. By applying RANSAC [34] to fit a plane to 𝒞_candidates , we estimate the glass plane Π . Then we define a reliability 𝜌_𝑖 for each patch 𝒮_𝑖 by weighting the probability 𝑝₁(𝑛_𝑖) as

𝜌_𝑖 = 𝑒^−𝑑^𝑖𝑝₁(𝑛_𝑖), (2.4)

where 𝑑_𝑖 is the Euclidean distance between Π and 𝐜_𝑖. If 𝐜_𝑖 is close to Π, we assign a high reliability to 𝒮_𝑖 . In contrary, if 𝐜_𝑖 deviates from Π too much, we assign a low reliability to 𝒮_𝑖 . Figure 2.3(d) shows the resulting reliability map, where only the dominant and closest glass plane is highlighted while the high probabilities of 𝑝₁(𝑛_𝑖) associated with the far and small glass regions, trees and ground are suppressed.

(31)

16

4 Virtual Point Detection

We detect and remove the virtual points associated with the glass patches estimated in Section 0.

Figure 2.5 illustrates the situation when reflection occurs on a glass plane, where the virtual points 𝐩 and 𝐫 correspond to the real points 𝐩̂ and 𝐫̂. Note that the virtual points are always created at the opposite side to the corresponding real points with respect to the glass plane, i.e., located behind the glass from the scanner. We divide the space of a target scene into Ω_front and Ω_back by taking the glass plane Π as the boundary, where Ω_front contains the scanner location. Then we detect the virtual points only within the half space Ω_back, since Ω_front contains the real points only. A point 𝐩 ∈ Ω_back is highly probable to be a virtual point when 1) it is projected to a patch with a high reliability and 2) there is a corresponding real point 𝐩̂ ∈ Ω_front which has symmetry relation to 𝐩 about the glass plane Π, and yields a similar geometric feature to 𝐩.

4. 1 Reflection Symmetry

For a given point 𝐩 ∈ Ω_back, we first evaluate a symmetry score 𝛾_symmetry(𝐩) which measures how an actually acquired point 𝐩̂ ∈ Ω_front is close to the symmetric position of 𝐩 ∈ Ω_back. We find the symmetric position of 𝐩 about the glass plane Π using the Householder matrix [35], which describes a linear transformation of reflection about a plane. The Householder matrix 𝐀 for a given plane is defined as 𝐀 = 𝐈 − 2𝐧𝐧^𝑇 , where 𝐈 is the identity matrix and 𝐧 is the unit vector of plane normal.

Hence, with the plane equation 𝑎𝑥 + 𝑏𝑦 + 𝑐𝑧 + 𝑑 = 0, it is given by

𝐀 = [

1 − 2𝑎² −2𝑎𝑏 −2𝑎𝑐 −2𝑎𝑑

−2𝑎𝑏 1 − 2𝑏² −2𝑏𝑐 −2𝑏𝑑

−2𝑎𝑐 −2𝑏𝑐 1 − 2𝑐² −2𝑐𝑑

0 0 0 1

]. (2.5)

Note that the Householder matrix is orthogonal.

𝐀^𝐓𝐀 = (𝐈 − 2𝐧𝐧)^𝑇(𝐈 − 2𝐧𝐧^𝑇)

= 𝐈 − 4𝐧𝐧^𝑇+ 4𝐧𝐧^𝑇𝐧𝐧^𝑇 = 𝐈, (2.6) since 𝐧𝐧^𝑇 is a symmetric matrix and 𝐧^𝑇𝐧 = 1. Similarly, 𝐀𝐀^𝑇 = 𝐈.

Then we have the relation 𝐩̂ = 𝐀_Π𝐩 where 𝐀_Π is the Householder matrix of glass plane Π. The homogeneous coordinates are used to represent the translation of plane.

We use a 𝑘-d tree to find the closest point 𝐪 ∈ Ω_front to 𝐩̂, and compute

(32)

17

Figure 2.5: Symmetry relation of reflection between a pair of real point and virtual point.

𝛾_symmetry(𝐩) = 𝑒⁻

‖𝐩̂−𝐪‖

𝛽₁ , (2.7)

where ‖𝐩^̂− 𝐪‖ is the Euclidean distance between 𝐩̂ and 𝐪.

4. 2 Geometric Similarity

The symmetry score 𝛾_symmetry(𝐩)may not be sufficient to detect the virtual points faithfully. For example, as shown in Figure 2.5, a real point 𝐭 ∈ Ω_back has a relatively high score of 𝛾_symmetry(𝐭), since we can find another real point 𝐬 ∈ Ωfront which yields a small distance 𝑑(𝐭̂, 𝐬) where 𝐭 = 𝐀Π𝐭.

Therefore, we also evaluate a geometric similarity 𝛾similarity(𝐩) between 𝐩 and 𝐪 using a 3D feature descriptor. Note that the employed 3D feature descriptor should be reflection invariant, since the two points 𝐩 and 𝐪 are assumed to have the relation of reflection symmetry.

Among many 3D feature descriptors [36] [37] [38], we use Fast Point Feature Histogram (FPFH) [39]. For a given query point 𝐩_𝑖, FPFH computes the three angular variations of (𝛼, ϕ, θ) for all the 50 nearest neighboring points 𝐩_𝑗 to 𝐩_𝑖 in terms of the Euclidean distance, as shown in Figure 2.6. The FPFH vector Φ(𝐩) at 𝐩 is defined as the histogram of the three angular variations. We prove the reflection invariance of FPFH. Let 𝐩̂_𝑖 = 𝐀𝐩_𝑖 and 𝐩̂_𝑗= 𝐀𝐩_𝑗 be the reflected points of 𝐩_𝑖 to 𝐩_𝑗 , respectively. Then, the axes satisfy 𝐯̂ = 𝐀𝐯, 𝐰̂ = 𝐀𝐰, and 𝐮̂ = 𝐀𝐮. Let us denote the normals of 𝐩_𝑖 and 𝐩_𝑗 as 𝐧_𝑖 and 𝐧_𝑗, respectively, then we also have 𝐧̂_𝑖= 𝐀𝐧_𝑖 and 𝐧̂_𝑗= 𝐀𝐧_𝑗. The angular variation of 𝛼 is given by 𝛼 = 〈𝐯, 𝐧_𝑗〉. The angular variation 𝛼̂ associated with the two reflected points 𝐩̂_𝑖 and 𝐩̂_𝑗 can be derived as

(33)

18

Figure 2.6: Fast Point Feature Histogram which describes the three angular variations (𝛼, ϕ, θ) associated with the two points 𝐩_𝑖 and 𝐩_𝑗.

𝛼̂ = 〈𝐯̂, 𝐧̂_𝑗〉 = 〈𝐀𝐯, 𝐀𝐧_𝑗〉

= (𝐀𝐯)^𝑇𝐀𝐧_𝑗 = 𝐯^𝑇𝐀^𝑇𝐀𝐧_𝑗

= 𝐯^𝑇𝐧_𝑗= 〈𝐯, 𝐧_𝑗〉 = 𝛼,

(2.8)

since 𝐀^𝑇𝐀 = 𝑰 by (2.6). Similarly, the other angular variations are given by 𝜙 = 〈𝐮, ^𝐩^𝑖^−𝐩^𝑗

‖𝐩_𝑖−𝐩_𝑗‖〉, and 𝜃 = arctan(〈𝐰, 𝐧_𝑗〉, 〈𝐮, 𝐧_𝑗〉), and we can prove 𝜙̂ = 𝜙 and 𝜃̂ = 𝜃 similarly. Consequently, FPFH implies the reflection invariance.

We compute the similarity score 𝛾_similarity(𝐩) as

𝛾_similarity(𝐩) = 𝑒⁻

𝐻(Φ(𝐩),Φ(𝐪))

𝛽₂ , (2.9)

where Φ(𝐩) is the FPFH vector at 𝐩. 𝐻(Φ(𝐩), Φ(𝐪)) is the Hellinger distance [40] defined by

𝐻(Φ(𝐩), Φ(𝐪)) = √∑ (√Φ_𝑖 _𝑖(𝐩) − √Φ_𝑖(𝐪))²

2 , (2.10)

where Φ_𝑖(𝐩) denotes the 𝑖-th element of Φ(𝐩).

(34)

19 (a)

(b)

Figure 2.7: Scores for virtual points. (a) An input LS3DPC model. (b) The distribution of score 𝛾(𝐩) where the red and blue colors depict high and low scores, respectively.

4. 3 Detection of Virtual Points

We combine the symmetry score and the similarity score together to compute a final score 𝛾(𝐩) given by

𝛾(𝐩) = 𝛾_symmetry(𝐩)𝛾_similarity(𝐩). (2.11) Figure 2.7 shows the resulting scores, where we set 𝛾(𝐩) = 0 for 𝐩 ∈ Ω_front. We see that a point 𝐩 ∈ Ω_back yields a high score 𝛾(𝐩) when it has a corresponding point 𝐪 ∈ Ω_front which yields a similar geometric feature to 𝐩 and is close to 𝐀_Π𝐩. We basically separate the virtual points from a

(35)

20

(a) Architecture Building (b) Engineering Building

(c) Natural Science Building (d) Botanical Garden

(e) Gymnasium (f) Terrace

Figure 2.8: Estimation of glass regions associated with dominant reflection artifacts. In each subfigure, a color panorama image and the resulting reliability distribution is shown in top and bottom, respectively.

LS3DPC model using the resulting scores by assigning a binary label 𝑙_𝑖 to each point 𝐩_𝑖 such that 𝑙_𝑖 = 1 when 𝐩_𝑖 is virtual and 𝑙_𝑖 = 0 when 𝐩_𝑖 is real.

However, as shown in Figure 2.5, a virtual point 𝐫 can be generated by reflection, but its corresponding real point 𝐫̂ is not actually acquired by scanner due to the occlusion by the cactus. In such a case, 𝐫 may not be detected as a virtual point due to a low score of 𝛾(𝐫). To overcome this issue, we formulate an energy function given by

(36)

21 𝐸(ℒ) = ∑ 𝐷_𝑖

𝑖

+ 𝜏 ∑ ∑ 𝑉_𝑖𝑗

𝐩_𝑗∈𝑁_𝑗 𝑖

, (2.12)

where ℒ = {𝑙_𝑖} is the set of all labels. The data cost 𝐷_𝑖 is defined by

𝐷_𝑖 = { −𝜌(𝐩_𝑖)𝛾(𝐩_𝑖), 𝑙_𝑖 = 1

−(1 − 𝜌(𝐩_𝑖)𝛾(𝐩_𝑖)), 𝑙_𝑖 = 0 (2.13) where 𝜌(𝐩_𝑖) is the reliability of the patch where 𝐩_𝑖 is projected. By multiplying the reliability, we effectively detect the points associated with the glass patches of dominant reflection. 𝑉_𝑖𝑗 is the smoothness cost that forces the neighboring points to have same labels, and 𝑁_𝑖 is the set of neighboring points to 𝐩𝑖. We find 48 nearest neighboring points using a 𝑘-d tree, but omit the points 𝐩𝑗's having Euclidean distances 𝑑(𝐩𝑖, 𝐩𝑗) larger than 0.1% of the bounding box's diagonal distance for a given LS3DPC model. We select the parameters empirically by testing the performance with variable numbers of neighboring points from 16 to 96 and variable threshold values for 𝑑(𝐩_𝑖, 𝐩_𝑗) from 0.01% to 1%, respectively. 𝑉_𝑖𝑗 is computed by

𝑉𝑖𝑗= {𝑒⁻

𝑑(𝐩_𝑖,𝐩_𝑗) 𝛽₁ 𝑒⁻

𝐻(Φ(𝐩_𝑖),Φ(𝐩_𝑗))

𝛽₂ , 𝑙_𝑖 ≠ 𝑙_𝑗 0, otherwise.

(2.14)

We use the Iterated Conditional Modes (ICM) [41] to obtain an optimal solution of (2.12). Finally, we remove the detected virtual points from an input LS3DPC model.

(37)

22

5 Experimental Results

We acquire LS3DPCs by capturing real-world outdoor scenes including glasses using a 3D terrestrial LiDAR scanner, RIEGL VZ-400 [31] with the angular resolutions 𝜔_azimuthal= 0.06° and 𝜔_polar= 0.06°. We evaluate the performance of the proposed algorithm on six LS3DPC models shown in Figure 2.8: ‘Architecture Building,’ ‘Engineering Building,’ ‘Natural Science Building,’ ‘Botanical Garden,’

‘Gymnasium’ and ‘Terrace’, where a single LS3DPC model has approximately 5~6 millions of points in general. 𝛽₁ and 𝛽₂ in (2.7) and (2.9) are empirically chosen as 0.5 and 0.5 for ‘Botanical Garden’

model and 1.5 and 1.5 for the other models. 𝜏 in (2.12) are set to 1.

We first show the estimated glass regions in Figure 2.8, where we see that the glass patches on dominant glass planes are detected successfully in most cases. Note that no points are sampled over a large area of the dominant glass plane in Figure 2.8(a), since the LiDAR scanner is located too close to the glass. When outdoor scenes are captured, the sky is often reflected on the glasses. In such cases, however, no valid real points are sampled on the sky due to the limit of capturing distance, and therefore, no virtual points are generated associated with the sky region.

Figure 2.9 shows the results of reflection removal in LS3DPCs. In ‘Architecture Building’ model, some trees are not removed since they are reflected on the glass region where no sampled points are obtained as shown in Figure 2.8(a). However, most of the reflected building and trees including a reflected person are detected and removed successfully despite a massive absence of sampled points on the glass plane. As shown in Figure 2.8(b), most of the reflection artifacts are well removed including building, trees and ground. However, some part of the reflected building shown in the second column

Model Number of points

Processing Time (sec.) Glass region

estimation

Descriptor computation

Virtual point

detection Total

(a) 5,562,972 15.9 43.0 34.9 93.7

(b) 9,720,671 10.7 72.5 51.6 134.8

(c) 4,913,710 12.8 37.2 4.7 54.7

(d) 6,140,383 10.7 49.8 11.8 72.3

(e) 5,609,449 10.0 43.3 16.6 69.9

(f) 5,000,902 9.5 38.2 24.7 72.4

Table 2.1: Processing time of the proposed algorithm. (a) ‘Architecture Building,’ (b) ‘Engineering Building,’ (c) ‘Natural Science Building,’ (d) ‘Botanical Garden,’ (e) ‘Gymnasium,’ and (f) ‘Terrace.’

(38)

23

still remains since the corresponding glass patches are assigned relatively low reliability values due to the lack of sampled points. In ‘Natural Science Building’ model, a linear region of real trees that intersect with the extended glass plane is also classified as virtual points, since the trees have multiple echo pulses due to complex scene structures and also yield short distances to the glass plane. However, the removal of these points is inconspicuous in the reconstructed 3D model. Also note that some virtual points of reflected building are not removed due to the lack of corresponding real points by occlusion.

‘Botanical Garden’ model exhibits a relatively complex scene that similar trees appear on both sides of the glass plane. In this model, the building and trees reflected on the glass are removed, while the real trees in the garden survive successfully.

In addition, we measure the processing time of the proposed algorithm on Intel i7-4790k Processor (4.38GHz), and provide the results in Table 2.1. Note that the processing times of glass region estimation and virtual point detection are not linearly proportional to the number of points. The descriptor computation to find normal and FPFH consumes more than a half of the total processing time in most cases. However, the pre-computed descriptors can be also used for further processing of point clouds in various applications.

(39)

24

(a) Architecture Building (b) Engineering Building

(c) Natural Science Building (d) Botanical Garden

Figure 2.9: Results of reflection removal for LS3DPCs. In each subfigure, the top row shows an input LS3DPC model, the middle row visualizes the estimated glass regions (yellow) and the detected virtual points (red), and the bottom row shows the resulting LS3DPC model where the virtual points are removed.

(40)

25

Chapter 3 Reflection Removal for Large-Scale 3D Point Clouds with Multiple Glass Planes

1 Introduction

Our preliminary research in Chapter 2 successfully removes virtual points in LS3DPCs. However, we assumed that only a single dominant glass plane exists in each LS3DPC model. When there are multiple glass planes, the method in Chapter 2 selects one of them as a glass plane and removes the virtual points associated with the selected plane only. Also, some glass regions are assigned low reliability values since the method in Chapter 2 does not consider the minimum acquisition distance of LiDAR scanner.

In this chapter, we generalize our previous work introduced in Chapter 2 and propose a novel virtual point removal algorithm for LS3DPCs captured with multiple glass planes. We investigate the capturing mechanism of terrestrial LiDAR scanner and analyze the number of received echo pulses for each emitted laser pulse to estimate multiple glass regions, respectively. Then, for each point, we traverse the trajectories of laser pulses reflected by multiple glass planes, and evaluate the validity of trajectories to select an optimal trajectory associated with a virtual point. We evaluate the performance of the proposed algorithm using various LS3DPC models for indoor and outdoor real-world scenes captured by a terrestrial LiDAR scanner. We show that the proposed algorithm estimates the multiple glass regions accurately, and detects the virtual points associated with multiple glass regions successfully.

The contributions of this paper compared to our preliminary research [42] are as follows.

 The existing method [42] assumes a single glass plane in a LS3DPC model and thus removes the reflection artifacts associated with a single dominant glass plane only. In contrary, the proposed algorithm estimates the multiple glass regions distributed over multiple glass planes in a LS3DPC model, and detects the virtual points associated with each glass plane, respectively.

 We also improve the accuracy of the glass region estimation method in [42] by computing reliability values with respect to multiple glass planes, respectively, and by refining the estimated glass regions.

 We propose a novel virtual point detection algorithm by recursively traversing multiple trajectories of laser pulses derived from multiple glass planes.

(41)

26

 We perform intensive experiments on more diverse LS3DPC models with multiple glass planes.

In addition, we also evaluate the quantitative performance of virtual point removal by simulating the glass reflection artifact using a synthetic 3D model.

The remaining of this paper is organized as follows. Section 0proposes the multiple glass region estimation algorithm, and Section 0explains the virtual point removal algorithm. Section 0presents the experimental results.

(42)

27

2 Glass Region Estimation

In a typical 2D glass image captured with a finite field-of-view, the glass region, where the reflection occurs, usually corresponds to an entire image area. However, LiDAR scanners capture 360° scene of real-world environments where multiple panes of glass are observed, and therefore, multiple glass regions are located at different positions in a single LS3DPC model. In order to remove the reflection artifacts from LS3DPC, we first find multiple glass regions using the characteristics of LiDAR scanning.

2. 1 Reflection Characteristics of LiDAR

During LiDAR scanning, a laser pulse is emitted periodically with predefined azimuthal and polar angular resolutions. A single 3D point is created for each emitted laser pulse since, in most cases, the light is reflected from a real-world object only once. However, when a laser pulse hits the glass plane as shown in Figure 2.1(b), it may produce more than two points: one sampled on a real object in front of the glass associated with the laser pulse reflected by glass, and the others sampled on the glass plane and/or sampled on a real object behind the glass associated with the laser pulse penetrated through glass.

Therefore, we basically count the number of points created from each laser pulse to estimate the glass regions.

In practice, we consider the unit sphere with the center at the scanner location, and we partition the surface of the sphere into local patches as shown in Figure 3.1, where each rectangular area covers the angular range of 𝜔_azimuthal× 𝜔_polar. By using the measured distance and the associated azimuthal and polar angles of an emitted laser pulse, the coordinates of 3D point are computed in the spherical coordinate system. We project the 3D points onto the surface of the unit sphere and count the number of projected points to each patch. Then we classify the patches into two categories of the ordinary patch including only a single point and the glass patch having two or more points.

However, the points are obtained based on whisk broom scanning which samples one point at a certain time instance by mechanically rotating the scanner. To reduce the effect of sampling error, we employ a scaling factor 𝜏 defining a larger surface patch covering a wider angular range of 𝜏𝜔_azimuthal× 𝜏𝜔_polar. We empirically set 𝜏 = 3, the smallest integer yielding reliable performance of glass region estimation, as depicted by the red rectangular area in Figure 3.1.

(43)

28

Figure 3.1: Partitioning of the unit sphere into local surface patches.

Let 𝑔_𝑖 be the number of projected points to the 𝑖-th surface patch 𝒮_𝑖. Figure 3.2(b) visualizes the distribution of 𝑔_𝑖's for the target scene shown in Figure 3.2(a). We see that the ordinary patches usually exhibit 𝑔_𝑖 = 𝜏² since each patch is associated with approximately 𝜏² laser pulses. However, the glass patches tend to have 𝑔_𝑖 > 𝜏² due to multiple echo pulses by reflection. Figure 3.2(c) shows the histogram of 𝑔_𝑖's counted over all the patches having valid projected points, i.e., 𝑔_𝑖 > 0. We observe two strong peaks at 𝑔_𝑖 = 9 and 𝑔_𝑖 = 18. The first peak at 𝑔_𝑖 = 9 implies that 𝜏² laser pulses produce the same number of points mainly associated with the ordinary patches. On the other hand, the second peak at 𝑔_𝑖 = 18 represents the patches with 2𝜏² projected points where each laser pulse generates two points on average by reflection. We also see a weak peak at 𝑔𝑖= 27 when a laser pulse creates three points on average.

We model the distribution of 𝑔_𝑖 using the mixture of 𝐵 Gaussian distributions [32]. The density of Gaussian mixture model is given by

𝑓(𝑔_𝑖) = ∑ 𝜅_𝑏⋅ 𝒩(𝑔_𝑖|𝜇_𝑏, 𝜎_𝑏²)

𝐵−1

𝑏=0

, (3.1)

(44)

29 (a)

(b)

(c)

(d)

Figure 3.2: Distribution of the number of projected points in ‘International Hall’ model. (a) A panoramic image for target scene. (b) The distribution of the number of projected points 𝑔_𝑖. (c) The histogram of 𝑔_𝑖's. (d) The distribution of the posterior probability 𝑝₁(𝑔_𝑖).

Reflection Removal of 3D Point Clouds for Realistic 3D Scene Reconstruction

Doctoral Dissertation

Reflection Removal of 3D Point Clouds for Realistic 3D Scene Reconstruction

Jae-Seong Yun

Department of Electrical Engineering

Ulsan National Institute of Science and Technology

2021

Reflection Removal of 3D Point Clouds for Realistic 3D Scene Reconstruction

Jae-Seong Yun

Department of Electrical Engineering

Ulsan National Institute of Science and Technology

Reflection Removal of 3D Point Clouds for Realistic 3D Scene Reconstruction

A dissertation submitted to

Ulsan National Institute of Science and Technology in partial fulfillment of the

requirements for the degree of Doctor of Philosophy

Jae-Seong Yun

12/14/2020 of submission Approved by

_________________________

Advisor

Jae-Young Sim

Reflection Removal of 3D Point Clouds for Realistic 3D Scene Reconstruction

Jae-Seong Yun

This certifies that the dissertation of Jae-Seong Yun is approved.

12/14/2020 of submission

nh h

Abstract

Contents

List of Figures

List of Tables

Chapter 1 Introduction

1 Motivations

2 Proposed Approaches

2. 1 Reflection Removal for Large-Scale 3D Point Clouds with Single Glass Plane

2. 2 Reflection Removal for Large-Scale 3D Point Clouds with Multiple Glass Planes

2. 3 Reflection Removal for Dynamic 3D Point Clouds

3 Organization

Chapter 2

Reflection Removal for Large-Scale 3D Point Clouds with Single Glass Plane

1 Introduction

2 Overview of Proposed Algorithm

3 Glass Region Estimation

3. 1 Patch Classification by Point Projection

3. 2 Reliability for Glass Patches

4 Virtual Point Detection

4. 1 Reflection Symmetry

4. 2 Geometric Similarity

4. 3 Detection of Virtual Points

5 Experimental Results

Chapter 3

Reflection Removal for Large-Scale 3D Point Clouds with Multiple Glass Planes

1 Introduction

2 Glass Region Estimation

2. 1 Reflection Characteristics of LiDAR

nh ^h