ABSTRACT:
Usually aerial imagery are acquired through perspective projection which results in relief displacement in the map. Therefore, or-thophotos from large scale aerial images over urban area often suffer from double mappings caused by sudden elevation changes in the scene. In the generation of true orthophotos, one aims at removing these perspective displacements and occlusions. Hence, knowledge of the surface shape is needed, as it has great impacts both on the orthorectification and occlusion identification. Mostly, the relief dis-placements are caused by buildings. In this paper, we present a method for the generation of true orthophotos from overlapping aerial images of known orientation, especially in built-up areas. Apart from the aerial images, the method does not depend on additional data sources. In particular, we focus on the derivation of building roof outline model using image segmentation and edge detection techniques under the guide of the classification result of point cloud from image matching . Experiment results on real data show the feasibility and the performance of the method.
1. INTRODUCTION
Aerial images are subject to perspective distortion and therefore cannot be used directly as a map with homogeneous scale. There-fore, orthorectification is carried out. For smooth terrain, the tra-ditional digital orthorectification works well as there is no rapid elevation change. However, for large scale imagery of urban areas with complex buildings, the distortion produces unavoidable ar-tifacts in the form of double mapped areas, therefore reduces the usefulness and reliability of the orthomap. Nowadays, the term true orthophoto is generally used for an orthophoto where surface elements that are not included in the digital terrain model are also rectified to the orthogonal projection (Amhar et al., 1998).
For generation of a true orthophoto two aspects need to be con-sidered. Firstly, the correct geometric surface description is nec-essary. Secondly, if the image ray has more than one intersection with the objects, including the terrain, resulting in hidden areas and double mapping.
For the visibility analysis, additional description of the objects on the terrain is required. The quality of the man-made objects model influences the result because it shows how well the occlu-sion can be detected, thus how well the quality of true orthophoto can be achieved. Hence, in many cases, in addition to the ter-rain model, the Digital Building Model (DBM) need to be intro-duced. The larger the image scale is, the more objects must be in-cluded in the surface model (Amhar et al., 1998). Consequently, the more preprocessing and more expensive price it requires for the true orthophoto production. There are different DBM levels. Concerning the building shape and the aim of visibility analysis, only the building roofs matter, as only the tops shall be mapped. Therefore, a simple building model including the roof informa-tion is sufficient.
Such artifacts can be removed by identification of occlusion caused by buildings. Generally existing methods for occlusion detection can be divided into the z-buffer based method (Catmull, 1974,
∗Corresponding author. [email protected]
Amhar et al., 1998, Rau et al., 2000, Rau et al., 2002), polygon based method (Kuzmin et al., 2004) and angle based ray tracing method (Bang et al., 2007). After that, the true orthophoto can be obtained by filling up the hidden area by overlapped image pixels.
The contribution of this article is that we
• exploit the capabilities of current image matching algorithms
to get dense point clouds describing the visible surfaces,
• classify these point clouds to obtain buildings, and finally
• refine these building models with color information from the
original perspective images.
In order to obtain a true ortho photo mosaick additionally occlu-sion detection based on a state of the art method is applied.
2. THE TRUE ORTHOPHOTO PROCESS
2.1 Overview
Figure 1: The entire processing of producing true orthophoto.
2.2 Coarse building outline acquisition
We take point clouds from dense matching of multi-view aerial images as input. The program package SCOP++ (Pfeifer et al., 2001) provides a robust hierarchic filtering algorithm to filter and classify point clouds. We extract points on buildings from the classification (see fig. 2).
The problems are that points on building facades might also be collected and that points have not been grouped according to the building roof they belong to. Therefore, we construct a Delaunay triangulation on the building points dataset and build constraints for the triangles to eliminate the facade points and group the roof points to each building roof. First, The normals of triangles con-sisting of points on the wall are not perpendicular to the ground. These points and triangles should be deleted. Secondly, the dis-tance between points from the same roof is usually smaller than from different roofs, these triangles are in a form of narrow trian-gles. Narrow triangles and points are removed.
These outlines (see fig. 3) cannot directly be used for the building models because the point clouds from matching are not necessar-ily distributed well on the roof and edges and so far the process-ing is quite rough so that these point sets reflect the true buildprocess-ing outlines only coarsely. However, they can still provide an approx-imate extent for each building.
Figure 2: Points on the building roof due to the classifcation of SCOP ++ in a 3D view.
Figure 3: Roof outline extent after eliminating the wrong points.
2.3 Simple building model refinement
As mentioned above, we aim to get more accurate building roof outlines. As the outline acquired in such a coarse way is merely approximate, they should be refined in the following processing. The color information contained in the aerial images shall not be disregarded as it provides potentials to improve the outlines. The building outlines from the last step can be used in the determina-tion of search range for the building edges in the aerial images.
In order to derive precise outlines, we project the building roof points back to image space using the known camera orientations. We aggregate the points within such an outline region for each building. Then, we apply a buffer distance to each region in or-der to make sure that it fully contains the respective building, re-sulting in a building mask. Applying the unsupervised K-means segmentation on each region, we determine areas of similar color. We hence pick up the largest segment of the result.
There are two cases. First, the largest cluster occupying a large area of the searching masked region (we set the ’large’ threshold, i.e. 75%) is treated as a correct roof detection. Second, the largest area of clusters is not large enough (less than 75% of the extent area), which reflects that probably the roof pixels have been seg-mented to more than one cluster (this mostly happens for com-plex roofs such as ridge roofs), we merge segments close to the center of each such area, resulting in a large segment that repre-sents the building roof. Finally, we eliminate small holes due to
Figure 5: a sample of points originally classified as on buildings (magenta) vs. higher quality building outlines (cyan) found by image segmentation of unmasked regions.
e.g. chimneys and roof-lights by morphological operations. In that way, the correct spatial extent of building roofs is detected. In the middle column of fig. 4, one can see that both simple and complex buildings can be detected.
The boundary (i.e. the last column of fig. 4) is hence obtained. For each pixel along the boundary of a building roof region, we find the corresponding point in the point cloud from image matching as the one with the smallest distance when projected into image space. Via these procedures, we obtain a simple building model with roof outlines.
Figure. 5 shows the roof points from dense matching and the building roof edges from image segmentation according to the method we propose.
2.4 Visiblity analysis and true orthophoto generation
Now that an improved 3D building outline estimation is available, we can also refine the building roof points from classification in the first step. Points within the outline are kept as correct building roof classification result, while those outside of the outlines are removed from the dataset. Now the surface model can be derived based on the new point cloud. The grid size is set according to the ground sample distance of the original aerial image. In the meantime, each image is rectified to get the corresponding ortho-photo.
The occlusion detection is achieved through the angle based ray tracing algorithm (Habib et al., 2007). The angle based algorithm identifies occlusions by checking the off-nadir angle to the line of sight connecting the perspective center of the imaging sen-sor and the DSM cells. This algorithm has several advantages over the traditional z-buffer method: it is not sensitive to the sam-pling interval of the surface model and it does not require a de-tailed DBM. Afterwards, we generate the true orthophoto by fill-ing up the occluded areas with information from the neighborfill-ing orthophotos according to the visibility maps.
Figure 6: The overlap view of shading DSM and aerial image is underneath.
3. EXPERIMENTS AND DISCUSSION
In order to verify the performance of the proposed method, ex-periments are carried out using real data. The data consist of four digital aerial images which were captured by a DMC camera over Vaihingen, Germany, in the course of a DGPF-project (Cramer, 2010). The ground sampling distance is 8 cm. The interior ori-entation parameters of the camera and the exterior oriori-entation pa-rameters of the imaging sensor are available. Fig. 6 shows the DSM derived from the image matching points and refined build-ing boundary points with the aerial image as background.
By picking up the roof manually over three sample areas, we tes-tify how complete and how well the outlines fit. Results are listed in table 1. It reveals that for buildings which are detected, the mean of the errors in the horizontal direction are less than 12 cm, which is less than 1.5 times of the ground sample distance of the images. The mean of the errors in the vertical direction are of less than 3 times of the ground sample distance. The standard devia-tion for both in the horizontal and vertical direcdevia-tions are less than 15 cm. This approach handles complex building roof structures, because image texture matters most. Due to the same reason, it performs poorly on roofs covered or shadowed by trees.
Figure 4: two examples of detecting edges within the masked region from coarse building outline acquisition.
Table 1: The statistic of detected building outline compared with manually reference data Test area no. manually
selected building roofs
no. detected building roofs
Correct detection [%]
delta. mean of building roof corners (horizontal) [m]
delta. mean of building roof corners (vertical) [m]
Std. dev. building roof (hor-izontal) [m]
Std. dev. building roof (vertical) [m]
1 17 15 88.24 0.09 0.20 0.13 0.08
2 23 18 78.26 0.12 0.23 0.21 0.11
3 14 12 85.71 0.11 0.22 0.14 0.08
4. CONCLUSION
The generation of orthophotos from multi-view images is a pro-cess of coordinate transformation and geometric correction (Balt-savias, 1996). True orthophotos aim to eliminate the double map-pings caused by orthorectification and gives a better fit in appli-cation. In this research, we present a new automated process for producing true orthophotos from multi-view aerial images with-out additional digital building model.
The possibility of using building points from image matching in guiding the edge detection and outline refinement is discussed and tested. Results show that to a large extent objects are cor-rected to the right location and double mapping effects are de-tected and compensated.
ACKNOWLEDGEMENTS
This research has been carried out with the financial support of The China Scholarship Council(CSC) and the Austrian Science Fund (FWF).
REFERENCES
Amhar, F., Jansa, J., Ries, C. et al., 1998. The generation of true orthophotos using a 3d building model in conjunction with a conventional dtm. International Archives of Photogrammetry and Remote Sensing 32, pp. 16–22.
Baltsavias, E. P., 1996. Digital ortho-imagesa powerful tool for the extraction of spatial-and geo-information. ISPRS Journal of Photogrammetry and Remote sensing 51(2), pp. 63–77.
Bang, K., Habib, A. F., Shin, S. and Kim, K., 2007. Comparative analysis of alternative methodologies for true ortho-photo gener-ation from high resolution satellite imagery. ASPRS ANNUAL.
Catmull, E., 1974. A subdivision algorithm for computer display of curved surfaces. Technical report, DTIC Document.
Cramer, M., 2010. The dgpf-test on digital airborne cam-era evaluation–overview and test design. Photogrammetrie-Fernerkundung-Geoinformation 2010(2), pp. 73–82.
Figure 7: Test area original aerial images (1stcolumn) vs. orthophoto (2ndcolumn) vs. true orthophoto (3rdcolumn) vs. true orthophoto from manually captured DBM (4thcolumn)
Habib, A. F., Kim, E.-M. and Kim, C.-J., 2007. New methodolo-gies for true orthophoto generation. Photogrammetric Engineer-ing & Remote SensEngineer-ing 73(1), pp. 25–36.
Kuzmin, Y. P., Korytnik, S. A. and Long, O., 2004. Polygon-based true orthophoto generation. International Archives of Pho-togrammetry, Remote Sensing and Spatial Information Sciences 35(Part 3), pp. 529–531.
Pfeifer, N., Stadler, P. and Briese, C., 2001. Derivation of digital terrain models in the scop++ environment. In: Proceedings of OEEPE Workshop on Airborne Laserscanning and Interferomet-ric SAR for Detailed Digital Terrain Models, Stockholm, Swe-den, Vol. 3612.
Rau, J., Chen, N. and Chen, L., 2000. Hidden compensation and shadow enhancement for true orthophoto generation. In: Pro-ceedings of Asian Conference on Remote Sensing 2000, pp. 4–8.