Contents lists available atScienceDirect
Automation in Construction
journal homepage:www.elsevier.com/locate/autcon
Interaction analysis for vision-based activity identi fi cation of earthmoving excavators and dump trucks
Jinwoo Kim
a, Seokho Chi
a,b,⁎, Jongwon Seo
caDepartment of Civil and Environmental Engineering, Seoul National University, 1 Gwanak-Ro, Gwanak-Gu, Seoul, Republic of Korea
bThe Institute of Construction and Environmental Engineering (ICEE), 1 Gwanak-Ro, Gwanak-Gu, Seoul, Republic of Korea
cDepartment of Civil and Environmental Engineering, Hanyang University, 222 Wangsimni-Ro, Seongdong-Gu, Seoul, Republic of Korea
A R T I C L E I N F O
Keywords:
Vision-based Activity identification Interaction
Earthmoving operatrion Excavator
Dump truck
A B S T R A C T
Activity identification is an essential step to measure and monitor the performance of earthmoving operations.
Many vision-based methods that automatically capture and explain activity information from image data have been developed with economic advantages and analysis efficiency. However, the previous methods failed to consider the interactive operations among equipment, and thus limited the applicability to the operation time estimation for productivity analysis. To address the drawback, this research developed a vision-based activity identification framework that incorporates interactive aspects of earthmoving equipment's operation. This fra- mework included four main processes: equipment tracking, action recognition of individual equipment, inter- action analysis, and post-processing. The interactions between excavators and dump trucks were examined due to its significant impacts on earthmoving operations. TLD (Tracking-Learning-Detection) was adapted to track the heavy equipment. Spatio-temporal reasoning and image differencing techniques were then implemented to categorize individual actions. Third, interactions were interpreted based on a knowledge-based system that evaluates equipment actions and proximity between operating equipment. Lastly, outliers or noisy results were filtered out considering work continuity. To validate the proposed framework, two experiments were performed:
one with the interaction analysis and the other without the analysis. 11,513 image frames from actual earth- moving sites in total were tested. The consequent average precision of activity analysis was enhanced from 75.68% to 91.27% after the interaction analysis was applied. In conclusion, this research contributes to iden- tifying critical elements that explain interactive operations, characterize the vision-based activity identification framework, and improve the applicability of the vision-based method for the automated equipment operations analysis.
1. Introduction
Activity identification classifies types of equipment operations such as working, traveling, and idling. It is vital to monitor the performance of on-site earthmoving operations, as sequential information of equip- ment's actions can serve as indicators to calculate direct work rates and cycle durations[1–4]. Based on the information gained from activity identification, site managers can perform project-related decision makings (e.g., resource allocation, path planning, scheduling, and site layout analysis) to be more efficient and operative[5–9]. Such insight can also be utilized to assess productivity and reduce idling time of the earthmoving equipment; slight reduction of the operation cycle has significant impacts on the productivity as the operations are normally repetitive by nature[10].
In the past, activity identification and analysis have commonly been
performed manually. However, such human-dependent approach has several salient shortcomings; they are expensive and time-consuming, and thus prone to yield an inconsistent set of data[5]. To overcome such limitations, automated activity identification systems have been introduced[11–12]. One of the most popular systems is a radio-based method that obtains on-site data using RFID (Radio Frequency Identi- fication), GPS (Global Positioning System), UWB (Ultra-Wideband), and BLE (Bluetooth Low Energy) [13–15]. The radio-based method cate- gorizes equipment activities based on equipment types, locations, and movements (i.e., acceleration, velocity, orientation). On the other hand, many researchers also pay attention to a vision-based method that capture and explain activity information from dataset in forms of images[16–22]. Image data contains detailed information about the equipment's action as well as types and locations[23]. The action in- formation is a critical cue for classifying operation types. That is the
https://doi.org/10.1016/j.autcon.2017.12.016
Received 9 May 2017; Received in revised form 5 November 2017; Accepted 7 December 2017
⁎Corresponding author at: Department of Civil and Environmental Engineering, Seoul National University, 1 Gwanak-Ro, Gwanak-Gu, Seoul, Republic of Korea.
E-mail addresses:[email protected](J. Kim),[email protected](S. Chi),[email protected](J. Seo).
Available online 02 January 2018
0926-5805/ © 2017 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/BY-NC-ND/4.0/).
T
information that the radio signal is difficult to provide; the radio signal often has difficulty identifying activity types precisely when equipment is located on the same position but performs different operations such as waiting for soil loading in case of dump trucks or excavators' rotating without center of gravity changes. Furthermore, increased accessibility to image data, in the line with the installment of CCTVs on construction sites for monitoring purposes [9,23], makes vision based approaches more practical and feasible. For instance, it is legalized to install cam- eras on construction sites for on-site monitoring in Korea according to [24].
Due to such aforementioned increased attention and inherent ad- vantages, a range of vision-based methods have been developed and they showed promising performance. However, the previous methods did not fully investigate interactive operations among equipment, which is crucial for real-world activity identification and productivity analysis. To address the drawbacks, this research developed a vision- based activity identification framework that considers interactive op- erations of earthmoving equipment. Fig. 1illustrates the concept of interaction. An excavator and a dump truck on the right hand side of Fig. 1work together by loading and unloading soils from each other, even though the dump truck‘stops’. If the other dump arrives at the working area as shown in the left hand side, its activity should be ca- tegorized as ‘idling’, according to their concurrent working status. It indicates that the activity states of equipment can influence and be influenced by other equipment.
Many researchers also perceived the importance of interactive op- erations and several studies made efforts to develop vision-based
activity identification methods[10,25]. However, previous studies fo- cused on certain qualities of interactions (e.g., interactions based on the equipment's proximity) under controlled environments (e.g., thefixed number of equipment in the image frame). It limited their research findings from providing comprehensive explanation about the complex interactions between multiple equipment. The present study aims tofill these existing knowledge gaps. First, this study reviews and identifies critical elements for understanding interactive operations of excavators and dump trucks. Second, it proposes the vision-based framework for automated activity analysis of the heavy equipment through further considerations of identified elements. Third, this framework can pro- vide continuous information for the types of equipment operations, which is fundamental data for calculating the cycle time and measuring the equipment productivity. Finally, these approaches are expected to improve the practicality of vision-based activity analysis on actual earthmoving sites.
2. Theoretical background and related works
2.1. Earthmoving operations
Earthmoving operations are fundamental construction processes to relocate soil from one site to the other[26–27]. A single earthmoving event involves series of specific activities such as excavating, loading, hauling, and dumping the soil. In order to successfully perform such earthmoving operations, excavators and dump trucks are essential input resources.
Excavators are assigned to the construction site to cut soil and hand it over to dump trucks. During the operational process, excavators can carry out one out of three types of activities as illustrated inFig. 2; they are‘working’,‘traveling’, and‘idling’[28–31]. Excavators are‘working’ under the cycle of swinging, loading, hauling, and unloading; within this cycle, excavators go over specific individual actions, including
‘scooping’,‘rotating’, and‘dropping’[32]. These sets of individual ac-
tions mean that excavators can either work alone (e.g., soil preparation and excavation) or interact with dump trucks (i.e., soil-loading).‘Tra- veling’activity indicates that excavators are‘moving’to other positions, such as working areas or parking stations[33]. When excavators are neither‘working’nor ‘traveling’, it can be interpreted that they are
‘idling’with‘stopping’actions[34].
In the meantime, dump trucks mainly move to deliver soil from one site to the other [31]. Two types of activities are involved; one is
Idle
Work Work
Fig. 1.Examples of interactions between earthmoving equipment.
Activity: work, travel, and idle Action: move, scoop/rotate/drop, and stop
Load soil Haul soil
Swing bucket Unload soil to dump truck
Wait for dump truck
Move to other sites
Excavator
One-to-one interaction
Stay in queue and wait for the next soil filling
order Idle (Stop) Travel to
dumping area
Return to
working area Dump soil
Work (Move & Stop)
Other working excavator and dump
truck
Dump truck
Group interaction Fill soil
from excavator Work (Scoop/Rotate/Drop)
Idle (Stop)
Travel (Move)
Fig. 2.Earthmoving activities and actions of earthmoving equipment.
‘working’and the other is‘idling’. One crucial nature of dump trucks' operational move is that their‘working’status can be both sedentary and mobile. Unlike excavators, traveling of dump trucks falls under
‘working’category since dump trucks travel to relocate loads of soil
[35]. The working process of dump trucks is composed of specific ac- tions such asfilling, traveling, and dumping[5,30–31,36]. In case of filling and dumping, trucks are‘stopping’. In other words, when they
are‘working’, it can either be‘moving’(for traveling) or‘stopping’(for
filling and dumping) [37]. However, dump trucks also may take
‘stopping’actions when its activity status shifts to ‘idling’[38]; this could happen when they wait for the next soilfilling order, when other excavators and dump trucks are busy (Figs. 1, 2). This implies that dump trucks'‘stopping’motion may occasionally indicate‘idling’status instead of‘working’.
Actions and activities of excavators and dump trucks also imply that there are significant amount of interactions between those two types of equipment. When excavators are ‘working’, particularly interacting with dump trucks, they unload soil to dump trucks. Likewise, dump trucks also fill soil from excavators when they are ‘working’. Both equipment is able to fully perform their tasks with those one-to-one interactions[39–40]. In case that excavators and dump trucks co-exist, their individual actions are also consistent that the excavator is
‘scooping/rotating/dropping’and the dump truck is‘stopping’; to put it differently, action consistency between equipment needs to be met in this case for explaining filling operations. When action inconsistency happens, (e.g., both excavator and dump truck are stopping), this action is classified as‘idling’. In addition to the consistency factor, another key concept to illustrate the interaction is the proximity between equip- ment. This means that excavators and dump trucks should stand within the effective distance of excavators to hand over soil (i.e., arm's length).
This aspect enables to filter out the unrealistic case; for example,
‘stopping’dump trucks are 30 m apart from‘scooping/rotating/drop- ping’excavators although their actions are consistent. In addition to these one-to-one interactions, group interactions can also occur among more than three equipment involved in the operation process. When two dump trucks keep‘stopping’nearby an excavator, one of them is normallyfilled with soil. In this case, the two dump trucks take same actions (‘stopping’); however, their activities are different (‘working’ and‘idling’) as shown inFig. 1.
To summarize, excavators perform three operation types of
‘working’, ‘idling’, and ‘traveling’, while dump trucks performs two operation types of ‘working’ and ‘idling’. Those activities for each equipment type are performed through diverse individual actions and through one-to-one or group interactions within the working area [36,41–44]. Under the interactive operations, types of equipment ac- tivities can be effectively inferred using both individual aspects (i.e., object type, location, and individual action) and interactive character- istics (i.e., co-existence, proximity, and action consistency).
2.2. Related works and limitations
Numerous computer vision techniques have been developed in order to accurately recognize target object's actions and activities.
There are three main categories: space-time, shape-based, and rule- based approaches [45]. Space-time approaches recognize actions by using spatio-temporal features such as optical flows and trajectory changes within consecutive image frames [46–48]. Shape-based ap- proaches determine actions by applying gesture/appearance-based features such as oriented histogram, bags-of-rectangles, and skeleton models [49–53]. Lastly, rule-based approaches utilize pre-defined knowledge of behaviors (i.e., a set of if-then rules) and recognize ac- tions[54–55].
Many researchers have made efforts to apply such computer vision approaches to construction sites, especially for equipment action and activity recognition. Zou and Kim[33], for instance, proposed a method to calculate hydraulic excavator's idle time using hue, saturation, and
value color spaces. Kinematic features were also used to identify ex- cavator's activities by representing articulated shapes of the excavator [28]. The study determined activity types by analyzing distance and elevation changes of the detected parts (e.g., bucket, body, and joint) of the excavator. Gong and Caldas[56]analyzed cycle time of concrete placement operations by detecting a concrete bucket and tracking its movement. Working cycle of tower crane was also investigated by[57], through tracking three dimensional locations of its jibs and body. On the other hand, machine learning based methods (e.g., Bags-of-Fea- tures) were implemented by [58,5] for the purpose of recognizing equipment's individual actions. The methods, however, required a large amount of training datasets in order to learn robust action recognizers for various operation types. For analyzing the interactions between heavy machinery, Azar et al.[10]developed a vision-based approach to estimate the dirt-loading cycle of an excavator and a dump truck by detecting one-to-one interactions. However, they faced difficulties not only in continuously identifying the types of equipment operations but also in recognizing group interactions between multiple equipment.
Recently, Bugler et al.[25]proposed an automated productivity as- sessment method of earthmoving operations through photogrammetry and video analysis. It determined activity types by considering the proximity between equipment and their actions; nonetheless, ex- cavator's‘scooping/rotating/dropping’actions, which are fundamental information for classifying ‘working’ excavators even though their center of gravity is in thefixed position, were not able to be explained.
The aforementioned research showed promising results for the construction applications and built up strong foundations for auto- mated activity analysis based on vision techniques. In the early stage, the researchers had attention to identifying actions of single equipment such as excavators and tower cranes. Currently, many researchers have made efforts to analyze site productivity based on the identified in- formation of equipment's actions. However, challenging issues still re- main on the classification of equipment operations in complex, real- world working environments. One major issue is that most previous studies mainly focused on individual characteristics (e.g., equipment shapes, orientations, and locations) of single equipment, but did not fully consider the interactions among earthmoving equipment for the analysis. This issue limits the applicability of the previousfindings to the actual construction context, since interactive aspects (e.g., co-ex- istence, proximity, and action consistency between equipment) as well as individual features (e.g., object types, locations, and actions) col- lectively function as significant cues for activity identification. For in- stance, without the interaction analysis, it is difficult to explain whether
a‘stopped’dump truck is‘working’(filling soil from an excavator) or
‘idling’(waiting for the next soilfilling order due to excavator-to-dump truck interaction). Such limitations have led previous studies to make only partial technical advancements.
3. Research framework
The research framework inFig. 3highlights the interaction analysis to classify types of equipment activities. The framework includes four main modules: equipment tracking, action recognition of individual equipment, interaction analysis, and post-processing. First, the loca- tions of excavators and dump trucks are tracked over time to collect two types of data: equipment types and their trajectories. Second, individual actions of the tracked objects are recognized using the spatio-temporal reasoning and the image differencing technique. Third, interactions of excavators and dump trucks are then analyzed based on the factors such as co-existence, proximity, and action consistency. Finally, post-pro- cessing is conducted to reduce recognition errors by considering con- tinuity of equipment activities. Details of each module are described in this section.
3.1. Construction equipment tracking
The purpose of this module is to obtain information of equipment types and trajectories. An extended version of TLD (Tracking-Learning- Detection), developed by[59,60], was adapted and customized in this research to track multiple construction equipment in the long term. TLD consists of two main processes: functional integration and online learning[61].Fig. 4illustrates the methodology of the adapted TLD.
Functional integration localizes target objects using both a pre- trained detector and a tracker that analyzes sequential images. This aspect enables to track construction equipment with dynamic move- ments and high interclass/intraclass variations. The detector learned from a well-developed training data enables to manage sudden changes of objects and environments, since it independently detects and tracks target objects in each frame[62–64]. Therefore, long-term tracking of equipment becomes possible even though construction equipment in- volves series of abrupt characteristic changes in terms of colors, shapes, orientations, and velocities. However, it is very difficult to pre-develop high quality training datasets[65]. Once new or unexpected events that are not part of a training data arise, detection errors may occur. On the other hand, the tracker analyzing sequential image frames can com- pensate such detector errors since the sequential analysis localizes the most similar region among the consecutive motion-based and feature- representation-based images. In this way, the tracker is able to adapt to gradual changes of object characteristics; hence, possible detection failures can be prevented by the adaptation of the tracker. Reversely, the tracker has also shortcomings that can be mediated by the detector;
it is sensitive to sudden changes of object's movements and shapes.
Since the sequential analysis is based on similarity of object's motion and representation, noise (e.g., background shadow) brings negative
impacts to recognition; hence, the tracker can easily miss target objects without re-tracking. Such failures can be counterbalanced with the detector's strengths which are robustness to abrupt changes and dy- namic movements.
Along with functional integration, online learning is also a key process for long-term tracking of construction equipment[61]. It is one of potential machine learning techniques to generate and reinforce a detector with sequentially updated training data [66–67]. To track construction objects using a detector, major challenging issue is to collect high quality training datasets that covers high interclass/in- traclass variations [61]. In this study, the training data was newly generated on site through the functional integration process, in attempt to address this challenge. To be specific, false positive and false nega- tive errors of the detector were added onto training data as negative samples to prevent the occurrence of similar errors. On the other hand, true positive results were added as positive samples with higher weights for recognition.
In sum, the functional integration enhances the detection and tracking performance by compensating weaknesses of a detector and a tracker. The online learning trains and customizes a detector, and de- velops and updates training data in real time without pre-works. Based on the two processes, TLD is able to recognize and track construction equipment that has dynamic motion changes and various character- istics (e.g., shapes, colors, and textures). Since it was originally devel- oped for flat objects such as human eyes or license plates in less changeable environments, the authors customized it to improve its practical applicability to the construction site. The technical details of the customization can be found from the authors' other publication:
Kim and Chi[61].
Equipment Tracking
Online Learning Tracking Detection
Functional Integration
Tracking Results
Detection Results
Positive Negative
Truck Exc.
ID, Type, Location
Action Recognition of Individual Equipment
Stop Drop
- =
Individual Action = {‘scooping/rotating/dropping’,
‘moving’, ‘stopping’}
Interaction Analysis
Proximity
&
Co-existence
Action consistency
&
Work Work
Activity = {‘idling’, ‘traveling’, ‘working’}
Post-processing
Activity type at every t
Fig. 3.Research framework with interactive operations.
Functional Integration Online Learning
Tracking Results Tracker
Detection Results Detector
Integrated Results
Training Datasets
Positive Negative
Learning
Fig. 4.Equipment tracking methodology.
3.2. Action recognition of individual equipment
This module recognizes individual actions of construction equip- ment using spatio-temporal reasoning and image differencing (Fig. 5).
For spatio-temporal reasoning, object's centroid is compared within sequential images to classify‘moving’or‘non-moving’actions. Image differencing is then performed to detect shape changes of excavators and classify ‘non-moving’ actions further into sub-action categories:
‘scooping/rotating/dropping’and‘stopping’.
The centroid change between image frames can be calculated based on the tracking results (locations over times). The centroid indicates that the center point of a 2D bounding box of an object and its change ratio is calculated by Eq. (1) to consider bi-directional movements.
Additionally, in order to reduce the scale dependency on the equip- ment-to-camera distances, the centroid's coordination change is divided by the diagonal length of the bounding box for tracking (Eq.(1)).
= x+ −x + y+ −y Centroid change ratio ( i i)L (i i)
i
1 2
1 2
2
(1) where,
xi: thex-axis position of centroid ati-th tracking result;yi: they-axis position of centroid ati-th tracking result;Li: the diagonal length of i-th tracking result.
Pre-defined threshold value then determines whether the tracked object is ‘moving’ or‘non-moving’by comparing calculated centroid change ratio. The threshold value is defined based on the geometric
relationship between the global coordination and the local (image) coordination (Eq.(2.1)).
⎡
⎣⎢
⎤
⎦⎥ =
⎡
⎣
⎢
⎢
⎢
⎤
⎦
⎥
⎥
⎥ s
x
y T
X Y 1 Z
1 (2.1)
where,
s:scale factor;T: transformation matrix; [x,y]: pixel coordination;
[X,Y,Z]: world coordination.
The equation explains that 3D global point P(X, Y, Z) can be transformed to the point p(x,y) in the image plane by the transfor- mation matrixT. With the decomposition ofT, Eq. (2.1)can be for- mulated as below (Eq.(2.2)).
⎡
⎣⎢ ⎤
⎦⎥ =
⎡
⎣
⎢
⎢
⎢
⎤
⎦
⎥
⎥
⎥
=
⎡
⎣
⎢⎢
⎤
⎦
⎥⎥
⎡
⎣
⎢
⎢
⎤
⎦
⎥
⎥
⎡
⎣
⎢
⎢⎢
⎢
⎤
⎦
⎥
⎥⎥
⎥
⎡
⎣
⎢
⎢
⎢
⎤
⎦
⎥
⎥
⎥ s
x
y KT R t
X Y Z
f c
f c
r r r
r r r
r r r
t t t
X Y 1 Z
(1)[ | ] 1
0 0 0 0 1
1 0 0 0 1 0 0 0 1
0 0
0 0 0 0 1 1
pers
x x
y y
x y z 11
21 31
12 22 32
13 23 33
(2.2) where,
[R|t]: rigid project matrix;Tpers(1): normalization matrix;
K: transformation matrix of normalized coordination to pixel co- ordination.
fx, fy: focal length at each direction;cx,cy: principal point at each direction;
Object type?
Move
Calculate the centroid change
Perform image differencing
Calculate the sum of absolute value
of pixel changes
Stop
Calculate the centroid change
Stop Move
(ID, type, (x,y))
Excavator Dump truck
Yes
No
Yes No
Yes No
Centroid change < threshold?
Centroid change > threshold?
Pixel change < threshold?
Scoop/
Rotate/
Drop
Fig. 5.Methodology for action recognition of individual equipment.
rij: rotated angle ati-jdirection;tx,ty,tz: translation at each direc- tion.
By integratingTpers(1) and [R|t], Eq.(2.2)is expressed as Eq.(2.3).
⎡
⎣
⎢ ⎤
⎦
⎥ =
⎡
⎣
⎢
⎢
⎢
⎤
⎦
⎥
⎥
⎥ s
x
y K R t
X Y 1 Z
[ | ]
1 (2.3)
where,
K: intrinsic parameter; [R|t]: extrinsic parameter.
Finally, 3D global coordination is transformed to the local image coordination using camera's intrinsic and extrinsic parameters, which can be obtained during calibration. According to the official equipment specifications of Hyundai, Volvo, and Doosan manufacturers, the ex- cavator's minimum moving speed and average diagonal length are as- sumed as 3 km/h and 4–7 m; the range of threshold value can be cal- culated by using Eqs. (1) and (2.3). The 10% per frame wasfinally optimized after the experiment with 11,513 images, given two frames per second. This frame rate suits well not only for tracking actions of the excavator (e.g., moving and non-moving) working with the velocity of 3 km/h but also for increasing computational efficiency for the real- time processing. After the spatio-temporal reasoning, image differen- cing is applied to further classify‘non-moving’into‘scooping/rotating/
dropping’or‘stopping’. The shapes of excavators are changing during
‘scooping/rotating/dropping’while the shapes do not change in case of
‘stopping’. The image differencing can detect such shape changes be- tween consecutive image frames by calculating the sum of pixel value differences from the absolute differences (Eq.(3)).Fig. 6also explains this calculation process. To eliminate effects of different scales, the size of each image is resized to the size of previous imagefirst.
=
∑ ∑ p+ −p Sum ratio of absolute diffrences A
j k jk
i jk i
i 1
(3) where,
pjki: the pixel value atjrow andkcolumn in bounding box fori-th tracking image;Ai: the area of bounding box ati-th tracking result.
Dump trucks have two individual action types:‘moving’and‘stop- ping’. In this study, it was assumed that there is no critical shape change of the dump truck during‘stopping’, thus the analysis focused on the spatio-temporal reasoning. The centroid change rate is calculated and compared to the pre-defined threshold value of 20% per frame for two frames per second (Eqs.(1) and (2.3)) under the consideration of dump truck's velocity (6 km/h).
3.3. Interaction analysis
Using the extracted information of object types, locations, and in- dividual actions, this module analyzes interactions among equipment.
The knowledge (rule)-based decision making was used to analyze both one-to-one and group interactions. In recognition of such interactive nature, factors such as the co-existence, proximity, and action con- sistency between excavators and dump trucks were investigated and adopted during the classification of equipment activity types. As a re- sult, more precise performance indicators (e.g., cycle time and direct work rates) can be obtained; for instance, interaction information can clearly distinguish whether excavators work‘alone’or‘together’with dump trucks.
Fig. 7 illustrates the process of activity recognition. In case of
‘moving’and‘stopping’of an excavator, it is defined as‘traveling’and
‘idling’respectively (Fig. 7); it is because excavators' individual action varies according to each activity. However, for actions such as
‘scooping/rotating/dropping’, the interaction analysis enables an ac- tivity identification system to decide whether the excavators are
‘working’alone (e.g., soil preparation and excavation) or together (i.e.,
loading soil to dump trucks). When the excavators and the dump trucks are‘working’together, they should both be placed within the operating boundary of the excavators (i.e., arm's length). If there exists at least one dump truck within the same image frame, the distances between centroids of the excavator and all the dump truck are computed (Eq.
(4)). Based on the distances, the nearest dump truck to the excavator is selected.
= x −x + y −y
Distance 2( exc. dump)2 (exc. dump)2 (4)
where,
xobject type: the x-axis position of centroids of each object type;
yobject type: the y-axis position of centroids of each object type.
The threshold proximity was optimized as 150 and 200 pixels after the experiments with 11,513 images. The previous Eqs.(2.1)–(2.3)also required the threshold values for the camera's intrinsic and extrinsic parameters and the average length of excavator's arms were between 8 and 15 m. The action consistency should also be considered since the co-existence and proximity alone are insufficient evidence to guarantee the interactions. Even if excavators are‘scooping/rotating/dropping’, dump trucks can‘move’; in this case, no interaction took place between two equipment. Excavators and dump trucks are ‘working’ together only when they show action consistency forfilling soil within the ef- fective distance; in other words, excavators are ‘scooping/rotating/
dropping’while dump trucks are‘stopping’for interaction. Based on this principle, the proposed framework can confidently decide whether
‘scooping/rotating/dropping’ excavator is interacting with a dump
truck or working alone (e.g., soil preparation).
‘Moving’dump trucks can be classified as‘working’. Yet,‘stopping’ dump trucks can either be classified as‘working’or‘idling’according to the interaction with excavators. Similar to the interaction analysis of excavators, the co-existence, proximity, and action consistency are ex- amined in order. When an excavator is not captured in the same image,
‘stopping’dump trucks should be‘idling’. Otherwise, the distances be-
tween the dump trucks and existing excavators are calculated in 2D images (Eq.(4)). The distances are used to select the nearest dump truck within the operating boundary of each excavator. Only after se- lecting nearest and ‘stopping’dump truck, excavators' individual ac- tions can be accurately confirmed. If all the conditions are met, the
‘stopping’dump trucks are then classified as‘working’and vice versa.
These processes enable the research framework to identify‘working’
dump trucks interacting with excavators (one-to-one interactions) and
‘idling’dump trucks waiting for itsfilling order due to the already in- teracting couple (group interactions).
- =
Fig. 6.Image differencing processes.
3.4. Post-processing
The post-processing is designed tofilter out noise or other kinds of outliers from the analyzed results. The activities at each frame are classified independently by the previous modules. However, it is ob- vious that construction equipment works continuously over time, which means that one activity takes place for certain duration. Majority voting of classified activity types (Eq.(5)) was applied every 5 s to compensate misclassification errors due to real-time discrete analysis and optimize classification results (Fig. 8).
=
Post S mode S t( , ) (5)
where
S: pre-classified activity state for all frames;t: time interval for post- processing.
4. Experimental results
4.1. Data collection and description
The authors collected video stream data fromfive earthmoving sites by using CCTVs and smartphones. The collected data included not only interactive operations among dump trucks and excavators but also operations in a group of multiple dump trucks working simultaneously (Fig. 9). To represent various appearances, actions, and activities of construction equipment, the stationary cameras were installed at 10 to 30 m distances, 0 to 3 m heights, and different viewpoints (i.e., front, back, side, and diagonal), 11 different positions in total. The total of 11,513 image frames, approximately 150 min of operation, was col- lected with 720 × 1280 and 1080 × 1920 pixels. The video data also included diverse equipment with different shapes, colors, or appear- ances manufactured by multiple corporations such as Caterpillar, Doosan, Hyundai, Scania, and Volvo.
4.2. Performance metrics
To quantify the performance of activity identification, two metrics were evaluated: precision and recall rate. The precision indicates the reliability of predictions and is calculated by Eq.(6). However, even though the most predicted results of algorithm are correct (high pre- cision), there can be occasions of omitted objects or activities. Thus, this Idle
(ID, type, (x,y), action)
Object type?
Idle Travel
Calculate the proximity with all excavators
Excavators exist?
Nearest to an excavator?
Proximity <
operating boundary?
The excavator is scooping/rotating/dropping?
Dump trucks exist?
Calculate the proximity with all dump trucks
Select the nearest dump trucks
Proximity <
operating boundary?
The dump truck is stopping?
No
Excavator
No Dump truck
Move
Stop Move
Scoop/rotate/drop Stop
Yes Yes
Yes Yes
Yes Yes
Yes Yes
Yes
Work (inter- acting) Work
(alone)
Work (travel) Individual
action?
Individual action?
Fig. 7.Interaction analysis methodology.
Work Work Idle Work Work Work
Work (work=6, idle=0)
( , 5 )
Fig. 8.Concept for post-processing.
study also evaluated the recall rate (Eq.(7)) to check the stability of the proposed framework. Furthermore, the estimated working, traveling, and idling times were compared with the actual operation time.
= +
Precision True Positive
True Positive False Positive (6)
= +
Recall rate True Positive
True Positive False Negative (7)
4.3. Implementation and experimental setup
C ++ programming language with Visual Studio 2012 was used for
the development of module 1 and MATLAB 2014b was implemented for the development of module 2, 3, and 4. A desktop computer [Intel i7- 4790 CPU @ 3.60 GHz, 32.0 GB RAM, Windows 7, 64 bit] was used for the series of experiments. Two main experiments with and without the interaction analysis were performed to investigate the advancement of the proposed solution. Additionally, individual action recognition and post-processing impacts were also analyzed.
4.4. Experimental results and analysis 4.4.1. Performance of the developed framework
Fig. 10shows examples of the classified activity types for all tracked Fig. 9.Examples of the collected data from actual earthmoving sites.
(a)
(b)
(c)
T = 248 T = 308
T = 103 T = 9
Fig. 10.Examples of experimental results. (a) Continuous classification over time. (b) One-to-one interactions. (c) Group interactions.
object. The responses were displayed for each bounding box. The per- formance metrics were discovered with the average precisions and re- call rates of 91.27% and 92.42% respectively (Tables 1, 2). The preci- sion results indicated that 91 classifications out of 100 responses were correct. Besides, the results of the recall rates meant that the im- plemented framework accurately classified 92 cases out of 100 actual occurrences for each activity type. Considering that the average pro- cessing time required for single frame analysis was 0.1 to 0.3 s while satisfying real-time applications. Moreover, the average error rate of estimated time for each activity was 5.4% compared to the original operation time; in other words, the model estimated 94.6 min as
‘working’and 5.4 min as ‘idling’ when the equipment were actually
‘working’for 100 min.
4.4.2. Performance without the interaction analysis
In addition to the validity, significant impacts of the interactive operations were also observed. For representing the classification without the interaction analysis, module 3 was opted out from the framework. In this case,‘stopping’dump trucks are regarded as‘idling’ (Fig. 11). Without the interaction analysis, the average precision was 75.68% dropped by 15.59%, whereas the average recall rate was 74.72% decreased by 17.70% (Table 2). The performance loss was salient with dump trucks; the interpretation errors of‘idling’increased remarkably for the dump trucks. Therefore, the recall rates for dump trucks were dropped by 38.91% and the‘idling’time was estimated with 41% error rates. On the other hand, the precisions and recall rates were not decreased for excavators.
4.4.3. Performance of the equipment tracking and individual action recognition
Since individual actions are one of vital factors for activity identifi- cation, its accuracy is important to determine the classification perfor- mance of operation types.Table 3 shows the experimental results for module 2: 91.58% of the average precision and 93.39% of the average recall rate. The results implied that the module worked properly for re- cognizing individual actions. It was also observed that the precisions for dump trucks were larger than those of excavators. This was consistent with the complicatedness of individual actions since dump trucks had binary cases while excavators had one more individual action. Regarding to this issue, the framework occasionally experienced a difficulty to further classify‘scooping/rotating/dropping’and‘stopping’. It can be explained with the effects of the image differencing technique. The technique was able to detect change in two consecutive image frames by considering the
‘changed pixel values’. However, all changing pixels were not able to be fully considered if the excavator's bucket was not tracked. It occurred when the bounding box of the excavator object did not cover the location of the bucket. Reversely, noise (e.g., background clutters and non-target objects) also affected the recognition performance with less significant change in image; for instance, dump trucks moved within the bounding box of ‘stopping’ excavators. Despite such technical limitations, the method was still effective in recognizing individual actions. TLD success- fully tracked equipment with 90.96% and 92.23% of the average preci- sions and recall rates. Thus, it was able to reduce the number of noisy bounding boxes and also to support the shape detections for the‘scooping/
rotating/dropping’excavators.
4.4.4. Performance of the post-processing
Finally, the post-processing effectivelyfiltered out noisy classifica- tion results. Module 4 increased the precisions and recall rates by 3.88% and 4.15% compared to the classification without the post- processing (Table 2). The filtering performance varied depending on interval durations for majority voting. The results revealed that 5 s was optimumfinding for the applied 11,513 images. The cases with the longer than 5 s showed poor performance due to the lingering imagery effects when activityAis already switched to another activityB(e.g.,
‘idling’states were kept monitoring for several seconds although it al- ready started to‘work’). Thus, the post-processing of 5 s played a sub- stantial role for the activity identification.
Table 1
Experimental results for performance metrics.
(%) The proposed method (with interaction analysis)
Without interaction analysis Without post-processing
Excavator Dump truck Excavator Dump truck Excavator Dump truck
Precision Recall rate Precision Recall rate Precision Recall rate Precision Recall rate Precision Recall rate Precision Recall rate
Idling 86.70 93.87 88.49 93.10 86.70 93.89 38.91 91.03 85.50 92.18 85.72 88.38
Traveling 93.12 89.97 – 93.12 89.97 – 83.82 83.36 –
Working 88.26 92.05 97.88 92.64 88.26 92.05 85.10 23.92 83.91 87.19 95.03 89.56
Average 89.36 91.96 93.19 92.87 89.36 91.96 62.01 57.48 84.41 87.58 90.38 88.97
Table 2
Performance evaluation of each module.
(%) The proposed
framework (A)
Without interaction analysis
Without post-processing
Performance (B)
A–B Performance (C) A–C
Precision 91.27 75.68 15.59 87.39 3.88
Recall rate 92.42 74.72 17.70 88.27 4.15
Fig. 11.Predictions with (left) and without (right) the interaction analysis for the same image frame.
Table 3
Experimental results for individual action recognition.
(%) Action recognition of individual equipment
Excavator Dump truck
Precision Recall rate Precision Recall rate
Stopping 86.70 93.87 93.29 97.47
Moving 93.12 89.97 94.3 92.18
Scooping/rotating/dropping 88.26 92.05 –
Average 89.36 91.96 93.80 94.83
5. Results and discussion
The experimental results supported the feasibility and applicability of the developed method with the acceptable performance criteria. It identified different types of equipment operations with 91.27% preci- sions and 92.42% recall rates, and then estimated‘working’,‘traveling’,
and‘idling’durations with 5.4% error rates on the average. The results
also showed the significant impacts of interaction analysis on the ac- tivity identification.Fig. 10displays that the proposed framework (with the interaction analysis) has an ability to successfully classify activity types of ‘stopping’dump trucks, which is crucial for operation time estimation. As shown inFig. 10(a), activity types of equipment were continuously identified correct from the image frame T= 248 to T= 308. Based on the co-existence, proximity, and action consistency conditions, the framework classified some of‘stopping’dump trucks as
‘working’(Fig. 10(b)). It was also possible to determine whether ex- cavators‘work’alone or together. This insight can be used for match- factor (the ratio of truck arrivals to available excavator service rates) calculation and resource allocations. Moreover,Fig. 11illustrates the advantage of considering the interactive operations between two equipment. While the‘stopping’dump truck for loading was correctly classified as‘working’(filling soil from the excavator) with the inter- action analysis, the dump truck was incorrectly classified as‘idling’ when the module 3 was opted out. The results denoted that the de- veloped framework was capable of interpreting the one-to-one inter- actions and enhancing the performance with the interaction analysis.
The group interactions can be also successfully analyzed with the developed approach. Fig. 10(c) displays that the approach correctly identified both one-to-one interactions and group interactions. At the image frameT= 9, the excavator in the left-hand side handed soil over to the dump truck located within the operating proximity and their activity types were determined as‘working’. The other excavator and its nearest dump truck co-existed within the pre-defined proximity; how- ever, they were‘idling’since the individual action of excavator was
‘stopping’. Based on the identified interactions, although the dump truck arrived at the operating zone at the image frameT= 103, the proposed method classified its operation type as‘idling’. The results showed that the group interactions among multiple equipment can be effectively handled by the module 3. Through these processes, the op- eration types were correctly determined with 91.27% of precisions and 92.42% of recall rates. These results meant that the method was able to differentiate the nearest one from multiple dump trucks within the ef- fective distance. As a result,‘working’and‘idling’of dump trucks were classified even though the criteria of co-existence, proximity and action consistency were all met.
Despite the promising performance of the developed approach, three cases of errors were inevitably observed.Fig. 12(a) illustrates the first case: errors of interaction analysis. The dump truck took actions of
‘stopping’and the excavator was‘scooping’the soil, which meant that
both the co-existence and action consistency were met. Besides, their proximity was lower than the threshold value of effective distance. In this case, the framework determined they were ‘working’with inter- action. In thefigure, however, the excavator was preparing to load soil to another dump truck, not with the tracked dump truck in the frame that was alreadyfilled with soil. These possible side effects can be re- duced by considering the maximum soilfilling durations; for instance, the duration of excavator-to-dump truck interactions cannot be longer than 5 min per each arrival. Second, activity identification errors oc- casionally occurred when tracking results did not yield statistically significant. Outliers or noise bounding boxes of the target objects re- sulted in both abrupt changes of centroids; as a result, some‘moving’ actions were classified as‘non-moving’. Image differencing was also affected as previously discussed; bounding boxes were not able to fully cover all parts of excavators (e.g., bucket, arm, and main body) (Fig. 12(b)). Then, the shape changes of excavators were not detected during‘scooping/rotating/dropping’. In this case, the dump truck was
also determined as‘idling’since the excavator was‘idling’. This lim- itation is expected to be complemented by using tracking algorithms based on CNN (Convolutional Neural Networks) in further study. In recent research, a CNN-based tracker outperforms to localize any types of objects that have high intraclass/interclass variations. However, to ensure powerful performance of CNN, numerous training data needs to be collected and labeled: very human-intensive. Thus, more cost and time effective approaches such as crowdsourcing labeling technique [68]need to be investigated. Last, in case of occlusions, the developed method inevitably missed target equipment and had difficulty to con- tinuously identify operation types. When earthmoving equipment was occluded in a short time, the post-processing could compensate the missed information by considering work continuity. For instance, even though‘working’excavators were missed for 2 s, the post-processing reasonably interpreted this short-term occlusion as ‘working’ time.
However, the post-processing was insufficient to handle the long-term occlusions. When equipment disappeared for longer than the post- processing interval (5 s), it was difficult to classify types of equipment operations. Although the previous studies made efforts to effectively handle such occlusions[6,9,61], they primarily focused on detection and tracking of construction equipment, rather than activity identifi- cation. The advanced reasoning processes (e.g., processes integrated with GPS) should be further studied to identify operation types con- tinuously in case of occlusions.
6. Conclusions
This research developed the vision-based activity identification framework that focused on the interactive operations between ex- cavators and dump trucks. This additional consideration had significant impacts on both identification credential and the level of information Fig. 12.Errors examples. (a) Errors of interaction analysis. (b) Effects of bounding box.
details. The framework consisted of four main modules: equipment tracking, action recognition of individual equipment, interaction ana- lysis, and post-processing. The experimental results (approximately 91.27% of precisions and 92.42% of recall rates) supported not only the feasibility of the proposed method but also the statistical significance of the interaction analysis. Based on the experimental results with and without the interaction analysis, the precisions and recall rates were increased by 5.59%→16%, 17.70%→18% respectively when inter- active operations were integrated. The present study made contribu- tions to the existing technology field and construction management.
First, it identified the critical elements of interactive operations (i.e., co- existence, proximity, and action consistency). Second, it then char- acterized the technical framework in order to detect, classify, and analyze the notable features from 2D video streams. Third, the frame- work classified activity types enabled to measure performance in- dicators (e.g., direct work rate and cycle durations) of heavy equip- ment. Last, it developed the framework that can enhance the practicality of the automated activity identification on actual earth- moving sites.
When considering on-site applications of these findings, however, subtle limitations still remain to exist. For instance, the CNN-based tracker is expected to remarkably upgrade tracking performance and it can also promote the quality of action recognition of individual equipment. For module 2, using 3D information instead of 2D enables one to handle omni-directional movements of equipment. With the 3D global locations, the proximity condition in the excavator-to-dump truck interaction analysis (module 3) can become robust to viewpoint variations. To compensate the intrinsic shortcoming of vision-based monitoring, integration with the radio-based methods is also a neces- sary future research topic. Especially for analyzing dump trucks' op- erations, GPS can compensate activity identification performance since dump trucks generally travel out of earthmoving sites as well as cam- era's field of view. Besides, brightness is a crucial variable to vision- based analysis. It is required to customize vision-based systems to night- time environments (e.g., darkness) for extending the applicability al- though construction operations usually performed during the daytime.
Apart from those technical advancements, the future research also thoroughly considers and examines on the practical applicability to actual construction sites. Using the work amount and time-log data (e.g., ‘working’, ‘idling’, and ‘traveling’) provided by this research, equipment productivity and operational efficiency can be automatically analyzed with the further integration of operating conditions such as soil type, soil volume, driver's skill, and equipment specification. Based on such information, site managers can make proper decisions on equipment allocation. With further achievement, it is expected that the automated activity identification can be realized for productivity ana- lysis of earthmoving operations.
Acknowledgements
This research was supported by a grant (14SCIP-B079691-01) from the Smart Civil Infrastructure Research Program and a grant (16CTAP- C114956-01) from the Technology Advancement Research Program, funded by Ministry of Land, Infrastructure and Transport (MOLIT) of the Korean Government and the Korea Agency for Infrastructure Technology Advancement (KAIA), and Seoul National University Big Data Institute through the Data Science Research Project 2017.
References
[1] J.S. Bohn, J. Teizer, Benefits of Barriers of Construction Project Monitoring Using High-Resolution Automated Cameras, J. Constr. Eng. Manag. 136 (6) (2010) 632–640,http://dx.doi.org/10.1061/(ASCE)CO.1943-7862.0000164.
[2] T. Cheng, J. Teizer, G.C. Migliaccio, U.C. Gatti, Automated task-level activity analysis through fusion of real time location sensors and worker’s thoracic posture data, Autom. Constr. 29 (2013) 24–39,http://dx.doi.org/10.1016/j.autcon.2012.
08.003.
[3] J. Seo, S. Han, S. Lee, H. Kim, Computer vision techniques for construction safety and health monitoring, Autom. Constr. 29 (2015) 239–251,http://dx.doi.org/10.
1016/j.aei.2015.02.001.
[4] J. Teizer, P.A. Vela, Personnel tracking on construction sites using video cameras, Adv. Eng. Inform. 23 (2009) 452–462,http://dx.doi.org/10.1016/j.aei.2009.06.
011.
[5] M. Golparvar-Fard, A. Heydarian, J.C. Niebles, Vision-based action recognition of earthmoving equipment using spatio-temporal features and support vector machine classifiers, Adv. Eng. Inform. 27 (2013) 652–663,http://dx.doi.org/10.1016/j.aei.
2013.09.001.
[6] M.-W. Park, I. Brilakis, Continuous localization of construction workers via in- tegration of detection and tracking, Autom. Constr. 72 (2016) 129–142,http://dx.
doi.org/10.1016/j.autcon.2016.08.039.
[7] M.-W. Park, A. Makhmalbaf, I. Brilakis, Comparative study of vision tracking methods for tracking of construction site resources, Autom. Constr. 20 (2011) 905–915,http://dx.doi.org/10.1016/j.autcon.2011.03.007.
[8] F. Pena-Mora, S. Han, S. Lee, M. Park, Strategic-Operational Construction Management: Hybrid System Dynamics and Discrete Event Approach, J. Constr.
Eng. Manag. 134 (9) (2008) 701–710,http://dx.doi.org/10.1061/(ASCE)0733- 9364(2008)134:9(701).
[9] Z. Zhu, X. Ren, Z. Chen, Visual Tracking of Construction Jobsites Workforce and Equipment with Particle Filtering, J. Comput. Civ. Eng. 04016023 (2016),http://
dx.doi.org/10.1061/(ASCE)CP.1943-5487.0000573.
[10] E.R. Azar, S. Dickinson, B. McCabe, Server-Customer Interaction Tracker: Computer Vision-Based System to Estimate Dirt-Loading Cycles, J. Constr. Eng. Manag. 139 (7) (2013) 785–794,http://dx.doi.org/10.1061/(ASCE)CO.1943-7862.0000652.
[11] I. Brilakis, M.-W. Park, G. Jog, Automated vision tracking of project related entities, Adv. Eng. Inform. 25 (2011) 713–724,http://dx.doi.org/10.1016/j.aei.2011.01.
003.
[12] J. Gong, C.H. Caldas, An object recognition, tracking, and contextual reasoning- based video interpretation method for rapid productivity analysis of construction operations, Autom. Constr. 20 (2011) 1211–1226,http://dx.doi.org/10.1016/j.
autcon.2011.05.005.
[13] J.-W. Park, K. Kim, Y.K. Cho, Framework of Automated Construction-Safety Monitoring Using Cloud-Enabled BIM and BLE Mobile Tracking Sensors, J. Constr.
Eng. Manag. 05016019 (2016),http://dx.doi.org/10.1061/(ASCE)CO.1943-7862.
0001223.
[14] T. Omar, M.L. Nehdi, Data acquisition technologies for construction progress tracking, Autom. Constr. 70 (2016) 143–155,http://dx.doi.org/10.1016/j.autcon.
2016.06.016.
[15] C. Zhang, A. Hammad, S. Rodriguez, Crane Pose Estimation Using UWB Real-Time Location System, J. Comput. Civ. Eng. 26 (5) (2012) 625–637,http://dx.doi.org/
10.1061/(ASCE)CP.1943-5487.0000172.
[16] E.R. Azar, B. McCabe, Part based model and spatial-temporal reasoning to recognize hydraulic excavators in construction images and videos, Autom. Constr. 24 (2012) 194–202,http://dx.doi.org/10.1016/j.autcon.2012.03.003.
[17] S. Chi, C.H. Caldas, Automated Object Identification Using Optical Video Cameras on Construction Sites, Computer-Aided Civil and Infrastructure Engineering 26 (2011) 368–380,http://dx.doi.org/10.1111/j.1467-8667.2010.00690.x.
[18] S. Chi, C.H. Caldas, Image-Based Safety Assessment: Automated Spatial Safety Risk Identification of Earthmoving and Surface Mining Activities, J. Constr. Eng. Manag.
138 (3) (2012) 341–351,http://dx.doi.org/10.1061/(ASCE)CO.1943-7862.
0000438.
[19] S. Chi, C.H. Caldas, D.Y. Kim, A Methodology for Object Identification and Tracking in Construction Based on Spatial Modeling and Image Matching Techniques, Computer-Aided Civil and Infrastructure Engineering 24 (2009) 199–211,http://
dx.doi.org/10.1111/j.1467-8667.2008.00580.x.
[20] H. Kim, K. Kim, H. Kim, Vision-Based Object-Centric Safety Assessment Using Fuzzy Inference: Monitoring Struck-By-Accidents with Moving Objects, J. Comput. Civ.
Eng. 30 (4) (2016) 04015075, ,http://dx.doi.org/10.1061/(ASCE)CP.1943-5487.
0000562.
[21] M.-W. Park, I. Brilakis, Construction worker detection in video frames for in- itializing vision trackers, Autom. Constr. 28 (2012) 15–25,http://dx.doi.org/10.
1016/j.autcon.2012.06.001.
[22] C. Yuan, S. Li, H. Cai, Vision-Based Excavator Detection and Tracking Using Hybrid Kinematic Shapes and Key Nodes, J. Comput. Civ. Eng. 31 (1) (2017) 04016038, , http://dx.doi.org/10.1061/(ASCE)CP.1943-5487.0000602.
[23] M. Memarzadeh, M. Golparvar-Fard, J.C. Niebles, Automated 2D detection of construction equipment and workers from site video streams using histograms of oriented gradients and colors, Autom. Constr. 32 (2013) 24–37,http://dx.doi.org/
10.1016/j.autcon.2012.12.002.
[24] Korea Construction Technology Promotion Act, Enforcement decree article 98 and 99, statutes of the Republic of Korea, Republic of Korea 2016, Available athttp://
www.law.go.kr/, Accessed date: 10 December 2017.
[25] M. Bugler, G. Ogunmakin, P.A. Vela, A. Borrmann, J. Teizer, Fusion of Photogrammetry and Video Analysis for Productivity Assessment of Earthwork Processes, Comput. Aided Civ. Inf. Eng. 32 (2017) 107–123,http://dx.doi.org/10.
1111/mice.12235.
[26] J. Fu, Logistics of earthmoving operations—simulation and optimization, Department of Transport Science, KTH, Stockholm, Sweden, 978-91-87353-05-5, 2013.
[27] S. Han, Productivity analysis comparison of different types of earthmoving opera- tions by means of various productivity measurements, J. Asian Archit. Build. Eng. 9 (1) (2010) 185–192,http://dx.doi.org/10.3130/jaabe.9.185.
[28] R. Bao, M.A. Sadeghi, M. Golparvar-Fard, Characterizing Construction Equipment Activities in Long Video Sequence of Earthmoving Operations via Kinematic