Real-time object detection in agricultural/remote environments using the multiple-expert colour feature extreme learning machine (MEC-ELM)

(1)

Real-time object detection in agricultural/remote environments using the multiple-expert colour feature extreme learning machine (MEC-ELM)

Edmund J. Sadgrove*, Greg Falzon, David Miron, David W. Lamb

PrecisionAgricultureResearchGroup,UniversityofNewEngland,Armidale,NSW2351,Australia

ARTICLE INFO

Articlehistory:

Received31August2017

Receivedinrevisedform6March2018 Accepted15March2018

Availableonline22March2018

Keywords:

Extremelearningmachine Objectdetection Machinevision Unmannedaerialvehicle Agriculture

Robotics

ABSTRACT

Itisnecessaryforautonomousroboticsinagriculturetoproviderealtimefeedback,butduetoadiverse arrayofobjectsandlackoflandscapeuniformitythisobjectiveisinherentlycomplex.Thecurrentstudy presentstwoimplementationsofthemultiple-expertcolourfeatureextremelearningmachine(MEC- ELM).TheMEC-ELMisacascadingalgorithmthathasbeenimplementedalongsideasummedareatable (SAT) forfast feature extraction and objectclassiﬁcation, fora fully functioning objectdetection algorithm.TheMEC-ELMisanimplementationofthecolourfeatureextremelearningmachine(CF-ELM), whichisanextremelearningmachine(ELM)withapartiallyconnectedhiddenlayer;takingthreecolour bandsasinputs.ThecolourimplementationusedwiththeSATenabletheMEC-ELMtoﬁndandclassify objectsquickly,with84%precisionand91%recallinweeddetectionintheY’UVcolourspaceandin0.5s perframe.Thecolourimplementationishoweverlimitedtolowresolutionimagesandforthisreasona colourlevelco-occurrencematrix(CLCM)variantoftheMEC-ELMisproposed.ThisvariantusestheSAT toproduceaCLCMandtextureanalyses,withtexturevaluesprocessedasaninputtotheMEC-ELM.This enabledtheMEC-ELMtoachieve78–85%precisionand81–93%recallincattle,weedandquadbike detectionandintimesbetween1and2sperframe.Bothimplementationswerebenchmarkedona standardi7mobileprocessor.ThustheresultspresentedinthispaperdemonstratedthattheMEC-ELM withSATgridandCLCMmakesanidealcandidateforfastobjectdetectionincomplexand/oragricultural landscapes.

1.Introduction

Agriculture systems require autonomous robotics for weed spraying, livestock detection and vehicle safety. The ability to detectandprocess objectsquickly isa desireof manyof these systems. Agricultural scenarios can be exceedingly complex as compared to other industrial based robotics systems. Object detectionalgorithmsinagriculturemaycompetewithanynumber ofstructurallysimilarordiverseobjects.Thismakesforacomplex environmentwiththepotentialformanyfalsepositivesandfalse negatives. A potential solution to this problem is to adopt a cascadingormultipleexpertapproach.Thesetypesofsolutions havevaryinglevelsofsuccessinbothﬂoraandfaunadetectionand therearenumerousimplementations,rangingfromsubstratalto more complex. These approaches include simple colour and texturedetection,aswell asbroadtemplatematchingandhave

beenimplementedforweed,horseandwildlifedetection[1–3].

Themorecomplexsolutionsuseawiderangeoftechniques,which canincludelargedatastructuressuchasdeeporexemplarneural networksandvaryinglevelsoftextureandshapebasedanalyses.

Thisincludes fortheripeness ofbananas andfor othergeneric objectdetectors[4–7].

Theapproachesdiscussedoftenrelyongrey-scaleimagesand inmanycases,thedetectionofprominentfeaturesorstandout colour attributes.Processingspeed isthe keyadvantageinthis case. Deep learning architectures are much slower and more complexandforthisreasonareoftenavoidedinplaceofrealtime solutions.Notably,withahigh-endGPUitispossibletoprocessa large number of frames per second [8], but in remote mobile computingGPUsmightnotbeattainable.Adisadvantageofmany approachesistherelianceonprominentkeycharacteristicsinthe classiﬁcationprocess.Thisoftenleadstopooroverallclassiﬁcation accuracy, particularly in more complex scenarios, e.g., in weed detectiontheremaynotbeanyprominentfeatures;inthecaseof cattle detectionthere maybevariationsin colour betweenthe

* Correspondingauthor.

ContentslistsavailableatScienceDirect

Computers in Industry

j o u r n a lh o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / c o m p i n d

(2)

samebreedand inothercasesan object'sfeatures mayappear differentatdifferentanglesorrotations.

Thegoalofthisresearchisthentoexploremethodsthatcanbe usedtodeliverbothfastandaccuratefeatureextractionandobject classification.Forthis,animplementationoftheMultiple-expert colour feature extreme learning machine (MEC-ELM) [9] is proposed. The MEC-ELM is a cascading implementation of the ColourFeatureExtreme LearnMachine(CF-ELM) [10], which is itselfanimplementationoftheExtremeLearningMachine(ELM) [11]forcolourobjectdetection.TheELMinpartduetoitsefficient implementation and fast processing speeds has demonstrated suitabilityforcomputervisionbasedproblemsintheagricultural scenarios. Including for soybean classifications [12], unmanned aerialvision for palm treedetection [13] and hasbeenbench- markedusingnotablefeatureextractiontechniques[7].TheMEC- ELMcanbeusedasboth afeatureextraction andclassification technique.Thiscan beachieved byadopting thesummed area table (SAT) [14] (or integral image) and thereby reducing a landscapeimage(orvideoframe)toagridofcolouredblocks.The purposeofthisistoprovideagenericapproachtoHAARfeatures [15]andhencetakeadvantageoftheSATsfast,multi-scalefeature extractionarchitecture.Fastprocessingmayrequirelowresolution imagedataandthiscanresultinapotentiallossofpixelbased information.Tomeetthischallenge,theoutputoftheSATgridis usedtogeneratethreecolourlevelco-occurrencematrices(CLCM) [16],oneforeachcolourband(red,greenandblue).Theoutcome willbeatexturebasedanalysis,withthevaluesprovidedasinput ofeachCF-ELM.ThetwoimplementationsoftheMEC-ELMwere implementedand processingspeeds and overallaccuracycom- pared.Thealgorithmisdesignedwiththeobjectiveoffastobject detectionincomplexandunpredictableterrain,makingitanideal candidateforuseintheagricultureindustry.Theobjectiveinthis caseistoimplementtheMEC-ELMasaweeddetectorforspraying, unobtrusive cattle tracking and as a vehicle interaction and avoidancetool.ThispaperwillbenchmarktheMEC-ELMwithpre- recordedvideodata,foreventualuseinaremotelaptopinterface forinteractionwithanunmannedaerialvehicle(UAVordrone)and stationarysurveillancedevices.

2.Theory/calculation

2.1.Multiple-expertcolourextremelearningmachine(MEC-ELM) TheELMisasinglelayer,feedforwardneuralnetworkthatis knownforitsfastandanalyticaltrainingphase.Inthisphasethe outputoftheneuralnetworkshiddenlayerisstoredinamatrix designatedHtheoutputweightsarethendeterminedanalytically.

Thiscanbeexpressed[17,18]:

HðW1;W_N_~;b1;b_N_~;x1;xNÞ

¼

gðW1

^x¹þb1Þ gðW_N_~

^x¹þb_~_NÞ

... ...

gðW1

^x^Nþb1Þ gðW_N_~

^x^Nþb_N_~Þ 2

64

3

75N

^N^~ ^ð¹^Þ

whereHis thehiddenlayeroutputmatrix,Nis thenumberof samplesusedinthetrainingphaseandN~isthenumberofneurons inthehiddenlayer. Intheactivationfunctiong() [19],Wisthe inputweight,xistheinputsamplepixelsandbisthebias.

The Colour Feature Extreme Learning Machine (CF-ELM) is similarinarchitecturetotheELM,comprisingofasinglelayer,feed forward,neuralnetworkwithapartiallyconnectedhiddenlayer andafullyconnectedoutputlayer.Thehiddenlayerisdividedinto 3sectionsandthisgivestheCF-ELMtheabilitytobeusedwith differentcolourmodels,includingred, green,blue(RGB), lumi- nance, chrominance red, chrominance blue (Y’UV) and hue, saturation, value (HSV) [20], Y’UV (also known as YCrCb) is

deﬁnedbytheinternationaltelecommunicationsunionasITU-R B.601 [32]. Equivalent to the standard ELM it uses randomly assigned weights in the hiddenlayer and by using thepseudo inverseit cananalyticallydetermine theoutputweights froma fullyconnectedoutputlayer.Bydividingthehiddenlayerinto3 sections, each colour attribute can be processed in a separate sectionofthehiddenlayeranditisforthisreasonthatthenumber ofneuronsinthehiddenlayermustbeamultipleof3.Bystoring theoutputofthese3sectionsinto3sectionsoftheHmatrixit allowsthematrixtobeusedtodeterminedtheoutputweightsin thesamewayasthestandardELM.ForY’UVtheCF-ELMhidden layercanbeexpressed[10].

HðW1;W_N_~;b1;b_N_~;Y⁰1;Y⁰N;U1;UN;V1;VNÞ

¼

gðW1

^Y⁰¹þb1Þ gðW_N_~

^Y⁰¹þb_~_nÞ

... ...

gðW1

^Y⁰^Nþb1Þ gðW_N_~

^Y⁰^Nþb_~nÞ

gðW1

^U1þb1Þ gðW_N_~

^U1þb_~nÞ

... ...

gðW1

^U^Nþb1Þ gðW_N_~

^U^Nþb_~nÞ

gðW1

^V¹þb1Þ gðW_N_~

^V¹þb_~nÞ

... ...

gðW1

^V^Nþb1Þ gðW_N_~

^V^Nþb_n_~Þ 2

66 66 66 66 66 66 66 66 66 66 4

3 77 77 77 77 77 77 77 77 77 77 5

3N

^N^~ ^ð²^Þ

where Y⁰, U and V are equal to the individual colour pixel matrices for each imageand are storedin theH matrix at the outputofthehiddenlayer.Thehiddenlayerprocessisrepeated in the output layer, with the output

b

^becoming ^the ^input

multiplierfortheoutputweights

b

^.^The^output^T^of^the^CF-ELM

is thentheresultof

b

^H.

T¼

b

^H ð3Þ

Here

b

^can^be^expressed:

b

¼

b

^T1

b

...^TN^~

2 64

3

75N~

^m ð4Þ

wheremisthenumberofneuronsintheoutputlayer,whichis equivalenttothenumber ofoutputsof theANN.Thematrixof targetoutputsTcanbeexpressedas:

T¼ t^T₁

...

t^T_N 2 64

3

75N

^m ð5Þ

whereforeachT_Nthevalueisstoredbasedontheinputtraining sampleanditsdesiredoutput.Thisleaves

b

âs^theôneûnknown,

bymaking

b

^the^subject^we^get:

b

¼H¹

^T ð6Þ

whereH¹istheMoore-PenrosepseudoinverseofmatrixH.The outputvaluesofthisprocessarethenstoredin

b

ândûsedâs^the

weightsintheoutputlayerremovingtheneedforalonggradient descentbasedtrainingprocess.

TheMEC-ELMisthenasetofCF-ELMs,whereeachCF-ELMcan betrained ona differentsetofsample images,differentcolour system or imageanalysistechniques. The MEC-ELMbecomes a globalconsensusofallCF-ELMsorexperts,thegoalisthentoﬁnd individualCF-ELMSofhighclassiﬁcationaccuracybutwithvarying consensus[21].Themethodusedinthispaperwasinspiredbythe ExamplarSVM[22]andtheEnsembleELM(EN-ELM)[23],where training samples are divided among different instances of CF- ELMs. Thetraining phase oftheMEC-ELM isdepictedin Fig.1,

(3)

whereKisthenumberofCF-ELMs,eachinstanceistrainedonone kthofthetrainingsetand^1N_K to^KN_K (orN)denotestheﬁnalimagefor eachinstance.Apredeterminednumberofvaluesissenttothe inputofeachCF-ELM,whereRGBvaluesareﬁrstprocessed,this willbediscussedfurtherinthemethodologyinSection3.1.

2.2.Summedareatable(SAT)

TheSAT,whichisalsoknownastheintegralimage(II),wasmade popularbyViolaandJonesobjectdetectionframework[24].TheSAT allowsfastobjectdetectionbyconvertinganimageintoasummed representation ofall pixel values andfromthesevalues aHAARbased feature setcan be used forbroadtemplate matching.The HAARbased feature setcan be quite largeand theAdaBoostalgorithm[25]is often used toﬁndfeatures mostcommonly associated withaset oftraining images.The HAAR features arerectangular based featureswith alternatingdarkandlightareasdesignedtomatchthedarkandlight areas(orlightintensity)associatedwithatargetobject.Theseblocks matchtheblockbasedsummationvaluesfoundintheSAT,which meansthatthesedarkandlightareascanbematchedtoareaswithin theimageveryquickly.Onlyfoursetsofcoordinatesarerequiredto calculatethelightintensityofonesubblock.TocreatetheSATall pixel values inthetable arestoredas summationsofthepixel value at thecurrentlocationandtheprecedingsummationvalues.Inthis fashion,theSATcanbegeneratedfromjustonepassovertheimage.

Thiscanbeexpressed[14]:

IIði;jÞ¼ði;jÞþði

D

ⁱ;jÞþði;j

D

^jÞði

D

ⁱ;j

D

^jÞ ð7Þ whereII(i,j)isasinglecoordinateofdimensionsiandjoftheSAT and the decrements (

D

^i,

D

^j) ^refer ^to ^the ^preceding ^pixel

summationvalue; here

D

^i,

D

^j⁼^1.^The ^sum^light ^intensity^value

of any rectangular area can then be calculated using the four coordinatesofeachcorneroftherectangle.Thiscanbeexpressed:

Sði;j;iw;jhÞ¼ði;jÞþðiw;jhÞðiw;jÞði;jhÞ ð8Þ where Srefers tothesumof therectangle's pixelvalues, with bottomrightcoordinatesiandjanddimensionsw(width)andh (height).Theresultingmagnitudesarethennormalisedbydivide themagnitudebythenumberofpixelswithintherectangle.This willproduceanaveragelightintensityvaluebetween0and255.In thispaperred,greenandblueSATswerecreatedinsteadofoneSAT forgrey-scale(orlightintensity).ThisallowedtheCF-ELMtowork witheachindividualcolourbandasrequired.

2.3.Colour-levelco-occurrencematrix(CLCM)

TheCLCM[16](or grey level GLCM) is an image analysistool based onlightintensityandlooksattherelationshipbetweenpixelsand

their nearestneighbour [26]. TheCLCMhas 3 ormore possible variants,includingfirstorder,secondorderandthirdorder.Each orderdescribesatabulationofdifferentcombinationsofpixels.The first order is essentially a one dimensional histogram of pixel intensities, itdoesnotlookattherelationshipbetweenneighbouring pixelsandcanbeusedtodeterminestatisticssuchasmeanand variance[27]. Secondorder CLCMs lookat therelationshipbetweena pixelandoneofitsneighbours,creatingatwodimensionalarray(or matrix)ofcombinationoccurrences.Thesecondordercanbeused fortextureanalysesandwillbeutilisedinthispaper.Thirdorhigher orderCLCMslookat2ormoreofapixel'snearestneighbours.These implementations arecomputationallymoreexpensiveand more difficultto interpret, forthis reasonthey were not usedin this research.Theinitialisationofasecondorderco-occurrencematrix canbeexpressed[28]:

Pði;jÞ¼Xⁿ

x¼1

Xⁿ

y¼1

1; ifIðx;yÞ¼iandIðxþ

D

^x;yþ

D

^yÞ¼j 0; otherwise

ð9Þ

whereP(i,j)representstotaleventsandisanindividualelementin theCLCM,nbynisthedimensionsofarectangularimage,xandy arecoordinateswithinanimageand

D

^x^and

D

^y^are^the^distance

fromapixel toitsneighbourintherightdirection(these same combinationsarecheckedintheleftdirectionaswell),inthiscase

D

^x,

D

^y⁼^1.^For^this^research^the^intensity^levels^were^reduced^from

256to16duringinitialisationoftheCLCM.Thiswasdonetoreduce the occurrence of nil values. After initialisation values are normalised by dividing all values in the CLCM by all possible combinationsin theCLCM.Totalpotential combinationscanbe calculatedfromknowndimensions2nn,where2represents theleftandrightdirections.Textureanalysesvaluescanthenbe calculatedfromthevaluesintheCLCMmatrix.Fivestatisticswere chosen; these values were selected as they were the least computationallyintensiveandprovidedgoodresultsinprelimi- narytesting.Theseincludedenergy,entropy,contrast,homogene- ityandCLCMmean,whichcanbeexpressed[28,26]:

Energy¼X^D¹

i;j¼0

Pði;jÞ²; ð10Þ

Entropy¼X^D1

i;j¼0

lnðPði;jÞÞPði;jÞ; ð11Þ

Contrast¼X^D1

i;j¼0

Pði;jÞðijÞ²; ð12Þ

Homogeniety¼X^D¹

i;j¼0

Pði;jÞ

1þðijÞ²; ð13Þ

CLCMMean¼X^D¹

i;j¼0

iPði;jÞ ð14Þ

whereDisthenumberofvalueswithintheCLCM.Inthisresearch three CLCMs were created to match the three colour band requirements of the CF-ELM.These included a red level, green levelandbluelevelco-occurrencematrix.Toavoidconfusionthese willbereferredtoasacolour-levelco-occurrencematrix(CLCM), butitisnoteworthythatagrey-level(GLCM)ismorecommonin literature.

Fig. 1.TheMEC-ELMwitheachCF-ELMtrainedonadifferentportionofthetraining set.

(4)

3.Methodology

All instances ofthe MEC-ELMwereprogrammedand tested usingtheCprogramminglanguage,Cwaschosenduetoportability andfasterprocessingspeeds.ItisalsopossibletousevariantsofC inmanyembeddeddevices[29].Allbenchmarkingwasconducted usingalaptopcomputerwithaLinuxbasedsystem,16gigabytesof ram,asolidstatedriveanda4thgenerationi7mobileprocessor.All time benchmarking was conducted using the clock_g ettime functionwiththeCLOCK_M ONOTONICoption fromthetime.h library.AllimageswerestoredasJPEGanddecompressedusingthe jpegIO.h library (also knownas libjepg), before pre-processing, imageswerestoredinmemoryasRGBvaluesbetween0and255.

JPEGwaschosenforstorageandﬁletransferreasons,itisalsoan outputavailableinanumberofremotecamerainterfaces.Images were stored using 4:4:4 sub-sampling which is perceptually losslessandweresavedusingthedefaultsetting4:2:0(lossy)in libjpeg for user feedback/external processing (smaller ﬁle). In agriculturethismeansthat thelesserqualityimagesare better preparedforthepotentialwirelesstransferconstraintsofremote environments.

3.1.Thealgorithm

Thealgorithmisa multiplestageprocessbasedprimarilyon Sadgroveetal.[9,10]fortheinitialisationoftheCF-ELMforremote computer based classiﬁcation. These stages included an (i) initialisationstage:where theneuralnetworkisinitialisedinto memorywithinitialrandomweights,(ii)weight biasingstage:

wheretheweightswerebiasedtothetrainingdataandisbasedon theCIW-ELM[30],(iii)trainingstage:wheretheoutputweights wereanalyticallydeterminedwithassistancefromtheClapack library[31],(iv)a tuningstage:whereoptimalthresholdvalues werefoundbasedonatuningsetof100images,thisstagewasalso used to assist in an off line training process, where optimal individualweightsetsforeachCF-ELMweresavedintextbased dataﬁlessothattheycanbeimportedinsteadofretraining,(V) testingstage:wheretheindividualCF-ELMSweretestedonthe image frames. The weightbiasing, training, tuningand testing stagesdifferedforeachofthetwoimplementationsandfromthe standardCF-ELMalgorithm.Thesedifferenceswillbediscussed.Of thetwoimplementationsoftheMEC-ELMtestedinthispaperthe primarydifferenceisthatoneutilisestheCLCMandtheotherdoes not.ThismeansthattheCLCMMEC-ELMhasCF-ELMswithjust5 inputstomatchthe5texturevalues.TheY’UVMEC-ELMhasinputs matchingthenumberofgridblocksfromtheintegralimage.In bothcasesthegridblockswillbereferredtoastheSATgrid.Inthis implementationtheRGBimageisconvertedtoY’UVbeforebeing processedasasatgrid.

Frame Extraction: prior to testing, each video described in Section 3.2 was extractedinto individualframes. The frames wereextractedfromMPEGvideoformattoJPEGusingtheFFmpeg packageavailabletobashinLinux.Theframeswereextractedat

twoframepersecondtomatchtheexpectedprocessingtimeof thealgorithm,althoughthisdidnottakeintoaccounttheCLCM andtrackingalgorithms.Thevideousedfortestinginthecaseof thecattleaerialfootagewasonly18slong,forthisreasonsix frames per second were extracted instead. To elaborate, the extractionoftwoframespersecondwasdeemedadequatefor objectdetectioninremoteenvironments,buttheemphasisin thispaperwasonevaluatingtheeffectivenessofthealgorithm andforthisreasonmoreframeswererequired.

TheSATGrid:duringthetraining,testingandtuningprocesses,all imageswereﬁrstconvertedtoasummedareatable.TheSATwas thenusedtoextractanaveragecolourgridoftheimageortarget area.Therewasnolimitonthenumberofcolouredblockswithin thegrid,butinthisresearchbetween25and100blockswere usedandthisdependedonhowsuccessfulclassiﬁcationwasat eachblocknumber.Asprocessingspeedwas apriority,lower blocknumberswerepreferred.FortheY’UVvariantoftheMEC- ELM these values were sent directly to each CF-ELM for processing.ThisisondisplayinFig.2,where3SATisthethree individualsummedareatablesforthecolourbandsY’,UandV.

ThereisalsoanexampleofanimageconvertedtoaSATgridwith 196blocksinFig.3.

Scaling: Eachframewas readasa selectionofrectangles. The rectanglesizewasproducedbydividingthewidthandheightof theimage frame by a number determined to be effectivein pretesting.At each iteration thesearch rectanglewas moved throughtheimageframe(typicallyby10–20pixelsatatime).To searchwithalargerorsmallerrectanglethenumberfordividing wasincreasedordecreasedandthiscausedascalechange.

TheCLCM:TheblockRGBvaluesextractedfromtheSATwere processedintoaCLCMforusewiththeCLCMvariantoftheMEC- ELM. During this process the red, green and blue level co- occurrencematrices wereproducedandfromeachanenergy, entropy, contrast, homogeneity and CLCM mean value were produced.Theﬁvevalueswerethensenttotheﬁveinputsof eachCF-ELMforprocessing.Adiagramisondisplayin Fig.4.

Notetheextrastage3CLCM,whereR,GandBSATsareusedto make 3 CLCMs for texture values energy, entropy, contrast, homogeneityandCLCMmean.

Fig.2. DiagramofthestandardY’UVvariantoftheMEC-ELM.

Fig.3. LeftimageisaSATgridoftheimageontherightwith196blocks.

(5)

ObjectDetection:toimprovetheprecisionandrecallratesofthe algorithm, areas of interest were saved every time a tested rectanglewithintheimageframeproducedapositiveclassifica- tion. If an area produced multiple classifications at multiple scalesandincloseproximity,asquarewasdrawninthearea basedonsavedcoordinates.Eachclassificationwasstoredwith an(x,y)coordinatenumberforthetoprighthandcornerandthe bottomleft handcorner oftherectangle. Aclassificationwas consideredinthesameareaifcoordinatesforatestedrectangle werewithinapercentileareaofpreviouslysavedcoordinates.

Theamountofclassiﬁcationsthattriggeredasquaretobedrawn differedforeachtestset.Typicallyif1–10detectionsofthetarget objectwerefoundinanarea, thiswas agoodindicationthat something was there. Each search area was within 10–100%

radiusoftheﬁrstclassiﬁedrectangle.

TrackingAssist:toassisttrackingobjectsacrossmultipleframes, asingleneuronextremelearningmachine(SNELM)wasutilised andtheoutputoftheSATwasusedasit'sinput.TheSNELMwas initialisedbased onjust one imageand theRGBvalues were reducedtoasingle16bitcolourvalue[33].Preliminarytesting with grey-level values did not improve results. The training imagewasbearbitrarilyselectedfromthetuningset.Afteran object is detected the base SNELM is updated to reﬂect the detected object. In theproceeding frame the SNELM is then primedtodetectthesameorsimilarobject.Astrainingisbased onasingleimage,thepseudoinverseisnotrequired,thiscanbe expressed:

b

¼ T

ð65536Rþ256GþBÞC ð15Þ

whereR,GandBarethearraysofcolourvaluesfromthegrid extractedfromtheSATandCisthenumberofgridblocks.

b

^is

onedimensionalandservesasaninputweightscontainer,with lengthequaltoCandTcontainsonlyonevalue,asonlyoneimage is usedintraining. Thisvalueisa targetandcan besetto1.

Processinganobjectcanthenbeexpressed:

Output¼X^C¹

i¼0

b

ðiÞð65536Rþ256GþBÞ ð16Þ the output is then checked against a predeﬁned threshold (typicallybetween0.1and1).

3.2.Datasets

Fourdatasetswereusedintraining,tuningandtesting.The numberofimagesineachdatasetislistedinTable1.Thetuning setcontained50imagesoftheobjectfromthetrainingsetand 50imagesofrandomsurroundingterrainfromthetestframes.

Allimagesusedintestingandtuningwere100by100pixelsin resolution. This resolution was chosen based on a trade off betweenthehigherprecisionofhighresolutionimagesandan improvementin processingtime.Highresolutionimageswere

not consideredimportantduetotheSATgridproducingalow resolution version of each image.The resolutionof the video frames isbasedontheresolution ofthe originalpre-recorded video.Thefourdatasetswerechosenbasedonapotentialusein the agricultureindustry, a particularemphasis wasplaced on unmanned aerial vehicle (UAV or drone) technology. The datasets included,aweeddetectionscenarioinvolvingCirsium vulgare(orBullThistle),twocattledetectionscenariosandanall terrain vehicle (ATV or quad bike) scenario. The Bull Thistle scenariowaschosen,asthistlecompeteswithothermorevaluable pasture,whichisa potentialproblemforlivestock.Detectionof the thistle allows for both weed surveillance and targeted spraying, savingmoney, pastureand theenvironment.The two cattledetectionscenariosinvolvedcattlerecordedbyariverbed with a stationarysurveillance cameraand cattle recordedbya dronehoveringovergreenpasture.Thecattledetectionscenario displaysthealgorithm'spotentialasacattletrackingandcounting algorithm.Thequadbikescenarioinvolvedariderdrivingthrough aruralcountryside.Vehicledetectioncanbeusedinbothaccident detectionandprevention.Thetechnicalaspectsofeachdatasetare listed:

BullThistle:thisdatasetwassourcedintwodifferentways.The trainingsetwasphotographedusinga 10mega-pixel Fujifilm handheldcameraondefaultsettingsandatafixeddistanceof 2mtosimulateaerialsurveillance.TheywerecapturedinJPEG formatataresolutionof3648by2736beforecroppingto100by 100.Thevideousedtoproducethetestingframeswerecaptured byaDJIPhantomquadcopterinMP4for60sat4069by2178 pixels,atadistanceof10mandondefaultsettings.Bothwere capturedfrompasturefieldsontheUniversityofNewEngland SMART Farm (Long 151 35min 40sE, Latitude 30 26min 09sS),insunnyconditionsbetweenmiddayandlateafternoon.

CattleDrone:ThisdatawassourcedfromvanGemertetal[34]

allowing comparison of results of the MEC-ELM and the ExemplarSupportVectorMachines(Exemplar-SVM)presented.

The video footage was recorded using an AscTec Pelican quadcopter, with a mounted GoPro 3: Black Edition action camera,inovercastconditionsandatvaryingdistances(around 30–50m).Thevideowasrecordedin1920by1080pixelsat60 framespersecond.Thevideoselectedfortestingwas18slong.

Thetrainingdatawasmadeupofcroppedandresizedimages fromtheremainingvideosinthedataset.Toextendthetraining dataset,imageswereﬂippedverticallyusingabashcommand.

Fig.4. DiagramoftheCLCMvariantoftheMEC-ELM.

Table1

Numberofimagesineachdataset,withresolutionofvideoframes.

Dataset Training Tuning Testing Resolution

BullThistle 688 100 120 2000,1500

CattleDrone 1331 100 110 1920,1080

CattleStationary 500 100 120 640,480

ATV 608 100 144 1280,720

(6)

CattleStationary:thedatasetisacollectionofsurveillancevideos takenfromanearbyfarmusingaScoutguardSG860Ccameraat 640by480pixels,at16framespersecondfor60sandAVIformat (beforeconversiontoMP4).ThevideosincludedPollHereford cattlesurroundingariverbedandotherareasforgrazing.The trainingsetwasmadeupofimagescroppedfromsomeofthese videos.Avideo notusedtogeneratethetrainingimageswas usedasthetestvideo.Eachvideowascapturedatdifferenttimes ofdayandindifferentweatherconditions.Thedistancetothe cowdependedonhowclosetheygottothecamera.

ATV:ThevideoofthequadbikeriderwascapturedusingaDJI Phantom3Advin areasofLoughCoolinand MountGablein Irelandinovercastconditionsandatvaryingdistances(around 20–50m).Precisespeciﬁcationsofthevideowerenotavailable.

Thevideo wasdownloaded fromYoutube forfair useinMP4 formatandataresolutionof1280720pixels.Thevideowas submittedtoYoutubebyuserandchannelname“JGKDrone” [35].Asonlyonevideowasavailablethetrainingand testset were made up of different sections of the 3min 37s video, exactly144 frames or 1min and 12sof video wereused for testingsequencesandtherestwasusedasthetrainingset.Areas thatwereoverlydarkorshadowywereavoidedforreasonsthat willbeexploredinthediscussion,althoughthesewerestillused inthetrainingdataset.Framesusedintestingwerecroppedto surroundthequad bike. Imageswere ﬂippedvertically using bashtoincreasethenumberoftrainingimages.Sometraining imagesoverlappedslightlywiththetestset.

4.Results

The resultsfrom each datasetare splitwith theY’UV input resultsinTable2andtheCLCMinputresultsinTable3.Thetables includethenameofeachdataset,thenumberofblocksusedinthe SATgrid,theaverageaccuracyofindividualCF-ELMsachievedin thetuningphase,theaveragetimetakenintestingperframe,the precisionandrecallratesforallframes.Accuracy,precision and recallcanbeexpressed:

Accuracy¼ TPþTN

TPþTNþFPþFN; ð17Þ

Precision¼ TP

TPþFP; ð18Þ

Recall¼ TP

TPþFN; ð19Þ

whereTPistruepositive,TNistruenegative,FPisfalsepositive andFNfalsenegatives.

TheresultsinTables2and3indicatebetteroverallperformance inthecattleandATVdatasetsfortheMEC-ELMwithCLCMtexture inputs.The MEC-ELM withY’UV inputs performedbest onthe thistledatasetwithaprecisionrateof98%andrecallat84%.The Y’UV variant produced much faster processing times, with all datasetsprocessingaroundhalfasecondtoasecondperframe.

TheCLCMvarianttookbetween1and2slongerduetothetime neededtoprocesstheCLCMandgetthetexturevalues.TheY’UV variantdidnotperformverywellinthecattledatasets.Inboth cases it required more grid blocks and recorded many false positives.Thisincludedthereﬂectionofthecattleinthewaterin the case of the stationarycamera and a largenumber of false positivesinthetreelineandonthehelipadinthedronedataset, thisdatasetwasalsotheslowestofthefourbecauseofthemultiple scalesrequiredtogettruepositives.TheCLCMvariantrecordeda numberoffalsepositivesonthehelipadaswell,particularlyinthe lasttenframes.Discardingthelasttenframesputstheprecisionat 84%.Sampleoutputframesfromeachofthe4datasetsareinFig.5, whereB(cattlestationary)andD(bullthistle)arefromtheY’UV variantandA(quadbike)andC(cattledrone)arefromtheCLCM variant.NotethedetectionofthecowreﬂectioninB.Thisproblem didnotoccurintheCLCMvariant.

InFig.6thereisareceiveroperatorcharacteristic(ROC)curve thatisdepictingtheaverageclassiﬁcationratesfordifferentsized SAT grids, with area under the curve (AUC). There were four differentsizedSATgridstestedandtheseresultswereconducted usingthetuningimages,with50imagesofthetargetobjectand50 imagesofsurroundinglandscape.Forthistestthethistledataset with the Y’UV variant was chosen. This was chosen as a demonstrationdataset, as thedifference betweentrue positive and false positive rates appeared most obvious. The test was conductedoncewithaSATgridofsizes9,25,100and2500.The fourMEC-ELMSweretrainedandtestonthetuningimagesandthe averageTPandFPratesfromthefourweresaved.Theaveragerates werethenusedtomaketheROCcurve.Ascanbeseenthelarger theSATgridthebettertheresults,withthebestresultscoming fromtheSATgridwith2500blocks.Notablytheresultsarenotas goodasthetuningaccuraciesfoundintheresultTables2and4.

Thetestinginthiscasewasbasedonanon-linetrainingmethod wereallweightswererandomisedandtestedimmediatelyafter.

The MEC-ELMsused toget theprecisionand recall rates were trained using an off-line method, where the CF-ELMS were initialised a number of times and theweights for the CF-ELM producingthebestresultswassavedinatextbasedfile[9].This meanttheresultsinFig.6werethereforebasedontheaverageof randomlytrainedclassifiers,ratherthanthebestof100classifiers.

Infurthertestingprocessingwith9blockstookanaverageof0.35s perframe,25blockstook0.44sperframe,100blockstook1.1sper frameand2500blockstookaround67sperfrom.

5.Discussion

Multipleexpert orcascadingapproaches toobjectdetection areoftenlimitedtorudimentaryapproachesandthisismostly duetotimeconstraints,particularly inthepursuitofrealtime results.Thispaperhaspresentedamultiple-expertcolourfeature extremelearning(MEC-ELM)forrealtimefeatureextractionand objectclassiﬁcation.Thealgorithmallowedrealtimeresultsby adoptingasummedareatableandhence,convertedtheimages into a low resolution and multiple scale environment. This allowedtheMEC-ELMtoprocessvideoframesandclassifyobjects inlessthanasecondperframeforY’UVinputsand1–2sinthe caseoftheCLCMimplementationonastandardmobileCPU.The Table2

TestingresultsforeachdatasetusingY’UVinputsbasedonallframesoftestvideos,withaccuracydetectedduringthetuningstage.

Y’UV Dataset SATgridblocks Accuracyintuning Time/frame Precision Recall

BullThistle 25 97% 0.44(s) 98% 84%

CattleDrone 100 81% 1.19(s) 23% 53%

CattleStationary 100 89.25% 0.53(s) 84% 78%

ATV 25 93% 0.49(s) 91% 78%

(7)

multiple scale environment gave the MEC-ELM the ability to processatmultiplescalespersecond,with2–4differentscales used in each of the datasets. This means thatobjects can be detectedupclose,farawayorinsmallerforms(acalforthistlein rosette).AnotheradvantageofusingtheSATgridistheabilityto producelowresolutionimages.Althoughthiswouldbeaproblem for classiﬁers that requirehigh resolution images, a cascading approachsuchastheMEC-ELMcanimproveaccuracybyforming a consensusamongexperts.The SATgridis essentiallyanaive scale[36]and at25 blocks perimage,theresolutionwould be

considered quite low. There are many advantages to this, by reducingsectionsoftheimageintoblocksthereisthepossibility toreduce noise in theimageand by reducing thesize ofeach frame, the video feed could be processed at a much lower resolution. This would decrease the processing times by a considerableamount.Thiswouldalsohaveanimpactonhardware restrictions,thatis,processingsmallerimageswouldlessenthe memoryrequirementsforonboardcomputingandallow faster transferspeedswhilelessoningstorageissues.

Low resolution images have some drawbacks, loss of pixel informationforexamplecanbeaseriousproblem.Effortsshould bemadetopreserveand/orretrievesomeofthislostdataandthis is reasontheCLCMvariantof theMEC-ELMwasproposed. The CLCMwasslowerthantheY’UVvariant,butmanagedtoalleviate some of the problems found in the cattle datasets. The Y’UV variant,althoughverygoodatdetectingobjectswherecolourisof someimportance(theredquadbikeandgreenthistle),struggled detectingmulti-colouredobjects,suchasblackand whitecows anddeliveredalotoffalsepositives.TheCLCMremovedmanyof thefalsepositivesandwasabletodetectmultiplecolourobjects withlittledifﬁculty;1–2satmultiplescales.Thisresultwasan improvement whencompared tosimilarsolutions inliterature.

The“VerschoorAerialCowDataset”waspreviouslytestedat32– 144sperframeandataround66%precisionand87%recall[34].

TheCLCMvariant,althoughitwasonlytestedononevideofrom thedataset,produced78%precisionand94%recallin1.9s.These timesmaketheMEC-ELMcomparabletothefastprocessingYolo objectdetectionframework,whichhasprocessingtimesaround6– 12sperframe[37]onastandardCPU.Itishoweverdifﬁcultto compareatthisstage,asYoloiscommonlybenchmarkedonahigh endGPUandtheabovecitationwasprogrammedinPythonandnot C.AcomparisonbetweenCandPythonisavailableinFourmentand Gillings[38].ThispaperhaslimiteditselftoamobileCPUforthe purposeofﬁeldbasedanalysis.ItisconceivablehoweverthatGPUs willbecomemorecommonplaceintheagriculturalenvironment.

ConvenientlytheSATGridisquiteadaptabletodifferentresource constraintsandastheavailabilityofhardwareresourcesincreases, thenumberofblocksusedcouldalsoincrease.Thiswillincrease theaccuracyofthealgorithmastheconstraintsarelessened.At lower resolutionsit willstill be advantagesto a GPU,allowing processingofmuchlargerareaswhilekeepingprocessingtimes down.

Notably,theresultsinthedatasetwerenotfullybalanced,the differencesbetweenprecisionandrecallinsomeofthedatasetsfor examplecouldhavebeenmuchcloser.Thiscouldbeachievedwith more precise tuning or a well established tracking algorithm [39,40]. Thetracking algorithmdeveloped was minimaltosave processingtime,futureresearchcouldexploremoresophisticated trackingalgorithms[41].Anotherissuethatcouldberesolvedis the problem of shadows or overcast. The quad bike video for example went through a lot of transitional lighting and the classiﬁcationwouldsufferduringthesephases.Thiscandependon thetraining setof course,but theincorporation ofillumination invariant or illumination robust images would likely further improvetheperformanceofthealgorithm.

TheresultsdemonstratetheabilityoftheMEC-ELMasalow resolution real-time object detection algorithm for drone and Table3

TestingresultsforeachdatasetusingtheCLCMtexturevaluesbasedonallframesofthetestvideos,withaccuracydetectedduringthetuningstage.

CLCM Dataset SATgridblocks Accuracyintuning Time/frame Precision Recall

BullThistle 25 91% 1.2(s) 84% 81%

CattleDrone 25 93.5% 1.9(s) 78% 93%

CattleStationary 25 94% 1.3(s) 85% 84%

ATV 25 90% 2.3(s) 84% 91%

Fig.5.Sampleresultfromdetectionineachdataset.

Fig.6. ROCcurvemeasuringtheaverageclassiﬁcationratesfordifferentsizedSAT grids.

(8)

surveillancevideocapture in theagricultureindustry.Choosing betweenthebestalgorithmdependsonthedataset,withtheY’UV performingbetterinweedsandtheCLCMperformingbetterin cattledetection.WhiletheybothperformedwellinATVdetection.

Thechoiceofdatasetsandcapturemethodsgivethealgorithmsa goodindicationofhowtheywillperforminalivescenario.The algorithms were benchmarked on a laptop, while real-time wirelesstransferhasalreadybeenestablish[42].Calibratingthe algorithmforliveuseshouldnowbeaprocessofchoosingtheright framerateandcompressionmethods.JPEGimageswerechosenfor thisreason,asitproducessmallerimagesizesforimagetransfer.

Globalpositionsystems(GPS)canbeincorporatedintodroneand surveillanceequipmenttogivelocationfeedback.

Theusefulnessofthealgorithmscanbeexempliﬁedineachof thechosendatasets. Inweeddetection, thealgorithmcouldbe usedtolocateinfestationsquickly.Afarmhandorgroundbased robotcouldthen beusedtodeliverchemical toaffectedareas.

Delivering chemicals to precise areas rather than entire areas wouldsavechemicalandreducecost.Thecattledetectionscenario couldbeusedtotrackandcountcattleinanunobtrusiveway.ATV detectioncould beused for vehiclesafety monitoring, locating missingorinjuredpersonsandincollisionavoidancesystems.In eachcasethealgorithmcouldbeusedinastandalonedevicesuch as a drone, unmanned ground vehicle (UGV) or stationary surveillancecamera.

Thispaperhasdemonstratedthroughquantifiableproperties theMEC-ELM'spotentialasarealtimeobjectdetectionalgorithm for on board and/or remote computer based technology. The datasetsusedplacedparticularemphasisondronesanddisplayed thealgorithm'sabilitytoperformwithaerialvideo data.Thisis particularlyimportantgiventheconvenienceofdrones,withtheir abilitytoflylongdistancesoverdifficultterrainandreportback resultsin realtime [42]. Thispointis particularlyimportant in agriculture, but the MEC-ELM was designed to be a generic approachtoobjectdetectionincomplexenvironmentsandforthis reasoncouldproveusefulinmanydifferentscenarios,particularly wherefastprocessingisrequired.

6.Conclusion

Two implementations of the multiple expert colour feature extremelearning have been benchmarkedas real time feature extractionandobjectclassiﬁcationalgorithms.Thisincludedalow resolutionY’UVcolourimplementationandaCLCMtexturebased implementation. The Y’UV implementation produced superior resultsinthedatasetswherecolourswereuniformandadeﬁning characteristic,includingweeddetectionwithaprecision of98%

andarecallof84%,whileproducingloweraccuracyinmultiple coloureddatasets,suchascattledetection.TheCLCMvariantwas abletoproduce consistentresultsthroughall thedatasets and higherresultsoverallinthree,includingquadbikedetectionwith 84%precisionand91%recall, cattledetectionfromastationary cameraat85%precisionand84%recallandcattledetectionfroma drone,at78%precisionand93%recall.TheY’UVvariantwasableto processvideoframesinhalfasecondforthreeofthedatasetsand justover1sforcattledetectionfromadrone.TheCLCMprocessing timeswerebetween1and2s.Theseprocessingtimesindicatethe algorithms ability to process the images within a time frame suitableforagriculturalroboticsapplications.Thisisparticularly notable given its ability as a low resolution classiﬁer. Future researchwillinvolvetestingthealgorithmonanembeddeddevice attachedtoadroneorlandvehicleandtestingthealgorithmin differentlighting conditions using differentillumination equal- isationtechniques.ThereisalsopotentialfortestingtheMEC-ELM onaGPUfora comparisontosimilarreal-timeobjectdetection algorithms.

Acknowledgements

DrPaulMeek,NSWDepartmentofPrimaryIndustriesforthe provisionofcattlesurveillancedata.AnimalEthics UNEAEC12- 042,collectedunderscientiﬁclicenceSciLicSL100634.

Mr Andrew Rieker of V-TOL Aerospace PTY Limited, for providingfootageofweedsusingtheirquadcopter.

MrPaulArnott,UNEKirbySmartFarm,foraccesstopaddocks forphotographyofweeds.

Mr E. Sadgrove is supported by an Australian Postgraduate Award.

References

[1]T.Berge,S.Goldberg,K.Kaspersen,J.Netland,Towardsmachinevisionbased site-speciﬁcweedmanagementincereals,Comput.Electron.Agric.81(2012) 79–86.

[2]M.S.Uddin,A.Y.Akhi,HorsedetectionusingHaarlikefeatures,Int.J.Comput.

TheoryEng.8(5)(2016)415–418.

[3]J. Wawerla, S.Marshall, G. Mori, K. Rothley, P. Sabzmeydani, Bearcam:

automatedwildlifemonitoringatthearcticcircle,Mach.Vis.Appl.20(2009) 303–317,doi:http://dx.doi.org/10.1007/s00138-008-0128-0.

[4]J.Redmon,S.Divvala,R.Girshick,A.Farhadi,Youonlylookonce:uniﬁed,real- timeobjectdetection,2016IEEEConferenceonComputerVisionandPattern Recognition(CVPR)(2015)779–788.

[5]T.Malisiewicz,A.Gupta,A.A.Efros,EnsembleofExemplar-SVMsforobject detectionandbeyond,InternationalConferenceonComputerVision(ICCV), IEEE,2011,pp.89–96.

[6]P.M.Pandiyan,C.Hema,P.Krishnan,S.Radzi,Colourrecognitionalgorithm usinganeuralnetworkmodelindeterminingtheripenessofabanana, InternationalConferenceonMan-MachineSystems(ICoMMS),BatuFerringhi, Malaysia,2009pp.2B71–2B–74.

[7]J.Xu,H.Zhou,G.-B.Huang,Extremelearningmachinebased fastobject recognition,15thInternationalConferenceonInformationFusion(FUSION), IEEE,Singapore,2012,pp.1490–1496.

[8]X.Chen,H.Mulam,AnimplementationoffasterRCNNwithstudyforregion sampling,Tech.rep.,CornellUniversity,2017.

[9]E.J.Sadgrove,G.Falzon,D.Miron,D.Lamb,Fastobjectdetectioninpastoral landscapesusingamultipleexpertcolourfeatureextremelearningmachine, TheInternationalTri-ConferenceforPrecisionAgriculture(PA17),1stAsian- AustraliaConferenceonprecisionPasturesandLivestockFarming(2017).

[10]E.J.Sadgrove,G.Falzon,D.Miron,D.Lamb,Fastobjectdetectioninpastoral landscapesusingacolourfeatureextremelearningmachine,Comput.

Electron.Agric.139(2017)204–212.

[11]G.-B.Huang,Q.-Y.Zhu,C.-K.Siew,Extremelearningmachine:theoryand applications,Neurocomputing70(2006)489–501.

[12]R.Moreno,F.Corona,A.Lendasse,M.Graña,L.S.Galvão,Extremelearning machinesforsoybeanclassiﬁcationinremotesensinghyperspectralimages, Neurocomputing128(2014)207–216,doi:http://dx.doi.org/10.1016/j.

neucom.2013.03.057.http://www.sciencedirect.com/science/article/pii/

S0925231213010102.

[13]S.Malek,Y.Bazi,N.Alajlan,EfﬁcientframeworkforpalmtreedetectioninUAV images,IEEEJ.Sel.Top.Appl.EarthObserv.RemoteSens.7(12)(2014)4692–

4703.

[14]G.Facciolo,N.Limare,E.Meinhardt-Llopis,Integralimagesforblockmatching, ImageProcess.OnLine4(2014)344–369.

[15]A.Mohamed,A.Issam,B.Mohamed,B. Abdellatif,Real-timedetectionof vehiclesusingtheHaar-likefeaturesandartiﬁcialneuronnetworks,Procedia Comput.Sci.73(2015)24–31.

[16]M.Benco,R.Hudec,S.Matuska,M.Zachariasova,One-dimensionalcolor-level co-occurrencematrices,Elecktro,IEEE,2012,doi:http://dx.doi.org/10.1109/

ELEKTRO.2012.6225600.

[17]G.-B.Huang,Q.-Y.Zhu,C.-K.Siew,Extremelearningmachine:anewlearning schemeoffeedforwardneuralnetworks,Proceedings.2004IEEEInternational JointConferenceonNeuralNetworks,vol.2(2004)985–990.

[18]Ö.F.Ertugrul,Y.Kaya,Adetailedanalysisonextremelearningmachineand novelapproachesbasedonELM,(2015).http://www.openscienceonline.com/

journal/ajcse.

[19]G.S.S.daGomes,T.B.Ludermir,L.M.M.R.Lima,Comparisonofnewactivation functionsinneuralnetworkforforecastingﬁnancialtimeseries,Neural Comput.Appl.20(3)(2011)417–439.

[20]M.Loesdau,S.Chabrier,A.Gabillon,HueandsaturationintheRGBcolour space,Conference:6thInternationalConference,Vol.8509ofLectureNotesin ComputerScience,Cherbourg,France,2014,pp.203–212.

[21]StudioEncodingParametersofDigitalTelevisionforStandard4:3andWide- screen16:9AspectRatios:RecommendationITU-RBT.601-6:(QuestionITU-R 1/6),InternationalTelecommunicationUnion,2007.URLhttps://books.google.

com.au/books?id=NpAwPwAACAAJ.

[22]S.Bashbaghi,E.Granger,R.Sabourin,G.-A.Bilodeau,Ensemblesofexemplar- svmsforvideofacerecognitionfromasinglesampleperperson,12thIEEE InternationalConferenceonAdvancedVideoandSignalBasedSurveillance (AVSS)(2015)1–6.

(9)

[23]N.Liu, H.Wang,Ensemble basedextremelearningmachine,IEEE Signal Process.Lett.17(8)(2010)754–757.

[24]P.Viola,M.Jones,Robustreal-timeobjectdetection,Int.J.Comput.Vis.57(2) (2004)137–154.

[25]R.Wang,Adaboostforfeatureselection,classiﬁcationanditsrelationwith SVM,areview,Phys.Procedia25(2012)800–807.

[26]A.H.Bishak,Z.Ghandriz,T.Taheri,Facerecognitionusingaco-occurrence matrixoflocalaveragebinarypattern(CMLABP),CyberJ.Multidiscip.J.Sci.

Technol.J.Sel.AreasTelecommun.(JSAT)(2012)15–19.

[27] G.N.Srinivasan,G. Shoba,Statisticaltexture analysis,WorldAcademyof Science,EngineeringandTechnology,vol.36(2008)1264–1269.

[28]A.Eleyan,H.Demirel,Co-occurrencematrixanditsstatisticalfeaturesasanew approachforfacerecognition,Turk.J.Elect.Eng.Comput.Sci.19(1)(2011)97–

107.

[29]J.Leskela,J.Nikula,M.Salmela,OpenCLembeddedproﬁleprototypeinmobile device,SignalProcessingSystems,IEEE,2009,doi:http://dx.doi.org/10.1109/

SIPS.2009.5336267.

[30]J.Tapson,P.deChazal,A.vanSchaik,Explicitcomputationofinputweightsin extremelearningmachines,InternationalConferenceonExtremeLearning Machines,Vol.1ofAlgorithmsandTheories,Springer,Switzerland,2015,pp.

41–49.

[31] Netlib.org,TheLAPACKECinterfacetoLAPACK,2013,November.http://www.

netlib.org/lapack/lapacke.html.

[32]Studioencodingparametersofdigitaltelevisionforstandard4:3andwide- screen16:9aspectratios,Tech.rep.,InternationalTelecommunicationsUnion, ElectronicPublication,Geneva,2015.

[33]S.Hartig,BasicimageanalysisandmanipulationinimagejChapter14,(2013) Unit14.15.

[34]J.C.vanGemert,C.R.Verschoor,P.Mettes,K.Epema,L.P.Koh,S.Wich,Nature conservationdronesforautomaticlocalizationandcountingofanimals, EuropeanConferenceonComputerVisionWorkshop,ECCVWorkshopon ComputerVisioninVehicleTechnology(2014)255–270.

[35]JGK Drone, Quad biking drone footage, 2016, November. https://www.

youtube.com/watch?v=x5I-y7bWD9E.

[36]D.E.Culler,J.P.Singh,A.Gupta,Workload-drivenevaluation,ParallelComputer Architecture:AHardware/softwareApproach,1sted.,(1998),pp.266.

[37]J.Redmon,A.Farhadi,Yolo9000:better,faster,stronger,(2016,December). [38]M.Fourment,M.Gillings,Acomparisonofcommonprogramminglanguages

usedinbioinformatics,J.BMCBioinf.9(2008)82.

[39]D. Ghosh,N.Kaabouch, Asurveyonimagemosaicingtechniques, J. Vis.

Commun.ImageRepresent.34(2016)1–11.

[40]B.D.Lucas,T. Kanade, Aniterativeimage registrationtechnique with an applicationtostereovision(IJCAI),Proceedingsofthe7thInternationalJoint ConferenceonArtiﬁcialIntelligence,vol.81(1981).

[41]Y.Wu,J.Lim,M.-H.Yang,Objecttrackingbenchmark,IEEETrans.PatternAnal.

Mach.Intell.37(9)(2015)1834–1848,doi:http://dx.doi.org/10.1109/

TPAMI.2014.2388226.

[42]G. Zhou,C.Li, P.Cheng,Unmannedaerialvehicle(UAV)real-timevideo registrationforforestﬁremonitoring,Vol.3ofConference:Geoscienceand RemoteSensingSymposium,IEEEInternational,Seoul,SouthKorea,2005.