您好,欢迎来到意榕旅游网。
搜索
您的当前位置:首页Fast

Fast

来源:意榕旅游网
Fastnearestneighbourtestingalgorithmforsmallfeaturesizes

O.T.Yıdız,L.AkarunandH.L.Akın

Afastnearestneighbouralgorithmwithalogarithmicexpectedtestingtimeforsmallfeaturesizesispresented.IthasbeentestedonaRobotVisionapplicationforYUVspace.Thealgorithmtimerequirementisbetterthanamultilayerperceptrontrainedforthesamepurpose.

Thealgorithmcanbeeasilyextendedtok-nearestneighbourwithoutaffectingthetimecomplexity.Thedifferenceoccursinthelaststep.Wesearchtwoclosestinstancesateacharrayin1-nearestneighbour,whereaswesearchtheclosest2kinstancesink-nearestneighbour.Asanexample,assumethatwewanttofind1-nearestneighbourof(x1¼2,x2¼3,x3¼4)fortheexampledatasetinFig.1.ThetwoclosestinstancestothatinstanceareshowninFig.2.Ifwesearchtheseteninstances,wefindthat(x1¼2,x2¼2,x3¼4)isthenearestneighbour.

Introduction:1(k)nearestneighbouristhesimplestandmostextensivelystudiednonparametriclearningalgorithminmachinelearning.Althoughithasaverygoodperformancecomparedtoitsparametricopponentsandhasnolearningphaseduetoitsnonpara-metricnature,ithasaseriousdrawback:lineartestingtime.Tofindouttheclassofatestcase,itmustbecomparedwithN(numberofinstancesinthetrainingset)elementsandtheelement(kelements)withtheminimumdistancetothetestcasemustbefound.Tocompareoneinstancewithanother,severalfunctionsexist,suchasEuclidiandistance(L2norm),Mahalonobisdistance,etc.Eachofthesefunc-tionscanbethoughtasamappingfromdfeaturesoftheinstancetoasinglenumber(distance).So,theexpectedtestingtimeofthenearestneighbouralgorithmisO(dN).Severalsolutionshavebeenproposedforreducingthislineartestingtime.SomeofthemareK-dtrees[1],geometrichashing[2]andR-trees[3].InthisLetter,weproposeanovelalgorithmbasedonbinarysearchtoefficientlycalculate1(k)nearestneighbourinO(d!logN)time.Ifdissmall,i.e.lessthanorequalto5,andNissignificantlylargerthand,thealgorithmbehaveslikeanO(logN)algorithm.Wealsoshowthatthiscorrespondstoestimatingthereducedorderingofvectorsfromdifferentconditionalorderings[4].

Algorithm:Thenovelalgorithmmainlydependsonbinarysearchonsortedinstances.Asapreprocessingstep,d!sortedarraysareprepared.Ifaninstancehasdfeatures,thenthereared!possiblepermutationsofthefeatures.Foreachpossiblepermutation,allinstancesaresortedinalphabeticorderaccordingtothatpermutation.Thisisdefinedintheliteratureastheconditionalorderingofasetofvectors[4].The1-nearestneighbourwhichwearelookingforisthenearestvectorinthereducedorderingofthevectors.Wewillshowthatthiscanbeestimatedfromdifferentconditionalorderingsthatcorrespondtopermutationsofvectorcomponents.Foreachpermuta-tion,NelementscanbesortedinO(NlogN)timeusingthequicksortalgorithm.Soford!possiblepermutations,thepreprocessingsteprequiresO(d!NlogN)timeandO(d!N)memoryspace.Fig.1showsadatasetwithd¼3features,N¼10instancessortedin6¼3!differentways.Thepreprocessingstepofthealgorithmcanbethoughtasthelearningphaseofthenearestneighbouralgorithm.

Fig.1Datasetafterpreprocessingstep

Inthetesting,eachtestinstancextissearchedind!sortedarraysusingbinarysearch.Ifanexactmatchisfound,thereisnoneedtosearchotherarraysfor1-nearestneighbour.Otherwise,wesearchthetwoclosestinstancesateachofthed!arrays.Thesetwoclosestinstancesareleftandrightborderinstancesinthebinarysearch.BecausesearchinganelementinasortedarraywithNelementsusingbinarysearchtakesO(logN)time,thealgorithmhasatimecomplexityofO(d!logN).Afterfindingthetwoclosestinstancesateacharray,wesearchthese2d!instancesfor1-nearestneighbour.ComparedtothebinarysearchthisstepofthealgorithmtakesonlyO(d!)time,whichissmallerthansearchingarrays.

theproblem.First,becauseofthelightingconditions,usuallyalargenumberofthepixelsofthetestframesareexactlythesameasthepixelsinthetrainingframes.Assumingthatthelabellercorrectlylabelsthepixels,thenearestneighbouralgorithmwillhave100%performanceonthesepixels,whereasMLPwillnothavethesameperformance.Secondly,sincethecolourspaceissmoothlydistributed(smallchangesinY,U,Vvalueswillresultinnochangeincolourclassification),nearestneighbourwillbehavebetterthanMLP.

MLPhasfourinputunits,Y,U,V,andabiasunit,tenoutputunitsorcoloursandinourtrainingsevenhiddenunits.TotestaninstancewithMLP(seetherelatedcolumninTable1):

3Â7multiplicationsforthefirstlayer,and7Â10multiplicationsforthesecondlayer,total91,floatingpointmultiplicationsmustbecarriedout.Tosumupthemultiplicationstoobtainhiddenvaluesandoutputvalues91floatingpointadditionoperationsmustbeperformed

Therearealsosevensigmoidoperations,whicharemuchmorecostlythanmultiplicationbecauseoftheexponentiationoperation.Therearealsosevenadditionandsevendivisionoperationsinthesigmoidfunction

Tofindthebestclass,ninecomparisonsmustbeperformedontheoutputunits

Thenovelnearestneighbourhassixsortedarrayswith200000instancesineach.Totestaninstancewiththenovelnearestneighbour(seetherelatedcolumninTable1):

Searchinganinstanceinonearrayrequireslog(200000)¼17compar-isonoperations.Sixarraysrequire102comparisonoperations

Tofindnearestneighbourin12instances,36integermultiplications,36integeradditions,36integersubtractions,and11comparisonsmustbeperformed

OnepossibleextensionofthealgorithmmaybetoconvertY,U,Vsortedvaluesintoalongintegerandperformingalloperationsonthese

values.Then,thesecondstepinnearestneighbourwillreduceto11comparisonoperations.

Conclusion:Wehavedevelopedanovelalgorithmthatestimatesthenearestneighbourfromdifferentconditionalorderingsofvectorcomponents.Thealgorithmiscomputationallyadvantageousespe-ciallywhenthenumberoftrainingsamplesislarge.Wehaveappliedthealgorithmtoacolourquantisationprobleminarobotapplication.Wearecurrentlyworkingonthestatisticalanalysisofthealgorithm.#IEE2004

ElectronicsLettersonlineno:20040144doi:10.1049/el:20040144

24October2003

O.T.Yıldız,L.AkarunandH.L.Akın(.DepartmentOfComputerEngineering,Bog¸iUniversity,Bebek,Istanbul)˘azicE-mail:yildizol@cmpe.boun.edu.trReferences

12345

Bentley,J.L.,andWeide,B.W.:‘Optimalexpected-timealgorithmsfor

closestpointproblems’,ACMTrans.Math.Soft.,1980,6,(4),pp.563–580

Califano,A.,andMohan,R.:‘Multidimensionalindexingforrecognizingvisualshapes’.Proc.IEEEConf.ComputerVisionandPatternRecognition,Maui,HI,USA,1991,pp.28–34

Guttman,A.:‘R-trees:adynamicindexstructureforspatialsearching’.Proc.ACMSIGMOD,Boston,MA,USA,1984,pp.47–57

Hardie,R.C.,andArce,G.R.:‘RankinginRpanditsuseinmultivariateimageestimation’,IEEETrans.CircuitsSyst.VideoTechnol.,1991,1,(2),pp.197–209

Akin,H.L.,Pavlova,P.,andYildiz,O.T.:‘‘‘Cerberus2002’’Robocup2002:RobotSoccerWorldCupVI’.2002Int.RobocupSymp.Pre-proceedings,Fukuoka,Japan,2002,p.448

ELECTRONICSLETTERS5thFebruary2004Vol.40No.3

因篇幅问题不能全部显示,请点此查看更多更全内容

Copyright © 2019- yrrf.cn 版权所有

违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com

本站由北京市万商天勤律师事务所王兴未律师提供法律服务