您的当前位置：首页匿名化算法

匿名化算法

来源：意榕旅游网

ATaxonomyandAdversarialModelforAttacksagainstNetworkLogAnonymization

Systems&Tech.Group

IBMRochester

JustinKing∗

jkking@us.ibm.com

Dept.ofComputerScience

UIUC

KiranLakkarajuAdamSlagellslagell@illinois.edu

NCSAUIUC

klakkara@illinois.edu

ABSTRACT

Inrecentyears,ithasbecomeimportantforresearchers,se-curityincidentrespondersandeducatorstosharenetworklogs,andmanyloganonymizationtoolsandtechniqueshavebeenputforthtosanitizethissensitivedatasourceinor-dertoenablemorecollaboration.Unfortunately,manynewattackshavebeencreated,inparallel,thattrytoexploitweaknessesintheanonymizationprocess.Inthispaper,wepresentataxonomythatrelatessimilarkindsofattacksinameaningfulway.Wealsopresentanewadversarialmodelwhichwecanmapintothetaxonomybythetypesofattacksthatcanbeperpetratedbyaparticularadversary.Thishashelpedustonegotiatethetrade-oﬀsbetweendatautilityandtrust,bygivingusawaytospecifythestrengthofananonymizationschemeasameasureofthetypesofadver-sariesitprotectsagainst.

CategoriesandSubjectDescriptors

C.2.3[NetworkOperations]:[NetworkMonitoring];K.4.1[PublicPolicyIssues]:[Privacy];K.4.3[OrganizationalImpacts]:[Computer-supportedcollaborativework];K.6.m[Miscellaneous]:[Security]

GeneralTerms

Security,Veriﬁcation

Keywords

AdversarialModel,Anonymization,NetworkLogs,Taxon-omy

1.INTRODUCTION

Asmanyhaveargued,sharingofnetworktraces,ﬂowdataandotherlogsisvitallyimportanttoresearch,industryandpedagogy[22,16,14].Thenetworkandsecurityresearchcommunitiesrelyonlarge,diverseandnon-synthetic[12]

∗ThisworkwasperformedwhentheauthorwasatUIUC.

Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforproﬁtorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationontheﬁrstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspeciﬁcpermissionand/orafee.

datasetsforempiricalstudies.Incidentrespondersneedtosharereallogstocollaborateoninvestigationsofattacksthatcrossorganizationalboundaries.Educatorsandthosecreatingeducationalmaterialsneedexampledataforstu-dentprojectsandexercises.

However,ithasfrequentlybeenpointedoutthatthesesortsofdataareoftenverysensitive[22,18,23,17].Therearesecurityconcernsaboutwhatinformationmaybere-vealedregardingthedatacollector’snetworkandsystems,andtherearelegalquestionsaboutbetrayingthetrustoftheuserswhoseprivateactionsarebeingrecorded[19].Anonymizationtechniqueshavebeendevelopedtoallevi-atethisconﬂictcreatedbytheneedforsecurityandtrustontheonehandandhighutilitydatasetsontheother[14,23,25,26,13].Infact,severaltoolshavebeencreatedsothatdataownerscansanitizenetworkdata[20,15,9,21].Whileveryuseful,thesetoolsalonedonotsolvetheentireprob-lem.Onemustknowhowtocreateanonymizationpoliciesthatprovidethenecessarylevelofassurance—especiallyinlieuofmanyrecentlydevelopedattacksagainstanonymiza-tion[8,4,5,18].Thesetoolsalonedonothingifonedoesnotknowhowtousethemintelligently.

Asacorollarytothistrade-oﬀbetweeninformationlossandtrust,therearealwaysatleasttwoactorswithoppos-inginterestsinthecreationofananonymizationpolicy.Thedataownerandthepeoplewhosebehaviorthedatadescribeswantpoliciestoprotecttheirprivacyandsecurityinterests.Evenifthedatadoesnotdirectlyrevealsensitiveinforma-tionaboutthedataowner’snetworkorservices,Internetserviceprovidersandsuchdonotwanttoviolatethecon-sumer’strustortheirownprivacypolicies.Ontheotherhand,theperson(s)doinganalysisofthedata—whetheritisaresearcherorsomeoneinvestigatingaspeciﬁcintrusion—needsdataasclosetotheoriginalaspossible.Alterationscanaﬀectanalysis,andtheywanttheseminimized.There-foreonepartywantsmoreanonymizationofthedataandtheotherwantsless.

Balancingthesetwodiﬀerentneedsrequirestheabilitytospecifyconstraintsonwhichﬁeldsmustandmustnotbeanonymized,andwithwhatalgorithmssuchﬁeldsmaybeanonymized.Inotherwork[6],wehavedevelopedaﬁrstorderpredicatelogicinwhichtheseconstraintsmaybespeciﬁedtogenerateanonymizationpoliciesortestthemforcompliance.However,onemuststillcomeupwithmecha-nismstocreateintelligentconstraints.In[10],weinvesti-gatedtheneedsofthepartyperformingtheanalysissothattheycouldspecifyminimalconstraintsonwhatcannotbeanonymized—tryingtomeasuretheutilityofanonymized

data.Thiswork,instead,focusesonhowonecanspecifyminimalconstraintsonwhatmustbeanonymizedtomeettheminimumassurancelevelofthedataowner.

Anaturalwaytoexpressone’ssecurityortrustrequire-mentsisintermsofthetypesofattacksorthetypeofadversarytheymustwithstand.Withthisinmind,wehavecreatedataxonomyoftheattacksagainstnetworkloganonymizationbaseduponcommonattackpreconditions.Wehavecreatedamorethoroughandmutuallyexclusivetaxonomythanpreviousattemptswhichnotonlyreﬂectsallcurrentlyknownattacks,butwebelieveitwillbeabletoincorporatefutureattacks.Furthermore,sinceitisbaseduponattackpreconditions,namelywhatinformationisnec-essaryfortheadversarytoperformtheattack,itreadilytranslatesintoconstraintsforourpredicatelogic.There-fore,constraintscanbeeasilyderivedtopreventasingleattack,aclassofattacks,orarbitrarycollectionsofattacksthroughsimplelogicalconjunctionsofthesestatements.Theothernaturalwaytoexpressalevelofassuranceistospecifythetypeofadversarythatmustbeprotectedagainst.Therefore,oursecondcontributioninthispaperisthede-velopmentofanadversarialmodelbaseduponadversarialmeansandcapabilities.Lastly,wetietogethertheadversar-ialmodelandattacktaxonomysoonecantranslatebetweenconstraintsbaseduponeitheradversariesortheclassesofattackswhichtheycanperpetrate.

Thispaperisorganizedasfollows:Section2presentsourtaxonomyofattacks.Section3introducesthebasicsofouradversarialmodel,includingthenotationwehavedevelopedtoexpressthemodel.Section4establishesourmappingbe-tweenthetaxonomyandadversarialmodel.Section5dis-cussesrelatedwork,andsection6presentsfutureworkandourconclusions.

2.ATTACKTAXONOMY

Classifyingattacksagainstloganonymizationisanearlysteptowardsacomprehensivestudyofthesecurityofanony-mizationpolicies.Ifnetworkownerscanselectclassesofat-tacksthattheywishtoprevent,theycanthenensurethattheiranonymizationpoliciesmeettheirsecurityconstraints,whileallowingasmuchnon-privateinformationaspossibletoberevealed—thusincreasingadataset’sutility.

2.1MotivationsandRequirements

Asdescribedpreviously,wewishtoprovidenetworkown-erswithataxonomyofattacks,theclassesofwhichtheycanselecttoprevent,ratherthanhavingtofocusonindividualattacks.Wealsowishtoformallyexpressrelationshipsbe-tweenattacks,allowingforexpressionofattackgroupingsinalogicaboutanonymization.Thistaxonomymustbecomplete(everyknownattackcanbeplacedinatleastoneclass)andmutuallyexclusive(noattackcanbeamemberofmorethanoneclass).Theclassesmustbeﬁne-grainedenoughfornetworkownerstoselectspeciﬁcclasseswithoutseriouslyimpactingtheutilityofalog.Finally,theclassesmustbetiedtogetherinamoreconcretewaythanade-scriptioninnaturallanguage.

2.2Methodology

Nineteenattacksfrommultiplesources[23,15,11,2,14,4,22,7,5,24]werecollectedasrepresentativeofcurrentattacksagainstloganonymization.Theseattacksvariedsig-niﬁcantlyinbothsophisticationandmechanismsused.The

attacksincludedalltheattacksspeciﬁcallymentionedbothbyPangandPaxson[15]andbySlagelletal.[22],andsotheattacksampleisastrictsupersetofthoseconsideredforthoseprevioustaxonomiesdiscussedinSection5.

Eachattackwasanalyzedforitspre-conditions.Wechosetoorganizearoundpre-conditionsbecausetheycapturetheinitialknowledgeoftheadversaryandpropertiesoftheanonymizedlog.Everyanonymizationattackrequiressomeinitialknowledgeandsomespeciﬁclogproperties.Byorga-nizingaroundpre-conditions,wecanidentifytheknowledgeandlogpropertiesthatarecrucialtotheexecutionofananti-anonymizationattack.Section3willdemonstratehowwecantakethisinitialknowledgeandlogpropertiesintoaccountwhenmodelinganadversaryandwhendeterminingwhetheragivenadversarycancarryoutaparticularattack.Agraphwasconstructed,withnodesforeachattackunderconsideration,aswellasforpre-conditionsforeachattack.Incaseswheretwoormorenodessharedacommonpre-condition,thatconditionwasrepresentedwithasinglenode.Edgeswereaddedtothegraphconnectingpre-conditionstotheattackstheyenabled.Attemptswerethenmadetosat-isfactorilygroupattackswithcommonpre-conditions.

2.3Results

Thefullgraphusingpre-conditionsisillustratedinFig-ure1.Wenotethatalargenumberoftheattacks(9of19)requiresomeformofconsistentIPpseudonymsinordertoidentifyindividualmachinesinthelog,andanadditionalat-tackspeciﬁcallyrequirespreﬁx-preservingpseudonyms[23].Becausethispre-conditiondoesnothelpusdistinguishmanyattacks,weignoreitwhengeneratingourtaxonomy.Giventhis,wenowhavesixunconnectedsubgraphsthatcanbetreatedasclassesinourtaxonomy.Twoofthesesubgraphseachinvolveasingleattack/pre-conditionpair,andasbothoftheseattacksessentiallyinvolvepotentialcryptographicweaknessesintheanonymizationfunctionsthemselves,wegroupthemtogetherintoasingleclass.

Remarkably,thisanalysisrevealsahigh-leveltaxonomyvirtuallyidenticaltothosegivenbySlagelletal[22].Sev-eraloftheseclassessharenocommonpre-conditions,andtheonlypre-conditionsharedbetweenclassesisaconsistentIPmapping(suchasapreﬁx-preservingmapping[23])thatenablesidentiﬁcationofindividualmachinesinthelog.Themajorclassesinourtaxonomy,separatedinFigure1withblack,dashedlines,areasfollows:

•Fingerprinting:Theprocessofmatchingattributesofananonymizedobjectagainstattributesofaknownobjecttodiscoveramappingbetweenanonymizedandunanony-mizedobjects.

•StructureRecognition:Theactofrecognizingstruc-turebetweenobjectstouseonemappingtodiscovermul-tiplemappingsbetweenanonymizedandunanonymizedobjects.

•KnownMapping:Exploitingthediscoveryofamap-pingbetweenunanonymizedandanonymizeddatainonereference,toundoanonymizationofthatsamedatainmultipleplaces.

•DataInjection:Injectinginformationtobeloggedwiththepurposeoflaterrecognizingthatdatainananony-mizedform.

•Cryptographic:Attackingthecryptographicprimitivesthatformthebasisforanonymizationalgorithms.

TenofthenineteenattacksareclassiﬁedasFingerprint-

FingerprintingUnanonymized Port #sConsistent IP PseudonymsPreﬁx-Preserving IP pseudonymsStructure RecognitionData InjectionData Injection#1Client Web Browsing history can be inferred Port-Based#2Host network port behavior can be identiﬁed#8Network Topology can be recovered from pcap traces#9Combined w/ #8, proﬁling host behavior can compromise server IPCleartext Transport Protocol InfoCleartext TCP Timestamps#18NIC/TimerFingerprinting#3Unanonymized Preﬁx-Pres IPreveals bits of other addresses#5Port scan and known IP reveals multiple IPs#4Port scan reveals sequential ordering of IPs#11Attacker sends packet to known cleartext address, hopes it will show up in anonymized log#10Sensors unanonymized in \"Probe Response attack\"Info isn't dropped from logUnanonymized Flow SizesCleartext Header InfoNetFlows111\"Long\" TraceKnowledge of machine clock drifts1Known anonymized/unanonymized IP pairRecognizable Port Scan in logForeknowledge of log collection time#17OS Fingerprinting#19Virtual Machine IdentiﬁcationMachine Attribute11VMs in trace (large clock skews)#7Consistently anonymized IPs can be unanonymized if compromised in another logAnonymized/Unanonymized Log PairKnown Mapping#6Consistently anonymized data can be unanonymized if compromised in another locationLegendAttack RequirementAttack1Cleartext ﬁle sizeKnown popular ﬁles and sizesRecognizable Client Behavior#16Client behavior ﬁngerprintingMandatoryAccess to multiple logsConsistent Data AnonymizationxMandatory Choice (Group x)Helpful#14File transfer ﬁngerprintingCleartext/Recognizable Server query repliesBehavioral#15Server interaction ﬁngerprinting#13Chosen plaintext attack can reveal data encryption keyCryptographic#12Dictionary attack on hashesCrypto function susceptible to Chosen PlaintextHash function anonymizes dataNetwork TrafficFigure1:GroupingofAnonymizationAttacksbyPre-conditionsingattacks.OuroriginalcriticismoftheSlagelletalclas-siﬁcationwasthatitwasnotﬁne-grainedenoughtoallowforpolicycreationthatprovidedahighlevelofbothsecu-rityandutility.Thus,wehaveattemptedtocreatesub-classesoftheFingerprintingattackclass.Weﬁrstanalyzedcommonprerequisitesbetweenattacks.However,nosim-pleapproach—astakentocreatethemajorclasses—couldbefound.Instead,wedecidedtocreatesubclassesusingacase-by-caseapproach:•Attacks#1,#2,#8,and#9allreliedonunanonymizedportnumbers(oronbeingabletoextracttheportnum-bersfromstatisticalanalysis[4]).WechosetocalltheseattacksPort-basedFingerprinting:Fingerprintingbynetworkportsorwell-knownservices.•Attacks#15and#16,whilenotsharinganycommonpre-conditions,werebasedonaknownmachine/userbe-havior,andsowetitletheseBehavioralFingerprint-ing:Fingerprintingbasedonobservinganentity’sknownbehavior.•Attacks#18and#19weresimpletogrouptogether,asbothrelyoncleartexttimestamps.WealsodecidedtogroupOSFingerprinting(#17)inthissubclass,thoughwewilldiscussitfurtherbelow.WecallthisMachineAttributeFingerprinting:Fingerprintingbasedon(rel-atively)immutableattributesofamachine,suchashard-waretimerdriftoroperatingsystemcharacteristics.•Aloneinthissubclassisattack#14,FileTransferFin-gerprinting,whichistheonlyﬁngerprintingattackinoursamplewhichattemptstoﬁngerprintsomethingotherthanamachine.Forthisreason,wecreatedaseparatesub-groupforit,whichwebelievecanbeexpandedwithotherexamples.WetitledthisclassNetworkTraﬃcFinger-printing:Fingerprintingbasedonrecognizingpatternsinherenttonetworktraﬃcsuchasknownﬁletransfersizes.Figure2illustratesthecompletetaxonomy.OSFingerprintingpresentsaparticularproblemforourclassiﬁcation.WefeelthatOSFingerprintingisviewedbestnotasasingleattack,butagroupofattacksbasedonapost-condition.Ifwearestructuringourtaxonomyuponpre-conditions,though,itismoresensibletosplitOSFin-gerprintingattacksamongtheotherfoursubclasses—thatistosay,theremaybePort-based,Behavioral,MachineAttribute-based,andNetworkTraﬃc-basedwaystoperformOSﬁngerprinting.Thus,tostopallformsofOSﬁngerprint-ing,weneedtoconsidereachsubclassinturn.Weleavetheseconsiderationsasfuturework.2.4AnalysisofPost-conditionsWeattemptedtoanalyzeattackpost-conditionsaswell.However,amajorityoftheattacksweexamined(10of19)werespeciﬁcallyaimedatidentifyingamachineordeanony-mizinganIPaddress.Theremainingnineattackseachhaveauniqueattributethattheydeanonymize.Asmostofthesesingleattack/post-conditionpairsseemtobeunrelated,andtheremainingattacksallfocusedonthegoalofmachineidentiﬁcation,wedeterminedthatpost-conditionanalysiswasnotusefulforourpurposes.Amorethoroughdiscus-sionofourpost-conditionanalysisisavailablein[6].3.ADVERSARIALMODELOurmodelconsistsofanumberofelements,representinglogs,logentities,logentity’propertiesanddata,thead-versary’sknowledgeset,patterns,andacollectionoffouradversarialmeans.Wewillexplaineachoftheseelements,andSection3.7willgiveathoroughexampleofhowtheseelementscanbecombinedtodescribeanadversary’scapa-bilities.3.1EntitiesAttack

Data Injection Attacks

Cryptographic AttacksFingerprinting Attacks

Known Mapping AttacksStructure Recognition Attacks

Port-based FingerprintingBehavioral Fingerprinting

Machine Attribute FingerprintingNetwork Trafﬁc Fingerprinting

Figure2:TaxonomyofAttacksAgainstAnonymization

Anentityinourmodelisanyitemthatanadversarymaywishtodiscoverinformationabout.Exampleentitiescouldbemachines,users,orﬁlessentoveranetwork.Anentityisessentiallysynonymouswithanobjectin[3],thoughwemaywishtodiscoverinformationaboutentitiesthatisnotinalog,evenwhenunanonymized.Forexample,wemaywishtodiscoveramachine’soperatingsystem,eventhoughthereisnosuchﬁeldinmostlogformats.

Entitieshaveattributes,someofwhichmaybeknowntoanadversaryandsomeofwhichmaynotbeknown.En-tityattributesmaybeanonymizedorunanonymized,andtheadversarymaywishtodiscovereitherform.Wewillde-velopnotationforanonymizedandunanonymizedattributesinsection3.4.

Anentityinourmodelwillbedenotedase,withanop-tionalsubscriptthatservesasawaytodiﬀerentiateenti-ties.Forexample,a“target”entitymaybedenotedaset.Attributeswillbedenotedastheentity’sname,followedbyaperiodandthenameoftheattribute.Forexample,thetargetentity’sunanonymizedIPaddressmaybedenotedaset.ipaddr.

3.2AdversaryKnowledgeSet

Theadversary’sknowledgesetisacollectionofentities,andtheirknownattributes.Thegoaloftheadversaryistouncoveranentity’sattributes,andtoaddthemtoitsknowl-edgeset.Datamustbeintheadversary’sknowledgesetbeforeitcanbeusedtouncovermoreinformation.Weconsideraddinginformationtotheadversary’sknowledgeset,butwedonotbelievethereisanyreasontoconsiderinformationlossfromthisset.Wedenotetheadversary’sknowledgesetusingthesymbolK,andwedenoteadditionofinformationtotheadversary’sknowledgesetusingthe‘⇒’symbol.Forexample,e.ipaddr⇒Kmeansthattheentity’sunanonymizedIPaddresshasbeenaddedtothead-versary’sknowledgeset.

TheKnows()Assertion

TheKnows()assertionletsusstatethatanadversaryknowsaspeciﬁcpieceofinformation.Whenusedtoconstructat-tackspeciﬁcations,theKnows()willcauseanattacktofailifitisnottrue,andthus,itisusedtospecifyanat-tack’sprerequisiteknowledge.Asanexample,thestatementKnows(et.ipaddr)assertsthatet.ipaddr∈K,thatis,thattheadversaryknowsthetargetentity’sunanonymizedIPaddress.

3.3Logs

AbbreviationMeaningCPConsistentPseudonymPPPreﬁxPreservingIP[23]BMBlackMarker[20]RPRandomPermutation[20]Table1:Abbreviationsforanonymizationalgo-rithmsandclassesofalgorithms

Logsarethesourceofinformationaboutentities.Whendiscussingaparticularlog,weassumethattheadversaryhassomeformofthatlog(eitheranonymizedorunanonymized)inhispossession.WewilldenotelogsusingthesymbolL.Wemayoccasionallyplaceasubscriptafterthisforthefewattacksinwhichmultiplelogsareinvolved.

3.4Anonymized/UnanonymizedData

Wewilldenoteunanonymizeddatausinganunderlineno-tation.Forexample,anunanonymizedIPaddressmaybedenotedasipaddr.

Wewilldenoteanonymizeddatausinganoverlineno-tation,followedbyabackslash,andthenanabbreviationtonotewhatalgorithm(orclassofalgorithms)areusedtoanonymize.Forexample,anIPaddressanonymizedusingaconsistentpseudonymalgorithm(suchaspreﬁx-preservingIPanonymization[23]orrandompermutation[20])maybedenotedasipaddr\\CP.AlistofabbreviationsforuseinthismodelcanbefoundinTable1.

3.5Patterns

Patternsareawaytomatchinformationintheadver-sary’sknowledgeset.Becauseofthelargevarietyofcom-putationsmatchinginformationcaninvolve,wechoosenottoformalizetheconstructionofapatternbesidesspecifyingitsinputsandoutputs.However,Coull’sadversarialmodel[3]maymakeagoodmodelforapatterninmanycasesofﬁngerprintingorstructurerecognitionattacks.Whileourmodelmakesuseofveryunspeciﬁedpatterns,weviewthisasbeingapowerfulconstructthatcanbebroadlyapplied.Patternstakeasetofinputattributes,whichmaybeei-theranonymizedorunanonymized.Thekeytotheseinputsisthattheyarepre-conditions,inthatthepatterncannotbematchedagainstiftheseinputsarenotavailable.Forexample,adatainjectionpatternusingunanonymized,ob-scureTCPoptionscannotberecognizedifthoseoptionsareanonymized(especiallybyaBlackMarkeralgorithm).Like-wise,apatternrelyingonpacketsizescannotbeappliedtoaNetFlowlog,asitdoesnotreportindividualpackets,but

rather,thesizeoftheentireﬂowbetweenhosts.Thus,ifourgoalistostoptheapplicationofapattern,wesimplyneedtopreventtheadversaryfromgainingknowledgeofthepattern’spre-conditions.

PatternsaredenotedusingthesymbolP.Wehavefor-malizedthepre-conditionsbylistingtheminanglebracketsafterthenameofthepattern.Forexample,ifpatternPexhase.ipaddr\\CPande.tcpoptionsareprerequisites,thenthefullnotationwillbePex󰀋e.ipaddr\\CP,e.tcpoptions󰀌.

3.6AdversarialMeans

Belowweconsiderfour“means”,orcapabilities,ofanad-versaryinourmodel.Eachofthesemeansisinrelationtoanattributeinarecord(usuallyanattributebelongingtoanentityofinterest).Incombination,thesemeans(alongwiththeKnows()assertion)candescribeanyattack.The‘→’symbolrepresentsthatameansproducesagivenvalue.Forexample,f(...)→gsaysthatthemeansf,whensuc-cessfullyused,producesinformationg.

3.6.1TheObserve()Means

Form:Observe(L,e.attr)→e.attr

TheObserve()meansrepresentsknowledgethatanadver-sarycandirectlylearnfromalog.Thisinformationmaybeeitheranonymizedorunanonymized,asitispresentedintheloginquestion.Observe()cannotdirectlyun-anonymizedata,norcanitinferdatanotfoundinthelog.

Asanexample,theactofobservinganentity’sIPad-dressfromalogthathasnotbeenanonymizedcanberep-resentedasObserve(et.ipaddr)→et.ipaddr⇒K(wenotethatsomeauthorshaveconsideredthisasimpleattack[11]).Asecondexample,involvingaportnumberanonymizedusingtheblackmarkeralgorithm,isverysimilar:

Observe(et.portnum\\BM)→et.portnum\\BM⇒K.

3.6.2TheCompromise()Means

Form:Compromise(e.attr\\alg)→e.attrTheCompromise()meansdirectlyrevealstheunanonymizedversionofananonymizedattribute,whichcanthenbeaddedtotheadversary’sknowledgeset.

AnexampleuseofCompromise()isanattackonacryp-tographicallyweakanonymizationalgorithm,suchasadic-tionaryattackagainstahashofIPv4addresses.Thereareonly232possibleaddresses(whichcanbeevenfurthercon-strainedduetootherknowledgeavailabletotheadversary),andconstructingadictionaryoftheseaddresseswouldtakeunderanhourattherelativelymodestrateof1.2millionaddresshashespersecond1.ThisattackcanberepresentedasCompromise(et.ipaddr\\MD5)→ipaddr⇒K.

WedonotanticipatethatmostadversarieswillhaveaCompromise()abilityformostoranyattributesandthatitprimarilywillservetoidentifyadversarieswhocantakeadvantageofweakanonymizationalgorithms.TheattacksperformedusingtheCompromise()meanswillgenerallybethosewehaveclassiﬁedasCryptographic.

3.6.3

TheInject()Means

doWerananOpensslbenchmarkandfoundthatXeonoverat1.23.00millionGHz.

md5hashespersecondonasingleitcouldcoreForm:Inject(L,P󰀋attr1,attr2,attr3,...󰀌)→∅

TheInject()meansinjectsapattern(involvingattributesattr1,attr2,attr3,...)intoalogforlaterrecognitionbyMat-ch().ItmustbeperformedbeforeanyObserve(),Compro-mise(),orMatch()canbedonetoanindividuallog(asaninjectionmusthappenwhilethelogisbeingcollected,asop-posedtothepost-collectionanalysisperformedbytheotherthreemeans).Theattacksconstructedusinganadversary’sInject()meansdirectlymaptotheclassofattackscalled“DataInjection”attacksinSection2.

Asanexample,considerinjectingwithapatternbaseduponaparticularTCPoption.Ifwecanconstructapat-ternPinj󰀋tcpopt󰀌usingthisTCPoption,wecandenoteitsinjectiontologLtbyInject(Lt,Pinj󰀋tcpopt󰀌).

3.6.4TheMatch()Means

Form:Match(e,P󰀋attr1,attr2,attr3,...󰀌)→e.attrTheMatch()meansusesaconstructedpatternPtorevealanunknownattributeofentitye.Itreliesontheparameterattributesattr1,attr2,attr3,...,whichmaybeanonymizedorunanonymized,inﬁndingamatchandrevealingthenewattribute(whichmayalsobeanonymizedorunanonymized).TheMatch()meansistheprimarymeansbywhichweexpectadversariestolaunchattacks.

AnexampleuseofMatch()canbeseenusingacontin-uationoftheTCPoptionpatterninjectedinourInject()meansexample.Ifwehaveanentityeswhichweareob-serving(toseeifitisthesameasourtargetentityetthatinjectedthepattern),wecanusethematchmeansasfol-lows:Match(es,Pinj󰀋tcpopt󰀌)→et.ipaddr\\CP⇒K.Ifes’sTCPoptionsmatchthepatternPinj,thenweinferthates≡et,andwethenknowet’sanonymizedIPaddress.

3.7ConstructinganExampleAttack

FromtheKnows()assertionandthefourmeansinsection3.6,(aswellasasetofpatterns),wemayconstructformaldescriptionsofvariousattacksagainstanonymization.Fig-ure3demonstratesadatainjectionattackusingobscureTCPoptions,whichwewillexploremorethoroughlybelow.Constructionofotherattacksisquitesimilar,andfurtherexamplesareavailableintheappendix,aswellasintheﬁrstauthor’sthesis[6].

TheattackgiveninFigure3consistsofasingleKnows()assertion,andfourmeans.First,weassertthattheadver-sarymustknowitsownunanonymizedIPaddress(thatis,theIPaddressoftheentityinthelogthatinjectsthedata).Second,theadversarymusthavethemeanstoInject()apatternPinjusingTCPoptionsintothelog.Next,afterob-tainingthelog,theadversaryloopsthroughalltheanony-mizedentities.Foreachentity,theIPaddress(anonymizedbyanalgorithmthatproducesconsistentpseudonyms)andtheTCPoptionsusedbytheentityareobserved(Observe()).Finally,wematchtheentityagainsttheinjectedpattern.Ifthepatternmatches(Match())theentityweareexamin-ing,weknowthattheanonymizedentityisthesameastheadversary’sentity,andthusweknowtheadversary’sanony-mizedIPaddress.

4.MAPPINGADVERSARIES&ATTACKS

Ourtaxonomyandadversarialmodelareusefulinandof

Knows(eself.ipaddr)Inject(L,Pinj󰀋eself.tcpoptions󰀌)FOREACHe∈L

Observe(L,e.ipaddr\\CP)⇒KObserve(L,e.tcpoptions)⇒K

Match(e,Pinj󰀋eself.tcpoptions󰀌)→

eself.ipaddr\\CP⇒K

ENDFOR

Figure3:DataInjectionAttackConstruction:RecoverAdversary’sAnonymizedIPAddressthemselves.However,theirinterrelationshipisthekeyfea-turethatletsnetworkdatasetownersdevelopasafeanony-mizationpolicy.Thispolicycanbederivedbyhandfromourmodel,thoughourotherwork[6]hasshownhowpoliciessafeagainstagivensetofattackscanbederivedautomati-cally.Regardlessofthemethodofpolicyderivation,weseektoshowtherelationshipbetweenourtaxonomyandmodelasawayofselectingthethreatstoanonymizationthatcon-cernus.

4.1Mappingadversariestoattacks

Anadversarycanperpetrateanyattackforwhichithastheprerequisiteknowledgeandmeans.Formally,consideranadversaryχwithasetofmeansandknowledgeMχ.AnadversarycanperpetrateanyattackthatcanbeconstructedfromtheelementsofMχ.Inotherwords,anattackαthatrequiresmeansandknowledgeµαcanbeperpetratedbyχifandonlyifµα⊆Mχ.

Sinceanyattackthatcanbeperpetratedbyanadversaryhasameansandknowledgesetthatisasubsetoftheadver-sary’smeansandknowledgeset(thatis,µα⊆Mχ),wecanexploreallpossiblecombinationsofthemeansandknowl-edgeoftheadversarytoﬁndallattackspossible,givenanadversary.Speciﬁcally,sincethepowersetP(Mχ)containsallsubsetsofMχ,weknowthateveryattackthatχcanperpetrateisamemberofP(Mχ).Inthecasethat|Mχ|isﬁnite,wecanevenenumeratealltheelementsofP(Mχ),andthusallofχ’spossibleattacks.

However,notethatmany(ifnotmost)oftheelementsofP(Mχ)arenotactuallyattacks.Forexample,anattackthatdoesnothaveatleastoneObserve()meansdoesnotreadanyinformationfromalog,andthereforeisnotanat-tackonloganonymization.Second,onlyCompromise()andMatch()actuallyrevealnewinformation,andthusanattackmustcontainatleastoneCompromise()orMatch()means.Finally,notethatpatterns,Compromise(),andMatch()allrelyonspeciﬁcinformationthathasbeenreadfromalog,andthus,toconstructanattackusinganyofthese,wemustalsoincludetheirObserve()counterparts.Thislimitsthesetofpotentialattacksthatwecanconstruct.Becauseofthis,ifwecallourmappingofadversariestoattacksγ,thenwemightsayγ(χ)󰀁P(Mχ).

4.2Derivingadversariesfromattacks

Wecanalsoestablishareversemapping,thatis,γ−1fromanattacktothesetofadversariesthatcanperpe-trateit.Whilethisadversarysetwillgenerallybeuncount-ablyinﬁnite(asthereareanuncountablyinﬁnitenumberofpatternswemayuseforaMatch()),wecandiscussaminimaladversarythatcanperpetrateαsimplyasthead-

versarywithexactlythemeansandknowledgesuchthatαcanbeconstructed.Thus,aminimaladversaryχminforattackαissimplytheadversarywithmeansandknowledgeMχmin=µα.

GivenasetofattacksA,wecanconstructaminimalad-versaryχminbysimplytakingtheunionofeverymeansandknowledgesetrequiredfortheattacksinA,formally,

A⊆γ(χmin)⇐⇒[

µα≡Mχmin

α∈A

NotethatA⊆γ(χmin).Thisisbecauseitisentirelypossi-blefortheunionoftwoormoreattacks’prerequisitemeans

andknowledgesetstocombineandallowforanadditionalattack.Thenewattackmusthaveallofitsprerequisitesintheunionoftheoriginalattacks’meansandknowledgesets.Insomecases(forexample,attacks#8and#9inourtaxonomy),anattackα1mayhaveameansandknowl-edgesetthatisasupersetofanotherattackα2’smeansandknowledgeset,thatis,µα2⊂µα1,andinthisrarecase,theminimaladversarycapableofperpetratingα1willalsobeabletocarryoutα2.Webelievethatthiswillonlyoccurwithhighlysimilarattacks.

4.3ExampleMappings

Consideranadversaryχ,whocaninjectdataintoalogasitisbeingcollected,andthenrecovertheloginanony-mizedform.However,assumethatthelogisanonymizedsuchthatonlypseudonymsremainfortheIPaddresses,andallotherﬁeldsexceptthetimestampsandTCPoptionsareremoved.Iftheadversaryisfamiliarwithanonymizationattackliterature,heorshemaybeabletousetheattacksdescribedinsomepaperstoreproducethemethodsusedfortheattacks.Assumethattheadversaryknowsthetargetentity’sunanonymizedIPaddress,clockdrift,TCPoptions,andoperatingsystem.Areasonablerepresentationofχ’smeansandknowledgesetsare:

Mχ={Observe(L,e.ipaddr\\CP),Observe(L,r.timestamp)Observe(L,e.tcpoptions)Match(e.timestamps[],Ptd),Match(e.timerdrift,Pt󰀋et.timerdrift󰀌),Match(e,Pinj󰀋et.tcpoptions󰀌),Inject(L,Ptd),Inject(L,Pinj󰀋et.tcpoptions󰀌)}

Kχ={et.ipaddr,et.timerdrift,et.tcpoptions,et.os}ByexaminingFigure3,wecanseethatallthenecessaryknowledgeandmeansarepresenttolaunchtheTCPop-tionsinjectionattack—wecallthisαinj.WecanalsoseefromFigure5thatwehavethemeansforKohnoetal’shardwaretimerdriftattack[7],whichwecallαtim.αinjandαtimaremembersofγ(χ),andthus,botharepossibleforχtoperpetrate.

Ourexampleχaboveisnottheminimaladversarythatcanperpetratebothattacks:themeansInject(L,Ptd)andtheknowledgeet.osareunnecessaryforeitherattack(nei-therarepresentinµαinjorµαtim).However,theseextrasmaymakeotherattackspossibleforχ.Forexample,itmaybepossibletouseInject(L,Ptd)toinjectatimingpatternintothelogbasedontheadversary’sknownuniquetimerdrift.

Inordertoprotectagainsttheattacksthatweknownχcanperform,weneedtochangetheanonymizationpolicy,whichchangestheavailableObserve()means.Forexample,

wecouldanonymizeIPaddresseswithablackmarkeralgo-rithm,makingtraﬃcfromvariousentitiesmuchharder(orimpossible)todistinguish.Wecouldalsoanonymizetimes-tampsandTCPoptions.

5.RELATEDWORK5.1

PriorTaxonomies

PangandPaxson[15]weretheﬁrsttosuggestanattackclassiﬁcationinthedomainofloganonymization.Theyde-scribedfourbasictypesofattacks.However,theseclassi-ﬁcationswereneithercompletenormutuallyexclusive[22].Furthermore,theclassiﬁcationsaretoobroad.Theplethoraofattacksthatfallwithinanyoneclassmakeseliminationofallattacksinaclass,whilefeasible,aseriousdetrimenttotheutilityofalog.

Slagelletal’sclassiﬁcation[22]introducestwoadditionalcategoriesofattacksnotconsideredbyPangandPaxson,andreorganizestheprevious,sometimes-overlappingcate-goriesintothreemutuallyexclusivesets.Thisreclassiﬁca-tionaddressestheﬁrsttwoproblemsofPangandPaxson’staxonomy,butitisstilltoocoarse.Werequireamoreﬁne-grainedtaxonomytousefulcorrelatewithouradversarialmodel.

5.2PriorAdversarialModels

Coulletal.recentlydevelopedtheirownadversarialmodelinconjunctionwithanentropymeasurement[3].Whilethismodelisconvincingfromaninformationtheoreticperspec-tive,itgivesnohardguaranteesonspeciﬁc,knownattacks—evenignoringallactiveattacks.WeviewCoulletal.’smodelandtheassociatedmetricsasanadditionaltoolthatcom-plimentsourownadversarialmodel.Ananalogouscontrastwouldbebetweensignatureandanomalybasedintrusiondetection.

TheinspirationforourmodelcomesfromAvione’sgenericmodelforattackingsecureRFIDtags[1].Avione’smodelconsistsofthreechannels:“forward”,“backward”,and“mem-ory”.Theadversaryalsohasasetofﬁve“means”,orcapa-bilities:Query(),Send(),Execute,Execute*(),andReveal().Ourmodelhastwo(implicit)channels.TheForwardchan-nelisdatainjectedintothelog,andtheBackwardchannelisthelogitself.Wehavenoneedforanequivalenttothemem-orychannel.Ourmeansroughlymaptotheirsasfollows:Query()⇐⇒Observe(),Send()⇐⇒Inject(),Execute()⇐⇒Match(),andReveal()⇐⇒Compromise().

6.CONCLUSIONSANDFUTUREWORK

Wehavepresentedanewtaxonomyandadversarialmodeltodescribeattacksagainstnetworkloganonymization.Thistaxonomyistheﬁrsttoexploreandclassifyalargevarietyofattacks,anditistheﬁrsttoclassifybyarigorous,empiricalevaluationofattackproperties.Ouradversarialmodelisca-pableofcapturingallcurrentlyknownanti-anonymizationattacks,andwebelieveitwillbeabletoincorporatenewattacksastheyarediscovered.Furthermore,weareabletomapanadversaryconstructedwithaspeciﬁcmeansandknowledgesetintoaclassofpotentialattacks(andviceversa).

Theimpactistoprovideamethodologythatallowsorga-nizationstocreatesaferanonymizationpoliciesforexistinglogsanitizers.Byallowingdataownerstomakemorein-formeddecisions,theycanmoreeasilybalancethetrade-oﬀs

betweenutilityandtrust,thusallowingformorecollabora-tion.Hence,thisworkcomplimentsourotherworkinthisareaevaluatinghowanonymizationaﬀectsutilityanddevel-opingaﬂexibleanonymizationframework[10,20].

Likeanynewtaxonomy,itsvaluewillonlybedeterminedinthefuturebythosewhohavefounditusefultowardstheirgoals.Wehavefounditusefulasitmapssonicelyintoaﬁrstorderpredicatelogicthatcanexpressconstraintsonwhatcannotbeanonymizedforutilityandwhatmustbeanony-mizedtopreventspeciﬁcattacksorclassesofattacks[6].Inthefuture,weplanonusingthispredicatelogicalongwithouranonymizationframeworktoseeiftheymeetspeciﬁedutilityandtrustrequirements.

7.ACKNOWLEDGEMENTS

ThismaterialisbaseduponworksupportedbytheNa-tionalScienceFoundation(NSF)underAwardNo.CNS0524643.Anyopinions,ﬁndings,andconclusionsorrecom-mendationsexpressedinthispublicationarethoseoftheauthorsanddonotnecessarilyreﬂecttheviewsoftheNSF.

8.APPENDIX

Observe(L,et.ipaddr\\MD5)⇒KCompromise(et.ipaddr\\MD5)→et.ipaddr⇒K

Figure4:CryptographicAttackConstruction-DictionaryattackonMD5-hashedIPv4AddressesKnows(et.ipaddr)Knows(et.timerdrift)FOREACHe∈LObserve(L,e.ipaddr\\CP)⇒KFOREACHrecordr∈e

Observe(L,r.timestamp)→e.timestamps[]⇒KENDFOR

Match(e.timestamps[],Ptd)→e.timerdrift⇒KMatch(e.timerdrift,Pt󰀋et.timerdrift󰀌)→et.ipaddr\\CP⇒KENDFOR

Figure5:MachineAttributeFingerprintingAttackConstruction-HardwareTimerDriftObserve(L,et.ipaddr\\CP)⇒KFOREACHrecordr∈et

Observe(r.timestamp)⇒KObserve(r.port)⇒KObserve(r.flowsize)⇒K

Match(r,Psessions󰀋r.timestamp󰀌)→et.sessions[]⇒KENDFOR

FOREACHsessions∈entt.sessions[]

Match(s,Pwebsites󰀋s.port,s.flowsizes󰀌)→s.website⇒KENDFOR

Figure6:Port-basedFingerprintingAttackCon-struction:WebBrowsingActivityIdentiﬁcation

[1]Avoine,REFERENCES

G.,

“AdversaryModelforRadioFrequencyIdentiﬁcation,”CryptologyePrintArchive,Report2005/049,2005.[2]Bethencourt,J.,Franklin,J.,Vernon,M.,“Mapping

InternetSensorswithProbeResponseAttacks,”14thUSENIXSecuritySymposium,Aug.,2005.[3]Coull,S.,Wright,C.V.,

Keromytis,A.,Monrose,F.,andReiter,M.,“TamingtheDevil:TechniquesforEvaluatingAnonymizedNetworkData,”,15thNetworkandDistributed

SystemSecuritySymposium(NDSS’08),Feb.,2008.[4]Coull,S.,Wright,C.V.,Monrose,F.,Collins,

M.,andReiter,M.,“PlayingDevil’sAdvocate:InferringSensitiveInformationfromAnonymizedNetworkTraces,”14thNetworkandDistributed

SystemSecuritySymposium(NDSS’07),Feb.,2007.[5]Coull,S.,Collins,

M.,Wright,C.V.,Monrose,F.,andReiter,M.,

“OnWebBrowsingPrivacyinAnonymizedNetFlows,”16thUSENIXSecuritySymposium,Aug.,2007.[6]King,J.,“ATaxonomy,Model,andMethodfor

SecureNetworkLogAnonymization,”Master’sThesis,UniversityofIllinoisatUrbana-Champaign,Apr.,2008.[7]Kohno,T.,Broido,A.,andClaﬀy,K.C.,“Remote

PhysicalDeviceFingerprinting,”IEEETransactionsonDependableandSecureComputing,Apr.,2005.[8]Koukis,D.,

Antonatos,S.,andAnagnostakis,K.,“OnthePrivacyRisksthofPublishingAnonymizedIPNetworkTraces,”10IFIPOpenConferenceonCommunicationsandMultimediaSecurity(CMS’06),Oct.,2006.

[9]Koukis,D.,Antonatos,S.,Antoniades,D.,Markatos,

E.,andTrimintzios,P.,“AGenericAnonymizationFrameworkforNetworkTraﬃc,”IEEEInternationalConferenceonCommunications(ICC’06),Jun.,2006.[10]Lakkaraju,K.andSlagell.A.,“Evaluatingthe

UtilityofAnonymizedthNetworkTracesforIntrusionDetection,”4SecureCommConference,Sep.,2008.[11]Lincoln,P.,Porras,P.,andShmatikov,V.,“Privacy-preservingSharingandCorrectionofSecurityAlerts,”13thUSENIXSecuritySymposium,Aug.,2004.[12]McHugh,J.,“TestingIntrusionDetectionSystems:

ACritiqueofthe1998and1999DARPAIntrusionDetectionSystemEvaluationsasPerformedbyLincolnLaboratory,”ACMTransactionsonInformationandSystemSecurity,Vol.3(4),Nov.,2000.[13]Øverlier,L.,Brekne,

T.,and˚Arnes,A.,“Non-expandingTransaction

SpeciﬁcPseudonymizationforIPTraﬃcMonitoring,”4thInternationalConferenceonCryptologyandNetworkSecurity(CANS’05),Dec.,2005.[14]Pang,R.,

Allman,M.,Paxson,V.,andLee,J.,“TheDevilandPacketTraceAnonymization,”SIGCOMMComputingCommunicationsReview,Vol.36(1):29–38,2006.

[15]

Pang,R.Paxson,V.,“AHigh-levelProgrammingEnvi-ronmentforPacketTraceAnonymizationandTransfor-mation,”ACMConferenceoftheSpecialInterestGrouponDataCommunication(SIGCOMM’03),Aug.,2003.[16]

Porras,P.,andShmatikov,V.,

“Large-scaleSecurityData:CollectionRisksandandChallenges,”SanitizationWorkshopofNetworkonNewSecurityParadigms(NSPW’06),Sep.,2006.[17]

Ramaswamy,R.,andWolf,T.,“High-SpeedPreﬁx-PreservingIPAddressAnonymizationforPassiveMeasurementSystems,”Networking,IEEE/ACMTransactionsonNetworking,Vol.15(1),Jan.,2007.[18]

Ribeiro,B.,Chen,W.,Miklau,G.,and

Towsley,D.,“AnalyzingPrivacyinEnterprisePacketTraceAnonymization,”15thNetworkandDistributedSystemSecuritySymposium(NDSS’08),Feb.,2008.[19]

Sicker,D.,Ohm,P.,andGrunwald,D.,“LegalIssuesSurroundingMonitoringduringNetworkResearch,”In-ternetMeasurementConference(IMC’07),Jun.,2007.[20]

Slagell,A.,Lakkaraju,K.,andLuo,K.,“FLAIM:AMulti-levelAnonymizationFrameworkforComputerandNetworkLogs,”20thLargeInstallationSystemAdministrationConferenceLISA’06),Dec.,2006.[21]

Slagell,A.,Li,Y.,andLuo,K.,“Sharing

NetworkLogsforComputerForensics:ANew

toolfortheAnonymizationofNetFlowRecords,”FirstComputerNetworkForensicsResearchWorkshop,heldinconjunctionwithIEEESecureComm,Sep.,2005.[22]

Slagell,A.,andYurcik,W.,“Sharing

ComputerNetworkLogsforSecurityandPrivacy:AMotivationforNewMethodologiesofAnonymization,”FirstInternationalWorkshopontheValueofSecuritythroughCollaboration(SECOVAL’05),Sep.,2005.[23]

Xu,J.,Fan,J.,Ammar,M.,andMoon,S.,“Preﬁx-preservingIPAddressAnonymization:Measurement-basedSecurityEvaluationandaNewCryptography-basedScheme,”10thIEEEInternationalConferenceonNetworkProtocols(ICNP’02),Nov.,2002.[24]

Zalewski,M.,

andStearns,W.,“PassiveOSFingerprintingTool,”www.stearns.org/p0f/README,viewedAug.1,2008.[25]

Zhang,Q.andLi,X.,

“AnIPAddressAnonymizationSchemewithMultipleAccessLevels,”AdvancesinDataCommunicationsandWirelessNetworks(ICOIN’06),Jan.,2006.[26]

Zhang,Q.,Wang,J.,andLi,X.,“OntheDesignofFastPreﬁx-PreservingIPAddressAnonymizationScheme,”InternationalConferenceonInformationandCommunicationsSecurity(ICICS’07),Dec.,2007.

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文