Systems&Tech.Group
IBMRochester
JustinKing∗
jkking@us.ibm.com
Dept.ofComputerScience
UIUC
KiranLakkarajuAdamSlagellslagell@illinois.edu
NCSAUIUC
klakkara@illinois.edu
ABSTRACT
Inrecentyears,ithasbecomeimportantforresearchers,se-curityincidentrespondersandeducatorstosharenetworklogs,andmanyloganonymizationtoolsandtechniqueshavebeenputforthtosanitizethissensitivedatasourceinor-dertoenablemorecollaboration.Unfortunately,manynewattackshavebeencreated,inparallel,thattrytoexploitweaknessesintheanonymizationprocess.Inthispaper,wepresentataxonomythatrelatessimilarkindsofattacksinameaningfulway.Wealsopresentanewadversarialmodelwhichwecanmapintothetaxonomybythetypesofattacksthatcanbeperpetratedbyaparticularadversary.Thishashelpedustonegotiatethetrade-offsbetweendatautilityandtrust,bygivingusawaytospecifythestrengthofananonymizationschemeasameasureofthetypesofadver-sariesitprotectsagainst.
CategoriesandSubjectDescriptors
C.2.3[NetworkOperations]:[NetworkMonitoring];K.4.1[PublicPolicyIssues]:[Privacy];K.4.3[OrganizationalImpacts]:[Computer-supportedcollaborativework];K.6.m[Miscellaneous]:[Security]
GeneralTerms
Security,Verification
Keywords
AdversarialModel,Anonymization,NetworkLogs,Taxon-omy
1.INTRODUCTION
Asmanyhaveargued,sharingofnetworktraces,flowdataandotherlogsisvitallyimportanttoresearch,industryandpedagogy[22,16,14].Thenetworkandsecurityresearchcommunitiesrelyonlarge,diverseandnon-synthetic[12]
∗ThisworkwasperformedwhentheauthorwasatUIUC.
Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprofitorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationonthefirstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspecificpermissionand/orafee.
SAC’09,March8–12,2009,Honolulu,Hawaii,U.S.A.Copyright2009ACM978-1-60558-166-8/09/03...$5.00.
datasetsforempiricalstudies.Incidentrespondersneedtosharereallogstocollaborateoninvestigationsofattacksthatcrossorganizationalboundaries.Educatorsandthosecreatingeducationalmaterialsneedexampledataforstu-dentprojectsandexercises.
However,ithasfrequentlybeenpointedoutthatthesesortsofdataareoftenverysensitive[22,18,23,17].Therearesecurityconcernsaboutwhatinformationmaybere-vealedregardingthedatacollector’snetworkandsystems,andtherearelegalquestionsaboutbetrayingthetrustoftheuserswhoseprivateactionsarebeingrecorded[19].Anonymizationtechniqueshavebeendevelopedtoallevi-atethisconflictcreatedbytheneedforsecurityandtrustontheonehandandhighutilitydatasetsontheother[14,23,25,26,13].Infact,severaltoolshavebeencreatedsothatdataownerscansanitizenetworkdata[20,15,9,21].Whileveryuseful,thesetoolsalonedonotsolvetheentireprob-lem.Onemustknowhowtocreateanonymizationpoliciesthatprovidethenecessarylevelofassurance—especiallyinlieuofmanyrecentlydevelopedattacksagainstanonymiza-tion[8,4,5,18].Thesetoolsalonedonothingifonedoesnotknowhowtousethemintelligently.
Asacorollarytothistrade-offbetweeninformationlossandtrust,therearealwaysatleasttwoactorswithoppos-inginterestsinthecreationofananonymizationpolicy.Thedataownerandthepeoplewhosebehaviorthedatadescribeswantpoliciestoprotecttheirprivacyandsecurityinterests.Evenifthedatadoesnotdirectlyrevealsensitiveinforma-tionaboutthedataowner’snetworkorservices,Internetserviceprovidersandsuchdonotwanttoviolatethecon-sumer’strustortheirownprivacypolicies.Ontheotherhand,theperson(s)doinganalysisofthedata—whetheritisaresearcherorsomeoneinvestigatingaspecificintrusion—needsdataasclosetotheoriginalaspossible.Alterationscanaffectanalysis,andtheywanttheseminimized.There-foreonepartywantsmoreanonymizationofthedataandtheotherwantsless.
Balancingthesetwodifferentneedsrequirestheabilitytospecifyconstraintsonwhichfieldsmustandmustnotbeanonymized,andwithwhatalgorithmssuchfieldsmaybeanonymized.Inotherwork[6],wehavedevelopedafirstorderpredicatelogicinwhichtheseconstraintsmaybespecifiedtogenerateanonymizationpoliciesortestthemforcompliance.However,onemuststillcomeupwithmecha-nismstocreateintelligentconstraints.In[10],weinvesti-gatedtheneedsofthepartyperformingtheanalysissothattheycouldspecifyminimalconstraintsonwhatcannotbeanonymized—tryingtomeasuretheutilityofanonymized
data.Thiswork,instead,focusesonhowonecanspecifyminimalconstraintsonwhatmustbeanonymizedtomeettheminimumassurancelevelofthedataowner.
Anaturalwaytoexpressone’ssecurityortrustrequire-mentsisintermsofthetypesofattacksorthetypeofadversarytheymustwithstand.Withthisinmind,wehavecreatedataxonomyoftheattacksagainstnetworkloganonymizationbaseduponcommonattackpreconditions.Wehavecreatedamorethoroughandmutuallyexclusivetaxonomythanpreviousattemptswhichnotonlyreflectsallcurrentlyknownattacks,butwebelieveitwillbeabletoincorporatefutureattacks.Furthermore,sinceitisbaseduponattackpreconditions,namelywhatinformationisnec-essaryfortheadversarytoperformtheattack,itreadilytranslatesintoconstraintsforourpredicatelogic.There-fore,constraintscanbeeasilyderivedtopreventasingleattack,aclassofattacks,orarbitrarycollectionsofattacksthroughsimplelogicalconjunctionsofthesestatements.Theothernaturalwaytoexpressalevelofassuranceistospecifythetypeofadversarythatmustbeprotectedagainst.Therefore,oursecondcontributioninthispaperisthede-velopmentofanadversarialmodelbaseduponadversarialmeansandcapabilities.Lastly,wetietogethertheadversar-ialmodelandattacktaxonomysoonecantranslatebetweenconstraintsbaseduponeitheradversariesortheclassesofattackswhichtheycanperpetrate.
Thispaperisorganizedasfollows:Section2presentsourtaxonomyofattacks.Section3introducesthebasicsofouradversarialmodel,includingthenotationwehavedevelopedtoexpressthemodel.Section4establishesourmappingbe-tweenthetaxonomyandadversarialmodel.Section5dis-cussesrelatedwork,andsection6presentsfutureworkandourconclusions.
2.ATTACKTAXONOMY
Classifyingattacksagainstloganonymizationisanearlysteptowardsacomprehensivestudyofthesecurityofanony-mizationpolicies.Ifnetworkownerscanselectclassesofat-tacksthattheywishtoprevent,theycanthenensurethattheiranonymizationpoliciesmeettheirsecurityconstraints,whileallowingasmuchnon-privateinformationaspossibletoberevealed—thusincreasingadataset’sutility.
2.1MotivationsandRequirements
Asdescribedpreviously,wewishtoprovidenetworkown-erswithataxonomyofattacks,theclassesofwhichtheycanselecttoprevent,ratherthanhavingtofocusonindividualattacks.Wealsowishtoformallyexpressrelationshipsbe-tweenattacks,allowingforexpressionofattackgroupingsinalogicaboutanonymization.Thistaxonomymustbecomplete(everyknownattackcanbeplacedinatleastoneclass)andmutuallyexclusive(noattackcanbeamemberofmorethanoneclass).Theclassesmustbefine-grainedenoughfornetworkownerstoselectspecificclasseswithoutseriouslyimpactingtheutilityofalog.Finally,theclassesmustbetiedtogetherinamoreconcretewaythanade-scriptioninnaturallanguage.
2.2Methodology
Nineteenattacksfrommultiplesources[23,15,11,2,14,4,22,7,5,24]werecollectedasrepresentativeofcurrentattacksagainstloganonymization.Theseattacksvariedsig-nificantlyinbothsophisticationandmechanismsused.The
attacksincludedalltheattacksspecificallymentionedbothbyPangandPaxson[15]andbySlagelletal.[22],andsotheattacksampleisastrictsupersetofthoseconsideredforthoseprevioustaxonomiesdiscussedinSection5.
Eachattackwasanalyzedforitspre-conditions.Wechosetoorganizearoundpre-conditionsbecausetheycapturetheinitialknowledgeoftheadversaryandpropertiesoftheanonymizedlog.Everyanonymizationattackrequiressomeinitialknowledgeandsomespecificlogproperties.Byorga-nizingaroundpre-conditions,wecanidentifytheknowledgeandlogpropertiesthatarecrucialtotheexecutionofananti-anonymizationattack.Section3willdemonstratehowwecantakethisinitialknowledgeandlogpropertiesintoaccountwhenmodelinganadversaryandwhendeterminingwhetheragivenadversarycancarryoutaparticularattack.Agraphwasconstructed,withnodesforeachattackunderconsideration,aswellasforpre-conditionsforeachattack.Incaseswheretwoormorenodessharedacommonpre-condition,thatconditionwasrepresentedwithasinglenode.Edgeswereaddedtothegraphconnectingpre-conditionstotheattackstheyenabled.Attemptswerethenmadetosat-isfactorilygroupattackswithcommonpre-conditions.
2.3Results
Thefullgraphusingpre-conditionsisillustratedinFig-ure1.Wenotethatalargenumberoftheattacks(9of19)requiresomeformofconsistentIPpseudonymsinordertoidentifyindividualmachinesinthelog,andanadditionalat-tackspecificallyrequiresprefix-preservingpseudonyms[23].Becausethispre-conditiondoesnothelpusdistinguishmanyattacks,weignoreitwhengeneratingourtaxonomy.Giventhis,wenowhavesixunconnectedsubgraphsthatcanbetreatedasclassesinourtaxonomy.Twoofthesesubgraphseachinvolveasingleattack/pre-conditionpair,andasbothoftheseattacksessentiallyinvolvepotentialcryptographicweaknessesintheanonymizationfunctionsthemselves,wegroupthemtogetherintoasingleclass.
Remarkably,thisanalysisrevealsahigh-leveltaxonomyvirtuallyidenticaltothosegivenbySlagelletal[22].Sev-eraloftheseclassessharenocommonpre-conditions,andtheonlypre-conditionsharedbetweenclassesisaconsistentIPmapping(suchasaprefix-preservingmapping[23])thatenablesidentificationofindividualmachinesinthelog.Themajorclassesinourtaxonomy,separatedinFigure1withblack,dashedlines,areasfollows:
•Fingerprinting:Theprocessofmatchingattributesofananonymizedobjectagainstattributesofaknownobjecttodiscoveramappingbetweenanonymizedandunanony-mizedobjects.
•StructureRecognition:Theactofrecognizingstruc-turebetweenobjectstouseonemappingtodiscovermul-tiplemappingsbetweenanonymizedandunanonymizedobjects.
•KnownMapping:Exploitingthediscoveryofamap-pingbetweenunanonymizedandanonymizeddatainonereference,toundoanonymizationofthatsamedatainmultipleplaces.
•DataInjection:Injectinginformationtobeloggedwiththepurposeoflaterrecognizingthatdatainananony-mizedform.
•Cryptographic:Attackingthecryptographicprimitivesthatformthebasisforanonymizationalgorithms.
TenofthenineteenattacksareclassifiedasFingerprint-
FingerprintingUnanonymized Port #sConsistent IP PseudonymsPrefix-Preserving IP pseudonymsStructure RecognitionData InjectionData Injection#1Client Web Browsing history can be inferred Port-Based#2Host network port behavior can be identified#8Network Topology can be recovered from pcap traces#9Combined w/ #8, profiling host behavior can compromise server IPCleartext Transport Protocol InfoCleartext TCP Timestamps#18NIC/TimerFingerprinting#3Unanonymized Prefix-Pres IPreveals bits of other addresses#5Port scan and known IP reveals multiple IPs#4Port scan reveals sequential ordering of IPs#11Attacker sends packet to known cleartext address, hopes it will show up in anonymized log#10Sensors unanonymized in \"Probe Response attack\"Info isn't dropped from logUnanonymized Flow SizesCleartext Header InfoNetFlows111\"Long\" TraceKnowledge of machine clock drifts1Known anonymized/unanonymized IP pairRecognizable Port Scan in logForeknowledge of log collection time#17OS Fingerprinting#19Virtual Machine IdentificationMachine Attribute11VMs in trace (large clock skews)#7Consistently anonymized IPs can be unanonymized if compromised in another logAnonymized/Unanonymized Log PairKnown Mapping#6Consistently anonymized data can be unanonymized if compromised in another locationLegendAttack RequirementAttack1Cleartext file sizeKnown popular files and sizesRecognizable Client Behavior#16Client behavior fingerprintingMandatoryAccess to multiple logsConsistent Data AnonymizationxMandatory Choice (Group x)Helpful#14File transfer fingerprintingCleartext/Recognizable Server query repliesBehavioral#15Server interaction fingerprinting#13Chosen plaintext attack can reveal data encryption keyCryptographic#12Dictionary attack on hashesCrypto function susceptible to Chosen PlaintextHash function anonymizes dataNetwork TrafficFigure1:GroupingofAnonymizationAttacksbyPre-conditionsingattacks.OuroriginalcriticismoftheSlagelletalclas-sificationwasthatitwasnotfine-grainedenoughtoallowforpolicycreationthatprovidedahighlevelofbothsecu-rityandutility.Thus,wehaveattemptedtocreatesub-classesoftheFingerprintingattackclass.Wefirstanalyzedcommonprerequisitesbetweenattacks.However,nosim-pleapproach—astakentocreatethemajorclasses—couldbefound.Instead,wedecidedtocreatesubclassesusingacase-by-caseapproach:•Attacks#1,#2,#8,and#9allreliedonunanonymizedportnumbers(oronbeingabletoextracttheportnum-bersfromstatisticalanalysis[4]).WechosetocalltheseattacksPort-basedFingerprinting:Fingerprintingbynetworkportsorwell-knownservices.•Attacks#15and#16,whilenotsharinganycommonpre-conditions,werebasedonaknownmachine/userbe-havior,andsowetitletheseBehavioralFingerprint-ing:Fingerprintingbasedonobservinganentity’sknownbehavior.•Attacks#18and#19weresimpletogrouptogether,asbothrelyoncleartexttimestamps.WealsodecidedtogroupOSFingerprinting(#17)inthissubclass,thoughwewilldiscussitfurtherbelow.WecallthisMachineAttributeFingerprinting:Fingerprintingbasedon(rel-atively)immutableattributesofamachine,suchashard-waretimerdriftoroperatingsystemcharacteristics.•Aloneinthissubclassisattack#14,FileTransferFin-gerprinting,whichistheonlyfingerprintingattackinoursamplewhichattemptstofingerprintsomethingotherthanamachine.Forthisreason,wecreatedaseparatesub-groupforit,whichwebelievecanbeexpandedwithotherexamples.WetitledthisclassNetworkTrafficFinger-printing:Fingerprintingbasedonrecognizingpatternsinherenttonetworktrafficsuchasknownfiletransfersizes.Figure2illustratesthecompletetaxonomy.OSFingerprintingpresentsaparticularproblemforourclassification.WefeelthatOSFingerprintingisviewedbestnotasasingleattack,butagroupofattacksbasedonapost-condition.Ifwearestructuringourtaxonomyuponpre-conditions,though,itismoresensibletosplitOSFin-gerprintingattacksamongtheotherfoursubclasses—thatistosay,theremaybePort-based,Behavioral,MachineAttribute-based,andNetworkTraffic-basedwaystoperformOSfingerprinting.Thus,tostopallformsofOSfingerprint-ing,weneedtoconsidereachsubclassinturn.Weleavetheseconsiderationsasfuturework.2.4AnalysisofPost-conditionsWeattemptedtoanalyzeattackpost-conditionsaswell.However,amajorityoftheattacksweexamined(10of19)werespecificallyaimedatidentifyingamachineordeanony-mizinganIPaddress.Theremainingnineattackseachhaveauniqueattributethattheydeanonymize.Asmostofthesesingleattack/post-conditionpairsseemtobeunrelated,andtheremainingattacksallfocusedonthegoalofmachineidentification,wedeterminedthatpost-conditionanalysiswasnotusefulforourpurposes.Amorethoroughdiscus-sionofourpost-conditionanalysisisavailablein[6].3.ADVERSARIALMODELOurmodelconsistsofanumberofelements,representinglogs,logentities,logentity’propertiesanddata,thead-versary’sknowledgeset,patterns,andacollectionoffouradversarialmeans.Wewillexplaineachoftheseelements,andSection3.7willgiveathoroughexampleofhowtheseelementscanbecombinedtodescribeanadversary’scapa-bilities.3.1EntitiesAttack
Data Injection Attacks
Cryptographic AttacksFingerprinting Attacks
Known Mapping AttacksStructure Recognition Attacks
Port-based FingerprintingBehavioral Fingerprinting
Machine Attribute FingerprintingNetwork Traffic Fingerprinting
Figure2:TaxonomyofAttacksAgainstAnonymization
Anentityinourmodelisanyitemthatanadversarymaywishtodiscoverinformationabout.Exampleentitiescouldbemachines,users,orfilessentoveranetwork.Anentityisessentiallysynonymouswithanobjectin[3],thoughwemaywishtodiscoverinformationaboutentitiesthatisnotinalog,evenwhenunanonymized.Forexample,wemaywishtodiscoveramachine’soperatingsystem,eventhoughthereisnosuchfieldinmostlogformats.
Entitieshaveattributes,someofwhichmaybeknowntoanadversaryandsomeofwhichmaynotbeknown.En-tityattributesmaybeanonymizedorunanonymized,andtheadversarymaywishtodiscovereitherform.Wewillde-velopnotationforanonymizedandunanonymizedattributesinsection3.4.
Anentityinourmodelwillbedenotedase,withanop-tionalsubscriptthatservesasawaytodifferentiateenti-ties.Forexample,a“target”entitymaybedenotedaset.Attributeswillbedenotedastheentity’sname,followedbyaperiodandthenameoftheattribute.Forexample,thetargetentity’sunanonymizedIPaddressmaybedenotedaset.ipaddr.
3.2AdversaryKnowledgeSet
Theadversary’sknowledgesetisacollectionofentities,andtheirknownattributes.Thegoaloftheadversaryistouncoveranentity’sattributes,andtoaddthemtoitsknowl-edgeset.Datamustbeintheadversary’sknowledgesetbeforeitcanbeusedtouncovermoreinformation.Weconsideraddinginformationtotheadversary’sknowledgeset,butwedonotbelievethereisanyreasontoconsiderinformationlossfromthisset.Wedenotetheadversary’sknowledgesetusingthesymbolK,andwedenoteadditionofinformationtotheadversary’sknowledgesetusingthe‘⇒’symbol.Forexample,e.ipaddr⇒Kmeansthattheentity’sunanonymizedIPaddresshasbeenaddedtothead-versary’sknowledgeset.
TheKnows()Assertion
TheKnows()assertionletsusstatethatanadversaryknowsaspecificpieceofinformation.Whenusedtoconstructat-tackspecifications,theKnows()willcauseanattacktofailifitisnottrue,andthus,itisusedtospecifyanat-tack’sprerequisiteknowledge.Asanexample,thestatementKnows(et.ipaddr)assertsthatet.ipaddr∈K,thatis,thattheadversaryknowsthetargetentity’sunanonymizedIPaddress.
3.3Logs
AbbreviationMeaningCPConsistentPseudonymPPPrefixPreservingIP[23]BMBlackMarker[20]RPRandomPermutation[20]Table1:Abbreviationsforanonymizationalgo-rithmsandclassesofalgorithms
Logsarethesourceofinformationaboutentities.Whendiscussingaparticularlog,weassumethattheadversaryhassomeformofthatlog(eitheranonymizedorunanonymized)inhispossession.WewilldenotelogsusingthesymbolL.Wemayoccasionallyplaceasubscriptafterthisforthefewattacksinwhichmultiplelogsareinvolved.
3.4Anonymized/UnanonymizedData
Wewilldenoteunanonymizeddatausinganunderlineno-tation.Forexample,anunanonymizedIPaddressmaybedenotedasipaddr.
Wewilldenoteanonymizeddatausinganoverlineno-tation,followedbyabackslash,andthenanabbreviationtonotewhatalgorithm(orclassofalgorithms)areusedtoanonymize.Forexample,anIPaddressanonymizedusingaconsistentpseudonymalgorithm(suchasprefix-preservingIPanonymization[23]orrandompermutation[20])maybedenotedasipaddr\\CP.AlistofabbreviationsforuseinthismodelcanbefoundinTable1.
3.5Patterns
Patternsareawaytomatchinformationintheadver-sary’sknowledgeset.Becauseofthelargevarietyofcom-putationsmatchinginformationcaninvolve,wechoosenottoformalizetheconstructionofapatternbesidesspecifyingitsinputsandoutputs.However,Coull’sadversarialmodel[3]maymakeagoodmodelforapatterninmanycasesoffingerprintingorstructurerecognitionattacks.Whileourmodelmakesuseofveryunspecifiedpatterns,weviewthisasbeingapowerfulconstructthatcanbebroadlyapplied.Patternstakeasetofinputattributes,whichmaybeei-theranonymizedorunanonymized.Thekeytotheseinputsisthattheyarepre-conditions,inthatthepatterncannotbematchedagainstiftheseinputsarenotavailable.Forexample,adatainjectionpatternusingunanonymized,ob-scureTCPoptionscannotberecognizedifthoseoptionsareanonymized(especiallybyaBlackMarkeralgorithm).Like-wise,apatternrelyingonpacketsizescannotbeappliedtoaNetFlowlog,asitdoesnotreportindividualpackets,but
rather,thesizeoftheentireflowbetweenhosts.Thus,ifourgoalistostoptheapplicationofapattern,wesimplyneedtopreventtheadversaryfromgainingknowledgeofthepattern’spre-conditions.
PatternsaredenotedusingthesymbolP.Wehavefor-malizedthepre-conditionsbylistingtheminanglebracketsafterthenameofthepattern.Forexample,ifpatternPexhase.ipaddr\\CPande.tcpoptionsareprerequisites,thenthefullnotationwillbePexe.ipaddr\\CP,e.tcpoptions.
3.6AdversarialMeans
Belowweconsiderfour“means”,orcapabilities,ofanad-versaryinourmodel.Eachofthesemeansisinrelationtoanattributeinarecord(usuallyanattributebelongingtoanentityofinterest).Incombination,thesemeans(alongwiththeKnows()assertion)candescribeanyattack.The‘→’symbolrepresentsthatameansproducesagivenvalue.Forexample,f(...)→gsaysthatthemeansf,whensuc-cessfullyused,producesinformationg.
3.6.1TheObserve()Means
Form:Observe(L,e.attr)→e.attr
TheObserve()meansrepresentsknowledgethatanadver-sarycandirectlylearnfromalog.Thisinformationmaybeeitheranonymizedorunanonymized,asitispresentedintheloginquestion.Observe()cannotdirectlyun-anonymizedata,norcanitinferdatanotfoundinthelog.
Asanexample,theactofobservinganentity’sIPad-dressfromalogthathasnotbeenanonymizedcanberep-resentedasObserve(et.ipaddr)→et.ipaddr⇒K(wenotethatsomeauthorshaveconsideredthisasimpleattack[11]).Asecondexample,involvingaportnumberanonymizedusingtheblackmarkeralgorithm,isverysimilar:
Observe(et.portnum\\BM)→et.portnum\\BM⇒K.
3.6.2TheCompromise()Means
Form:Compromise(e.attr\\alg)→e.attrTheCompromise()meansdirectlyrevealstheunanonymizedversionofananonymizedattribute,whichcanthenbeaddedtotheadversary’sknowledgeset.
AnexampleuseofCompromise()isanattackonacryp-tographicallyweakanonymizationalgorithm,suchasadic-tionaryattackagainstahashofIPv4addresses.Thereareonly232possibleaddresses(whichcanbeevenfurthercon-strainedduetootherknowledgeavailabletotheadversary),andconstructingadictionaryoftheseaddresseswouldtakeunderanhourattherelativelymodestrateof1.2millionaddresshashespersecond1.ThisattackcanberepresentedasCompromise(et.ipaddr\\MD5)→ipaddr⇒K.
WedonotanticipatethatmostadversarieswillhaveaCompromise()abilityformostoranyattributesandthatitprimarilywillservetoidentifyadversarieswhocantakeadvantageofweakanonymizationalgorithms.TheattacksperformedusingtheCompromise()meanswillgenerallybethosewehaveclassifiedasCryptographic.
3.6.3
TheInject()Means
1
doWerananOpensslbenchmarkandfoundthatXeonoverat1.23.00millionGHz.
md5hashespersecondonasingleitcouldcoreForm:Inject(L,Pattr1,attr2,attr3,...)→∅
TheInject()meansinjectsapattern(involvingattributesattr1,attr2,attr3,...)intoalogforlaterrecognitionbyMat-ch().ItmustbeperformedbeforeanyObserve(),Compro-mise(),orMatch()canbedonetoanindividuallog(asaninjectionmusthappenwhilethelogisbeingcollected,asop-posedtothepost-collectionanalysisperformedbytheotherthreemeans).Theattacksconstructedusinganadversary’sInject()meansdirectlymaptotheclassofattackscalled“DataInjection”attacksinSection2.
Asanexample,considerinjectingwithapatternbaseduponaparticularTCPoption.Ifwecanconstructapat-ternPinjtcpoptusingthisTCPoption,wecandenoteitsinjectiontologLtbyInject(Lt,Pinjtcpopt).
3.6.4TheMatch()Means
Form:Match(e,Pattr1,attr2,attr3,...)→e.attrTheMatch()meansusesaconstructedpatternPtorevealanunknownattributeofentitye.Itreliesontheparameterattributesattr1,attr2,attr3,...,whichmaybeanonymizedorunanonymized,infindingamatchandrevealingthenewattribute(whichmayalsobeanonymizedorunanonymized).TheMatch()meansistheprimarymeansbywhichweexpectadversariestolaunchattacks.
AnexampleuseofMatch()canbeseenusingacontin-uationoftheTCPoptionpatterninjectedinourInject()meansexample.Ifwehaveanentityeswhichweareob-serving(toseeifitisthesameasourtargetentityetthatinjectedthepattern),wecanusethematchmeansasfol-lows:Match(es,Pinjtcpopt)→et.ipaddr\\CP⇒K.Ifes’sTCPoptionsmatchthepatternPinj,thenweinferthates≡et,andwethenknowet’sanonymizedIPaddress.
3.7ConstructinganExampleAttack
FromtheKnows()assertionandthefourmeansinsection3.6,(aswellasasetofpatterns),wemayconstructformaldescriptionsofvariousattacksagainstanonymization.Fig-ure3demonstratesadatainjectionattackusingobscureTCPoptions,whichwewillexploremorethoroughlybelow.Constructionofotherattacksisquitesimilar,andfurtherexamplesareavailableintheappendix,aswellasinthefirstauthor’sthesis[6].
TheattackgiveninFigure3consistsofasingleKnows()assertion,andfourmeans.First,weassertthattheadver-sarymustknowitsownunanonymizedIPaddress(thatis,theIPaddressoftheentityinthelogthatinjectsthedata).Second,theadversarymusthavethemeanstoInject()apatternPinjusingTCPoptionsintothelog.Next,afterob-tainingthelog,theadversaryloopsthroughalltheanony-mizedentities.Foreachentity,theIPaddress(anonymizedbyanalgorithmthatproducesconsistentpseudonyms)andtheTCPoptionsusedbytheentityareobserved(Observe()).Finally,wematchtheentityagainsttheinjectedpattern.Ifthepatternmatches(Match())theentityweareexamin-ing,weknowthattheanonymizedentityisthesameastheadversary’sentity,andthusweknowtheadversary’sanony-mizedIPaddress.
4.MAPPINGADVERSARIES&ATTACKS
Ourtaxonomyandadversarialmodelareusefulinandof
Knows(eself.ipaddr)Inject(L,Pinjeself.tcpoptions)FOREACHe∈L
Observe(L,e.ipaddr\\CP)⇒KObserve(L,e.tcpoptions)⇒K
Match(e,Pinjeself.tcpoptions)→
eself.ipaddr\\CP⇒K
ENDFOR
Figure3:DataInjectionAttackConstruction:RecoverAdversary’sAnonymizedIPAddressthemselves.However,theirinterrelationshipisthekeyfea-turethatletsnetworkdatasetownersdevelopasafeanony-mizationpolicy.Thispolicycanbederivedbyhandfromourmodel,thoughourotherwork[6]hasshownhowpoliciessafeagainstagivensetofattackscanbederivedautomati-cally.Regardlessofthemethodofpolicyderivation,weseektoshowtherelationshipbetweenourtaxonomyandmodelasawayofselectingthethreatstoanonymizationthatcon-cernus.
4.1Mappingadversariestoattacks
Anadversarycanperpetrateanyattackforwhichithastheprerequisiteknowledgeandmeans.Formally,consideranadversaryχwithasetofmeansandknowledgeMχ.AnadversarycanperpetrateanyattackthatcanbeconstructedfromtheelementsofMχ.Inotherwords,anattackαthatrequiresmeansandknowledgeµαcanbeperpetratedbyχifandonlyifµα⊆Mχ.
Sinceanyattackthatcanbeperpetratedbyanadversaryhasameansandknowledgesetthatisasubsetoftheadver-sary’smeansandknowledgeset(thatis,µα⊆Mχ),wecanexploreallpossiblecombinationsofthemeansandknowl-edgeoftheadversarytofindallattackspossible,givenanadversary.Specifically,sincethepowersetP(Mχ)containsallsubsetsofMχ,weknowthateveryattackthatχcanperpetrateisamemberofP(Mχ).Inthecasethat|Mχ|isfinite,wecanevenenumeratealltheelementsofP(Mχ),andthusallofχ’spossibleattacks.
However,notethatmany(ifnotmost)oftheelementsofP(Mχ)arenotactuallyattacks.Forexample,anattackthatdoesnothaveatleastoneObserve()meansdoesnotreadanyinformationfromalog,andthereforeisnotanat-tackonloganonymization.Second,onlyCompromise()andMatch()actuallyrevealnewinformation,andthusanattackmustcontainatleastoneCompromise()orMatch()means.Finally,notethatpatterns,Compromise(),andMatch()allrelyonspecificinformationthathasbeenreadfromalog,andthus,toconstructanattackusinganyofthese,wemustalsoincludetheirObserve()counterparts.Thislimitsthesetofpotentialattacksthatwecanconstruct.Becauseofthis,ifwecallourmappingofadversariestoattacksγ,thenwemightsayγ(χ)P(Mχ).
4.2Derivingadversariesfromattacks
Wecanalsoestablishareversemapping,thatis,γ−1fromanattacktothesetofadversariesthatcanperpe-trateit.Whilethisadversarysetwillgenerallybeuncount-ablyinfinite(asthereareanuncountablyinfinitenumberofpatternswemayuseforaMatch()),wecandiscussaminimaladversarythatcanperpetrateαsimplyasthead-
versarywithexactlythemeansandknowledgesuchthatαcanbeconstructed.Thus,aminimaladversaryχminforattackαissimplytheadversarywithmeansandknowledgeMχmin=µα.
GivenasetofattacksA,wecanconstructaminimalad-versaryχminbysimplytakingtheunionofeverymeansandknowledgesetrequiredfortheattacksinA,formally,
A⊆γ(χmin)⇐⇒[
µα≡Mχmin
α∈A
NotethatA⊆γ(χmin).Thisisbecauseitisentirelypossi-blefortheunionoftwoormoreattacks’prerequisitemeans
andknowledgesetstocombineandallowforanadditionalattack.Thenewattackmusthaveallofitsprerequisitesintheunionoftheoriginalattacks’meansandknowledgesets.Insomecases(forexample,attacks#8and#9inourtaxonomy),anattackα1mayhaveameansandknowl-edgesetthatisasupersetofanotherattackα2’smeansandknowledgeset,thatis,µα2⊂µα1,andinthisrarecase,theminimaladversarycapableofperpetratingα1willalsobeabletocarryoutα2.Webelievethatthiswillonlyoccurwithhighlysimilarattacks.
4.3ExampleMappings
Consideranadversaryχ,whocaninjectdataintoalogasitisbeingcollected,andthenrecovertheloginanony-mizedform.However,assumethatthelogisanonymizedsuchthatonlypseudonymsremainfortheIPaddresses,andallotherfieldsexceptthetimestampsandTCPoptionsareremoved.Iftheadversaryisfamiliarwithanonymizationattackliterature,heorshemaybeabletousetheattacksdescribedinsomepaperstoreproducethemethodsusedfortheattacks.Assumethattheadversaryknowsthetargetentity’sunanonymizedIPaddress,clockdrift,TCPoptions,andoperatingsystem.Areasonablerepresentationofχ’smeansandknowledgesetsare:
Mχ={Observe(L,e.ipaddr\\CP),Observe(L,r.timestamp)Observe(L,e.tcpoptions)Match(e.timestamps[],Ptd),Match(e.timerdrift,Ptet.timerdrift),Match(e,Pinjet.tcpoptions),Inject(L,Ptd),Inject(L,Pinjet.tcpoptions)}
Kχ={et.ipaddr,et.timerdrift,et.tcpoptions,et.os}ByexaminingFigure3,wecanseethatallthenecessaryknowledgeandmeansarepresenttolaunchtheTCPop-tionsinjectionattack—wecallthisαinj.WecanalsoseefromFigure5thatwehavethemeansforKohnoetal’shardwaretimerdriftattack[7],whichwecallαtim.αinjandαtimaremembersofγ(χ),andthus,botharepossibleforχtoperpetrate.
Ourexampleχaboveisnottheminimaladversarythatcanperpetratebothattacks:themeansInject(L,Ptd)andtheknowledgeet.osareunnecessaryforeitherattack(nei-therarepresentinµαinjorµαtim).However,theseextrasmaymakeotherattackspossibleforχ.Forexample,itmaybepossibletouseInject(L,Ptd)toinjectatimingpatternintothelogbasedontheadversary’sknownuniquetimerdrift.
Inordertoprotectagainsttheattacksthatweknownχcanperform,weneedtochangetheanonymizationpolicy,whichchangestheavailableObserve()means.Forexample,
wecouldanonymizeIPaddresseswithablackmarkeralgo-rithm,makingtrafficfromvariousentitiesmuchharder(orimpossible)todistinguish.Wecouldalsoanonymizetimes-tampsandTCPoptions.
5.RELATEDWORK5.1
PriorTaxonomies
PangandPaxson[15]werethefirsttosuggestanattackclassificationinthedomainofloganonymization.Theyde-scribedfourbasictypesofattacks.However,theseclassi-ficationswereneithercompletenormutuallyexclusive[22].Furthermore,theclassificationsaretoobroad.Theplethoraofattacksthatfallwithinanyoneclassmakeseliminationofallattacksinaclass,whilefeasible,aseriousdetrimenttotheutilityofalog.
Slagelletal’sclassification[22]introducestwoadditionalcategoriesofattacksnotconsideredbyPangandPaxson,andreorganizestheprevious,sometimes-overlappingcate-goriesintothreemutuallyexclusivesets.Thisreclassifica-tionaddressesthefirsttwoproblemsofPangandPaxson’staxonomy,butitisstilltoocoarse.Werequireamorefine-grainedtaxonomytousefulcorrelatewithouradversarialmodel.
5.2PriorAdversarialModels
Coulletal.recentlydevelopedtheirownadversarialmodelinconjunctionwithanentropymeasurement[3].Whilethismodelisconvincingfromaninformationtheoreticperspec-tive,itgivesnohardguaranteesonspecific,knownattacks—evenignoringallactiveattacks.WeviewCoulletal.’smodelandtheassociatedmetricsasanadditionaltoolthatcom-plimentsourownadversarialmodel.Ananalogouscontrastwouldbebetweensignatureandanomalybasedintrusiondetection.
TheinspirationforourmodelcomesfromAvione’sgenericmodelforattackingsecureRFIDtags[1].Avione’smodelconsistsofthreechannels:“forward”,“backward”,and“mem-ory”.Theadversaryalsohasasetoffive“means”,orcapa-bilities:Query(),Send(),Execute,Execute*(),andReveal().Ourmodelhastwo(implicit)channels.TheForwardchan-nelisdatainjectedintothelog,andtheBackwardchannelisthelogitself.Wehavenoneedforanequivalenttothemem-orychannel.Ourmeansroughlymaptotheirsasfollows:Query()⇐⇒Observe(),Send()⇐⇒Inject(),Execute()⇐⇒Match(),andReveal()⇐⇒Compromise().
6.CONCLUSIONSANDFUTUREWORK
Wehavepresentedanewtaxonomyandadversarialmodeltodescribeattacksagainstnetworkloganonymization.Thistaxonomyisthefirsttoexploreandclassifyalargevarietyofattacks,anditisthefirsttoclassifybyarigorous,empiricalevaluationofattackproperties.Ouradversarialmodelisca-pableofcapturingallcurrentlyknownanti-anonymizationattacks,andwebelieveitwillbeabletoincorporatenewattacksastheyarediscovered.Furthermore,weareabletomapanadversaryconstructedwithaspecificmeansandknowledgesetintoaclassofpotentialattacks(andviceversa).
Theimpactistoprovideamethodologythatallowsorga-nizationstocreatesaferanonymizationpoliciesforexistinglogsanitizers.Byallowingdataownerstomakemorein-formeddecisions,theycanmoreeasilybalancethetrade-offs
betweenutilityandtrust,thusallowingformorecollabora-tion.Hence,thisworkcomplimentsourotherworkinthisareaevaluatinghowanonymizationaffectsutilityanddevel-opingaflexibleanonymizationframework[10,20].
Likeanynewtaxonomy,itsvaluewillonlybedeterminedinthefuturebythosewhohavefounditusefultowardstheirgoals.Wehavefounditusefulasitmapssonicelyintoafirstorderpredicatelogicthatcanexpressconstraintsonwhatcannotbeanonymizedforutilityandwhatmustbeanony-mizedtopreventspecificattacksorclassesofattacks[6].Inthefuture,weplanonusingthispredicatelogicalongwithouranonymizationframeworktoseeiftheymeetspecifiedutilityandtrustrequirements.
7.ACKNOWLEDGEMENTS
ThismaterialisbaseduponworksupportedbytheNa-tionalScienceFoundation(NSF)underAwardNo.CNS0524643.Anyopinions,findings,andconclusionsorrecom-mendationsexpressedinthispublicationarethoseoftheauthorsanddonotnecessarilyreflecttheviewsoftheNSF.
8.APPENDIX
Observe(L,et.ipaddr\\MD5)⇒KCompromise(et.ipaddr\\MD5)→et.ipaddr⇒K
Figure4:CryptographicAttackConstruction-DictionaryattackonMD5-hashedIPv4AddressesKnows(et.ipaddr)Knows(et.timerdrift)FOREACHe∈LObserve(L,e.ipaddr\\CP)⇒KFOREACHrecordr∈e
Observe(L,r.timestamp)→e.timestamps[]⇒KENDFOR
Match(e.timestamps[],Ptd)→e.timerdrift⇒KMatch(e.timerdrift,Ptet.timerdrift)→et.ipaddr\\CP⇒KENDFOR
Figure5:MachineAttributeFingerprintingAttackConstruction-HardwareTimerDriftObserve(L,et.ipaddr\\CP)⇒KFOREACHrecordr∈et
Observe(r.timestamp)⇒KObserve(r.port)⇒KObserve(r.flowsize)⇒K
Match(r,Psessionsr.timestamp)→et.sessions[]⇒KENDFOR
FOREACHsessions∈entt.sessions[]
Match(s,Pwebsitess.port,s.flowsizes)→s.website⇒KENDFOR
Figure6:Port-basedFingerprintingAttackCon-struction:WebBrowsingActivityIdentification
9.
[1]Avoine,REFERENCES
G.,
“AdversaryModelforRadioFrequencyIdentification,”CryptologyePrintArchive,Report2005/049,2005.[2]Bethencourt,J.,Franklin,J.,Vernon,M.,“Mapping
InternetSensorswithProbeResponseAttacks,”14thUSENIXSecuritySymposium,Aug.,2005.[3]Coull,S.,Wright,C.V.,
Keromytis,A.,Monrose,F.,andReiter,M.,“TamingtheDevil:TechniquesforEvaluatingAnonymizedNetworkData,”,15thNetworkandDistributed
SystemSecuritySymposium(NDSS’08),Feb.,2008.[4]Coull,S.,Wright,C.V.,Monrose,F.,Collins,
M.,andReiter,M.,“PlayingDevil’sAdvocate:InferringSensitiveInformationfromAnonymizedNetworkTraces,”14thNetworkandDistributed
SystemSecuritySymposium(NDSS’07),Feb.,2007.[5]Coull,S.,Collins,
M.,Wright,C.V.,Monrose,F.,andReiter,M.,
“OnWebBrowsingPrivacyinAnonymizedNetFlows,”16thUSENIXSecuritySymposium,Aug.,2007.[6]King,J.,“ATaxonomy,Model,andMethodfor
SecureNetworkLogAnonymization,”Master’sThesis,UniversityofIllinoisatUrbana-Champaign,Apr.,2008.[7]Kohno,T.,Broido,A.,andClaffy,K.C.,“Remote
PhysicalDeviceFingerprinting,”IEEETransactionsonDependableandSecureComputing,Apr.,2005.[8]Koukis,D.,
Antonatos,S.,andAnagnostakis,K.,“OnthePrivacyRisksthofPublishingAnonymizedIPNetworkTraces,”10IFIPOpenConferenceonCommunicationsandMultimediaSecurity(CMS’06),Oct.,2006.
[9]Koukis,D.,Antonatos,S.,Antoniades,D.,Markatos,
E.,andTrimintzios,P.,“AGenericAnonymizationFrameworkforNetworkTraffic,”IEEEInternationalConferenceonCommunications(ICC’06),Jun.,2006.[10]Lakkaraju,K.andSlagell.A.,“Evaluatingthe
UtilityofAnonymizedthNetworkTracesforIntrusionDetection,”4SecureCommConference,Sep.,2008.[11]Lincoln,P.,Porras,P.,andShmatikov,V.,“Privacy-preservingSharingandCorrectionofSecurityAlerts,”13thUSENIXSecuritySymposium,Aug.,2004.[12]McHugh,J.,“TestingIntrusionDetectionSystems:
ACritiqueofthe1998and1999DARPAIntrusionDetectionSystemEvaluationsasPerformedbyLincolnLaboratory,”ACMTransactionsonInformationandSystemSecurity,Vol.3(4),Nov.,2000.[13]Øverlier,L.,Brekne,
T.,and˚Arnes,A.,“Non-expandingTransaction
SpecificPseudonymizationforIPTrafficMonitoring,”4thInternationalConferenceonCryptologyandNetworkSecurity(CANS’05),Dec.,2005.[14]Pang,R.,
Allman,M.,Paxson,V.,andLee,J.,“TheDevilandPacketTraceAnonymization,”SIGCOMMComputingCommunicationsReview,Vol.36(1):29–38,2006.
[15]
Pang,R.Paxson,V.,“AHigh-levelProgrammingEnvi-ronmentforPacketTraceAnonymizationandTransfor-mation,”ACMConferenceoftheSpecialInterestGrouponDataCommunication(SIGCOMM’03),Aug.,2003.[16]
Porras,P.,andShmatikov,V.,
“Large-scaleSecurityData:CollectionRisksandandChallenges,”SanitizationWorkshopofNetworkonNewSecurityParadigms(NSPW’06),Sep.,2006.[17]
Ramaswamy,R.,andWolf,T.,“High-SpeedPrefix-PreservingIPAddressAnonymizationforPassiveMeasurementSystems,”Networking,IEEE/ACMTransactionsonNetworking,Vol.15(1),Jan.,2007.[18]
Ribeiro,B.,Chen,W.,Miklau,G.,and
Towsley,D.,“AnalyzingPrivacyinEnterprisePacketTraceAnonymization,”15thNetworkandDistributedSystemSecuritySymposium(NDSS’08),Feb.,2008.[19]
Sicker,D.,Ohm,P.,andGrunwald,D.,“LegalIssuesSurroundingMonitoringduringNetworkResearch,”In-ternetMeasurementConference(IMC’07),Jun.,2007.[20]
Slagell,A.,Lakkaraju,K.,andLuo,K.,“FLAIM:AMulti-levelAnonymizationFrameworkforComputerandNetworkLogs,”20thLargeInstallationSystemAdministrationConferenceLISA’06),Dec.,2006.[21]
Slagell,A.,Li,Y.,andLuo,K.,“Sharing
NetworkLogsforComputerForensics:ANew
toolfortheAnonymizationofNetFlowRecords,”FirstComputerNetworkForensicsResearchWorkshop,heldinconjunctionwithIEEESecureComm,Sep.,2005.[22]
Slagell,A.,andYurcik,W.,“Sharing
ComputerNetworkLogsforSecurityandPrivacy:AMotivationforNewMethodologiesofAnonymization,”FirstInternationalWorkshopontheValueofSecuritythroughCollaboration(SECOVAL’05),Sep.,2005.[23]
Xu,J.,Fan,J.,Ammar,M.,andMoon,S.,“Prefix-preservingIPAddressAnonymization:Measurement-basedSecurityEvaluationandaNewCryptography-basedScheme,”10thIEEEInternationalConferenceonNetworkProtocols(ICNP’02),Nov.,2002.[24]
Zalewski,M.,
andStearns,W.,“PassiveOSFingerprintingTool,”www.stearns.org/p0f/README,viewedAug.1,2008.[25]
Zhang,Q.andLi,X.,
“AnIPAddressAnonymizationSchemewithMultipleAccessLevels,”AdvancesinDataCommunicationsandWirelessNetworks(ICOIN’06),Jan.,2006.[26]
Zhang,Q.,Wang,J.,andLi,X.,“OntheDesignofFastPrefix-PreservingIPAddressAnonymizationScheme,”InternationalConferenceonInformationandCommunicationsSecurity(ICICS’07),Dec.,2007.
因篇幅问题不能全部显示,请点此查看更多更全内容