Systemsbiology
Vol.22no.92006,pages1152–1153
doi:10.1093/bioinformatics/btl038
FANMOD:atoolforfastnetworkmotifdetection
SebastianWernickeÃandFlorianRasche
¨tJena,Ernst-Abbe-Platz2,07743Jena,GermanyInstitutfu¨rInformatik,Friedrich-Schiller-Universita
ReceivedonJanuary3,2006;revisedandacceptedonFebruary1,2006
AdvanceAccesspublicationFebruary2,2006AssociateEditor:JonathanWren
ABSTRACT
Summary:Motifsaresmallconnectedsubnetworksthatanetworkdisplaysinsignificantlyhigherfrequenciesthanwouldbeexpectedforarandomnetwork.Theyhaverecentlygatheredmuchattentionasaconcepttouncoverstructuraldesignprinciplesofcomplexbiologi-calnetworks.FANMODisatoolforfastnetworkmotifdetection;itreliesonrecentlydevelopedalgorithmstoimprovetheefficiencyofnetworkmotifdetectionbysomeordersofmagnitudeoverexistingtools.Thisfacilitatesthedetectionoflargermotifsinbiggernetworksthanprevi-ouslypossible.AdditionalbenefitsofFANMODaretheabilitytoanalyzecolorednetworks,agraphicaluserinterfaceandtheabilitytoexportresultstoavarietyofmachine-andhuman-readablefileformatsinclud-ingcomma-separatedvaluesandHTML.
Availability:Thetoolisfreelyavailableonlineathttp://www.minet.uni-jena.de/~wernicke/motifs/andrunsunderLinux,MacOSandWindows.
Contact:wernicke@minet.uni-jena.de
beenproposedbyKashtanetal.(2004).However,amongsomeotherdrawbacks,thisalgorithmprovidesonlynon-uniformsam-plingandscalespoorlyasmotifsizeincreases;amoredetailedanalysisoftheseproblemsisgivenbyWernicke(2005).
FANMODisatoolfornetworkmotifdetectionthatimplementsanovelalgorithmcalledRAND-ESU(Wernicke,2005)toenumerateandsamplesubgraphs.Thisalgorithmisordersofmagnitudefasterthananyotherexistingalgorithmforthistask,facilitatingthedetec-tionoflargermotifsinbiggernetworksthanpreviouslypossible.Moreover,FANMODallowsformotifdetectionincolorednet-works,somethingnotpossiblewithotherexistingtools.
2COMPARISONWITHEXISTINGTOOLS
1INTRODUCTION
Manybiologicalnetworkscontaincertainsmallsubnetworksinsignificantlyhigherfrequenciesthanrandomnetworks.Miloetal.(2002,2004)proposedtousesuchoverabundant‘topologicalmodules’(Vespignani,2003)foruncoveringthestructuraldesignprinciplesofbiologicalnetworks,therebycoiningthetermnetworkmotifsforthem.Theanalysisofnetworkmotifshasledtointer-estingresults,e.g.intheareasofprotein–proteininteractionprediction(AlbertandAlbert,2004),hierarchicalnetworkdecom-position(Itzkovitzetal.,2005)andtheanalysisoftemporalgeneexpressionpatterns(Kaliretal.,2001).
Findingnetworkmotifsconsistsofthreecomputationallyexpensivesubtasks:
Findingwhichsubgraphsoccurintheinputnetworkandinwhatnumber.
Determiningwhichofthesesubgraphsaretopologicallyequiva-lent(i.e.isomorphic)andgroupingthemintosubgraphclassesaccordingly.
Determiningwhichsubgraphclassesaredisplayedatmuchhigherfrequenciesthaninrandomgraphs(underaspecifiedrandomgraphmodel).
Someworkhasbeenspentonthesecondsubtaskbut—untilrecently—considerablylessontheothertwo.Inordertospeedupthefirstsubtask,analgorithmforsamplingsubgraphshas
ÃTowhomcorrespondenceshouldbeaddressed.
WeareawareoftwotoolsthatperformsomewhatsimilartasksasFANMODandallowforthedetectionandanalysisofnetworkmotifsindirectedandundirectednetworks,namelyMFINDER(Kashtanetal.,2002)andMAVISTO(Schreiberand
¨bbermeyer,2005).SomeworksalsomentionPAJEKSchwo
(BatageljandMrvar,2003)inthiscontext,amulti-functionaltoolfornetworkanalysis.However,PAJEKisoflimiteduseinnetworkmotifanalysis;whileitsupportsthesearchforalloccur-rencesofacertainpatterninanetwork,theenumerationofsub-graphsandstatisticalcomparisonwithrandomgraphsarenotsufficientlysupported.
BothMFINDERandMAVISTOsupportthedetectionofnetworkmotifsconsistingofuptoeightvertices,butotherwisethesetoolshaveadifferentfocus:MFINDERisacommand-linetoolthatisconcernedsolelywiththedetectionofnetworkmotifswhereasMAVISTOvisualizesoccurencesofamotifinanetworkbyaforce-directedgraphlayoutalgorithm.MFINDERalsoincorporatesabroadrangeofrandomgraphmodelsfordeterminingthefrequencyofsubgraphsinrandomgraphs.AtoolnamedMDRAWhasrecentlybeenreleasedinordertovisualizetheoutputofMFINDER,itisavailablefromthesamewebsiteasMFINDER.ThemaindisadvantageofusingMFINDERandMAVISTOfornetworkmotifdetectionisthattheemployedalgorithmsforsubgraphenumerationandsampling(thelatterbeingonlysupportedbyMFINDER)arecomparablyslowandscalepoorlyasthesub-graphsizeincreases.Asanexample,onalaptopequippedwitha1.5GHzPentiumMprocessorand512MBRAM,enumeratingall1.4·106size-5subgraphsinthetranscriptionalnetworkofEscherichiaColi(Shen-Orretal.,2002)requires620swithMAVISTOwhereasMFINDERtakes180s.OurtoolFANMODperformsthistaskinlessthan10s.
OtheradvantagesofFANMODoverthetwoexistingtoolsincludetheabilitytohandlenetworkswithcoloredverticesand
ÓTheAuthor2006.PublishedbyOxfordUniversityPress.Allrightsreserved.ForPermissions,pleaseemail:journals.permissions@oxfordjournals.org
1152
Fanmod
Fig.1.Detectingsize-4networkmotifswithcolorededgesinthetranscriptionalnetworkofE.ColiusingtheFANMODinterface(left).Viaanexportfilter(middle),theobtainedresultscanbeexportedtoHTML(right).
edgesinordertomodeldifferenttypesofinteractionsbetweendifferentkindsofentities(e.g.tofindmotifsinprotein–geneinteractionnetworks)andtheabilitytoaccuratelypredicttheoverallrunningtimeofitsmotifdetectionalgorithm(contrarytopreviousalgorithms,RAND-ESUallowsforaquickandaccurateestimationofthetotalnumberofsize-ksubgraphsinagivennetwork).WhileFANMODdoesnotincorporateamoduleforvisualizingconcreteappearancesofmotifsinanetwork,variousoutputformats(includ-ingcomma-separatedvaluesandHTML)facilitatetheanalysisandfurtherprocessingofresults.
HTMLexportfunctionwithvariousfilters(e.g.afilterforavoidingso-called‘danglingmotifs’thatcontaindegree-onevertices)allowsforthequickinspectionanddistributionofresults(seeFig.1foranexample).
ACKNOWLEDGEMENTS
ThisworkwassupportedbytheDeutscheTelekomStiftungandtheDeutscheForschungsgemeinschaft(DFG)projectPEAL(parame-terizedcomplexityandexactalgorithms,NI369/1).ConflictofInterest:nonedeclared.
3IMPLEMENTATIONANDFEATURES
TheFANMODtooliswrittenintheC++programminglanguageandconsistsofapproximately7000linesofnon-librarycode.Thegraphicaluserinterfaceandothersystem-dependentfeaturesareimplementedwiththewxWIDGETSframework(Smartetal.,2005),whichisavailableforabroadrangeofplatformsincludingLinux,MacOSandWindows.
FANMODcandetectnetworkmotifsuptoasizeofeightvertices.Forthis,allsubgraphsofthegivensizeareeitherenumer-atedoruniformlysampledintheinputnetworkusingthealgorithmdescribedbyWernicke(2005).Thesubgraphsaregroupedintoisomorphicsubgraphclassesusinganimplementationofthecano-nicalgraph-labelingalgorithmNAUTY(McKay,1981).Finally,FANMODdeterminesthefrequencyofsubgraphclassesinauser-specifiednumberofrandomgraphs.Therandomgraphsaregeneratedfromtheoriginalnetworkbyswitchingedgesbetweenvertices;theusermaychoosebetweendifferentswitchingschemesinordertopreservecertaingraphproperties(suchasthenumberofbidirectionaledgesindirectednetworks)duringtherandomization.Forcolorednetworks,motifsofsizeuptosevenvertices(depend-ingonthenumberofcolorsthatareusedforedgesandvertices)canbedetected.Thespeedofthetoolisnotaffectedbycolors,ingeneralitisevenalittlefasterbecausethecanonicalgraphlabelingisfacilitated.Therandomnetworkscanoptionallypreservethenumberofedgesbetweenverticesofdifferentcolors.
Thecalculatedsignificanceofeachsubgraphinthenetwork(expressedasP-ValuesandZ-Scoreswithrespecttothegeneratedrandomnetworks)canbeexportedtoavarietyofformats.An
REFERENCES
Albert,I.andAlbert,R.(2004)Conservednetworkmotifsallowprotein–proteininter-actionprediction.Bioinformatics,20,3346–3352.
Batagelj,V.andMrvar,A.(2003)Pajek—analysisandvisualizationoflargenetworks.
¨nger,M.andMutzel,P.(eds),GraphDrawingSoftware.Springer-Verlag,InJu
pp.77–103.
Itzkovitz,S.etal.(2005)Coarse-grainingandself-dissimilarityofcomplexnetworks.
Phys.Rev.E.,71,016127.
Kalir,S.etal.(2001)Orderinggenesinaflagellapathwaybyanalysisofexpression
kineticsfromlivingbacteria.Science,292,2080–2083.
Kashtan,N.,Itzkovitz,S.,Milo,R.andAlon,U.(2002)Mfindertoolguide.Technical
report,DepartmentofMolecularCellBiologyandComputerScienceandAppliedMathematics,WeizmanInstituteofScience,Israel.
Kashtan,N.etal.(2004)Efficientsamplingalgorithmforestimatingsubgraphcon-centrationsanddetectingnetworkmotifs.Bioinformatics,20,1746–1758.McKay,B.D.(1981)Practicalgraphisomorphism.Congr.Numer.,30,45–87.
Milo,R.etal.(2004)Superfamiliesofdesignedandevolvednetworks.Science,
303,1538–1542.
Milo,R.etal.(2002)Networkmotifs:simplebuildingblocksofcomplexnetworks.
Science,298,824–827.
¨bbermeyer,H.(2005)Mavisto:atoolfortheexplorationofSchreiber,F.andSchwo
networkmotifs.Bioinformatics,21,3572–3574.
Shen-Orr,S.S.etal.(2002)Networkmotifsinthetranscriptionalregulationnetworkof
EscherichiaColi.Nat.Gen.,31,64–68.
Smart,J.,Hock,K.andCsomor,S.(2005)Cross-PlatformGUIProgrammingwith
wxWidgets.PrenticeHallPTR.
Vespignani,A.(2003)Evolutionthinksmodular.Nat.Gen.,35,118–119.
Wernicke,S.(2005)Afasteralgorithmfordetectingnetworkmotifs.InProceedingsof
the5thWorkshoponAlgorithmsinBioinformatics(WABI’05),LectureNotesinBioinformatics.Vol.3692,pp.165–177.(Longversionsubmitted).
1153
因篇幅问题不能全部显示,请点此查看更多更全内容