: 竺!V0 ! 0.1(Serial No.5O) Journa1ofUSChinaM-.........................................edicalScience,ISSN1548..6648,USA ..........—...........Triple—type Theory of Statistics and Its Role of Guidance for Scientiic fResearch Work HU Liang-ping,LIU Hong-mei (Beijing Consulting Center ofBiomedical Statistics,Beo'ing 100850) Abstract:Objective To offer a practical solution to the situation that many researchers make frequent errors and can hardly capture the essence of statistics despite various learning and training experiences in related courses. in the hope of improving the validity of statistical research design and statistical analysis,thereby improving the overall qualiy of worltd’s research projects and academic publications to guarantee scientiifc reliabiliy and texactness of our research endeavors.Methods Based on the experience in long.time teaching and research in statistics,the author points out the CrUX of misuse of statistics and proposes a“triple.type theory of statistics”. which may promise a key to solving this issueThe“three—type theory”is summarized as follows:Any statistical .problem in practical research presents itself in three ypest・expressive type,prototype and the standardized typeIt .is of critical importance to understand their correlations in a specific study before applying statistical analysis methods,which helps researchers“see the essence through the superficial phenomena”i.e.apply statistics .rationally and properly.Results Our investigations reveal that for certain problems,the tree thypes are mutuallv identical;for some problems,the“prototype”is the“standardized type”;howeverfor some others,the three types ,are distinct from each other.Particularly in some multifactor experimental studiesresearchers have committed the ,“incomplete control error”in setting experimental groupswhich leads to the non—existence of the“standardized .type”corresponding to the“prototype”.This diiculft problem could be resolved by introducing the concept and method of“splitting groups”.Conclusion Once the tree thypes and their correlations for each speciifc question are clarified,a good design of research,proper presentation and interpretation of dataa sound selection and ,application of statistical method can be ca ̄ied out naturally and easilyTriple.type theory of statistics can help .researchers avoid committing statistical errors or at least reduce the rate of misuse and therefore enhance the quality and level of our biomedical research.It may also facilitate the compilation of new textbooks and reform of teaching methodology,providing guidelines f0r the application and development of biomedical statistics. Key words:expressive ype;prottotype;standardized type;research design;statistical analysis INTRoDUCTIoN Published data indicate that China has made remarkable achievements in science and technologybut there is ,still a big gap between China and the world’s leading countriesLiterature metrology experts have discovered that .the number of China’s scientiifc and technical journals and articles covered or cited by the Science Citation Index (SCI)and the Engineering Index(EI)is on the increase,however,it is limited and incomparable with that of“US. UK,Netherlands and Germany”.What are causes?The author,after investigationsfinds that many researchers’ ,HU Liang。ping(】955一),male.M.D.,professor of Beijing Consulting Center of Biomedical Statistics;research field:medical statistics. 56 Triple-type Theory of Statistics and Its Role of Guidance for Scientific Research Work command of research design and statistical analysis is surprisingly inadequate,which is the actual root cause of the problem【ll_Statistics textbooks are far from instructive and practical for research work,filled with the standardized models.In fact,researchers have to study multiple factors,observation index and latent variables, most of which are not the“standardized—types”described in the teaching materials.What they often come across in research work are those practical problems that are self-comprehensible yet elusive in essence,sometimes even present them in a pseudo form(defined as“expressive type”、.It is common sense that we must find the essential factors of the problem(defined as“prototype”),convert the“prototype”into the“standardized type”,and then use classical statistical methods to solve the problem.The author attempts to generalize and summarize the above thinking process,which can be applied to solve various simple or complex problems,and brings forth a new theory.“Triple—type Theory of Statistics”. MATERIALS AND METHoDS 1.Triple—type Theory of Statistics Statistics has become an indispensable tool in biomedical research.A sound application of statistical techniques relies on a rational guidance of statistic thinking and theories.Here the crux of the matter is“seeing the essence through the superficial phenomena”.A large number of practical statistical problems in biomedical research usually manifest themselves in“expressive types”with pseudo features.Many researchers take these “expressive types’’as the“standardized types”mentioned in most textbooks.indiscriminately applying the ready—made methods to solve the problems.Quite often,they make errors.To ensure a scientiifc valid research design and accurate analysis of research findings,researchers must fully understand and find the“prototype’’of the problem,which reflects the essentials of its“expressive type”.convert the“prototype’’into the“standardized type”and then select the proper statistics methods.This is the“Triple—type theory of statistics”.a new theory proposed by the author,in the hope of helping researchers solve the practical statistical problems in a creative wav 】. The”expressive type—of most problems may present itself in two situations.First,research design is nonstandard.In addition,the presentation/interpretation of research data or procedure is vague or ambiguous; second,research design is standard whereas the presentation/interpretatiOn of research data or procedures is vague or ambiguous. “Prototype”reflects the essence of a problem.There are some hints to indicate how many factors and how many levels of each factor are involved,though such information is not manifest jn forrn. “Standardized type”presents,in a most comprehensible way,the information regarding relevant factors,the number of the factors,different levels of each factor and their intercombination relationshipIt’s necessary for the .analyzers to further detect the repeated measurement factorsconsider whether the affecting factors start function ,at different points of time,and figure out the major and minor factors that may affect the observation index.A proper type of research design can be determined. Once research data are obtained,it will be easy to select appropriate statistical methods in accordance with the background of the research,the objective of statistic analysis and the properties of data. 2.Application of Triple—type Theory of Statistics Case 1.A researcher wanted to investigate the eficacy of fdrug A and drug B in elevating white blood cells (WBC)counts.The values of WBC counts prior to and after drug administration were used as index of assessment 57 Triple・type Theory of Statistics and Its Role of Guidance for Scientific Research Work for therapeutic effect.He set the following 4 groups,with 20 mice in each group,and observed the values of WBC counts Group 1.The blank control group Group 2.Single drug A group Group 3.2-drug(drug A and drug B)combination group Group 4.The blank control group of Group 3 Question:Are there any errors in his research design?What are the“expressive type”.the“prototype”and the“standardized type”of the experimental design respectively? 3.ErrOr Discrimination The investigator needn’t have set two blank control groups and thereby wasted 20 mice.To observe whether the combination use of drug A and drug B could produce synergistic effect or antagonistic effect,the researcher should have not only set single drug A group and 2-drug(A and B)combination group,but single drug B group as wel 1.The“expressive type”,“prototype”and“standardized ype”of tthe experimental design are shown in Table 1, Table 2 and Table 3 respectively. Table 1 Expressive type of case l Group categories Elevated values ofWBC Counts( ±S) X X X ①Blank control ②Single drugA ③2-drug(drug A and drug B)combination ④Blank control of Group 3 X Obviously.we can find two errors in the above“expressive type’’:“overuse of the blank control’’and “incomplete arrangement in setting control group”,i.e.,for some groups,there are corresponding control groups, however,some other groups do not have corresponding control groups. Table 2 Prototype of case 1 Group categories Elevated values of WBC counts(measurement unit)( ±S) X ①Blnk acontrol ②Single drugA X ③Single drugB ④2-drug(drug A and drug B)combination X X “Single drug B group’’is added in the“prototype”so as to demonstrate the effect produced by single drug A. single drug B,combination use of drug A and drug B,and blank use of the two dugs.If the experiment is to be conducted based on such‘‘prototype”,it will be extremely diiculft to analyze the data since in each group there are 20 changed values of WBC counts.Many people may indiscriminately apply the classical models in the statistics textbooks.categorize the data as“quantitative data of single-factor with four-level design”,and consequently select the method of“analysis of variance or rank sum test for single—factor with four—level design'’. Actually.they are misled by the superficial information,since there are three possibilities for the 4-group experimental design:a real single—factor with four-level;an incomplete combination of levels of two or multiple factors;complete combination of two-factor with two—leve1.The researcher must determine the proper ype of tthe design before selecting appropriate analysis methods to process data.Case 1 falls into the third type.Next,we convert the“prototype”into the corresponding“standardized ype”.whitch is shown in Table 3. 58 cientiicResearfchWTriple・type Theory ofStatistics and Its Role ofGuidanceforSork —————Table 3 Standardized type(I)of case 1 Whether or not drug A is used Elevated values of WB counts(measurement unit)( ±S 1 WhetherornotdrugBis used:No Yes If“whether or not drug A is used”and“whether or not drug B is used”have equal impact on the experimental results.the statistical structure illustrated in table 3 is called“2一factor factorial design”or“2x2 factorial design”.If the quantitative data assume independence,normality and homogeneity of variance,which is the prerequisite for a parametric test.the method of“analysis of variance with 2.factor factorial design”should be employed to analyze data.Otherwise.it is necessary to use variables transformation method to covert data into variables that meet the prerequisite of parametric tests before applying the above technique. On condition that the two factors whether or not drug A is used”:“whether or not drug B is used”)produce verV deferent impact feither major or minor)on the experimental results,the stucturre as shown in Table 3 is called“nested design with two—factor”in statistics.i.e.the secondary factor is nested under the primary factor. Such data information is better illustrated in Table 4,“standardized type”fII)of Case 1.Here we assume“whether or not drug A is used”is the primary factor,and“whether or not drug B is used”is the secondary factor. Table 4 Standardized type《II)of case 1 Elevated values of WB Counts(measurement unit)r ±S) Whether or not drug A is used:No Yes Whether or not drug B is used:No X yes X Whether or not drug B is used:No X yes X Experimental data should be subjected to“analysis of variance nested design with 2-factor”if they assume independence,normality and homogeneity of variance,which is the prerequisite of a parametric test.Otherwise, relevant data must first be converted into variables that meet the prerequisite of parametric tests by means of variable transformation method before applying the 2-factor nested technique. Case 2.A doctor collected some relevant clinical data(Table 5)and compared 8 sets of data between the two experimental groups respectively,using the t-test for quantitative data of group design. Questions:Are there any errors in his research design?What are the“expressive type”.the‘。prototype’’and the“standardized trype”of the experimental design respectively? Table 5 Comparison of the average amplitude(AP)of EGG waves of patients from two groups prior to and after surgery( V’ ±S)(expressive ype)t 59 Triple-type Theory of Statistics and Its Role of Guidance for Scientific Research Work 4.Error Discrimination Table 5 presents evidently the“expressive type”of the problem.Two experimental factors(group categories and observation time)affect the results of the quantitative measurement in this table.T-test for group design is generally used to analyze quantitative data of single-factor with two—level design(or group design).In this case, quantitative data were deliberately split into multiple single—factor with two—level designand the whole design ,was broken up.As a result,data were utilized less effectively and it is impossible to investigate the impact of the interactions between factors on the observation results.Therefore,reliability of the results was much decreased. The fundamental error is that the researcher failed to figure out the real type of experimental design corresponding to the quantitative data and indiscriminately applied the‘‘standardized type”model in statistics textbooks to analyze the“expressive ype”probltem. “Group Categories”is a group factor in the experiment,that is,patients have been divided into two independent groups.It may be misleading to put“observation time”under“group categories”as it is likely to be seen as a“single-factor with eight—level desin’g .What is the‘‘prototype’’corresponding to the quantitative data? We simply modify the format ofTable 5,and present the“prototype”ofthe problem.(see Table 61 Table 6 Comparison of the average amplitude(AP)of EGG waves of patients from two groups prior to and after surgery(/Z V’ ±S)(prototype) There are two experimental factors in Table 6.“Therapy methods”is a group factor,which serves criteria of dividing subjects into two independent groups.In each group,every patient’s average amplitude(AP)value of EGG waves is observed repeatedly at 4 different points of time.In other words.“observation time”is a repetitive measurement factor.As the experiment involves two experimental factors.the design falls into the type of “two—factor design with repeated measures”.However,judging from the data listed in Table 6,it’s quite possible to misinterpret that the two factors have a“parallel relationship”.The following Table 7 presents the data in a more evident way.which reflects the essential structure of the research design. Table 7 Comparison of the average amplitude(AP)of EGG waves of patients from wo groups pritor to and after surgery(prototype) Table 7 is the“standardized type”of the research design.If the quantitative data meet the prerequisite of parametric tests(normally assume independence,normality and homogeneity of variance,however,advanced statistics method are required to process the repeated-measures data as they do not meet the condition of danceforScientiicResfearchWorkTriple-typeTheory ofStatistics andItsRoleofGui..................................。.—. ——............................independence).“ANOVA with 2.factor repeated measures design”should be selected.Otherwise,relevant data must first be converted into variables that meet the prerequisite of parametric test by using variable transformation method before applying the above technique. It should be mentioned here that the preoperative index of the patients in the two groups must be similar; otherwise.the results are not comparable.In case of difference,the two sets of data may be regarded as “covariates’’and the method of the“univariate analysis of covariance with 2・factor repeated measures design”is a more appropriate alternative. There are numerous similar or more complex statistical problems in practical biomedical researches.For further reference.please consult the author’s previous books for details【jj_ RESUI rS Our investigations reveal that for certain problems,the three types are mutually identical;for some problems, the“prototype’ is the“standardized type”:however,for some others,the three types are distinct from each other ̄ Particularly in some multifactor experimental studies。researchers have committed the“incomplete control error” in setting experimental groups.which leads to the non—existence of the‘‘standardized type’’corresponding to the “prototype”.This dificulft problem could be resolved by introducing the concept and method of“splitting groups’’ In fact,In order to see the essence through the superficial phenomena,any problem can be solved by using the triple.type theory of statistics correctly. DISCUSSION The seemingly abstract。'triple—type theory of statistics’’is actually very concrete and applicable in real situations.There are numerous similar statistical problems in scientiic research work.Particulfarly in some cases, “categories“and“treatment”are used in a broad sense to refer to specific groups.which may be misleading to researchers.They are likely to assume that“categories’’and‘t'reatment’’always represent“a factor’ In fact,in man)'situations.”categories and“treatment’’reflect the results of complete or incomplete combination of levels with multiple—factor.Correspondingly,here the problems or data present themselves as“expressive types”;when the essential structure of“categories’’and“treatment”is captured and made manifest.the problem or data are then labeled as‘‘prototype”:such data were further converted into the structure of“standardized type”,which can be directly analyzed using statistical methods on condition that all the multiple factors represented by ‘categories” and treatment”are listed clearly,or the original multiple groups are recombined in accordance with specialized and statistical knowledge,and additionally,experimental information regarding whether the factors have major or minor impact on the observation results as well as the sequence of implemented factors are clarified.Therefore, “Triple—type theory of statistics”is not only easily comprehensible.but important for scientific researches. Expand our horizon and we may easily discover that the application of“Triple-type theory of Statistics’’is not confined to biomedical research.It is applicable in any other research fields such as the study on anthroposociology,psychology and the natural laws of universal change and development.Whenever people are faced with a practical problem,it is necessary to see the essence through the superficial phenomena,seek and detect the crux of the problem.reveal the“prototype”behind the“expressive type”.and then find the “standardized type”which can be used to solve the problem.(i.e.the most appropriate solution) Statistics textbooks filled with“standardized type”models have inflicted“trauma”on many learners. 61 Triple-type Theory of Statistics and Its Role of Guidance for Scientiic Research Work fⅢ Long。term and high—level statistics training may help those innocent‘‘victims”,however,the task is time—consuming and impractica1.It is wise to compile statistics textbooks that integrate theories with practice in the hope of improve the quality of our scientiifc projects and academic publications.Statistics textbooks complied in line with the“Triple-ype ttheory of statistics”can correct various misuses of statisticsThe theory may also .provide relevant guidelines for compiling textbooks in other disciplines or making various problem.solving plans. Hence it is of great benefit to enhance people’s abiliy tto understand and analyze problems.In sumthe potentia1 ,and profound signiifcance of promoting and popularizing“Triple—ype tTheory of Statistics”is inestimable. REFERENCES [I】HU L.P.,ZHANG T.M.Analysis of factors affecting the validity of research findings and academic papers.Science Watch. 2006,,(4):9-19. [21 HU L.R,LIU H.G Triple。type theory of statistics and its application in biomedical research.Chinese Medical Journa1.2005. 85(27):1936・1940. 【3 J HU L.E Chief compiler.Application of Tripe—type Theory of Statistics in Experimental Design.Beijing:People’s Military Medical Press.2006:l-l0. 【4】 HU L_P.,LIU H.G Misleading impact of“category”on applications of statisticsChinese Journal ofMedical Writing.2000. .7f71:729-731. (Edited by Jane Chen) (continued from Page 34) REFERENCES WEI R.N..S0NG H.Z.,Ln R.X ,LIN Z.Y,ZHUANG M.X.,WANG B.R.A study for patient of diabetes mellitus’s complication in hospital in Taiwan.Taiwan Journal ofPublic Health,2002,2,(2):l15.122. cA1 H.Health Service Utilization and Outcome ofPatient wf Diabetes in Northern Branch Bureau of National Health Insurance.Taipai:Institute ofHealth and Welfare Policy,1999. Young,T.K.,Reading,J.,Elias,B.,Neil,J.D.Type 2 diabetes mellitus in Canada:Status of an epidemic in progress.Canadian MedicalAssociation Journal,2000,163(5):561・566. HUANG E.S.,Meigs,J.B.,Singer,D.E.The effect of interventions to prevent cardiovascular disease in patients with type 2 diabetes mellitus.American Journal ofMedicine.2001 111(8):633.642. HU Y J.,CHEN M.L.,HU X.Y,LAI F.M..FENG Q.C.,GU C.T.Nursing 0厂C ronic Disease th ed).Taipai:Hui.Hua. 2005:332.335. Brehm,B.J.,Lattin,B.L.,Spang,S.E.,Boback,J.A.,Alessio,D.A.D.Effects of high monounsaturated fat and low fat diets on body weight,cardiovascular factors,and glycemic control in persons with type 2 diabetes.Journal ofthe American Dietetic Association 2005,,05:9. 【7】 Ching C C.,Tsai C.L.,Chwen C.,Chih M.C.,Rong H.C..Tzu W.Lower serum Leptin concentrations in female subjects with type 2 diabetes mellitus.Middle Taiwan Journal fMediocine.2005.10:90.98. (Edited by Jane Chen) 62