Identifying genes indispensable for an organism's life and their characteristics is one of the central questions in current biological research, and hence it would be helpful to develop computational approaches towards thepredictionofessentialgenes. The performance of a predictor is usually measured by the area under the receiver operating characteristic curve (AUC). We propose a novel method by implementinggeneticalgorithmsto maximize thepartialAUCthat is restricted to a specific interval of lower false positive rate (FPR), the region relevant to follow-up experimental validation. Our predictor uses various features based on sequence information, protein-protein interaction network topology, andgeneexpression profiles. A feature selection wrapper was developed to alleviate the over-fitting problem and to weigh each feature's relevance toprediction. We evaluated our method using the proteome of budding yeast. Our implementation ofgeneticalgorithmsmaximizing thepartialAUCbelow 0.05 or 0.10 of FPR outperformed other popular classification methods.
Korean Society for Biochemistry and Molecular Biology. The Korean Federation of Science and Technology Societies 801, 22, Teheran-ro 7Gil, Gangnam-gu, Seoul 06130, South Korea Tel) 82-2-508-7434 Fax) 82-2-508-7578 e-mail) firstname.lastname@example.org Powered by INFOrang Co., Ltd