Lukasz Kurgan, Ph.D.

Robert J. Mattauch Endowed Professor and Vice Chair of Computer Science

  • Engineering East Hall, Room E4268, Richmond VA UNITED STATES
lkurgan@vcu.edu

Data scientist specializing in high-throughput structural bioinformatics of proteins & small RNAs.

Contact

Media

Biography

Lukasz Kurgan received his M.Sc. degree (with honors) in Automation and Robotics from AGH University of Science and Technology (Poland) in 1999 and a Ph.D. degree in Computer Science from University of Colorado at Boulder in 2003. He joined the University of Alberta in 2003 where he received tenure in 2007 and was promoted to the rank of Professor in 2013. He moved to the Virginia Commonwealth University in 2016 as the Robert J. Mattauch Endowed Professor of Computer Science.

Industry Expertise

Education/Learning
Research
Computer Software
Biotechnology
Pharmaceuticals

Areas of Expertise

Structural Bioinformatics
Intrinsically Disordered Proteins
Protein-ligand(drug) interactions
Computer-aided molecular modeling
Big Data Analysis
Drug Repurposing
Drug Repositioning
Structural Genomics

Accomplishments

Member of Faculty Opinions

2021-09-02

Inducted as member of the "Big Data & Analytics" section of the "Bioinformatics, Biomedical Informatics & Computational Biology" area.

Author of the winning flDPnn algorithm of the international Critical Assessment of Protein Intrinsic Disorder Prediction (CAID) challenge

2021-04-19

CAID is a worldwide competition that identifies the most accurate methods that predict the intrinsically disordered protein regions. The results were recently published in Nature Methods (https://www.nature.com/articles/s41592-021-01117-3), followed by a commentary article in the same journal that highlights our win (https://www.nature.com/articles/s41592-021-01123-5).

Fellow of the Kosciuszko Foundation Collegium of Eminent Scientists

2018-01-30

With citation for "outstanding achievements and contributions to the Polish scientific community."

Show All +

Education

University of Colorado at Boulder

Ph.D.

Computer Science

2003

University of Science and Technology (Poland)

M.Sc.

Automation and Robotics

1999

Affiliations

  • Professor Department of Computer Science Virginia Commonwealth University
  • Adjunt Professor Department of Electrical and Computer Engineering University of Alberta

Media Appearances

Computer science research team gains international recognition for method that accurately predicts intrinsic disorder in proteins

VCU news  online

2021-05-19

A computer science research team from VCU Engineering won an international challenge for their novel method of predicting intrinsically disordered proteins. Kurgan's award-winning method now appears in the journal Nature Communications (https://www.nature.com/articles/s41467-021-24773-7). The editors of Nature Communications also placed Kurgan's article on the Editor's Highlights page, which features a small selection of articles the editorial team believes to be particularly interesting or important.

View More

VCU professors join elite bioengineering institute

Commonwealth Times  online

2018-04-16

Three professors were inducted into the American Institute for Medical and Biological Engineering (AIMBE) at a formal ceremony on April 9, 2018. Kurgan was nominated for his work in structural bioinformatics, using computer programs to study the structures of proteins and DNA.

View More

VCU's Kurgan supercomputer programs help biologists to speed up hypothesis generation to understand proteins

Supercomputing Online News  online

2017-07-24

“We have manually curated but understand less than 1 percent of these proteins, and right now there’s over 80 million to solve,” said Kurgan, a Qimonda-endowed professor and data scientist. “A program can solve these proteins faster than a single human and can help researchers speed up hypothesis generation.”

View More

Show All +

Research Grants

Integrated prediction of intrinsic disorder and disorder functions with modular multi-label deep learning

NSF

2021-08-31

Proteins are remarkable biological machines. Hundreds of millions of protein sequences were decoded over the last two decades creating a significant knowledge gap related to the fact that we do not know what most of them do. A common way to decipher protein functions relies on the sequence-to-structure-to-function paradigm where protein function is learned from the protein structure that is produced from the sequence. However, recent research has identified a large family of the intrinsically disordered proteins that lack a stable structure under physiological conditions and which therefore cannot be characterized using the structure-based approaches. These proteins are particularly abundant in the eukaryotes and are involved in the pathogenesis of numerous human diseases. The discovery of the intrinsically disordered proteins has prompted the development of a new generation of computational methods that predict presence of intrinsic disorder directly from protein sequences. A recently completed Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment has shown that these methods are fast and provide accurate results. However, while intrinsic disorder can be readily and accurately identified in protein sequences, its function remains a mystery. This proposal will conceptualize, design, implement, test and deploy an innovative machine learning method that provides highly accurate and integrated predictions of disorder and disorder functions directly from protein sequences. The team will utilize this method to produce functional annotations of disorder on an unprecedented scale of dozens of millions of proteins, addressing the knowledge gap problem for this protein family. In the long run this project will advance understanding of fundamental biological processes and related human health issues in the context of the intrinsically disordered proteins. This project will also train STEM students and researchers via high-school outreach and multidisciplinary teaching and mentoring of undergraduate and graduate students and postdoctoral researchers, producing highly skilled researchers who are sought after by industry and academia.

View more

High-throughput annotation of cellular functions of intrinsic disorder in proteins

NSF

2016-10-01

One of fundamental problems in molecular biology is to decipher functions of millions of uncharacterized protein sequences that are rapidly generated by high-throughput genome sequencing. The sequence-to-structure-to-function paradigm was used for decades to determine functions of proteins. However, recent research has broadened this paradigm by adding new players, proteins with intrinsic disorder (ID). They are highly abundant and cannot be solved with the currently used structure-driven approach. While there are many widely used computational methods that accurately predict ID in protein sequences, methods for the prediction of the many functions of ID are lacking. This project will develop a family of novel, accurate, and high-throughput computational methods that predict all major functions of ID in protein sequences. It will produce putative functional annotations on an unprecedented scale of thousands of species, addressing the problem of high rate acquisition of raw sequence data and contributing to the increase of the rate of scientific discovery. These results will advance our understanding of fundamental biological processes and human health given the high prevalence of ID in human diseases and attractiveness of proteins with ID as drug targets.

View more

High-throughput characterization, prediction, and applications of protein disorder

NSERC

2012-03-01

For years, scientists were convinced that proteins must fold into precise, rigid molecules to allow proteins to function correctly. This view is changing now. The intrinsically disordered proteins have at least some disordered (also called unfolded/highly flexible) parts and many of them carry out their function without ever fully folding into a rigid molecule. The disorder is highly abundant in nature and its prevalence was shown in several human diseases. However, the characterization of protein disorder is lagging behind the rapidly growing number of known proteins. Experimental annotations of disorder are time consuming and difficult and thus computational methods that predict disorder from protein sequences have emerged as a viable alternative to bridge the annotation gap and to investigate the disorder. Although the quality of these predictors continues to rise, more accurate methods and novel methods that address specific characteristics of disorder are urgently needed. Moreover, there is a pressing need to understand and characterize disorder in various proteomes and functional classes of proteins. To this end, our objectives include (1) development of a comprehensive computational platform for accurate, fast, and multi-objective prediction of disorder; and (2) applications and experimental validation of disorder predictions. This work facilitates a more complete understanding of the protein disorder, principles of protein folding, and molecular mechanisms of protein function. Our methods provide a cost and time effective solution to guide experimentalists, and they are crucial for modern research and development in several areas, including rational drug design, structural genomics, and systems biology.

View more

Show All +

Courses

CMSC 435 Introduction to Data Science

Virginia Commonwealth University

CMSC 635 Knowledge Discovery and Data Mining

Virginia Commonwealth University

ECE 321 Software Requirements Engineering

University of Alberta

Show All +

Selected Articles

Intrinsic Disorder in Human RNA-Binding Proteins

Journal of Molecular Biology

2021-10-15

Although RNA-binding proteins (RBPs) are known to be enriched in intrinsic disorder, no previous analysis focused on RBPs interacting with specific RNA types. We fill this gap with a comprehensive analysis of the putative disorder in RBPs binding to six common RNA types: messenger RNA (mRNA), transfer RNA (tRNA), small nuclear RNA (snRNA), non-coding RNA (ncRNA), ribosomal RNA (rRNA), and internal ribosome RNA (irRNA). We also analyze the amount of putative intrinsic disorder in the RNA-binding domains (RBDs) and non-RNA-binding-domain regions (non-RBD regions). Consistent with previous studies, we show that in comparison with human proteome, RBPs are significantly enriched in disorder. However, closer examination finds significant enrichment in predicted disorder for the mRNA-, rRNA- and snRNA-binding proteins, while the proteins that interact with ncRNA and irRNA are not enriched in disorder, and the tRNA-binding proteins are significantly depleted in disorder. We show a consistent pattern of significant disorder enrichment in the non-RBD regions coupled with low levels of disorder in RBDs, which suggests that disorder is relatively rarely utilized in the RNA-binding regions. Our analysis of the non-RBD regions suggests that disorder harbors posttranslational modification sites and is involved in the putative interactions with DNA. Importantly, we utilize experimental data from DisProt and independent data from Pfam to validate the above observations that rely on the disorder predictions. This study provides new insights into the distribution of disorder across proteins that bind different RNA types and the functional role of disorder in the regions where it is enriched.

View more

flDPnn: Accurate intrinsic disorder prediction with putative propensities of disorder functions

Nature Communications

2021-07-21

Identification of intrinsic disorder in proteins relies in large part on computational predictors, which demands that their accuracy should be high. Since intrinsic disorder carries out a broad range of cellular functions, it is desirable to couple the disorder and disorder function predictions. We report a computational tool, flDPnn, that provides accurate, fast and comprehensive disorder and disorder function predictions from protein sequences. The recent Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment and results on other test datasets demonstrate that flDPnn offers accurate predictions of disorder, fully disordered proteins and four common disorder functions. These predictions are substantially better than the results of the existing disorder predictors and methods that predict functions of disorder. Ablation tests reveal that the high predictive performance stems from innovative ways used in flDPnn to derive sequence profiles and encode inputs. flDPnn’s webserver is available at http://biomine.cs.vcu.edu/servers/flDPnn/

View more

DescribePROT: database of amino acid-level protein structure and function predictions

Nucleic Acids Research

2020-10-29

We present DescribePROT, the database of predicted amino acid-level descriptors of structure and function of proteins. DescribePROT delivers a comprehensive collection of 13 complementary descriptors predicted using 10 popular and accurate algorithms for 83 complete proteomes that cover key model organisms. The current version includes 7.8 billion predictions for close to 600 million amino acids in 1.4 million proteins. The descriptors encompass sequence conservation, position specific scoring matrix, secondary structure, solvent accessibility, intrinsic disorder, disordered linkers, signal peptides, MoRFs and interactions with proteins, DNA and RNAs. Users can search DescribePROT by the amino acid sequence and the UniProt accession number and entry name. The pre-computed results are made available instantaneously. The predictions can be accesses via an interactive graphical interface that allows simultaneous analysis of multiple descriptors and can be also downloaded in structured formats at the protein, proteome and whole database scale. The putative annotations included by DescriPROT are useful for a broad range of studies, including: investigations of protein function, applied projects focusing on therapeutics and diseases, and in the development of predictors for other protein sequence descriptors. Future releases will expand the coverage of DescribePROT. DescribePROT can be accessed at http://biomine.cs.vcu.edu/servers/DESCRIBEPROT/.

View more

Show All +