We are developing bioinformatics tools for interpreting the highly complex data resulting from proteomics and functional genomics platforms in order to maximize the extraction of functionally relevant biological knowledge. These approaches have allowed us to identify signaling pathways regulated as a stem cell differentiates into a heart cell, to discover novel ubiquitylation substrates of viral and bacterial pathogens, and to rapidly and specifically map substrates of acetylation enzyme machinery in yeast.
Mass Spectrometry Interaction Statistics (MiST)
In close collaboration with the Sali lab at UCSF, we have developed the Mass Spectrometry interaction STatistics (MiST) scoring algorithm, which discovers protein interactions in an unbiased manner. MiST allows to perform quality controls, to process raw affinity purification mass spectrometry (APMS) data, and to prioritize biologically relevant bait-prey pairs in a set of replicated APMS experiments. This tool provides the MiST score which is a weighted sum of three features: (1) normalized protein abundance (abundance); (2) invariability of abundance over replicated experiments (reproducibility) and (3) a measure of how unique a bait-prey pair is compared to all other baits (specificity).
Network Propagation
Network propagation is a powerful technique that has been applied successfully in various settings, from biological networks to electrical engineering. Briefly, information about given proteins (e.g. their differential expression or activity) can be superimposed onto the nodes (proteins or signaling moieties) of a network and then propagated through the edges (molecular interactions) to nearby nodes in an iterative manner until convergence is achieved. This can be used to identify subnetworks associated with a gene set or phenotype, and pathways linking one gene to another gene.
We work closely with the Ideker lab at UC San Diego, who are experts in Network Propagation, to apply and expand this method.
Statistical Scoring of Genetic Interactions
We develop statistical scoring methods for quantitative analysis of genetic interactions. Our toolboxes handle raw data processing, as well as quality control, and employ visualization modules for post-processing of the data. We originally developed these tools for yeast and bacterial studies, but have since extended them to cover CRISPR-based genetic interaction screens in mammalian cells. The modular setup facilitates analysis of both microscopy-based and pooled screens.
Structural Modeling Using Genetic Interactions
In collaboration with the Sali lab at UCSF, we use genetic interactions of point mutants to model the structures of macromolecular assemblies. Using our high-throughput genetics platforms, we determine genetic interaction scores for point mutants crossed against libraries of deletion mutants in vivo. We then quantify the similarities of the genetic interaction profiles between point mutants to devise distance restraints for the corresponding pairs of mutated residues. These distance restraints are used as the basis for determining the structures of protein assemblies, using the Integrative Modeling Platform of the Sali lab (https://integrativemodeling.org/)