Wei Wang is a professor in the Department of Computer Science at University of California, Los Angeles and the director of the Scalable Analytics Institute (ScAi). She received her PhD degree in Computer Science from the University of California, Los Angeles in 1999. She was a professor in Computer Science at the University of North Carolina at Chapel Hill from 2002 to 2012, and was a research staff member at the IBM T. J. Watson Research Center between 1999 and 2002. Dr. Wang's research interests include big data analytics, data mining, bioinformatics and computational biology, and databases. She has filed seven patents, and has published one monograph and more than one hundred seventy research papers in international journals and major peer-reviewed conference proceedings.
Dr. Wang received the IBM Invention Achievement Awards in 2000 and 2001. She was the recipient of an NSF Faculty Early Career Development (CAREER) Award in 2005. She was named a Microsoft Research New Faculty Fellow in 2005. She was honored with the 2007 Phillip and Ruth Hettleman Prize for Artistic and Scholarly Achievement at UNC. She was recognized with an IEEE ICDM Outstanding Service Award in 2012, an Okawa Foundation Research Award in 2013 and an ACM SIGKDD Service Award in 2016. Dr. Wang has been an associate editor of the IEEE Transactions on Knowledge and Data Engineering, IEEE Transactions on Big Data, ACM Transactions on Knowledge Discovery in Data, Journal of Knowledge and Information Systems, Data Mining and Knowledge Discovery, IEEE/ACM Transactions on Computational Biology and Bioinformatics, and International Journal of Knowledge Discovery in Bioinformatics. She serves on the organization and program committees of international conferences including ACM SIGMOD, ACM SIGKDD, ACM BCB, VLDB, ICDE, EDBT, ACM CIKM, IEEE ICDM, SIAM DM, SSDBM, RECOMB, BIBM. She was elected to the Board of Directors of the ACM Special Interest Group on Bioinformatics, Computational Biology, and Biomedical Informatics (SIGBio) in 2015.
Alignment-free RNASeq Analysis
RNASeq technique has been demonstrated as a revolutionary means for exploring transcriptome because it provides deep coverage and base-pair level resolution. Traditional RNASeq quantification tools require the alignments of fragments to either a genome or a transcriptome, entailing a time-consuming and intricate alignment step. In order to improve the performance of RNASeq quantification, alignment-free methods have been recently proposed to quantify transcript abundances using k-mers in the transcriptome, demonstrating the feasibility of designing an efficient alignment-free method for transcriptome quantification. I will present our contribution in this space. Our methods partition the transcriptome into disjoint transcript clusters based on sequence similarity and employ the notion of sig-mers that are a special type of k-mers uniquely associated with each cluster. We demonstrate that the sig-mer counts within a cluster are sufficient for estimating transcript abundances with accuracy comparable to any state of the art method. This enables us to perform transcript quantification on each cluster independently, reducing a complex optimization problem into smaller optimization tasks that can be run in parallel. As a result, we only need to resort to a small percentage of all k-mers and therefore require less computation, and still deliver results of comparable quality.