Line 7: | Line 7: | ||
| style="padding:2px;" | <h2 id="mp-tfa-h2" style="margin:3px; background:#fff7bd; font-family:inherit; font-size:120%; font-weight:bold; border:1px solid #f2ea7e; text-align:left; color:#000; padding:0.2em 0.4em;">T19 Integration of Information to Identify Objects in Big Data</h2> | | style="padding:2px;" | <h2 id="mp-tfa-h2" style="margin:3px; background:#fff7bd; font-family:inherit; font-size:120%; font-weight:bold; border:1px solid #f2ea7e; text-align:left; color:#000; padding:0.2em 0.4em;">T19 Integration of Information to Identify Objects in Big Data</h2> | ||
|- | |- | ||
− | | style="color:#000;" | <div id="mp-tfa" style="padding:2px 5px"> | + | | style="color:#000;" | <div id="mp-tfa" style="padding:2px 5px">'''Length:''' 3 hours (half day) |
− | + | ||
− | '''Description:''' | + | '''Intended Audience:''' This tutorial is intended for both researchers and practitioners from a variety of areas, |
− | + | e.g., cancer research, health care, communication, business processes, and | |
− | '''Presenter:''' Grace S. Shieh | + | databases, who are interested in integration of information (including several data |
+ | sets of the same type or data sets of distinct types) to filter noise in information and | ||
+ | applying machine learning and statistical methods to identify objects of interest, e.g., | ||
+ | the true mutations in DNA of cancer patients. | ||
+ | |||
+ | '''Description:''' Big data has tremendous potential to transform businesses and research but raises | ||
+ | significant challenges in pre-processing and extracting useful information and | ||
+ | information integration to identify objects of interest. In this tutorial, I will present | ||
+ | some statistical methods/machine learning for fusion and analysis of big data in | ||
+ | cancer research, e.g., DNA sequencing data, gene expression data (RNA-seq) from | ||
+ | The Cancer Genome Atlas (TCGA), protein expression and clinical features of | ||
+ | cancer patients. This tutorial aims to cover both useful statistical/data mining | ||
+ | methods and the cutting-edge directions. | ||
+ | |||
+ | Topics include the following: (1) integration of data sets to filter noise in the | ||
+ | information, (2) sampling of big data to reduce computational burden but retain | ||
+ | certain prediction accuracy, (3) applying machine learning/statistics to identify true | ||
+ | objects, e.g., true mutations in DNA sequencing data of cancer patients, and (4) | ||
+ | integration of distinct types of information to identify objects, e.g., using DNA, | ||
+ | RNA gene expression and protein data, and clinical features of cancer patients to | ||
+ | find novel drug targets for cancers and identify prognosis markers of cancer patients. | ||
+ | |||
+ | '''Prerequisites:''' Basic knowledge of probability and statistics, data mining or | ||
+ | databases will be helpful. | ||
+ | |||
+ | '''Presenter:''' [mailto:gshieh@webmail.stat.sinica.edu.tw Grace S. Shieh] | ||
+ | |||
+ | '''Grace S. Shieh''' is a full research fellow/professor at Institute of Statistical Science, | ||
+ | Academia Sinica/National Taiwan University. She received her PhD in Statistics | ||
+ | from University of Wisconsin-Madison, taught at University of Missouri-Columbia | ||
+ | in 1990-94, and joined ISS-AS since 1994; she branched into computational biology | ||
+ | in 2000. Her research expertise includes integration of data (information), | ||
+ | 2 | ||
+ | information quality, machine learning, directional statistics and association. She has | ||
+ | worked on problems of integrating distinct types of information (data) to uncover | ||
+ | novel drug targets and find prognosis markers for cancers, preprocessing in | ||
+ | information fusion, and integrating several data sets (especially the cutting-edge | ||
+ | biotechnology such as next generation sequencing data) to identify true mutations in | ||
+ | DNA, among others. Her research was funded by government agencies as well as IT | ||
+ | companies such as Taiwan Semiconductor Manufacturing Company. She has | ||
+ | published numerous papers and is an elected fellow of International Statistical | ||
+ | Institute. She has served as a committee member, session chair, organizer and | ||
+ | workshop/tutorial lecturer for numerous international conferences. She is also an | ||
+ | associate editor for Statistical Methodology, Frontiers in Statistical Methodology | ||
+ | and Genetics and STAT. | ||
+ | <div align="right"> | ||
+ | [[Tutorials| Back to Tutorials]] | ||
+ | </div> | ||
</div> | </div> | ||
|- | |- |
Revision as of 13:21, 24 February 2016
|