June 20th 2018 - 9am-12.30pm - Hyatt Regency Cambridge - 575 Memorial Dr.

Machine Learning and Advanced Analytics for Integrative Analysis

In the big data era of medicine, data is explosively growing in both volume and variety. The field of precision medicine research lacks the computational tools to reliably integrate multiply types of bio-data. There is a strong need of integrative machine learning models in order to better make use of heterogeneous information in decision making and knowledge discovery. How data from multiple sources are incorporated in a learning system is key step for a successful analysis. 

This provides an unprecedented opportunity to understand disease from multiple angles and make precise data-driven decisions. This meeting will outline the challenges and plot a route for machine learning experts and data scientists to wisely optimise the use of these data.


  • Peter Stetson, Chief Health Informatics Officer, Deputy Physician-In-Chief, Office of the Physician-In-Chief, Memorial Sloan Kettering Cancer Center
  • Cory McLean, Senior Software Engineer, Google Brain
  • Heming Xing, Senior Principal Scientist, Precision Immunology Cluster, Sanofi
  • Riccardo Sabatini, Chief Data Scientist, Orionis Biosciences
  • Marghoob Mohiyuddin, Research Leader, Bioinformatics, Roche
  • Lee Lichtenstein, Associate Director, Somatic Computational Methods, The Broad Institute
  • Shanrong Zhao, Director, Computational Biology and Bioinformatics, Pfizer
  • Renato Umeton, Head of Data Science, Dana-Farber Cancer Institute
  • Gyungah Jun, Director, Head of Neurogenetics & Integrated Genomics, Eisai Andover Innovation Medicines (AiM) Institute
  • Linda Zhou, Director, Research and Life Sciences Solutions, Western Digital
  • Brad Chapman, Senior Research Scientist, Harvard University
  • Vibhor Gupta, Director, Pangaea Group
  • Pablo Cingolani, Principal Scientist, AstraZeneca
  • Nathanael Fillmore, Associate Director for Machine Learning and Predictive Analytics, MAVERIC, VA Boston Healthcare System
  • Bin Li, Director, Computational Biology, Takeda
  • Stephanie Hintzen, Bioinformatics System Programmer, Dana-Farber Cancer Institute
  • Jack Pollard, Head of Cancer Bioinformatics, Sanofi
  • Joanna Fueyo, Visiting Scholar, Bioinformatics, Boston University
  • Mark Kon, Professor of Mathematics and Statistics, Boston University
  • John Quackenbush, Professor of Computational Biology and Bioinformatics, Harvard T.H. Chan School of Public Health
  • Leonid Peshkin, Lecturer on Systems Biology, Harvard Medical School
  • Will Chen, VP Product Management and Business Development, Precision Medicine, Elsevier

Points of discussion:

  • Evaluating the proposition: Advanced analytics + biodata = revolutionised medicine?
  • How are data science and machine learning transforming our understanding of disease and helping us develop treatments?
  • Exploring integrative analysis e.g.  health record data, genetics, expression, metabolomics and imaging.
  • A bio archive for algorithms and a pharma grade AI chemical library.
  • Issues of reproducibility in AI applications.
  • Challenges in using machine learning for the integration of large scale functional genomics.
  • The power of querying multiple levels of evidence to test complex hypotheses enabled by systems biology frameworks and more holistic thinking.
  • Selecting machine learning approaches to establish phenotype-genotype relationships and integrate multiple data types.

meeting output:

  • A definition of specific challenge areas of highest value for the application of machine learning in analysing bio-data.
  • Definitions of strengths and weaknesses of AI/data integration techniques and tools currently available and in the pipeline.
  • Identified areas of opportunity for collaboration between industry and academia.
  • Strategic initiatives that are currently being pursued to progress the use of advanced analytics e.g. case studies of success, technology development and consortia.
  • Recommendations for applying AI in for the discovery and development of drugs. What types of machine learning applications are currently in use and in development at pharmaceutical and biotechnology companies?