User login

You are here

Data Fusion based on Coupled Matrix and Tensor Factorizations

Recent technological advances enable us to collect huge amounts of data from multiple sources; however, extracting meaningful information remains to be the main challenge. In complex problems, the structure we are looking for is often buried in the data. In those cases, in particular, looking at the data from different aspects; in other words, jointly analyzing data from multiple sources, i.e., data fusion (also called multi-block, multi-view or multi-set data analysis), increases the chances of capturing the hidden patterns. For instance, in metabolomics, biological fluids are measured using a variety of analytical techniques such as Liquid Chromatography-Mass Spectrometry (LC-MS), Gas Chromatography-Mass Spectrometry (GC-MS) and Nuclear Magnetic Resonance (NMR) Spectroscopy with an ultimate goal of identifying chemicals related to certain conditions such as diseases. Data measured using different analytical methods are often complementary and their fusion enhances knowledge discovery.

We develop mathematical models and algorithms for data fusion. Our earlier work has focused on coupled matrix and tensor factorizations and proved useful especially in recommender system applications. With the goal of fusing data sets for pattern/biomarker discovery, we have later focused on structure-revealing data fusion models that can identify shared - unshared factors in coupled data sets.

Data fusion is a challenging task due to the heterogeneity of the data sets (matrices vs. higher-order tensors, different noise characteristics) and there are many open research questions. We have been working on those research questions with the goal of addressing the challenges in real applications, in particular, in omics data fusion and multimodal neuroimaging data analysis.

 PI: Evrim Acar

  • JODA: Joint Data Analysis for Enhanced Knowledge Discovery in Metabolomics (funded by the Danish Council for Independent Research | Technology and Production Sciences and Sapere Aude Program under the projects 11-116328 and 11-120947 between March 2012 - December 2016, Denmark)
  • Multimodal Neuroimaging Data Fusion (funded by Simula Metropolitan Center for Digital Engineering, Norway)

 COLLABORATORS:

  • Rasmus Bro, Faculty of Science, University of Copenhagen, Denmark
  • Mathias Nilsson, School of Chemistry, The University of Manchester, UK
  • Lars Ove Dragsted, Faculty of Science, University of Copenhagen, Denmark
  • Gozde Gurdeniz, Faculty of Science, University of Copenhagen, Denmark
  • Michael Saunders, Stanford University, Stanford, CA
  • Anne Tjønneland, Danish Cancer Society, Denmark
  • Tormod Næs, Nofima, Norway
  • Tamara G. Kolda, Sandia National Labs, Livermore, CA
  • A. Taylan Cemgil, Bogazici University, Istanbul, Turkey
  • Beyza Ermis, Bogazici University, Istanbul, Turkey
  • Bulent Yener, Rensselaer Polytechnic Institute, Troy, NY
  • Vagelis Papalexakis, University of California Riverside, CA
  • Age K. Smilde, University of Amsterdam, Netherlands
  • Tulay Adali, University of Maryland Baltimore County, MD
  • Urban J. Wunsch, Chalmers University of Technology, Goteborg, Sweden

SOFTWARE: The MATLAB CMTF Toolbox

  • Version 1.1  (Dec. 2014)
  • Version 1.0  (April 2013)

DATA:

RELATED PUBLICATIONS
Coupled Matrix and Tensor Factorizations (Models and Algorithms)

Metabolomics

Neuroscience

Social Network Analysis

Bioinformatics

Environmental Sciences

TALKS AT CONFERENCES/WORKSHOPS & SEMINARS:

WORKSHOPS/MINISYMPOSIA:

LITERATURE SURVEYS:

COURSES:

  • Data fusion as part of the multi-way data analysis course, University of Copenhagen, June 13, 2019
  • Copenhagen School of Chemometrics, May 25-26, 2016
  • ODIN Course on Data Fusion, November 25, 2015

 

                   

                                                      

 Maintained by: Evrim Acar

 

Could not access file: http://www.models.life.ku.dk/site_user_list