• Digital-Health-Conference-2020

Invited SpeakersProfile Details

Christophe Van Neste
Christophe Van Neste Christophe Van Neste is a postdoctoral fellow at KAUST, specialized in cellular and genetic biotechnology.


In 2008, Christophe Van Neste obtained a Master degree in Bio-Engineering at Ghent University (Ghent, Belgium), specializing in cellular and genetic biotechnology. In the following years he studied philosophy and obtained in 2010 a Master degree in Philosophical Sciences at Ca'Foscari University (Venice, Italy), with a focus on philosophy of science and epistemic justification for the growth of scientific knowledge. Thereafter, he became a doctoral fellow in the lab of Prof. Dieter Deforce (Ghent University) where he worked as a bioinformatician and endeavored on a project to apply massively parallel sequencing to forensic DNA profiling analyses. During the spring of 2014, he had an internship at the Illumina headquarters in San Diego (USA), where he had the opportunity to work closely with the Illumina BaseSpace development team and Illumina's forensic team. He defended his doctoral dissertation "Porting forensic DNA analysis to deep sequencing" for an international committee in June 2015. Shortly thereafter, he fully committed to cancer research and joined the team of Prof. Frank Speleman of the Ghent Center for Medical Genetics to work on replicative stress in neuroblastoma. In 2016 he won an FWO supported research mandate to work on "Computational probing of replicative stress resistance and induced G-quadruplex resolving processes in embryonic stem or cancer cells." Inspired by the technical challenges of this work, he joined the machine learning and knowledge-mining lab of Prof. Vladimir Bajic (KAUST, Saudi Arabia) in July 2018. Here, he is currently focusing on extracting important insights from cancer literature, through application of natural language processing and AI algorithms. For further information: http://www.van-neste.be

All sessions by Christophe Van Neste

  • Day 3Wednesday, January 22nd
Session 6 : Spotlight on Young Talent (Chair Prof. Robert Hoehndorf)
2:35 pm

The Use of Custom Embeddings Generated from Pubmed Corpora for Cancer Research

In natural language processing, one of the big questions that remain open is “what is the optimal approach to embed our natural language in a vector space?”, which essentially transforms words into series of numbers. Ideally, the numbers should represent semantic meaning. In a multidimensional space, the different dimensions should correspond to different types of meaning (e.g. size of an entity, sex of an animal) that a computer algorithm can then subsequently use to make inferences.

Big text-data endowed institutions or corporations, claim only large-sized corpora produce performant embeddings. In this presentation, we will investigate what is the minimal size of a corpus useful for extracting cancer-related statements. To this end, we developed a literature knowledge mining tool “sina” (https://github.com/dicaso/sina), that allows extracting relevant statements to specific conditions and the research question at hand, by selecting a specific corpus of documents with which to establish a custom word embedding.

Building 19, Hall 1 14:35 - 14:55 Details