Knowledge transfer across cell lines using hybrid gaussian process models with entity embedding vectors


Clemens Hutter, Moritz von Stosch, Mariano Nicolas Cruz Bournazou, and Alessandro Butté


Biotechnology and Bioengineering, Aug. 2021.

DOI: 10.1002/bit.27907

[BibTeX, LaTeX, and HTML Reference]


To date, a large number of experiments are performed to develop a biochemical process. The generated data is used only once, to take decisions for development. Could we exploit data of already developed processes to make predictions for a novel process, we could significantly reduce the number of experiments needed. Processes for different products exhibit differences in behaviour, typically only a subset behave similar. Therefore, effective learning on multiple product spanning process data requires a sensible representation of the product identity. We propose to represent the product identity (a categorical feature) by embedding vectors that serve as input to a Gaussian Process regression model. We demonstrate how the embedding vectors can be learned from process data and show that they capture an interpretable notion of product similarity. The improvement in performance is compared to traditional one-hot encoding on a simulated cross product learning task. All in all, the proposed method could render possible significant reductions in wet-lab experiments.


Gaussian Process Regression, Embedding Vector, Transversal DataAnalysis, Hybrid semi-parametric modeling, bioprocess development, cell culture

Download this document:


Copyright Notice: © 2021 C. Hutter, M. von Stosch, M. N. Cruz Bournazou, and A. Butté.

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.