Machine Learning Workflow for Morphological Classification of Galaxies
Machine Learning Workflow for Morphological Classification of Galaxies
Bernd Doser, Kai L. Polsterer, Andreas Fehlner, Sebastian Trujillo-Gomez
AbstractAs part of the EU-funded Center of Excellence SPACE (Scalable Parallel Astrophysical Codes for Exascale), seven commonly used astrophysics simulation codes are being optimized to exploit exascale computing platforms. Exascale cosmological simulations will produce large amounts of data (i.e. several petabytes) that will soon be waiting to be analyzed, with enormous potential for scientific breakthroughs. Our tool Spherinator enables the reduction of these complex data sets to a low-dimensional space using Generative Deep Learning to understand the morphological structure of simulated galaxies. A spherical latent space allows the HiPSter module to provide explorative visualization using Hierarchical Progressive Surveys (HiPS) in the Aladin software. Here we present a machine-learning workflow covering all stages, from data collection to preprocessing, training, prediction, and final deployment. This workflow ensures full reproducibility by keeping track of the code, data, and environment. Additionally, the workflow allows for scalability in managing a large amount of data and complex pipelines. We use only open source software and standards that align with the FAIR (Findability, Accessibility, Interoperability and Reproducibility) principles. In this way, we are able to distribute the workflow reliably and enable collaboration by sharing code, data, and results efficiently.