Comparative analysis of machine learning techniques for feature selection and classification of Fast Radio Bursts
Comparative analysis of machine learning techniques for feature selection and classification of Fast Radio Bursts
Ailton J. B. Júnior, Jéferson A. S. Fortunato, Leonardo J. Silvestre, Thonimar V. Alencar, Wiliam S. Hipólito-Ricaldi
AbstractFast Radio Bursts (FRBs) are millisecond-duration radio transients of extragalactic origin, exhibiting a wide range of physical and observational properties. Distinguishing between repeating and non-repeating FRBs remains a key challenge in understanding their nature. In this work, we apply unsupervised machine learning techniques to classify FRBs based on both primary observables from the CHIME catalog and physically motivated derived features. We evaluate three hybrid pipelines combining dimensionality reduction with clustering: PCA + k-means, t-SNE + HDBSCAN, and t-SNE + Spectral Clustering. To identify optimal hyperparameters, we implement a comprehensive grid search using a custom scoring function that prioritizes recall while penalizing excessive cluster fragmentation and noise. Feature relevance is assessed using principal component loadings, mutual information with the known repeater label, and permutation-based F\textsubscript{2} score sensitivity. Our results demonstrate that the derived features including redshift, luminosity, and spectral properties, such as the spectral index and the spectral running, significantly enhance the classification performance. Finally, we identify a set of FRBs currently labeled as non-repeaters that consistently cluster with known repeaters across all methods, highlighting promising candidates for future follow-up observations and reinforcing the utility of unsupervised approaches in FRB population studies.