TY - GEN
T1 - Uncertainty estimation through quantile forest for prescriptive scheduling of data processing at ALMA
AU - Carrasco, Rodrigo A.
AU - Aburto, Luis
AU - García Yus, Jorge
AU - De Rodt, Alfredo
AU - Speroni, Gianfranco
N1 - Publisher Copyright:
© 2024 SPIE.
PY - 2024
Y1 - 2024
N2 - The Atacama Large Millimeter/submillimeter Array (ALMA) is a prominent astronomical observatory known for its detailed imaging capabilities. Efficient scheduling of ALMA's data processing tasks, especially those involving complex pipeline executions, is crucial for maximizing operational productivity. This paper addresses the challenge by developing a predictive model that estimates the runtime of these tasks, enabling more effective scheduling and resource management. Our approach employs the Light Gradient Boosting Machine (LGBM) and Quantile Forest models to predict processing times and quantify uncertainties. The use of these models is innovative, as it not only provides accurate predictions but also offers insights into the variability of processing times. This is particularly beneficial for handling the dynamic nature of the data processing workload at ALMA. We enhance the model's performance and reliability by incorporating variable scaling and logarithmic transformations. To determine the best model, we comprehensively evaluated seven different machine-learning techniques. Our results show that the LGBM model and quantile estimation outperform traditional methods in predicting task durations. This leads to more efficient scheduling, as it allows the system to account for potential delays and optimize the sequencing of jobs. The quantile approach, in particular, offers a robust method for dealing with the inherent uncertainty in processing times. Our predictive tool has demonstrated a substantial reduction in overall flow time, decreasing it by 5.7%. Further improvements were achieved using stochastic scheduling techniques, which leverage the uncertainty estimates provided by our model. This research highlights the potential of machine learning to significantly enhance the operational efficiency of large-scale observatories like ALMA, providing a scalable and practical solution for managing complex data processing tasks.
AB - The Atacama Large Millimeter/submillimeter Array (ALMA) is a prominent astronomical observatory known for its detailed imaging capabilities. Efficient scheduling of ALMA's data processing tasks, especially those involving complex pipeline executions, is crucial for maximizing operational productivity. This paper addresses the challenge by developing a predictive model that estimates the runtime of these tasks, enabling more effective scheduling and resource management. Our approach employs the Light Gradient Boosting Machine (LGBM) and Quantile Forest models to predict processing times and quantify uncertainties. The use of these models is innovative, as it not only provides accurate predictions but also offers insights into the variability of processing times. This is particularly beneficial for handling the dynamic nature of the data processing workload at ALMA. We enhance the model's performance and reliability by incorporating variable scaling and logarithmic transformations. To determine the best model, we comprehensively evaluated seven different machine-learning techniques. Our results show that the LGBM model and quantile estimation outperform traditional methods in predicting task durations. This leads to more efficient scheduling, as it allows the system to account for potential delays and optimize the sequencing of jobs. The quantile approach, in particular, offers a robust method for dealing with the inherent uncertainty in processing times. Our predictive tool has demonstrated a substantial reduction in overall flow time, decreasing it by 5.7%. Further improvements were achieved using stochastic scheduling techniques, which leverage the uncertainty estimates provided by our model. This research highlights the potential of machine learning to significantly enhance the operational efficiency of large-scale observatories like ALMA, providing a scalable and practical solution for managing complex data processing tasks.
KW - data processing
KW - job processing time prediction
KW - online scheduling
KW - predictive analytics
KW - quantile forest
UR - http://www.scopus.com/inward/record.url?scp=85201816550&partnerID=8YFLogxK
U2 - 10.1117/12.3019492
DO - 10.1117/12.3019492
M3 - Conference contribution
AN - SCOPUS:85201816550
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Observatory Operations
A2 - Benn, Chris R.
A2 - Chrysostomou, Antonio
A2 - Storrie-Lombardi, Lisa J.
PB - SPIE
T2 - Observatory Operations: Strategies, Processes, and Systems X 2024
Y2 - 17 June 2024 through 20 June 2024
ER -