TY - JOUR
T1 - A Fine-Tuned BERT-Based Model for Individual Log Anomaly Detection in Operational Monitoring at Paranal Observatory
AU - Catalán, Andrés H.
AU - Carrasco, Rodrigo A.
AU - Ruz, Gonzalo A.
AU - Gil, Juan P.
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2025
Y1 - 2025
N2 - In operational environments such as astronomical observatories, continuous monitoring of system logs is critical yet challenging due to the vast volume of data generated. Manual inspection of these logs is impractical, needing automated methods capable of accurate and scalable anomaly detection. This paper proposes a novel sentiment-aware anomaly detection framework, leveraging Bidirectional Encoder Representations from Transformers with In-Task Pre-Training and Fine-Tuning (BERT-ITPT-FiT), to classify textual logs based on their inherent sentiment polarity. Specifically tailored for the Paranal Observatory, our approach incorporates a computationally efficient, character-based preprocessing strategy that retains essential semantic elements, such as acronyms and technical terms, thereby eliminating the need for traditional parsing. This method efficiently processes a substantial dataset comprising 7,359,506 log entries from three VLTI instruments in an average runtime of only 114.23 seconds. Extensive experiments utilizing real operational data demonstrate that our architecture achieves an effective throughput of approximately 3,958 logs per second during the combined embedding generation and model training phases while consistently delivering an excellent classification performance. Besides, it significantly outperforms existing deep learning methods combined with conventional embeddings in both in- and cross-instrument scenarios, achieving F1-Scores exceeding 99.99% and 99.96%, respectively. This work emphasizes a critical balance between computational efficiency and classification performance, providing a robust and scalable solution for anomaly detection in high-stakes observatory operations.
AB - In operational environments such as astronomical observatories, continuous monitoring of system logs is critical yet challenging due to the vast volume of data generated. Manual inspection of these logs is impractical, needing automated methods capable of accurate and scalable anomaly detection. This paper proposes a novel sentiment-aware anomaly detection framework, leveraging Bidirectional Encoder Representations from Transformers with In-Task Pre-Training and Fine-Tuning (BERT-ITPT-FiT), to classify textual logs based on their inherent sentiment polarity. Specifically tailored for the Paranal Observatory, our approach incorporates a computationally efficient, character-based preprocessing strategy that retains essential semantic elements, such as acronyms and technical terms, thereby eliminating the need for traditional parsing. This method efficiently processes a substantial dataset comprising 7,359,506 log entries from three VLTI instruments in an average runtime of only 114.23 seconds. Extensive experiments utilizing real operational data demonstrate that our architecture achieves an effective throughput of approximately 3,958 logs per second during the combined embedding generation and model training phases while consistently delivering an excellent classification performance. Besides, it significantly outperforms existing deep learning methods combined with conventional embeddings in both in- and cross-instrument scenarios, achieving F1-Scores exceeding 99.99% and 99.96%, respectively. This work emphasizes a critical balance between computational efficiency and classification performance, providing a robust and scalable solution for anomaly detection in high-stakes observatory operations.
KW - Bidirectional encoder representations from transformers
KW - log anomaly detection
KW - sentiment analysis
KW - word embedding
UR - https://www.scopus.com/pages/publications/105010223057
U2 - 10.1109/ACCESS.2025.3586586
DO - 10.1109/ACCESS.2025.3586586
M3 - Article
AN - SCOPUS:105010223057
SN - 2169-3536
VL - 13
SP - 117464
EP - 117478
JO - IEEE Access
JF - IEEE Access
ER -