EnglishCancella i cookie per ripristinare le impostazioni di lingua associate al browser in uso
Titolo/Abstract/Parole chiave

Intelligent monitoring and fault diagnosis for ATLAS TDAQ: a complex event processing solution

Magnoni, Luca (2012) Intelligent monitoring and fault diagnosis for ATLAS TDAQ: a complex event processing solution. Tesi di Dottorato , Università degli studi di Ferrara.

[img]
Anteprima
File PDF
MagnoniL_phd_tesi.pdf

Download (11MB) | Anteprima

    Abstract

    Effective monitoring and analysis tools are fundamental in modern IT infrastructures to get insights on the overall system behavior and to deal promptly and effectively with failures. In recent years, Complex Event Processing (CEP) technologies have emerged as effective solutions for information processing from the most disparate fields: from wireless sensor networks to financial analysis. This thesis proposes an innovative approach to monitor and operate complex and distributed computing systems, in particular referring to the ATLAS Trigger and Data Acquisition (TDAQ) system currently in use at the European Organization for Nuclear Research (CERN). The result of this research, the AAL project, is currently used to provide ATLAS data acquisition operators with automated error detection and intelligent system analysis. The thesis begins by describing the TDAQ system and the controlling architecture, with a focus on the monitoring infrastructure and the expert system used for error detection and automated recovery. It then discusses the limitations of the current approach and how it can be improved to maximize the ATLAS TDAQ operational efficiency. Event processing methodologies are then laid out, with a focus on CEP techniques for stream processing and pattern recognition. The open-source Esper engine, the CEP solution adopted by the project is subsequently analyzed and discussed. Next, the AAL project is introduced as the automated and intelligent monitoring solution developed as the result of this research. AAL requirements and governing factors are listed, with a focus on how stream processing functionalities can enhance the TDAQ monitoring experience. The AAL processing model is then introduced and the architectural choices are justified. Finally, real applications on TDAQ error detection are presented. The main conclusion from this work is that CEP techniques can be successfully applied to detect error conditions and system misbehavior. Moreover, the AAL project demonstrates a real application of CEP concepts for intelligent monitoring in the demanding TDAQ scenario. The adoption of AAL by several TDAQ communities shows that automation and intelligent system analysis were not properly addressed in the previous infrastructure. The results of this thesis will benefit researchers evaluating intelligent monitoring techniques on large-scale distributed computing system.

    Tipologia del documento:Tesi di Dottorato (Tesi di Dottorato)
    Data:30 Marzo 2012
    Relatore:Luppi, Eleonora - Lehmann Miotto, Giovanna
    Coordinatore ciclo:Ruggiero, Valeria
    Istituzione:Università degli studi di Ferrara
    Dottorato:XXIV Anno 2009 > MATEMATICA E INFORMATICA
    Struttura:Dipartimento > Matematica
    Soggetti:Area 01 - Scienze matematiche e informatiche > INF/01 Informatica
    Parole chiave:event processing, cep, intelligent monitoring, aal, shifter assistant, atlas tdaq
    Depositato il:27 Feb 2013 08:38

    Staff:

    Accesso riservatoAccesso riservato