Hierarchical Fault Detection and Diagnosis for Transformer Architectures

arXiv CS.AI

이 뉴스, 어떠셨어요?

한 번의 탭으로 반응을 남겨요 · 로그인 불필요

CC BY

이 매체는 공공·자유 라이선스로 본문을 직접 표시합니다.

Abstract

Transformers now underpin critical AI systems across industry and research.

Yet their faults can silently alter model behavior without runtime errors, and existing techniques offer little support for tracing these failures to their component and root cause.

Such faults evade detection because loss and numerical values stay normal, and the visible symptom rarely identifies the component responsible.

We present DEFault++, a hierarchical learning-based technique that first detects a fault, then identifies the affected component, and finally the cause within it, helping developers effectively debug transformer models.

DEFault++ organizes component-level runtime measurements with a Fault Propagation Graph (FPG), a structural prior over the architecture's dependency paths, and reports the evidence behind each diagnosis.

To train and evaluate it, we construct DEFault-bench, a benchmark of 5,556 labeled runs from mutation testing across seven models, nine tasks, and both encoder and decoder architectures.

DEFault++ improves fault detection over four prior techniques, reaching an F1 of 0.826--0.909, and in a developer study with 21 participants, it raises repair accuracy from 57.1% to 83.3%.

These results show that transformer fault diagnosis benefits from component-level measurements and architecture-aware reasoning rather than model-level behavior alone, and DEFault-bench provides a foundation for further research on transformer fault diagnosis.

전문 보기

Hierarchical Fault Detection and Diagnosis for Transformer Architectures

이 뉴스, 어떠셨어요?

Abstract

관련 뉴스

'research' 카테고리 뉴스

Detecting and Controlling Sycophancy with Cascading Linear Features

Life After Benchmark Saturation: A Case Study of CORE-Bench

Refusal Lives Downstream of Persona in Chat Models

arXiv의 다른 기사

Knowledge-augmented Agentic AI for Mental Health Medication Information Seeking

Accelerating Skill Assessment in Chess: A Drift-Diffusion-Enhanced Elo Rating System

Governing Actions, Not Agents: Institutional Attestation as a Governance Model for Autonomous AI Systems