A Dataset for Evaluating Online Anomaly Detection Approaches for Discrete Multivariate Time Series
Lucas Correia, Jan-Christoph Goos, Thomas Bäck, Anna V. Kononova·November 21, 2024
Summary
The PATH dataset, introduced for evaluating online unsupervised anomaly detection methods in multivariate time series, simulates a realistic automotive powertrain, offering diverse, extensive, and non-trivial data. Available in contaminated and clean versions, it supports unsupervised and semi-supervised settings, time series generation, and forecasting. The dataset, along with source code, is accessible online. It features 16 logged signals from a full electric vehicle's subsystems, chosen based on domain knowledge to represent powertrain behavior. The study focuses on creating a diverse dataset with realistic drive cycles to assess the effectiveness of classical methods against deep learning in anomaly detection. The PATH dataset includes signals like motor speed, torque, battery state of charge, and pedal inputs, and is designed to test anomaly detection methods in the presence of contamination.
Introduction
Background
Overview of the PATH dataset
Importance of evaluating unsupervised anomaly detection methods
Context of automotive powertrain simulation
Objective
Aim of using the PATH dataset
Focus on classical vs. deep learning methods in anomaly detection
Dataset Overview
Structure and Components
Contaminated and clean versions
Support for unsupervised and semi-supervised settings
Time series generation and forecasting capabilities
Signal Description
16 logged signals from electric vehicle subsystems
Selection based on domain knowledge for powertrain behavior representation
Signal Types
Motor speed, torque, battery state of charge, pedal inputs
Realistic drive cycle representation
Dataset Generation
Signal Generation
Methods for creating realistic drive cycles
Techniques for simulating powertrain behavior
Contamination
Strategies for introducing anomalies
Levels of contamination for testing robustness
Evaluation Framework
Method Comparison
Classical vs. deep learning approaches
Criteria for assessing anomaly detection effectiveness
Performance Metrics
Quantitative measures for evaluating methods
Statistical analysis for comparing results
Implementation and Accessibility
Data Availability
Online access to the PATH dataset
Source code for generating and processing data
Tools and Software
Recommendations for data analysis tools
Compatibility with various programming languages
Conclusion
Summary of Findings
Insights from using the PATH dataset
Implications for anomaly detection research
Future Directions
Potential improvements to the dataset
Areas for further investigation
Basic info
papers
computational engineering, finance, and science
machine learning
systems and control
artificial intelligence
Advanced features
Insights
How does the PATH dataset simulate a realistic automotive powertrain?
What is the PATH dataset used for in evaluating online unsupervised anomaly detection methods?
Which signals are included in the PATH dataset to represent powertrain behavior?
What types of settings does the PATH dataset support for unsupervised and semi-supervised anomaly detection?