# industry_anomaly_detect **Repository Path**: geek_dog/industry_anomaly_detect ## Basic Information - **Project Name**: industry_anomaly_detect - **Description**: 一个简单的采用音频方式进行异常检测的项目 - **Primary Language**: Python - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-09-08 - **Last Updated**: 2025-09-26 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Industrial Anomaly Detection Industrial Anomaly Detection is a deep learning-based system for detecting anomalies in industrial equipment using audio signals. The system supports multiple neural network architectures including Convolutional Neural Networks (CNN), Transformer-based models, Deep Neural Networks (DNN), LSTM, and FSMN for audio anomaly detection. ## Features - Multiple model architectures: CNN, Transformer, DNN, LSTM, and FSMN - Audio feature extraction using MFCC, Mel-spectrogram, or combined features - Comprehensive training and evaluation pipeline - TensorBoard integration for monitoring training progress - Support for both CPU and GPU training - Detailed performance metrics including ROC AUC, PR AUC, and classification reports ## Requirements - Python 3.6+ - PyTorch - torchaudio - librosa - scikit-learn - pandas - tqdm - matplotlib - tensorboard Install required packages: ```bash pip install -r requirement.txt ``` ## Project Structure ``` industry_anomaly_detect/ ├── configs/ # Configuration files ├── dataset/ # Dataset directory (to be created) ├── deploy/ # Deployment files ├── docs/ # Documentation ├── models/ # Trained models storage ├── source/ # Source code │ ├── modules/ # Neural network models │ ├── utils/ # Utility functions │ └── train_wraper.py # Training wrapper ├── train.py # Main training script ├── train.sh # Training script └── requirement.txt # Python dependencies ``` ## Dataset Download [MIMII DUE](https://zenodo.org/records/4740355) dataset from: https://zenodo.org/records/4740355 ## Configuration The system is configured through [configs/config.yaml]. Key configuration options include: ### Model Configuration - `type`: Model architecture ("conv", "transformer", or "dnn") - `device`: Training device ("cuda" or "cpu") - `batch_size`: Training batch size - `epochs`: Number of training epochs - `learning_rate`: Learning rate for optimization ### Data Configuration - `feature_type`: Feature extraction method ("mfcc", "melspectrogram", or "combined") - `sample_rate`: Audio sample rate - Feature-specific parameters for MFCC, Mel-spectrogram, and combined features ### Paths Configuration - `train_data`: Path to normal training audio files - `test_data`: Path to test audio files (normal and anomaly) - `model_dir`: Directory to save trained models ## Usage ### Training To train the model with default configuration: ```bash python train.py ``` To train with a custom configuration file: ```bash python train.py --config [path/to/your/config.yaml] ``` ### Feature Extraction Options 1. **MFCC**: Mel-frequency cepstral coefficients 2. **Mel-spectrogram**: Mel-scale spectrogram features 3. **Combined**: Combination of MFCC and Mel-spectrogram features ### Model Architectures 1. **Convolutional Neural Network (conv)**: CNN-based autoencoder 2. **Transformer (transformer)**: Transformer-based autoencoder 3. **Deep Neural Network (dnn)**: Fully connected DNN autoencoder ## Monitoring Training TensorBoard logs are generated during training and can be viewed with: ```bash tensorboard --logdir=mpodels/[your_trained_store_directory] ``` ## Dataset Structure The dataset should be organized as follows: - Training data: Directory containing normal (non-anomalous) audio files - Test data: Directory containing both normal and anomalous audio files Audio files should be in WAV format. ## Results The system evaluates performance using: - ROC AUC Score - PR AUC Score - Classification reports with precision, recall, and F1-score - Accuracy metrics for normal and anomaly samples Model performance metrics are logged during training and testing phases. ## License This project is licensed under the terms described in the LICENSE file.