# industry_anomaly_detect

**Repository Path**: geek_dog/industry_anomaly_detect

## Basic Information

- **Project Name**: industry_anomaly_detect
- **Description**: 一个简单的采用音频方式进行异常检测的项目
- **Primary Language**: Python
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-09-08
- **Last Updated**: 2025-09-26

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Industrial Anomaly Detection

Industrial Anomaly Detection is a deep learning-based system for detecting anomalies in industrial equipment using audio signals. The system supports multiple neural network architectures including Convolutional Neural Networks (CNN), Transformer-based models, Deep Neural Networks (DNN), LSTM, and FSMN for audio anomaly detection.

## Features

- Multiple model architectures: CNN, Transformer, DNN, LSTM, and FSMN
- Audio feature extraction using MFCC, Mel-spectrogram, or combined features
- Comprehensive training and evaluation pipeline
- TensorBoard integration for monitoring training progress
- Support for both CPU and GPU training
- Detailed performance metrics including ROC AUC, PR AUC, and classification reports

## Requirements

- Python 3.6+
- PyTorch
- torchaudio
- librosa
- scikit-learn
- pandas
- tqdm
- matplotlib
- tensorboard

Install required packages:
```bash
pip install -r requirement.txt
```

## Project Structure

```
industry_anomaly_detect/
├── configs/                 # Configuration files
├── dataset/                 # Dataset directory (to be created)
├── deploy/                  # Deployment files
├── docs/                    # Documentation
├── models/                  # Trained models storage
├── source/                  # Source code
│   ├── modules/             # Neural network models
│   ├── utils/               # Utility functions
│   └── train_wraper.py     # Training wrapper
├── train.py                # Main training script
├── train.sh                # Training script
└── requirement.txt         # Python dependencies
```

## Dataset
Download [MIMII DUE](https://zenodo.org/records/4740355) dataset from: https://zenodo.org/records/4740355

## Configuration

The system is configured through [configs/config.yaml]. Key configuration options include:

### Model Configuration
- `type`: Model architecture ("conv", "transformer", or "dnn")
- `device`: Training device ("cuda" or "cpu")
- `batch_size`: Training batch size
- `epochs`: Number of training epochs
- `learning_rate`: Learning rate for optimization

### Data Configuration
- `feature_type`: Feature extraction method ("mfcc", "melspectrogram", or "combined")
- `sample_rate`: Audio sample rate
- Feature-specific parameters for MFCC, Mel-spectrogram, and combined features

### Paths Configuration
- `train_data`: Path to normal training audio files
- `test_data`: Path to test audio files (normal and anomaly)
- `model_dir`: Directory to save trained models

## Usage

### Training

To train the model with default configuration:

```bash
python train.py
```

To train with a custom configuration file:

```bash
python train.py --config [path/to/your/config.yaml]
```

### Feature Extraction Options

1. **MFCC**: Mel-frequency cepstral coefficients
2. **Mel-spectrogram**: Mel-scale spectrogram features
3. **Combined**: Combination of MFCC and Mel-spectrogram features

### Model Architectures

1. **Convolutional Neural Network (conv)**: CNN-based autoencoder
2. **Transformer (transformer)**: Transformer-based autoencoder
3. **Deep Neural Network (dnn)**: Fully connected DNN autoencoder

## Monitoring Training

TensorBoard logs are generated during training and can be viewed with:

```bash
tensorboard --logdir=mpodels/[your_trained_store_directory]
```

## Dataset Structure

The dataset should be organized as follows:
- Training data: Directory containing normal (non-anomalous) audio files
- Test data: Directory containing both normal and anomalous audio files

Audio files should be in WAV format.

## Results

The system evaluates performance using:
- ROC AUC Score
- PR AUC Score
- Classification reports with precision, recall, and F1-score
- Accuracy metrics for normal and anomaly samples

Model performance metrics are logged during training and testing phases.

## License

This project is licensed under the terms described in the LICENSE file.