# FastBERT

**Repository Path**: natural-language-processing/FastBERT

## Basic Information

- **Project Name**: FastBERT
- **Description**: FastBERT：又快又稳的推理提速方法
《FastBERT: a Self-distilling BERT with Adaptive Inference Time》ACL2020
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-09-27
- **Last Updated**: 2022-05-31

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# FastBERT
![](https://img.shields.io/badge/license-MIT-000000.svg)

Source code for ["FastBERT: a Self-distilling BERT with Adaptive Inference Time"](https://www.aclweb.org/anthology/2020.acl-main.537/).


## Good News

**2020/07/05 - Update**: Pypi version of FastBERT has been launched. Please see [fastbert-pypi](https://pypi.org/project/fastbert/).

Install ``fastbert`` with ``pip``
```sh
$ pip install fastbert
```


## Requirements

``python >= 3.4.0``, Install all the requirements with ``pip``.
``` 
$ pip install -r requirements.txt
```


## Quick start on the Chinese Book review dataset

Download the pre-trained Chinese BERT parameters from [here](https://share.weiyun.com/gHkb3N6L), and save it to the ``models`` directory with the name of "Chinese_base_model.bin". 

Run the following command to validate our FastBERT with ``Speed=0.5`` on the Book review datasets.
```sh
$ CUDA_VISIBLE_DEVICES="0" python3 -u run_fastbert.py \
        --pretrained_model_path ./models/Chinese_base_model.bin \
        --vocab_path ./models/google_zh_vocab.txt \
        --train_path ./datasets/douban_book_review/train.tsv \
        --dev_path ./datasets/douban_book_review/dev.tsv \
        --test_path ./datasets/douban_book_review/test.tsv \
        --epochs_num 3 --batch_size 32 --distill_epochs_num 5 \
        --encoder bert --fast_mode --speed 0.5 \
        --output_model_path  ./models/douban_fastbert.bin
```

Meaning of each option.
```
usage: --pretrained_model_path Path to initialize model parameters.
       --vocab_path Path to the vocabulary.
       --train_path Path to the training dataset.
       --dev_path Path to the validating dataset.
       --test_path Path to the testing dataset.
       --epochs_num The epoch numbers of fine-tuning.
       --batch_size Batch size.
       --distill_epochs_num The epoch numbers of the self-distillation.
       --encoder The type of encoder.
       --fast_mode Whether to enable the fast mode of FastBERT.
       --speed The Speed value in the paper.
       --output_model_path Path to the output model parameters.
```

Test results on the Book review dataset.
```
Test results at fine-tuning epoch 3 (Baseline): Acc.=0.8688;  FLOPs=21785247744;
Test results at self-distillation epoch 1     : Acc.=0.8698;  FLOPs=6300902177;
Test results at self-distillation epoch 2     : Acc.=0.8691;  FLOPs=5844839008;
Test results at self-distillation epoch 3     : Acc.=0.8664;  FLOPs=5170940850;
Test results at self-distillation epoch 4     : Acc.=0.8664;  FLOPs=5170940327;
Test results at self-distillation epoch 5     : Acc.=0.8664;  FLOPs=5170940327;
```


## Quick start on the English Ag.news dataset

Download the pre-trained English BERT parameters from [here](https://share.weiyun.com/gHkb3N6L), and save it to the ``models`` directory with the name of "English_uncased_base_model.bin". 

Download the ``ag_news.zip`` from [here](https://share.weiyun.com/ZctQJP8h), and then unzip it to the ``datasets`` directory. 

Run the following command to validate our FastBERT with ``Speed=0.5`` on the Ag.news datasets.
```sh
$ CUDA_VISIBLE_DEVICES="0" python3 -u run_fastbert.py \
        --pretrained_model_path ./models/English_uncased_base_model.bin \
        --vocab_path ./models/google_uncased_en_vocab.txt \
        --train_path ./datasets/ag_news/train.tsv \
        --dev_path ./datasets/ag_news/test.tsv \
        --test_path ./datasets/ag_news/test.tsv \
        --epochs_num 3 --batch_size 32 --distill_epochs_num 5 \
        --encoder bert --fast_mode --speed 0.5 \
        --output_model_path  ./models/ag_news_fastbert.bin
```

Test results on the Ag.news dataset.
```
Test results at fine-tuning epoch 3 (Baseline): Acc.=0.9447;  FLOPs=21785247744;
Test results at self-distillation epoch 1     : Acc.=0.9308;  FLOPs=2172009009;
Test results at self-distillation epoch 2     : Acc.=0.9311;  FLOPs=2163471246;
Test results at self-distillation epoch 3     : Acc.=0.9314;  FLOPs=2108341649;
Test results at self-distillation epoch 4     : Acc.=0.9314;  FLOPs=2108341649;
Test results at self-distillation epoch 5     : Acc.=0.9314;  FLOPs=2108341649;
```


## Datasets

More datasets can be downloaded from [here](https://share.weiyun.com/ZctQJP8h).


## Other implementations

There are some other excellent implementations of FastBERT.

* BitVoyage/FastBERT (Pytorch): https://github.com/BitVoyage/FastBERT


## Acknowledgement

This work is funded by 2019 Tencent Rhino-Bird Elite Training Program. Work done while this author was an intern at Tencent.

If you use this code, please cite this paper:
```
@inproceedings{weijie2020fastbert,
  title={{FastBERT}: a Self-distilling BERT with Adaptive Inference Time},
  author={Weijie Liu, Peng Zhou, Zhe Zhao, Zhiruo Wang, Haotang Deng, Qi Ju},
  booktitle={Proceedings of ACL 2020},
  year={2020}
}
```