潘其威(William) 1acb0c5eba

Merge pull request #18 from PanQiWei/push_to_hub_integration

push_to_hub integration

2023-04-26 17:52:45 +08:00

3.2 KiB

Raw Blame History

Examples

To run example scripts in this folder, one must first install auto_gptq as described in this

Quantization

Commands in this chapter should be run under quantization folder.

Basic Usage

To Execute basic_usage.py, using command like this:

python basic_usage.py

To Execute basic_usage_with_wikitext2.py, using command like this:

python basic_usage_with_wikitext2.py

Note: There is about 0.6 ppl degrade on opt-125m model using AutoGPTQ, compared to GPTQ-for-LLaMa.

Quantize with Alpaca

To Execute quant_with_alpaca.py, using command like this:

CUDA_VISIBLE_DEVICES=0 python quant_with_alpaca.py --pretrained_model_dir "facebook/opt-125m"

The alpaca dataset used in here is a cleaned version provided by gururise in AlpacaDataCleaned

Evaluation

Commands in this chapter should be run under evaluation folder.

Language Modeling Task

run_language_modeling_task.py script gives an example of using LanguageModelingTask to evaluate model's performance on language modeling task before and after quantization using tatsu-lab/alpaca dataset.

To execute this script, using command like this:

CUDA_VISIBLE_DEVICES=0 python run_language_modeling_task.py --base_model_dir PATH/TO/BASE/MODEL/DIR --quantized_model_dir PATH/TO/QUANTIZED/MODEL/DIR

Use --help flag to see detailed descriptions for more command arguments.

Sequence Classification Task

run_sequence_classification_task.py script gives an example of using SequenceClassificationTask to evaluate model's performance on sequence classification task before and after quantization using cardiffnlp/tweet_sentiment_multilingual dataset.

To execute this script, using command like this:

CUDA_VISIBLE_DEVICES=0 python run_sequence_classification_task.py --base_model_dir PATH/TO/BASE/MODEL/DIR --quantized_model_dir PATH/TO/QUANTIZED/MODEL/DIR

Use --help flag to see detailed descriptions for more command arguments.

Text Summarization Task

run_text_summarization_task.py script gives an example of using TextSummarizationTask to evaluate model's performance on text summarization task before and after quantization using samsum dataset.

To execute this script, using command like this:

CUDA_VISIBLE_DEVICES=0 python run_text_summarization_task.py --base_model_dir PATH/TO/BASE/MODEL/DIR --quantized_model_dir PATH/TO/QUANTIZED/MODEL/DIR

Use --help flag to see detailed descriptions for more command arguments.

Push To Hub

Commands in this chapter should be run under push_to_hub folder.

You can upload and share your quantized model to Hugging Face Hub by using push_to_hub function.

push_quantized_model_to_hf_hub.py provide a simple example to upload quantized model, tokenizer and configs at once.

First, you need to login, run the following command in the virtual environment where Hugging Face Transformers is installed:

huggingface-cli login

Then run the script like this:

python push_quantized_model_to_hf_hub.py --quantized_model_dir PATH/TO/QUANTIZED/MODEL/DIR --tokenizer_dir PATH/TO/TOKENIZER/DIR --repo_id REPO/ID

Use --help flag to see detailed descriptions for more command arguments.

3.2 KiB Raw Blame History

Examples

Quantization

Basic Usage

Quantize with Alpaca

Evaluation

Language Modeling Task

Sequence Classification Task

Text Summarization Task

Push To Hub

3.2 KiB

Raw Blame History