Back to repos

dengfeng-ai/tangshi-lora

View on GitHubAnalyzed March 16, 2026

Overview

tangshi-lora (GitHub) is a project for fine-tuning the Qwen-2.5-1.5B-Instruct large language model on Tang dynasty poetry using the QLoRA (Quantized Low-Rank Adapter) technique. It is a modern, parameter-efficient follow-up to tangshi-gpt, leveraging the latest in PEFT (Parameter-Efficient Fine-Tuning) methods to enable high-quality, domain-specific text generation with minimal compute resources.

The repository provides a full pipeline: from downloading and preprocessing a large-scale Tang poetry corpus, to instruction-tuning dataset creation, QLoRA-based training, and both quantitative (BLEU, perplexity) and qualitative evaluation. A pre-trained LoRA adapter is included for immediate use, and the codebase is designed for easy reproducibility and extension.

Architecture

The project is organized into modular components:

  • Data Preparation: Scripts in data/ download the chinese-poetry corpus via sparse HTTP, then preprocess it into an Alpaca-style instruction-tuning dataset. Only regular-form poems (绝句/律诗) by the top 50 poets are included, with traditional Chinese converted to simplified using OpenCC.
  • Training: train/train.py loads the Qwen model with 4-bit quantization (via bitsandbytes), applies LoRA adapters to key attention modules, and fine-tunes using Hugging Face's SFTTrainer from the trl library. Training configuration is managed via train/config.yaml.
  • Evaluation & Inference: Scripts in eval/ provide quantitative evaluation (BLEU-4, perplexity), qualitative side-by-side comparison, and single-shot generation using the fine-tuned model.
  • Outputs: The outputs/checkpoint/ directory contains a ready-to-use LoRA adapter, tokenizer config, and chat template.

Key Features

  • Efficient Fine-Tuning: Utilizes QLoRA (rank 16, alpha 32, 4-bit NF4 quantization) to adapt a 1.5B parameter model with only ~0.28% trainable parameters, making training feasible on a single T4 GPU (EC2 g4dn.xlarge).
  • Domain-Specific Instruction Tuning: Converts ~49K Tang poems into ~13K style-imitation instructions in Alpaca format, e.g., 模仿李白的风格,写一首诗.
  • Modern Evaluation: Provides both character-level and word-level BLEU-4 (with jieba segmentation), as well as perplexity, for rigorous quantitative assessment. Qualitative scripts allow for side-by-side comparison of base vs fine-tuned generations.
  • Reproducibility: All steps are scriptable, with clear configuration and deterministic data splits to prevent leakage.
  • Plug-and-Play Adapter: Pre-trained LoRA weights are included, so users can skip training and immediately perform evaluation or inference.

Project Structure

PathPurpose
data/download.pyDownloads the Tang poetry corpus from GitHub via HTTP
data/preprocess.pyPreprocesses and formats the dataset for instruction tuning
train/train.pyRuns QLoRA training using SFTTrainer
eval/eval.pyComputes BLEU and perplexity metrics
eval/qualitative.pyPrints side-by-side base vs fine-tuned generations
eval/generate.pySingle-shot inference with the fine-tuned model
outputs/checkpoint/Pre-trained LoRA adapter and configs

How It Works

  1. Data Download: Run data/download.py to fetch the raw JSON corpus from chinese-poetry.
  2. Preprocessing: Use data/preprocess.py to filter, convert, and format the poems into instruction-tuning samples, with a deterministic train/test split.
  3. Training: Launch training via train/train.py and train/config.yaml, which applies QLoRA to the Qwen model and fine-tunes on the dataset.
  4. Evaluation/Inference: Evaluate the model quantitatively (eval/eval.py), qualitatively (eval/qualitative.py), or generate new poems (eval/generate.py) using the provided adapter.

Results

Fine-tuning yields dramatic improvements: BLEU-4 scores increase by orders of magnitude, and perplexity drops from 54.3 to 19.8 on the test set. The fine-tuned model generates poetry much closer in style and content to reference Tang poems, as shown in both quantitative metrics and qualitative samples.