# Minimind-Pytorch

**Repository Path**: olivegame/minimind-pytorch

## Basic Information

- **Project Name**: Minimind-Pytorch
- **Description**: Minimind 复现学习操作
- **Primary Language**: Python
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-03-17
- **Last Updated**: 2026-03-17

## Categories & Tags

**Categories**: Uncategorized

**Tags**: minimind

## README

# MiniMind-Pytorch

## 介绍
Minimind 复现学习操作。  minimind.pptx 包含学习过程涵盖的部分知识点和链接🔗

## 安装

1.  新建一个虚拟环境：

```bash
conda create -n minimind python=3.10
```

2.  安装配置环境：

```bash
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple
```

## 训练和验证

#### MiniMind2-series-moe 模型

![输入图片说明](images/dataset.jpg)

使用 4 卡 RTX3090 进行训练

1.  预训练（学知识）操作：

```bash
CUDA_VISIBLE_DEVICES=3,4,5,6 \
torchrun \
--standalone \
--nproc_per_node=4 \
train_pretrain.py \
--data_path ../dataset/pretrain_hq.jsonl \
--use_wandb \
--wandb_project MiniMind-Pretrain \
--save_weight pretrain \
--use_moe 1
```

训练曲线过程：

![曲线过程](https://foruda.gitee.com/images/1773739999179764430/36d2bec0_13961851.png "屏幕截图")

2.  监督微调（学对话方式）

- 先训练sft_512.jsonl

```bash
CUDA_VISIBLE_DEVICES=3,4,5,6 \
torchrun \
--nproc_per_node=4 \
--master_port=29501 \
train_full_sft.py \
--use_wandb \
--wandb_project MiniMind-Full-SFT_512 \
--data_path ../dataset/sft_512.jsonl \
--save_weight full_sft \
--use_moe 1 \
--batch_size 32 \
--num_workers 8 \
--use_compile 0
```

训练曲线过程：

![输入图片说明](https://foruda.gitee.com/images/1773740087126420461/6f609670_13961851.png "屏幕截图")

- 再训练sft_2048.json

```bash
CUDA_VISIBLE_DEVICES=3,4,5,6 \
torchrun \
--nproc_per_node=4 \
--master_port=29501 \
train_full_sft.py \
--use_wandb \
--wandb_project MiniMind-Full-SFT_2048 \
--data_path ../dataset/sft_2048.jsonl \
--save_weight full_sft \
--use_moe 1 \
--batch_size 32 \
--num_workers 8 \
--use_compile 0 \
--from_weight full_sft
```

训练曲线过程：

![输入图片说明](https://foruda.gitee.com/images/1773740136103520388/c8c2f587_13961851.png "屏幕截图")

3.  强化学习DPO

```bash
CUDA_VISIBLE_DEVICES=3,4,5,6 \
torchrun \
--nproc_per_node=4 \
--master_port=29501 \
train_dpo.py \
--use_wandb \
--wandb_project MiniMind-DPO \
--data_path ../dataset/dpo.jsonl \
--save_weight dpo \
--use_moe 1 \
--batch_size 8 \
--num_workers 8 \
--use_compile 0 \
--from_weight full_sft
```

训练曲线过程：

![](https://foruda.gitee.com/images/1773740176528283851/5ea1eb97_13961851.png "屏幕截图")

4.  验证测试

```bash
python eval_llm.py \
--load_from model \
--save_dir out \
--weight dpo \
--use_moe 1
```

简单提一些问题吧\(^o^)/~：

![输入图片说明](https://foruda.gitee.com/images/1773739823471926576/cdf9bed6_13961851.png "屏幕截图")

看来还是一个弱智版GPT😄