Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions llm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ The following notebooks are actively maintained in sync with MindSpore and MindS
| 2 | [distilgpt2](./distilgpt2/) | Includes notebooks for DistilGPT-2 finetuning and inference on causal language modeling (text generation) tasks. |
| 3 | [bert](./bert/) | Includes notebooks for finetuning BERT on SWAG dataset for Multiple Choice tasks using MindSpore NLP |
| 4 | [esm](./esmforproteinfolding/) | Includes notebooks for EsmForProteinFolding finetuning and inference tasks |
| 5 | [ernie4.5](./ernie4_5/) | Includes notebooks for Ernie 4.5 inference tasks |
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The capitalization in the model name reference is inconsistent. Line 17 uses "Ernie 4.5" while the notebook title and description use "ERNIE 4.5" (all caps). For consistency with the notebook's own usage and typical naming conventions, this should be "ERNIE 4.5".

Suggested change
| 5 | [ernie4.5](./ernie4_5/) | Includes notebooks for Ernie 4.5 inference tasks |
| 5 | [ERNIE 4.5](./ernie4_5/) | Includes notebooks for ERNIE 4.5 inference tasks |

Copilot uses AI. Check for mistakes.

### Community-Driven / Legacy Applications

Expand Down
316 changes: 316 additions & 0 deletions llm/ernie4_5/inference_ernie4_5.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,316 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "0c7132da",
"metadata": {},
"source": [
"# 基于 MindSpore NLP 的 ERNIE 4.5 模型推理与应用\n",
"\n",
"## 实验介绍\n",
"\n",
"本实验主要介绍如何基于 MindSpore 2.7.0 AI 框架和 MindSpore NLP 0.5.1 套件,在 Ascend 800I/T A2 硬件环境下,实现 ERNIE 4.5 大语言模型的加载、推理及应用开发。\n",
"\n",
"ERNIE 4.5 是百度开源的大规模模型系列,包含稠密(Dense)与混合专家(MoE)架构,在中文理解、多模态交互及长文本处理方面表现优异。本案例将演示如何利用 MindSpore 的 `AutoClass` 接口快速加载模型权重,并构建一个基于该模型的对话应用。\n",
"\n",
"## 实验环境\n",
"\n",
"本案例基于 **Ascend 800I/T A2** 硬件环境,软件环境如下:\n",
"\n",
"| Python | MindSpore | MindSpore NLP |\n",
"| :----- | :-------- | :------------ |\n",
"| 3.10 | 2.7.0 | 0.5.1 |"
]
},
{
"cell_type": "markdown",
"id": "20bb5f2e",
"metadata": {},
"source": [
"### 安装依赖\n",
"\n",
"首先,我们需要安装 MindNLP 及相关依赖库。如果环境中未安装,请执行以下命令:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3041a225",
"metadata": {},
"outputs": [],
"source": [
"# 安装 MindSpore NLP\n",
"# !pip install mindnlp==0.5.1\n",
"# 安装常用的文本处理库\n",
"# !pip install jieba\n",
"# !pip install sentencepiece"
]
},
{
"cell_type": "markdown",
"id": "eca64203",
"metadata": {},
"source": [
"### 配置运行环境\n",
"\n",
"引入必要的库,并设置 MindSpore 的运行模式。针对大模型推理,我们使用 Ascend 作为计算后端。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6d59a07c",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import time\n",
"import mindspore\n",
"from mindspore import context\n",
"import mindnlp\n",
"\n",
"print(f\"MindSpore version: {mindspore.__version__}\")\n",
"print(\"MindNLP version:\", mindnlp.__version__)"
]
},
{
"cell_type": "markdown",
"id": "188668d7",
"metadata": {},
"source": [
"## 数据准备\n",
"\n",
"对于大模型推理任务,我们通常不需要像 CV NLP 等任务中那样下载大规模训练数据集。但在实际应用开发中,我们可能需要准备一些特定的 Prompt(提示词)或测试用例。\n",
"\n",
"此处我们创建一个简单的测试数据集,模拟应用场景中的输入。"
Comment on lines +81 to +85
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section titled "数据准备" (Data Preparation) at lines 85-89 is somewhat misleading. Unlike typical ML tasks that require dataset preparation, this section simply creates a list of test prompts. The narrative acknowledges this ("对于大模型推理任务,我们通常不需要像 CV NLP 等任务中那样下载大规模训练数据集"), but the section title "数据准备" might cause confusion. Consider renaming to "测试用例准备" (Test Case Preparation) or "推理样本准备" (Inference Sample Preparation) to more accurately reflect the content.

Copilot uses AI. Check for mistakes.
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1471afc3",
"metadata": {},
"outputs": [],
"source": [
"# 模拟应用场景数据\n",
"test_cases = [\n",
" \"请简要介绍一下什么是混合专家模型(MoE)?\",\n",
" \"写一首关于秋天丰收的七言绝句。\",\n",
" \"请分析以下句子的情感倾向:'这家餐厅的服务真是太糟糕了,我再也不会来了。'\",\n",
" \"使用Python写一个冒泡排序算法。\"\n",
"]\n",
"\n",
"print(\"测试用例准备完成。\")"
]
},
{
"cell_type": "markdown",
"id": "71617376",
"metadata": {},
"source": [
"## 模型构建与加载\n",
"\n",
"本章节将演示如何使用 MindSpore NLP 的 `Transformers` 接口加载 ERNIE 4.5 模型。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f8f53408",
"metadata": {},
"outputs": [],
"source": [
"# 加载分词器 (Tokenizer)\n",
"# 分词器负责将自然语言文本转换为模型可理解的 Token ID。\n",
"\n",
"from mindnlp.transformers import AutoTokenizer\n",
"from mindnlp.transformers import AutoModelForCausalLM\n",
"\n",
"MODEL_NAME = \"baidu/ERNIE-4.5-0.3B-PT\"\n",
"\n",
"print(f\"正在加载分词器: {MODEL_NAME} ...\")\n",
"try:\n",
" tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)\n",
" print(\"分词器加载成功。\")\n",
"except Exception as e:\n",
" print(f\"分词器加载失败,请检查网络或模型名称。错误信息: {e}\")\n",
" \n",
"# 加载模型 (Model)\n",
"\n",
"print(f\"正在加载模型: {MODEL_NAME} ...\")\n",
"\n",
"# 加载模型权重\n",
"# mindspore_dtype=mindspore.bfloat16 可在 NPU 上使用更高效的 bfloat16 精度\n",
"try:\n",
" model = AutoModelForCausalLM.from_pretrained(\n",
" MODEL_NAME,\n",
" mindspore_dtype=mindspore.bfloat16\n",
" ).to('npu')\n",
" # 将模型设置为评估模式\n",
" model.set_train(False)\n",
" print(\"模型加载成功,已加载到 NPU (bfloat16)。\")\n",
"except Exception as e:\n",
" print(f\"模型加载失败。错误信息: {e}\")"
]
},
{
"cell_type": "markdown",
"id": "b4775da6",
"metadata": {},
"source": [
"## 应用开发:构建对话生成函数\n",
"\n",
"为了方便进行多轮对话或特定任务推理,我们将模型的生成过程封装为一个函数。这类似于 ResNet 案例中的“验证”或“推理”步骤。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fd42b1ca",
"metadata": {},
"outputs": [],
"source": [
"def chat_with_ernie(query, history=[], max_length=2048, temperature=0.7, top_p=0.9):\n",
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The history parameter is defined with a default mutable argument (empty list). This is a common Python pitfall that can lead to unexpected behavior across multiple function calls. The default list will be shared across all calls to the function when no history is provided. Although the function doesn't currently modify the history parameter, it's better to use None as the default and initialize inside the function: history=None and then if history is None: history = [] in the function body.

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function documentation states this is a "对话生成函数" (dialogue generation function) that accepts a history parameter for multi-turn conversations, but the implementation doesn't support this functionality. The function name chat_with_ernie also implies chat/dialogue capability. If multi-turn conversation support is intended for future enhancement, consider renaming the function to generate_with_ernie to better reflect its current single-turn generation capability, or implement the history handling as the name and documentation suggest.

Copilot uses AI. Check for mistakes.
" \"\"\"\n",
" 基于 ERNIE 4.5 的对话生成函数\n",
" \n",
" Args:\n",
" query (str): 用户输入的问题\n",
" history (list): 对话历史\n",
" max_length (int): 生成的最大长度\n",
" temperature (float): 采样温度,控制生成的多样性\n",
" top_p (float): 核采样阈值\n",
" \n",
" Returns:\n",
" str: 模型生成的回答\n",
" \"\"\"\n",
" # 1. 构建 Prompt\n",
" # 说明:此示例针对 ERNIE 4.5 的预训练模型,直接对原始 query 做 tokenize,不使用额外 Chat Template。\n",
" # 若使用的是已对话微调的 ERNIE 4.5 Chat 类模型,请先根据其官方 Chat Template 将 history 和 query 拼接为 prompt,再送入 tokenizer。\n",
" inputs = tokenizer(query, return_tensors=\"ms\")\n",
Comment on lines +173 to +190
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The history parameter is documented in the docstring and accepted as a function argument, but it is never actually used in the function implementation. The function only processes the current query without incorporating any conversation history. Either the history parameter should be implemented to support multi-turn conversations, or it should be removed from both the function signature and the docstring.

Copilot uses AI. Check for mistakes.
" # 将输入张量迁移到 NPU 上,与模型设备保持一致\n",
" inputs = {k: v.to('npu:0') for k, v in inputs.items()}\n",
" \n",
" # 2. 生成配置\n",
" # 在 MindSpore 2.7 + MindSpore NLP 0.5.1 中,generate 接口用法与 Huggingface 类似\n",
" outputs = model.generate(\n",
" inputs[\"input_ids\"],\n",
" max_length=max_length,\n",
Comment on lines +196 to +198
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The max_length parameter is used in the generate() call, but according to modern transformer APIs (including HuggingFace transformers which MindNLP is designed to be compatible with), max_length represents the total length including input tokens. For generation tasks, it's more common and clearer to use max_new_tokens to specify only the number of tokens to generate, excluding the input. This prevents confusion and ensures consistent behavior regardless of input length. Consider using max_new_tokens instead of max_length if supported by MindNLP 0.5.1.

Copilot uses AI. Check for mistakes.
" do_sample=True,\n",
" temperature=temperature,\n",
" top_p=top_p,\n",
" pad_token_id=tokenizer.pad_token_id,\n",
" eos_token_id=tokenizer.eos_token_id\n",
" )\n",
" \n",
" # 3. 解码输出:仅解码生成的部分,避免误删或截断输入内容\n",
" generated_ids = outputs[0][inputs[\"input_ids\"].shape[-1]:]\n",
" response = tokenizer.decode(generated_ids, skip_special_tokens=True)\n",
" \n",
" return response.strip()\n",
"\n",
"print(\"推理函数封装完成。\")\n"
]
},
{
"cell_type": "markdown",
"id": "87b53bb4",
"metadata": {},
"source": [
"## 实验结果展示\n",
"\n",
"在本节中,我们将使用第3节准备的测试用例,对 ERNIE 4.5 模型进行实际的推理测试,展示其在不同领域的应用能力。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b65e8acc",
"metadata": {},
"outputs": [],
"source": [
"# 知识问答任务\n",
"# 测试模型对专业知识的理解能力。\n",
"\n",
"query_1 = test_cases[0] # 关于 MoE 的问题\n",
"print(f\"Q: {query_1}\")\n",
"\n",
"start_time = time.time()\n",
"response_1 = chat_with_ernie(query_1)\n",
"end_time = time.time()\n",
"\n",
"print(f\"A: {response_1}\")\n",
"print(f\"推理耗时: {end_time - start_time:.2f} s\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f36aedc7",
"metadata": {},
"outputs": [],
"source": [
"# 文学创作任务\n",
"# 测试模型的创意写作能力。\n",
"\n",
"query_2 = test_cases[1] # 写诗\n",
"print(f\"Q: {query_2}\")\n",
"response_2 = chat_with_ernie(query_2)\n",
"print(f\"A: \\n{response_2}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9972bcce",
"metadata": {},
"outputs": [],
"source": [
"# 情感分析任务\n",
"# 测试模型对自然语言的情绪理解能力。\n",
"\n",
"query_3 = test_cases[2] # 情感分析\n",
"print(f\"Q: {query_3}\")\n",
"response_3 = chat_with_ernie(query_3)\n",
"print(f\"A: \\n{response_3}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "33c0675a",
"metadata": {},
"outputs": [],
"source": [
"# 逻辑与代码生成任务\n",
"# 测试模型的逻辑推理与代码能力。\n",
"\n",
"query_4 = test_cases[3] # 写冒泡排序\n",
"print(f\"Q: {query_4}\")\n",
"response_4 = chat_with_ernie(query_4)\n",
"print(f\"A: \\n{response_4}\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "mind",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.14"
Comment on lines +22 to +311
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a discrepancy between the documented Python version and the kernel metadata. Line 22 states the environment uses "Python 3.10", but the kernel metadata at line 316 shows "version": "3.11.14". This inconsistency could confuse users about the actual requirements. Please ensure the documented version matches the tested environment, or clarify that multiple Python versions are supported.

Copilot uses AI. Check for mistakes.
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading