Skip to content

Commit 7439092

Browse files
wangbochao789sunjinghua.vendorWangDaBenShisunjh2222
authored
refactor: 更新推理服务界面,优化显卡数量选择和工具提示,删除不再使用的密钥交换接口 (#66)
* 更新子模块版本 * refactor: 更新推理服务界面,优化显卡数量选择和工具提示,删除不再使用的密钥交换接口 * fix: docx/xlsx/pptx类型数据检测为zip类型 * merge main * 更新子模块版本 * fix: docx/xlsx/pptx类型数据检测为zip类型 * chore: update API endpoints in environment files and enhance user agreement component with scroll position tracking --------- Co-authored-by: sunjinghua.vendor <[email protected]> Co-authored-by: wangbochao.vendor <[email protected]> Co-authored-by: sjh <[email protected]>
1 parent e1036a6 commit 7439092

9 files changed

Lines changed: 266 additions & 310 deletions

File tree

back/src/utils/util_file_validation.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ def validate_file_type(
106106
current_pos = file_obj.tell()
107107
file_obj.seek(0)
108108

109-
file_data = file_obj.read(512)
109+
file_data = file_obj.read()
110110
file_obj.seek(current_pos)
111111

112112
if len(file_data) == 0:

front/.env

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
FRONTEND_DEPLOY_STAGE=DEVELOPMENT
22
NEXT_PUBLIC_EDITION=SELF_HOSTED
3-
FRONTEND_CORE_API=https://112.111.7.69:30382/console/api
4-
FRONTEND_APP_API=https://112.111.7.69:30382/api
3+
FRONTEND_CORE_API=https://10.210.0.49:5023/console/api
4+
FRONTEND_APP_API=https://10.210.0.49:5023/api
55

66
# SENTRY
77
NEXT_PUBLIC_SENTRY_DSN=

front/.env.example

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
FRONTEND_DEPLOY_STAGE=DEVELOPMENT
22
NEXT_PUBLIC_EDITION=SELF_HOSTED
3-
FRONTEND_CORE_API=https://112.111.7.69:30382/console/api
4-
FRONTEND_APP_API=https://112.111.7.69:30382/api
3+
FRONTEND_CORE_API=https://10.210.0.49:5023/console/api
4+
FRONTEND_APP_API=https://10.210.0.49:5023/api
55

66
# SENTRY
77
NEXT_PUBLIC_SENTRY_DSN=

front/app/(appLayout)/inferenceService/platform/page.tsx

Lines changed: 39 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -513,7 +513,7 @@ const InferenceService = () => {
513513
<Input placeholder="请输入服务名称" />
514514
</Form.Item>
515515
<PlusCircleOutlined
516-
style={{ color: '#0E5DD8' }}
516+
style={{ color: '#0E5DD8', cursor: 'pointer' }}
517517
onClick={() => {
518518
const current = form.getFieldValue(['services', name]) || {}
519519
add({ name: current?.name || '', model_num_gpus: current?.model_num_gpus || undefined })
@@ -522,47 +522,49 @@ const InferenceService = () => {
522522
{index !== 0 && (
523523
<MinusCircleOutlined
524524
className="dynamic-delete-button"
525-
style={{ color: '#0E5DD8' }}
525+
style={{ color: '#0E5DD8', cursor: 'pointer' }}
526526
onClick={() => remove(name)}
527527
/>
528528
)}
529529
</div>
530-
<Form.Item
531-
name={[name, 'model_num_gpus']}
532-
validateTrigger={['onChange', 'onBlur']}
533-
rules={[
534-
{
535-
required: true,
536-
type: 'number',
537-
message: '请输入显卡数量',
538-
},
539-
{
540-
validator: (_, value) => {
541-
if (!Number.isInteger(value) || value < 1)
542-
return Promise.reject(new Error('显卡数量需为大于等于1的整数'))
543-
return Promise.resolve()
530+
<div className='flex items-center gap-[8px]'>
531+
<Form.Item
532+
name={[name, 'model_num_gpus']}
533+
validateTrigger={['onChange', 'onBlur']}
534+
rules={[
535+
{
536+
required: true,
537+
message: '',
544538
},
545-
},
546-
]}
547-
style={{ marginBottom: 0 }}
548-
>
549-
<div className='flex items-center gap-[8px]'>
550-
<Select placeholder="分配显卡数量" style={{ width: '80%' }} options={[{ label: '1', value: 1 }, { label: '2', value: 2 }, { label: '4', value: 4 }, { label: '8', value: 8 }]} />
551-
<Tooltip
552-
placement="top"
553-
title={
554-
<div className='text-xs leading-relaxed max-w-[280px]'>
555-
<p>运行大模型占用的显存主要由以下组成(以Qwen3-32B、精度为FP16为例计算,占用显存约为64+8+2=74G):</p>
556-
<p>1)模型权重:32B × 2 = 64 GB,固定不变。</p>
557-
<p>2)KV缓存:2 × 并发数 × 32K × 64 × 128 × 8 × 2,示例为并发1上下文32K,约8G;并发翻倍显存同步增加。</p>
558-
<p>3)激活值与开销:推理中间计算与框架额外占用约1-2G。</p>
559-
</div>
560-
}
561-
>
562-
<QuestionCircleOutlined style={{ color: '#0E5DD8' }} />
563-
</Tooltip>
564-
</div>
565-
</Form.Item>
539+
{
540+
validator: (_, value) => {
541+
if (value === undefined || value === null || value === '')
542+
return Promise.reject(new Error('请输入显卡数量'))
543+
const numValue = Number(value)
544+
if (!Number.isInteger(numValue) || numValue < 1)
545+
return Promise.reject(new Error('显卡数量需为大于等于1的整数'))
546+
return Promise.resolve()
547+
},
548+
},
549+
]}
550+
style={{ width: '80%', marginBottom: 0 }}
551+
>
552+
<Select placeholder="分配显卡数量" style={{ width: '100%' }} options={[{ label: '1', value: 1 }, { label: '2', value: 2 }, { label: '4', value: 4 }, { label: '8', value: 8 }]} />
553+
</Form.Item>
554+
<Tooltip
555+
placement="top"
556+
title={
557+
<div className='text-xs leading-relaxed max-w-[280px]'>
558+
<p>运行大模型占用的显存主要由以下组成(以Qwen3-32B、精度为FP16为例计算,占用显存约为64+8+2=74G):</p>
559+
<p>1)模型权重:32B × 2 = 64 GB,固定不变。</p>
560+
<p>2)KV缓存:2 × 并发数 × 32K × 64 × 128 × 8 × 2,示例为并发1上下文32K,约8G;并发翻倍显存同步增加。</p>
561+
<p>3)激活值与开销:推理中间计算与框架额外占用约1-2G。</p>
562+
</div>
563+
}
564+
>
565+
<QuestionCircleOutlined style={{ color: '#0E5DD8', cursor: 'pointer' }} />
566+
</Tooltip>
567+
</div>
566568
</div>
567569
</Form.Item>
568570
))}

front/app/api/internal/key_exchange/route.ts

Lines changed: 0 additions & 57 deletions
This file was deleted.

0 commit comments

Comments
 (0)