diff --git a/docs/doc/en/vision/hand_gesture_classification.md b/docs/doc/en/vision/hand_gesture_classification.md
index 500b64e1..51434b90 100644
--- a/docs/doc/en/vision/hand_gesture_classification.md
+++ b/docs/doc/en/vision/hand_gesture_classification.md
@@ -2,45 +2,265 @@
title: MaixCAM MaixPy Hand Gesture Classification Based on Hand Keypoint Detection
---
+
## Introduction
The `MaixCAM MaixPy Hand Gesture Classification Based on Hand Keypoint Detection` can classify various hand gestures.
-The current dataset used is the `14-class static hand gesture dataset` with a total of 2850 samples divided into 14 categories.
-[Dataset Download Link (Baidu Netdisk, Password: 6urr)](https://pan.baidu.com/s/1Sd-Ad88Wzp0qjGH6Ngah0g)
+The following describes how to preprocess features from [AI model estimated hand landmarks](./hand_landmarks.md), which are then classified using LinearSVC (Support Vector Machine). A detailed implementation is available in `MaixPy/projects/app_hand_gesture_classifier/LinearSVC.py`, and the usage example can be found in the app implementation in `MaixPy/projects/app_hand_gesture_classifier/main.py`.
-
+**Users can add any distinguishable hand gestures for training.**
+
+## Usage
+
+### Preprocessing
+Here’s the preprocessing for the raw output `hand_landmarks` from the AI model to derive usable features:
+
+```python
+def preprocess(hand_landmarks, is_left=False, boundary=(1,1,1)):
+ hand_landmarks = np.array(hand_landmarks).reshape((21, -1))
+ vector = hand_landmarks[:,:2]
+ vector = vector[1:] - vector[0]
+ vector = vector.astype('float64') / boundary[:vector.shape[1]]
+ if not is_left: # mirror
+ vector[:,0] *= -1
+ return vector
+```
+
+### Import Modules
+Alternatively, you can directly copy the `LinearSVC.py` implementation from the directory `target_dir`
+```python
+# To import LinearSVC
+target_dir = '/maixapp/apps/hand_gesture_classifier/'
+import sys
+if target_dir not in sys.path:
+ sys.path.insert(0, target_dir)
+
+from LinearSVC import LinearSVC, LinearSVCManager
+```
+
+### Classifier (LinearSVC)
+
+Introduction to the LinearSVC classifier's functions and usage.
+
+#### Initialization, Loading, and Exporting
+```python
+# Initialize
+clf = LinearSVC(C=1.0, learning_rate=0.01, max_iter=500)
+# Load
+clf = LinearSVC.load("/maixapp/apps/hand_gesture_classifier/clf_dump.npz")
+# Export
+clf.save("my_clf_dump.npz")
+```
+*Initialization Method Parameters*
+1. C=1.0 (Regularization Parameter)
+ - Controls the strength of the regularization in the SVM.
+ - A larger C value punishes misclassifications more strictly, potentially leading to overfitting.
+ - A smaller C value allows some misclassifications, improving generalization but potentially underfitting-
+ Default: 1.0, balanced regularization between accuracy and generalization.
+
+2. learning_rate=0.01 (Learning Rate)
+ - Controls the step size for weight updates during each gradient descent optimization.
+ - Too large a learning rate may cause the optimization process to diverge
+ - While too small a learning rate may lead to slow convergence.
+ Default: 0.01, typically a moderate value to ensure gradual approach to the optimal solution.
+
+3. max_iter=500 (Maximum Iterations)
+ - Specifies the maximum number of optimization rounds during training.
+ - More iterations allow the model more chances to converge, but too many may result in overfitting.
+ - A smaller max_iter value may stop optimization prematurely, leading to underfitting.
+ Default: 1000, sufficient iterations to ensure convergence.
+
+*Loading and Exporting Method Parameters*
+1. filename: str
+ - The target file path, both relative and absolute paths are supported.
+ - This is a required parameter.
+ Default: None
+
+*Training and Prediction (Classification)*
+After initializing the classifier, it needs to be trained before it can be used for classification.
+
+If you load a previously trained classifier, it can directly be used for classification.
+
+**Note: Every training session is a full training process, meaning previous training results will be lost. It's recommended to export the classifier backup as needed.**
+
+```python
+npzfile = np.load("/maixapp/apps/hand_gesture_classifier/trainSets.npz") # Preload features and labels (name_classes index)
+X_train = npzfile["X"] # Raw features
+y_train = npzfile["y"] # Labels
+
+clf.fit(clf.scaler.fit_transform(X_train), y_train) # Train SVM after feature normalization
+
+# Regression
+y_pred = clf.predict(clf.scaler.transform(X_train)) # Predict labels after feature normalization
+recall_count = len(y_train)
+right_count = np.sum(y_pred == y_train)
+print(f"right/recall= {right_count}/{recall_count}, acc: {right_count/recall_count}")
+
+# Prediction
+X_test = X_train[:5]
+feature_test = clf.scaler.transform(X_test) # Feature normalization
+# y_pred = clf.predict(feature_test) # Predict labels
+y_pred, y_conf = clf.predict_with_confidence(feature_test) # Predict labels with confidence
+print(f"pred: {y_pred}, conf: {y_conf}")
+# Corresponding class names name_classes = ("one", "five", "fist", "ok", "heartSingle", "yearh", "three", "four", "six", "Iloveyou", "gun", "thumbUp", "nine", "pink")
+```
+
+Since every training is full, you need to manually maintain the storage of previously trained features and corresponding labels to allow dynamic addition/removal of classes.
+
+To simplify usage and reduce extra workload, the `Classifier Manager (LinearSVCManager)` has been encapsulated, as described in the next section.
+
+
+### Classifier Manager (LinearSVCManager)
+
+Introduction to the LinearSVCManager's functions and usage.
+
+#### Initialization, Loading, and Exporting
+
+Both Initialization or Loading must provide valid X and Y (corresponding features and labels) inputs.
+
+And their lengths must be equal and correspond to each other, or an error will occur.
+
+```python
+# Initialization, Loading
+def __init__(self, clf: LinearSVC=LinearSVC(), X=None, Y=None, pretrained=False)
+
+# Initialize with default LinearSVC parameters
+clfm = LinearSVCManager(X=X_train, Y=y_train)
+# Initialize with specific LinearSVC parameters
+clfm = LinearSVCManager(LinearSVC(C=1.0, learning_rate=0.01, max_iter=100), X_train, y_train)
+
+# Loading requires the loaded LinearSVC and setting pretrained=True to avoid unnecessary retraining
+# Ensure X_train, y_train are the data previously used to train LinearSVC
+clfm = LinearSVCManager(LinearSVC.load("/maixapp/apps/hand_gesture_classifier/clf_dump.npz"), X_train, y_train, pretrained=True)
+
+# Export parameters using LinearSVC (clfm.clf)'s save
+clfm.clf.save("my_clf_dump.npz")
+# Export features and labels used for training
+np.savez("trainSets.npz",
+ X = X_train,
+ y = y_train,
+ )
+```
-This app is implemented in `MaixPy/projects/app_hand_gesture_classifier/main.py`, and the main logic is as follows:
+#### Accessing Training Data Used
+
+clfm.samples is a Python tuple:
+1. clfm.samples[0] is `X`
+2. clfm.samples[1] is `Y`
+
+**Do not modify directly, for read-only access only. To make changes, call `clfm.train()` to retrain the model.**
+
+#### Adding or Removing
+
+**When adding, ensure X_new and y_new have the same length and match the shape of the previous X_train and y_train.**
+
+All are numpy arrays, and you can check their shape with the shape attribute.
+
+```python
+# Add new data
+clfm.add(X_new, y_new)
+
+# Remove data
+mask_ge_4 = clfm.samples[1] >= 4 # Mask for class labels >= 4
+indices_ge_4 = np.where(mask_ge_4)[0]
+clfm.rm(indices_ge_4)
+```
+
+These operations mainly modify clfm.samples, but will trigger a call to clfm.train() to retrain the model.
+
+Depending on the size of the training data, wait a few moments before directly applying the updated model.
+
+
+#### Prediction
+
+```python
+y_pred, y_conf = clfm.test(X_test) # Predict labels
+```
+
+This is equivalent to:
+
+```python
+clf = clfm.clf
+feature_test = clf.scaler.transform(X_test) # Feature normalization
+y_pred, y_conf = clf.predict_with_confidence(feature_test) # Predict labels with confidence
+```
+
+#### Example (Simplified Version of the Resulting Video)
+
+Note:
+- Missing preprocess implementation should be copied from the Preprocessing section.
+- Missing LinearSVC module should be copied from the Import Modules section.
+
+The classification and prediction part can be run as a single file:
+
+```python
+from maix import camera, display, image, nn, app
+import numpy as np
+
+# Add below me
+
+name_classes = ("one", "five", "fist", "ok", "heartSingle", "yearh", "three", "four", "six", "Iloveyou", "gun", "thumbUp", "nine", "pink") # Easy-to-understand class names
+npzfile = np.load("/maixapp/apps/hand_gesture_classifier/trainSets.npz") # Preload features and labels (name_classes index)
+X_train = npzfile["X"]
+y_train = npzfile["y"]
+clfm = LinearSVCManager(LinearSVC.load("/maixapp/apps/hand_gesture_classifier/clf_dump.npz"), X_train, y_train, pretrained=True) # Initialize LinearSVCManager with preloaded classifier
+
+detector = nn.HandLandmarks(model="/root/models/hand_landmarks.mud")
+cam = camera.Camera(320, 224, detector.input_format())
+disp = display.Display()
+
+# Loading screen
+img = cam.read()
+img.draw_string(100, 112, "Loading...\nwait up to 10s", color = image.COLOR_GREEN)
+disp.show(img)
+
+while not app.need_exit():
+ img = cam.read()
+ objs = detector.detect(img, conf_th = 0.7, iou_th = 0.45, conf_th2 = 0.8)
+ for obj in objs:
+ hand_landmarks = preprocess(obj.points[8:8+21*3], obj.class_id == 0, (img.width(), img.height(), 1)) # Preprocessing
+ features = np.array([hand_landmarks.flatten()])
+ class_idx, pred_conf = clfm.test(features) # Get predicted class
+ class_idx, pred_conf = class_idx[0], pred_conf[0] # Handle multiple inputs and outputs, take the first element
+ msg = f'{detector.labels[obj.class_id]}: {obj.score:.2f}\n{name_classes[class_idx]}({class_idx})={pred_conf*100:.2f}%'
+ img.draw_string(obj.points[0], obj.points[1], msg, color = image.COLOR_RED if obj.class_id == 0 else image.COLOR_GREEN, scale = 1.4, thickness = 2)
+ detector.draw_hand(img, obj.class_id, obj.points, 4, 10, box=True)
+ disp.show(img)
+```
+
+The current `X_train` is based on the "14-Class Static Hand Gesture Dataset," which consists of 2850 samples, divided into 14 classes. The dataset can be downloaded from the provided [Baidu Netdisk link (Password: 6urr)](https://pan.baidu.com/s/1Sd-Ad88Wzp0qjGH6Ngah0g).
+
+
+
-1. Load the `14-class static hand gesture dataset` processed by the **Hand Keypoint Detection** model, extracting `20` relative wrist coordinate offsets.
-2. Initially train on the first `4` classes to support basic gesture recognition.
-3. Use the **Hand Keypoint Detection** model to process the camera input and visualize classification results on the screen.
-4. Tap the top-right `class14` button to add more samples and retrain the model for full `14-class` gesture recognition.
-5. Tap the bottom-right `class4` button to remove the added samples and retrain the model back to the `4-class` mode.
-6. Tap the small area between the buttons to display the last training duration at the top of the screen.
-7. Tap the remaining large area to show the currently supported gesture classes on the left side—**green** for supported, **yellow** for unsupported.
## Demo Video
-
+The implementation of this app is located at `MaixPy/projects/app_hand_gesture_classifier/main.py`, with the main logic as follows:
-1. The video demonstrates the `14-class` mode after executing step `4`, recognizing gestures `1-10` (default mapped to other meanings), **OK**, **thumbs up**, **finger heart** (requires the back of the hand, hard to demonstrate in the video but can be verified), and **pinky stretch**—a total of `14` gestures.
+1. Load the `14-class static hand gesture dataset`, which consists of `20` coordinate offsets relative to the wrist after processing by `hand keypoint detection`.
+2. Initially train `4` gesture classifications **or directly load pre-trained `14` classifier parameters (switchable in the source code)** to support gesture recognition.
+3. Load the `hand keypoint detection` model to process the camera input and visualize the results on the screen using the classifier.
+4. Click the upper right corner `class14` to add the remaining classification samples and retrain to achieve `14` gesture classifications.
+5. Click the lower right corner `class4` to remove the additional classification samples from the previous step and retrain to revert to `4` gesture classifications.
+6. Click the small area between the buttons to display the duration of the last classifier training at the top.
+7. Click other large areas to display the currently supported classification categories on the left side—green indicates supported, yellow indicates not supported.
-2. Then, step `5` is executed, reverting to the `4-class` mode, where only gestures **1**, **5**, **10** (fist), and **OK** are recognizable. Other gestures fail to produce correct results. During this process, step `7` was also executed, showing the current `4-class` mode—only the first 4 gestures are green, and the remaining 10 are yellow.
+
-3. Step `4` is executed again, restoring the `14-class` mode, and previously unrecognized gestures in the `4-class` mode are now correctly identified.
+1. The demo video shows the execution of step `4` **or the bold part of step `2`**, demonstrating the `14-class` mode. It can recognize gestures `1-10` (with default corresponding English meanings), "OK", thumbs-up, heart shape (requires the back of the hand, difficult to demonstrate in the video but can be verified), and pinky stretch, making a total of `14` gestures.
+2. Then, step `5` is executed to revert to the `4-class` mode, where only gestures `1`, `5`, `10` (fist), and "OK" can be recognized, while the remaining gestures do not produce correct results. Step `7` is also executed to show the current `4-class` mode, as only the first 4 gestures are displayed in green, and the remaining 10 are shown in yellow.
+3. Step `4` is executed again to restore the `14-class` mode, and the gestures that could not be recognized in the `4-class` mode are now correctly identified.
+4. Finally, the recognition of gestures with both hands is demonstrated, showing that the system can correctly identify gestures from both hands simultaneously.
-4. Finally, dual-hand recognition is demonstrated, and both hands' gestures are accurately recognized simultaneously.
## Others
-The demo video captures the **maixvision** screen preview window in the top-right corner, matching the actual on-screen display.
-
-For detailed implementation, please refer to the source code and the above analysis.
+**The demo video is captured from the preview window in the upper right corner of MaixVision, and it is consistent with the actual screen display.**
-Further development or modification can be directly done based on the source code, which includes comments for guidance.
+**For more detailed usage instructions or secondary development, please refer to the source code analysis mentioned above, which includes comments.**
-If you need additional assistance, feel free to leave a message on **MaixHub** or send an email to the official company address.
+If you still have questions or need assistance, you can post on `maixhub` or send an `e-mail` to the company at `support@sipeed.com`. **Please use the subject `[help][MaixPy] guesture classification: xxx`**.
\ No newline at end of file
diff --git a/docs/doc/en/vision/hand_landmarks.md b/docs/doc/en/vision/hand_landmarks.md
index 325beb95..f850f1cb 100644
--- a/docs/doc/en/vision/hand_landmarks.md
+++ b/docs/doc/en/vision/hand_landmarks.md
@@ -1,5 +1,5 @@
---
-tite: 3D Coordinate Detection of 21 Hand Keypoints with MaixPy MaixCAM
+title: 3D Coordinate Detection of 21 Hand Keypoints with MaixPy MaixCAM
update:
- date: 2024-12-31
version: v1.0
diff --git a/docs/doc/zh/vision/hand_gesture_classification.md b/docs/doc/zh/vision/hand_gesture_classification.md
index 9447adee..de7f1921 100644
--- a/docs/doc/zh/vision/hand_gesture_classification.md
+++ b/docs/doc/zh/vision/hand_gesture_classification.md
@@ -7,44 +7,260 @@ title: MaixCAM MaixPy 基于手部关键点检测结果进行进行手势分类
由`MaixCAM MaixPy 基于手部关键点检测结果进行进行手势分类`可分类手势。
-目前使用的数据集为`14 类静态手势数据集`,[数据集下载地址(百度网盘 Password: 6urr )](https://pan.baidu.com/s/1Sd-Ad88Wzp0qjGH6Ngah0g),数据集共 2850 个样本,分为 14 类。
+通过前置 [AI 模型估计手部关键点](./hand_landmarks.md)获取特征,再由 LinearSVC (支持向量机线性分类算法) 提供了自训练分类各种手势的能力。详情位于 `MaixPy/projects/app_hand_gesture_classifier/LinearSVC.py`,使用案例见 app 实现,其位于 `MaixPy/projects/app_hand_gesture_classifier/main.py`。
+
+**用户可自行添加其他任意可区分手势进行训练。**
+
+## 使用
+
+### 预处理
+以下是对 AI 模型估计手部关键点 的原始输出 hand_landmarks 数据结构进行预处理得到待使用特征:
+
+```python
+def preprocess(hand_landmarks, is_left=False, boundary=(1,1,1)):
+ hand_landmarks = np.array(hand_landmarks).reshape((21, -1))
+ vector = hand_landmarks[:,:2]
+ vector = vector[1:] - vector[0]
+ vector = vector.astype('float64') / boundary[:vector.shape[1]]
+ if not is_left: # mirror
+ vector[:,0] *= -1
+ return vector
+```
+
+### 导入模块
+也可直接前往该目录 `target_dir` 拷贝 `LinearSVC.py` 实现
+```python
+# 为了导入 LinearSVC
+target_dir = '/maixapp/apps/hand_gesture_classifier/'
+import sys
+if target_dir not in sys.path:
+ sys.path.insert(0, target_dir)
+
+from LinearSVC import LinearSVC, LinearSVCManager
+```
+
+### 分类器(LinearSVC)
+
+介绍分类器(LinearSVC)的各项功能和使用方法。
+
+#### 初始化,加载和导出
+```python
+# 初始化
+clf = LinearSVC(C=1.0, learning_rate=0.01, max_iter=500)
+# 加载
+clf = LinearSVC.load("/maixapp/apps/hand_gesture_classifier/clf_dump.npz")
+# 导出
+clf.save("my_clf_dump.npz")
+```
+*初始化方法参数*
+1. C=1.0(正则化参数)
+ - 控制支持向量机(SVM)的正则化强度。
+ - C 值越大,对误分类的惩罚越高,模型会尝试严格分类每个样本,可能会导致过拟合。
+ - C 值越小,允许一定程度的误分类,提高泛化能力,可能会导致欠拟合。
+ 默认值:1.0,适中的正则化,平衡准确性和泛化能力。
+
+2. learning_rate=0.01(学习率)
+ - 控制权重更新的步长大小,即每次梯度下降优化时,参数调整的速度。
+ - 学习率过大,可能会导致优化过程无法收敛,甚至发散。
+ - 学习率过小,优化过程收敛速度过慢,训练时间较长。
+ 默认值:0.01,通常为适中的学习率,确保模型逐步逼近最优解。
+
+3. max_iter=500(最大迭代次数)
+ - 控制训练过程中执行的最大优化轮数。
+ - 迭代次数越多,模型有更多机会找到最优解,但过多的迭代可能会导致训练时间过长或过拟合。
+ - 如果 max_iter 过小,可能在尚未收敛时就提前停止,导致欠拟合。
+ 默认值:1000,允许模型有足够的训练轮次来收敛。
+
+*加载和导出方法参数*
+1. filename: str
+ - 目标文件路径,支持相对和绝对路径
+ - 必须提供
+ 默认值:无
+
+#### 训练和预测(分类)
+分类器初始化后需要进行有效训练才能完成后续分类任务。
+
+若是直接加载先前的训练器备份,即可直接用于分类。
+
+**每次训练都是全量训练,即会丢失先前训练结果。建议:有需要请及时导出当前分类器备份。**
+
+```python
+npzfile = np.load("/maixapp/apps/hand_gesture_classifier/trainSets.npz") # 预加载特征和ID(name_classes 索引)
+X_train = npzfile["X"] # 原始特征
+y_train = npzfile["y"] # 标签id
+
+clf.fit(clf.scaler.fit_transform(X_train), y_train) # 标准化特征后训练SVM
+
+# 回归
+y_pred = clf.predict(clf.scaler.transform(X_train)) # 标准化特征后预测类别
+recall_count = len(y_train)
+right_count = np.sum(y_pred == y_train)
+print(f"right/recall= {right_count}/{recall_count}, acc: {right_count/recall_count}")
+
+# 预测
+X_test = X_train[:5]
+feature_test = clf.scaler.transform(X_test) # 标准化特征
+# y_pred = clf.predict(feature_test) # 预测类别
+y_pred, y_conf = clf.predict_with_confidence(feature_test) # 预测类别
+print(f"pred: {y_pred}, conf: {y_conf}")
+# 对应的类别名 name_classes = ("one", "five", "fist", "ok", "heartSingle", "yearh", "three", "four", "six", "Iloveyou", "gun", "thumbUp", "nine", "pink")
+```
+
+由于每次都是是全量训练,直接使用分类器时,还需要手动维护先前训练的特征和对应标签的存储,才能实现动态增删改分类类别。
+
+为了简化使用并降低额外负担,现封装了 `分类器管理器(LinearSVCManager)`,见下节。
+
+
+### 分类器管理器(LinearSVCManager)
+
+介绍分类器管理器(LinearSVCManager)的各项功能和使用方法。
+
+#### 初始化,加载和导出
+
+无论 `初始化` 或 `加载` 都必须提供有效 X,Y(对应特征和标签)输入。
+
+且保证长度相等,元素一一对应,否则会报错。
+
+```python
+# 初始化,加载
+def __init__(self, clf: LinearSVC=LinearSVC(), X=None, Y=None, pretrained=False)
+
+# 使用默认参数的 LinearSVC 进行初始化
+clfm = LinearSVCManager(X=X_train, Y=y_train)
+# 使用特定参数的 LinearSVC 进行初始化
+clfm = LinearSVCManager(LinearSVC(C=1.0, learning_rate=0.01, max_iter=100), X_train, y_train)
+
+# 加载必须使用加载的 LinearSVC 并且指定 pretrained=True 避免无意义的当场二次训练
+# 且需要保证 X_train, y_train 确实是先前用来训练 LinearSVC 的
+clfm = LinearSVCManager(LinearSVC.load("/maixapp/apps/hand_gesture_classifier/clf_dump.npz"), X_train, y_train, pretrained=True)
+
+# 导出参数请使用 LinearSVC (clfm.clf) 的 save
+clfm.clf.save("my_clf_dump.npz")
+# 导出用于训练的特征和标签
+np.savez("trainSets.npz",
+ X = X_train,
+ y = y_train,
+ )
+```
+
+#### 访问已用于训练的数据
+
+clfm.samples 为一个 python 二元组:
+1. clfm.samples[0] 为 `X`
+2. clfm.samples[1] 为 `Y`
+
+**请不要直接修改,仅供只读访问。否则需手动调用 `clfm.train()` 重新训练。**
+
+#### 添加或移除
+
+**添加请确保 X_new 和 y_new 长度相等,且形状对应先前的 X_train 和 y_train。**
+
+皆为 numpy 数组,可自行通过 shape 字段确认。
+
+```python
+# 添加
+clfm.add(X_new, y_new)
+
+# 移除
+mask_ge_4 = clfm.samples[1] >= 4 # 大于等于 4 的掩码
+indices_ge_4 = np.where(mask_ge_4)[0]
+clfm.rm(indices_ge_4)
+```
+
+以上操作主要处理 `clfm.samples`,但每次会在结尾调用 `clfm.train()` 再训练。
+
+因此,根据待训练数据规模,等待些许时间后,便可直接应用。
+
+
+#### 预测
+
+```python
+y_pred, y_conf = clfm.test(X_test) # 预测类别
+```
+
+等价于
+
+```python
+clf = clfm.clf
+feature_test = clf.scaler.transform(X_test) # 标准化特征
+y_pred, y_conf = clf.predict_with_confidence(feature_test) # 预测类别
+```
+
+#### 示例(效果视频简化版本)
+
+注意:
+- 缺失 preprocess 实现,请从 `预处理` 拷贝过来
+- 缺失 LinearSVC 模块,请从 `导入模块` 拷贝过来
+
+分类预测部分如下,可单文件运行:
+
+```python
+from maix import camera, display, image, nn, app
+import numpy as np
+
+# 添加在我下面
+
+name_classes = ("one", "five", "fist", "ok", "heartSingle", "yearh", "three", "four", "six", "Iloveyou", "gun", "thumbUp", "nine", "pink") # , "class N", "class N+1", ...) # 易于理解的标签名
+npzfile = np.load("/maixapp/apps/hand_gesture_classifier/trainSets.npz") # 预加载特征和ID(name_classes 索引)
+X_train = npzfile["X"]
+y_train = npzfile["y"]
+clfm = LinearSVCManager(LinearSVC.load("/maixapp/apps/hand_gesture_classifier/clf_dump.npz"), X_train, y_train, pretrained=True) # 使用预加载分类器初始化 LinearSVCManager
+
+detector = nn.HandLandmarks(model="/root/models/hand_landmarks.mud")
+cam = camera.Camera(320, 224, detector.input_format())
+disp = display.Display()
+
+# Loading screen
+img = cam.read()
+img.draw_string(100, 112, "Loading...\nwait up to 10s", color = image.COLOR_GREEN)
+disp.show(img)
+
+while not app.need_exit():
+ img = cam.read()
+ objs = detector.detect(img, conf_th = 0.7, iou_th = 0.45, conf_th2 = 0.8)
+ for obj in objs:
+ hand_landmarks = preprocess(obj.points[8:8+21*3], obj.class_id == 0, (img.width(), img.height(), 1)) # 预处理
+ features = np.array([hand_landmarks.flatten()])
+ class_idx, pred_conf = clfm.test(features) # 获取预测类别
+ class_idx, pred_conf = class_idx[0], pred_conf[0] # 复数输入,复数返回,取第一单元
+ msg = f'{detector.labels[obj.class_id]}: {obj.score:.2f}\n{name_classes[class_idx]}({class_idx})={pred_conf*100:.2f}%'
+ img.draw_string(obj.points[0], obj.points[1], msg, color = image.COLOR_RED if obj.class_id == 0 else image.COLOR_GREEN, scale = 1.4, thickness = 2)
+ detector.draw_hand(img, obj.class_id, obj.points, 4, 10, box=True)
+ disp.show(img)
+```
+
+目前使用的 `X_train` 基于的原始数据集为`14 类静态手势数据集`,[数据集下载地址(百度网盘 Password: 6urr )](https://pan.baidu.com/s/1Sd-Ad88Wzp0qjGH6Ngah0g),其中数据集共 2850 个样本,分为 14 类。

+## 效果视频
+
该 app 实现位于 `MaixPy/projects/app_hand_gesture_classifier/main.py`,主要逻辑是
1. 加载 `14 类静态手势数据集` 经 `手部关键点检测` 处理后的 `20` 个相对手腕的坐标偏移
-2. 初始训练前 `4` 个分类,以支持手势识别
+2. 初始训练前 `4` 个分类 **或直接加载预训练的 `14` 分类器参数(源码可切换)**,以支持手势识别
3. 加载 `手部关键点检测` 模型处理摄像头并通过该分类器将结果可视化在屏幕上
4. 点击右上角 `class14` 可增添剩余分类样本再训练以达到 `14` 分类手势
5. 点击右下角 `class4` 可移除上一步添加的分类样本再训练以达到 `4` 分类手势
6. 点击按钮之间的小块区域,可在顶部显示分类器上一次训练的时长
7. 点击其余大块区域,可在左侧显示当前支持的分类类别,绿色表示支持,黄色表示不支持
-
-
-## 效果视频
-1. 视频内容为执行了上述第 `4` 步后的 `14` 分类模式,可识别手势 `1-10` (默认对应其他英文释义),ok,大拇指点赞,比心(需要手背,拍摄时不好演示,可自行验证),小拇指伸展 一共 `14` 种手势。
-
+1. 视频演示内容为执行了上述第 `4` 步 **或第 `2` 步加粗部分**后的 `14` 分类模式,可识别手势 `1-10` (默认对应其他英文释义),ok,大拇指点赞,比心(需要手背,拍摄时不好演示,可自行验证),小拇指伸展 一共 `14` 种手势。
2. 紧接着执行第 `5` 步,回退到 `4` 分类模式,仅可识别 1,5,10(握拳)和 ok,其余的手势都无法识别到正常结果。期间也有执行 第 `7` 步展示了当前是 `4` 分类模式,因为除了前 4 种手势为绿,后 10 种全部为黄色显示。
-
3. 再就是执行第 `4` 步,恢复到 `14` 分类模式,`4` 分类模式无法识别的手势现在也恢复正确识别了。
-
4. 末尾展示了双手的识别,实测可同时正确识别两只手的手势。
## 其它
-效果视频为捕获的 maixvision 右上的屏幕预览窗口而来,和屏幕实际显示内容一致。
-
-详细实现可见源码和上述分析了。
+**效果视频为捕获的 maixvision 右上的屏幕预览窗口而来,和屏幕实际显示内容一致。**
-二次开发或修改也可直接基于源码完成,内附有注释。
+**更详细使用方法或二次开发请参考上述分析阅读源码,内附有注释。**
-如确实仍有需要协助的,可与 maixhub 上发帖留言或发 email 到公司邮箱。
\ No newline at end of file
+如仍有疑惑或需要协助,可于 `maixhub` 上发帖留言或发 `e-mail` 到公司邮箱 `support@sipeed.com`,**标题请使用`[help][MaixPy] guesture classification: xxx`**。
\ No newline at end of file
diff --git a/docs/doc/zh/vision/hand_landmarks.md b/docs/doc/zh/vision/hand_landmarks.md
index 95b06062..c698237d 100644
--- a/docs/doc/zh/vision/hand_landmarks.md
+++ b/docs/doc/zh/vision/hand_landmarks.md
@@ -1,5 +1,5 @@
---
-tite: MaixPy MaixCAM 人手部 21 个关键点三维坐标检测
+title: MaixPy MaixCAM 人手部 21 个关键点三维坐标检测
update:
- date: 2024-12-31
version: v1.0
diff --git a/projects/app_hand_gesture_classifier/main.py b/projects/app_hand_gesture_classifier/main.py
index 414c5439..318ce700 100644
--- a/projects/app_hand_gesture_classifier/main.py
+++ b/projects/app_hand_gesture_classifier/main.py
@@ -45,7 +45,7 @@ def timer(name):
# print(f"X_train: {X_train[0]}")
# print(f"y_train: {y_train[0]}")
-if 0:
+if 1:
with timer("加载") as r:
clfm = LinearSVCManager(LinearSVC.load("clf_dump.npz"), X_train, y_train, pretrained=True)
last_train_time = r['passed']
@@ -130,7 +130,7 @@ def preprocess(hand_landmarks, is_left=False, boundary=(1,1,1)):
if not class_nums_changing and current_n_classes == 4:
class_nums_changing = True
if class_nums_changing:
- img.draw_string(30, 112, "Release to upgrade to class 14\n and please wait.", color = image.COLOR_RED)
+ img.draw_string(30, 112, "Release to upgrade to class 14\n and please wait for Training be done.", color = image.COLOR_RED)
else:
if class_nums_changing:
class_nums_changing = False
@@ -145,7 +145,7 @@ def preprocess(hand_landmarks, is_left=False, boundary=(1,1,1)):
if not class_nums_changing and current_n_classes == 14:
class_nums_changing = True
if class_nums_changing:
- img.draw_string(30, 112, "Release to retrain to class 4\n and please wait.", color = image.COLOR_RED)
+ img.draw_string(30, 112, "Release to retrain to class 4\n and please wait for Training be done.", color = image.COLOR_RED)
else:
if class_nums_changing:
class_nums_changing = False