diff --git a/docs/doc/en/vision/hand_gesture_classification.md b/docs/doc/en/vision/hand_gesture_classification.md index 500b64e1..51434b90 100644 --- a/docs/doc/en/vision/hand_gesture_classification.md +++ b/docs/doc/en/vision/hand_gesture_classification.md @@ -2,45 +2,265 @@ title: MaixCAM MaixPy Hand Gesture Classification Based on Hand Keypoint Detection --- + ## Introduction The `MaixCAM MaixPy Hand Gesture Classification Based on Hand Keypoint Detection` can classify various hand gestures. -The current dataset used is the `14-class static hand gesture dataset` with a total of 2850 samples divided into 14 categories. -[Dataset Download Link (Baidu Netdisk, Password: 6urr)](https://pan.baidu.com/s/1Sd-Ad88Wzp0qjGH6Ngah0g) +The following describes how to preprocess features from [AI model estimated hand landmarks](./hand_landmarks.md), which are then classified using LinearSVC (Support Vector Machine). A detailed implementation is available in `MaixPy/projects/app_hand_gesture_classifier/LinearSVC.py`, and the usage example can be found in the app implementation in `MaixPy/projects/app_hand_gesture_classifier/main.py`. -![](../../assets/handposex_14class.jpg) +**Users can add any distinguishable hand gestures for training.** + +## Usage + +### Preprocessing +Here’s the preprocessing for the raw output `hand_landmarks` from the AI model to derive usable features: + +```python +def preprocess(hand_landmarks, is_left=False, boundary=(1,1,1)): + hand_landmarks = np.array(hand_landmarks).reshape((21, -1)) + vector = hand_landmarks[:,:2] + vector = vector[1:] - vector[0] + vector = vector.astype('float64') / boundary[:vector.shape[1]] + if not is_left: # mirror + vector[:,0] *= -1 + return vector +``` + +### Import Modules +Alternatively, you can directly copy the `LinearSVC.py` implementation from the directory `target_dir` +```python +# To import LinearSVC +target_dir = '/maixapp/apps/hand_gesture_classifier/' +import sys +if target_dir not in sys.path: + sys.path.insert(0, target_dir) + +from LinearSVC import LinearSVC, LinearSVCManager +``` + +### Classifier (LinearSVC) + +Introduction to the LinearSVC classifier's functions and usage. + +#### Initialization, Loading, and Exporting +```python +# Initialize +clf = LinearSVC(C=1.0, learning_rate=0.01, max_iter=500) +# Load +clf = LinearSVC.load("/maixapp/apps/hand_gesture_classifier/clf_dump.npz") +# Export +clf.save("my_clf_dump.npz") +``` +*Initialization Method Parameters* +1. C=1.0 (Regularization Parameter) + - Controls the strength of the regularization in the SVM. + - A larger C value punishes misclassifications more strictly, potentially leading to overfitting. + - A smaller C value allows some misclassifications, improving generalization but potentially underfitting- + Default: 1.0, balanced regularization between accuracy and generalization. + +2. learning_rate=0.01 (Learning Rate) + - Controls the step size for weight updates during each gradient descent optimization. + - Too large a learning rate may cause the optimization process to diverge + - While too small a learning rate may lead to slow convergence. + Default: 0.01, typically a moderate value to ensure gradual approach to the optimal solution. + +3. max_iter=500 (Maximum Iterations) + - Specifies the maximum number of optimization rounds during training. + - More iterations allow the model more chances to converge, but too many may result in overfitting. + - A smaller max_iter value may stop optimization prematurely, leading to underfitting. + Default: 1000, sufficient iterations to ensure convergence. + +*Loading and Exporting Method Parameters* +1. filename: str + - The target file path, both relative and absolute paths are supported. + - This is a required parameter. + Default: None + +*Training and Prediction (Classification)* +After initializing the classifier, it needs to be trained before it can be used for classification. + +If you load a previously trained classifier, it can directly be used for classification. + +**Note: Every training session is a full training process, meaning previous training results will be lost. It's recommended to export the classifier backup as needed.** + +```python +npzfile = np.load("/maixapp/apps/hand_gesture_classifier/trainSets.npz") # Preload features and labels (name_classes index) +X_train = npzfile["X"] # Raw features +y_train = npzfile["y"] # Labels + +clf.fit(clf.scaler.fit_transform(X_train), y_train) # Train SVM after feature normalization + +# Regression +y_pred = clf.predict(clf.scaler.transform(X_train)) # Predict labels after feature normalization +recall_count = len(y_train) +right_count = np.sum(y_pred == y_train) +print(f"right/recall= {right_count}/{recall_count}, acc: {right_count/recall_count}") + +# Prediction +X_test = X_train[:5] +feature_test = clf.scaler.transform(X_test) # Feature normalization +# y_pred = clf.predict(feature_test) # Predict labels +y_pred, y_conf = clf.predict_with_confidence(feature_test) # Predict labels with confidence +print(f"pred: {y_pred}, conf: {y_conf}") +# Corresponding class names name_classes = ("one", "five", "fist", "ok", "heartSingle", "yearh", "three", "four", "six", "Iloveyou", "gun", "thumbUp", "nine", "pink") +``` + +Since every training is full, you need to manually maintain the storage of previously trained features and corresponding labels to allow dynamic addition/removal of classes. + +To simplify usage and reduce extra workload, the `Classifier Manager (LinearSVCManager)` has been encapsulated, as described in the next section. + + +### Classifier Manager (LinearSVCManager) + +Introduction to the LinearSVCManager's functions and usage. + +#### Initialization, Loading, and Exporting + +Both Initialization or Loading must provide valid X and Y (corresponding features and labels) inputs. + +And their lengths must be equal and correspond to each other, or an error will occur. + +```python +# Initialization, Loading +def __init__(self, clf: LinearSVC=LinearSVC(), X=None, Y=None, pretrained=False) + +# Initialize with default LinearSVC parameters +clfm = LinearSVCManager(X=X_train, Y=y_train) +# Initialize with specific LinearSVC parameters +clfm = LinearSVCManager(LinearSVC(C=1.0, learning_rate=0.01, max_iter=100), X_train, y_train) + +# Loading requires the loaded LinearSVC and setting pretrained=True to avoid unnecessary retraining +# Ensure X_train, y_train are the data previously used to train LinearSVC +clfm = LinearSVCManager(LinearSVC.load("/maixapp/apps/hand_gesture_classifier/clf_dump.npz"), X_train, y_train, pretrained=True) + +# Export parameters using LinearSVC (clfm.clf)'s save +clfm.clf.save("my_clf_dump.npz") +# Export features and labels used for training +np.savez("trainSets.npz", + X = X_train, + y = y_train, + ) +``` -This app is implemented in `MaixPy/projects/app_hand_gesture_classifier/main.py`, and the main logic is as follows: +#### Accessing Training Data Used + +clfm.samples is a Python tuple: +1. clfm.samples[0] is `X` +2. clfm.samples[1] is `Y` + +**Do not modify directly, for read-only access only. To make changes, call `clfm.train()` to retrain the model.** + +#### Adding or Removing + +**When adding, ensure X_new and y_new have the same length and match the shape of the previous X_train and y_train.** + +All are numpy arrays, and you can check their shape with the shape attribute. + +```python +# Add new data +clfm.add(X_new, y_new) + +# Remove data +mask_ge_4 = clfm.samples[1] >= 4 # Mask for class labels >= 4 +indices_ge_4 = np.where(mask_ge_4)[0] +clfm.rm(indices_ge_4) +``` + +These operations mainly modify clfm.samples, but will trigger a call to clfm.train() to retrain the model. + +Depending on the size of the training data, wait a few moments before directly applying the updated model. + + +#### Prediction + +```python +y_pred, y_conf = clfm.test(X_test) # Predict labels +``` + +This is equivalent to: + +```python +clf = clfm.clf +feature_test = clf.scaler.transform(X_test) # Feature normalization +y_pred, y_conf = clf.predict_with_confidence(feature_test) # Predict labels with confidence +``` + +#### Example (Simplified Version of the Resulting Video) + +Note: +- Missing preprocess implementation should be copied from the Preprocessing section. +- Missing LinearSVC module should be copied from the Import Modules section. + +The classification and prediction part can be run as a single file: + +```python +from maix import camera, display, image, nn, app +import numpy as np + +# Add below me + +name_classes = ("one", "five", "fist", "ok", "heartSingle", "yearh", "three", "four", "six", "Iloveyou", "gun", "thumbUp", "nine", "pink") # Easy-to-understand class names +npzfile = np.load("/maixapp/apps/hand_gesture_classifier/trainSets.npz") # Preload features and labels (name_classes index) +X_train = npzfile["X"] +y_train = npzfile["y"] +clfm = LinearSVCManager(LinearSVC.load("/maixapp/apps/hand_gesture_classifier/clf_dump.npz"), X_train, y_train, pretrained=True) # Initialize LinearSVCManager with preloaded classifier + +detector = nn.HandLandmarks(model="/root/models/hand_landmarks.mud") +cam = camera.Camera(320, 224, detector.input_format()) +disp = display.Display() + +# Loading screen +img = cam.read() +img.draw_string(100, 112, "Loading...\nwait up to 10s", color = image.COLOR_GREEN) +disp.show(img) + +while not app.need_exit(): + img = cam.read() + objs = detector.detect(img, conf_th = 0.7, iou_th = 0.45, conf_th2 = 0.8) + for obj in objs: + hand_landmarks = preprocess(obj.points[8:8+21*3], obj.class_id == 0, (img.width(), img.height(), 1)) # Preprocessing + features = np.array([hand_landmarks.flatten()]) + class_idx, pred_conf = clfm.test(features) # Get predicted class + class_idx, pred_conf = class_idx[0], pred_conf[0] # Handle multiple inputs and outputs, take the first element + msg = f'{detector.labels[obj.class_id]}: {obj.score:.2f}\n{name_classes[class_idx]}({class_idx})={pred_conf*100:.2f}%' + img.draw_string(obj.points[0], obj.points[1], msg, color = image.COLOR_RED if obj.class_id == 0 else image.COLOR_GREEN, scale = 1.4, thickness = 2) + detector.draw_hand(img, obj.class_id, obj.points, 4, 10, box=True) + disp.show(img) +``` + +The current `X_train` is based on the "14-Class Static Hand Gesture Dataset," which consists of 2850 samples, divided into 14 classes. The dataset can be downloaded from the provided [Baidu Netdisk link (Password: 6urr)](https://pan.baidu.com/s/1Sd-Ad88Wzp0qjGH6Ngah0g). + + +![](../../assets/handposex_14class.jpg) -1. Load the `14-class static hand gesture dataset` processed by the **Hand Keypoint Detection** model, extracting `20` relative wrist coordinate offsets. -2. Initially train on the first `4` classes to support basic gesture recognition. -3. Use the **Hand Keypoint Detection** model to process the camera input and visualize classification results on the screen. -4. Tap the top-right `class14` button to add more samples and retrain the model for full `14-class` gesture recognition. -5. Tap the bottom-right `class4` button to remove the added samples and retrain the model back to the `4-class` mode. -6. Tap the small area between the buttons to display the last training duration at the top of the screen. -7. Tap the remaining large area to show the currently supported gesture classes on the left side—**green** for supported, **yellow** for unsupported. ## Demo Video - +The implementation of this app is located at `MaixPy/projects/app_hand_gesture_classifier/main.py`, with the main logic as follows: -1. The video demonstrates the `14-class` mode after executing step `4`, recognizing gestures `1-10` (default mapped to other meanings), **OK**, **thumbs up**, **finger heart** (requires the back of the hand, hard to demonstrate in the video but can be verified), and **pinky stretch**—a total of `14` gestures. +1. Load the `14-class static hand gesture dataset`, which consists of `20` coordinate offsets relative to the wrist after processing by `hand keypoint detection`. +2. Initially train `4` gesture classifications **or directly load pre-trained `14` classifier parameters (switchable in the source code)** to support gesture recognition. +3. Load the `hand keypoint detection` model to process the camera input and visualize the results on the screen using the classifier. +4. Click the upper right corner `class14` to add the remaining classification samples and retrain to achieve `14` gesture classifications. +5. Click the lower right corner `class4` to remove the additional classification samples from the previous step and retrain to revert to `4` gesture classifications. +6. Click the small area between the buttons to display the duration of the last classifier training at the top. +7. Click other large areas to display the currently supported classification categories on the left side—green indicates supported, yellow indicates not supported. -2. Then, step `5` is executed, reverting to the `4-class` mode, where only gestures **1**, **5**, **10** (fist), and **OK** are recognizable. Other gestures fail to produce correct results. During this process, step `7` was also executed, showing the current `4-class` mode—only the first 4 gestures are green, and the remaining 10 are yellow. + -3. Step `4` is executed again, restoring the `14-class` mode, and previously unrecognized gestures in the `4-class` mode are now correctly identified. +1. The demo video shows the execution of step `4` **or the bold part of step `2`**, demonstrating the `14-class` mode. It can recognize gestures `1-10` (with default corresponding English meanings), "OK", thumbs-up, heart shape (requires the back of the hand, difficult to demonstrate in the video but can be verified), and pinky stretch, making a total of `14` gestures. +2. Then, step `5` is executed to revert to the `4-class` mode, where only gestures `1`, `5`, `10` (fist), and "OK" can be recognized, while the remaining gestures do not produce correct results. Step `7` is also executed to show the current `4-class` mode, as only the first 4 gestures are displayed in green, and the remaining 10 are shown in yellow. +3. Step `4` is executed again to restore the `14-class` mode, and the gestures that could not be recognized in the `4-class` mode are now correctly identified. +4. Finally, the recognition of gestures with both hands is demonstrated, showing that the system can correctly identify gestures from both hands simultaneously. -4. Finally, dual-hand recognition is demonstrated, and both hands' gestures are accurately recognized simultaneously. ## Others -The demo video captures the **maixvision** screen preview window in the top-right corner, matching the actual on-screen display. - -For detailed implementation, please refer to the source code and the above analysis. +**The demo video is captured from the preview window in the upper right corner of MaixVision, and it is consistent with the actual screen display.** -Further development or modification can be directly done based on the source code, which includes comments for guidance. +**For more detailed usage instructions or secondary development, please refer to the source code analysis mentioned above, which includes comments.** -If you need additional assistance, feel free to leave a message on **MaixHub** or send an email to the official company address. +If you still have questions or need assistance, you can post on `maixhub` or send an `e-mail` to the company at `support@sipeed.com`. **Please use the subject `[help][MaixPy] guesture classification: xxx`**. \ No newline at end of file diff --git a/docs/doc/en/vision/hand_landmarks.md b/docs/doc/en/vision/hand_landmarks.md index 325beb95..f850f1cb 100644 --- a/docs/doc/en/vision/hand_landmarks.md +++ b/docs/doc/en/vision/hand_landmarks.md @@ -1,5 +1,5 @@ --- -tite: 3D Coordinate Detection of 21 Hand Keypoints with MaixPy MaixCAM +title: 3D Coordinate Detection of 21 Hand Keypoints with MaixPy MaixCAM update: - date: 2024-12-31 version: v1.0 diff --git a/docs/doc/zh/vision/hand_gesture_classification.md b/docs/doc/zh/vision/hand_gesture_classification.md index 9447adee..de7f1921 100644 --- a/docs/doc/zh/vision/hand_gesture_classification.md +++ b/docs/doc/zh/vision/hand_gesture_classification.md @@ -7,44 +7,260 @@ title: MaixCAM MaixPy 基于手部关键点检测结果进行进行手势分类 由`MaixCAM MaixPy 基于手部关键点检测结果进行进行手势分类`可分类手势。 -目前使用的数据集为`14 类静态手势数据集`,[数据集下载地址(百度网盘 Password: 6urr )](https://pan.baidu.com/s/1Sd-Ad88Wzp0qjGH6Ngah0g),数据集共 2850 个样本,分为 14 类。   +通过前置 [AI 模型估计手部关键点](./hand_landmarks.md)获取特征,再由 LinearSVC (支持向量机线性分类算法) 提供了自训练分类各种手势的能力。详情位于 `MaixPy/projects/app_hand_gesture_classifier/LinearSVC.py`,使用案例见 app 实现,其位于 `MaixPy/projects/app_hand_gesture_classifier/main.py`。 + +**用户可自行添加其他任意可区分手势进行训练。** + +## 使用 + +### 预处理 +以下是对 AI 模型估计手部关键点 的原始输出 hand_landmarks 数据结构进行预处理得到待使用特征: + +```python +def preprocess(hand_landmarks, is_left=False, boundary=(1,1,1)): + hand_landmarks = np.array(hand_landmarks).reshape((21, -1)) + vector = hand_landmarks[:,:2] + vector = vector[1:] - vector[0] + vector = vector.astype('float64') / boundary[:vector.shape[1]] + if not is_left: # mirror + vector[:,0] *= -1 + return vector +``` + +### 导入模块 +也可直接前往该目录 `target_dir` 拷贝 `LinearSVC.py` 实现 +```python +# 为了导入 LinearSVC +target_dir = '/maixapp/apps/hand_gesture_classifier/' +import sys +if target_dir not in sys.path: + sys.path.insert(0, target_dir) + +from LinearSVC import LinearSVC, LinearSVCManager +``` + +### 分类器(LinearSVC) + +介绍分类器(LinearSVC)的各项功能和使用方法。 + +#### 初始化,加载和导出 +```python +# 初始化 +clf = LinearSVC(C=1.0, learning_rate=0.01, max_iter=500) +# 加载 +clf = LinearSVC.load("/maixapp/apps/hand_gesture_classifier/clf_dump.npz") +# 导出 +clf.save("my_clf_dump.npz") +``` +*初始化方法参数* +1. C=1.0(正则化参数) + - 控制支持向量机(SVM)的正则化强度。 + - C 值越大,对误分类的惩罚越高,模型会尝试严格分类每个样本,可能会导致过拟合。 + - C 值越小,允许一定程度的误分类,提高泛化能力,可能会导致欠拟合。 + 默认值:1.0,适中的正则化,平衡准确性和泛化能力。 + +2. learning_rate=0.01(学习率) + - 控制权重更新的步长大小,即每次梯度下降优化时,参数调整的速度。 + - 学习率过大,可能会导致优化过程无法收敛,甚至发散。 + - 学习率过小,优化过程收敛速度过慢,训练时间较长。 + 默认值:0.01,通常为适中的学习率,确保模型逐步逼近最优解。 + +3. max_iter=500(最大迭代次数) + - 控制训练过程中执行的最大优化轮数。 + - 迭代次数越多,模型有更多机会找到最优解,但过多的迭代可能会导致训练时间过长或过拟合。 + - 如果 max_iter 过小,可能在尚未收敛时就提前停止,导致欠拟合。 + 默认值:1000,允许模型有足够的训练轮次来收敛。 + +*加载和导出方法参数* +1. filename: str + - 目标文件路径,支持相对和绝对路径 + - 必须提供 + 默认值:无 + +#### 训练和预测(分类) +分类器初始化后需要进行有效训练才能完成后续分类任务。 + +若是直接加载先前的训练器备份,即可直接用于分类。 + +**每次训练都是全量训练,即会丢失先前训练结果。建议:有需要请及时导出当前分类器备份。** + +```python +npzfile = np.load("/maixapp/apps/hand_gesture_classifier/trainSets.npz") # 预加载特征和ID(name_classes 索引) +X_train = npzfile["X"] # 原始特征 +y_train = npzfile["y"] # 标签id + +clf.fit(clf.scaler.fit_transform(X_train), y_train) # 标准化特征后训练SVM + +# 回归 +y_pred = clf.predict(clf.scaler.transform(X_train)) # 标准化特征后预测类别 +recall_count = len(y_train) +right_count = np.sum(y_pred == y_train) +print(f"right/recall= {right_count}/{recall_count}, acc: {right_count/recall_count}") + +# 预测 +X_test = X_train[:5] +feature_test = clf.scaler.transform(X_test) # 标准化特征 +# y_pred = clf.predict(feature_test) # 预测类别 +y_pred, y_conf = clf.predict_with_confidence(feature_test) # 预测类别 +print(f"pred: {y_pred}, conf: {y_conf}") +# 对应的类别名 name_classes = ("one", "five", "fist", "ok", "heartSingle", "yearh", "three", "four", "six", "Iloveyou", "gun", "thumbUp", "nine", "pink") +``` + +由于每次都是是全量训练,直接使用分类器时,还需要手动维护先前训练的特征和对应标签的存储,才能实现动态增删改分类类别。 + +为了简化使用并降低额外负担,现封装了 `分类器管理器(LinearSVCManager)`,见下节。 + + +### 分类器管理器(LinearSVCManager) + +介绍分类器管理器(LinearSVCManager)的各项功能和使用方法。 + +#### 初始化,加载和导出 + +无论 `初始化` 或 `加载` 都必须提供有效 X,Y(对应特征和标签)输入。 + +且保证长度相等,元素一一对应,否则会报错。 + +```python +# 初始化,加载 +def __init__(self, clf: LinearSVC=LinearSVC(), X=None, Y=None, pretrained=False) + +# 使用默认参数的 LinearSVC 进行初始化 +clfm = LinearSVCManager(X=X_train, Y=y_train) +# 使用特定参数的 LinearSVC 进行初始化 +clfm = LinearSVCManager(LinearSVC(C=1.0, learning_rate=0.01, max_iter=100), X_train, y_train) + +# 加载必须使用加载的 LinearSVC 并且指定 pretrained=True 避免无意义的当场二次训练 +# 且需要保证 X_train, y_train 确实是先前用来训练 LinearSVC 的 +clfm = LinearSVCManager(LinearSVC.load("/maixapp/apps/hand_gesture_classifier/clf_dump.npz"), X_train, y_train, pretrained=True) + +# 导出参数请使用 LinearSVC (clfm.clf) 的 save +clfm.clf.save("my_clf_dump.npz") +# 导出用于训练的特征和标签 +np.savez("trainSets.npz", + X = X_train, + y = y_train, + ) +``` + +#### 访问已用于训练的数据 + +clfm.samples 为一个 python 二元组: +1. clfm.samples[0] 为 `X` +2. clfm.samples[1] 为 `Y` + +**请不要直接修改,仅供只读访问。否则需手动调用 `clfm.train()` 重新训练。** + +#### 添加或移除 + +**添加请确保 X_new 和 y_new 长度相等,且形状对应先前的 X_train 和 y_train。** + +皆为 numpy 数组,可自行通过 shape 字段确认。 + +```python +# 添加 +clfm.add(X_new, y_new) + +# 移除 +mask_ge_4 = clfm.samples[1] >= 4 # 大于等于 4 的掩码 +indices_ge_4 = np.where(mask_ge_4)[0] +clfm.rm(indices_ge_4) +``` + +以上操作主要处理 `clfm.samples`,但每次会在结尾调用 `clfm.train()` 再训练。 + +因此,根据待训练数据规模,等待些许时间后,便可直接应用。 + + +#### 预测 + +```python +y_pred, y_conf = clfm.test(X_test) # 预测类别 +``` + +等价于 + +```python +clf = clfm.clf +feature_test = clf.scaler.transform(X_test) # 标准化特征 +y_pred, y_conf = clf.predict_with_confidence(feature_test) # 预测类别 +``` + +#### 示例(效果视频简化版本) + +注意: +- 缺失 preprocess 实现,请从 `预处理` 拷贝过来 +- 缺失 LinearSVC 模块,请从 `导入模块` 拷贝过来 + +分类预测部分如下,可单文件运行: + +```python +from maix import camera, display, image, nn, app +import numpy as np + +# 添加在我下面 + +name_classes = ("one", "five", "fist", "ok", "heartSingle", "yearh", "three", "four", "six", "Iloveyou", "gun", "thumbUp", "nine", "pink") # , "class N", "class N+1", ...) # 易于理解的标签名 +npzfile = np.load("/maixapp/apps/hand_gesture_classifier/trainSets.npz") # 预加载特征和ID(name_classes 索引) +X_train = npzfile["X"] +y_train = npzfile["y"] +clfm = LinearSVCManager(LinearSVC.load("/maixapp/apps/hand_gesture_classifier/clf_dump.npz"), X_train, y_train, pretrained=True) # 使用预加载分类器初始化 LinearSVCManager + +detector = nn.HandLandmarks(model="/root/models/hand_landmarks.mud") +cam = camera.Camera(320, 224, detector.input_format()) +disp = display.Display() + +# Loading screen +img = cam.read() +img.draw_string(100, 112, "Loading...\nwait up to 10s", color = image.COLOR_GREEN) +disp.show(img) + +while not app.need_exit(): + img = cam.read() + objs = detector.detect(img, conf_th = 0.7, iou_th = 0.45, conf_th2 = 0.8) + for obj in objs: + hand_landmarks = preprocess(obj.points[8:8+21*3], obj.class_id == 0, (img.width(), img.height(), 1)) # 预处理 + features = np.array([hand_landmarks.flatten()]) + class_idx, pred_conf = clfm.test(features) # 获取预测类别 + class_idx, pred_conf = class_idx[0], pred_conf[0] # 复数输入,复数返回,取第一单元 + msg = f'{detector.labels[obj.class_id]}: {obj.score:.2f}\n{name_classes[class_idx]}({class_idx})={pred_conf*100:.2f}%' + img.draw_string(obj.points[0], obj.points[1], msg, color = image.COLOR_RED if obj.class_id == 0 else image.COLOR_GREEN, scale = 1.4, thickness = 2) + detector.draw_hand(img, obj.class_id, obj.points, 4, 10, box=True) + disp.show(img) +``` + +目前使用的 `X_train` 基于的原始数据集为`14 类静态手势数据集`,[数据集下载地址(百度网盘 Password: 6urr )](https://pan.baidu.com/s/1Sd-Ad88Wzp0qjGH6Ngah0g),其中数据集共 2850 个样本,分为 14 类。   ![](../../assets/handposex_14class.jpg) +## 效果视频 + 该 app 实现位于 `MaixPy/projects/app_hand_gesture_classifier/main.py`,主要逻辑是 1. 加载 `14 类静态手势数据集` 经 `手部关键点检测` 处理后的 `20` 个相对手腕的坐标偏移 -2. 初始训练前 `4` 个分类,以支持手势识别 +2. 初始训练前 `4` 个分类 **或直接加载预训练的 `14` 分类器参数(源码可切换)**,以支持手势识别 3. 加载 `手部关键点检测` 模型处理摄像头并通过该分类器将结果可视化在屏幕上 4. 点击右上角 `class14` 可增添剩余分类样本再训练以达到 `14` 分类手势 5. 点击右下角 `class4` 可移除上一步添加的分类样本再训练以达到 `4` 分类手势 6. 点击按钮之间的小块区域,可在顶部显示分类器上一次训练的时长 7. 点击其余大块区域,可在左侧显示当前支持的分类类别,绿色表示支持,黄色表示不支持 - - -## 效果视频 -1. 视频内容为执行了上述第 `4` 步后的 `14` 分类模式,可识别手势 `1-10` (默认对应其他英文释义),ok,大拇指点赞,比心(需要手背,拍摄时不好演示,可自行验证),小拇指伸展 一共 `14` 种手势。 - +1. 视频演示内容为执行了上述第 `4` 步 **或第 `2` 步加粗部分**后的 `14` 分类模式,可识别手势 `1-10` (默认对应其他英文释义),ok,大拇指点赞,比心(需要手背,拍摄时不好演示,可自行验证),小拇指伸展 一共 `14` 种手势。 2. 紧接着执行第 `5` 步,回退到 `4` 分类模式,仅可识别 1,5,10(握拳)和 ok,其余的手势都无法识别到正常结果。期间也有执行 第 `7` 步展示了当前是 `4` 分类模式,因为除了前 4 种手势为绿,后 10 种全部为黄色显示。 - 3. 再就是执行第 `4` 步,恢复到 `14` 分类模式,`4` 分类模式无法识别的手势现在也恢复正确识别了。 - 4. 末尾展示了双手的识别,实测可同时正确识别两只手的手势。 ## 其它 -效果视频为捕获的 maixvision 右上的屏幕预览窗口而来,和屏幕实际显示内容一致。 - -详细实现可见源码和上述分析了。 +**效果视频为捕获的 maixvision 右上的屏幕预览窗口而来,和屏幕实际显示内容一致。** -二次开发或修改也可直接基于源码完成,内附有注释。 +**更详细使用方法或二次开发请参考上述分析阅读源码,内附有注释。** -如确实仍有需要协助的,可与 maixhub 上发帖留言或发 email 到公司邮箱。 \ No newline at end of file +如仍有疑惑或需要协助,可于 `maixhub` 上发帖留言或发 `e-mail` 到公司邮箱 `support@sipeed.com`,**标题请使用`[help][MaixPy] guesture classification: xxx`**。 \ No newline at end of file diff --git a/docs/doc/zh/vision/hand_landmarks.md b/docs/doc/zh/vision/hand_landmarks.md index 95b06062..c698237d 100644 --- a/docs/doc/zh/vision/hand_landmarks.md +++ b/docs/doc/zh/vision/hand_landmarks.md @@ -1,5 +1,5 @@ --- -tite: MaixPy MaixCAM 人手部 21 个关键点三维坐标检测 +title: MaixPy MaixCAM 人手部 21 个关键点三维坐标检测 update: - date: 2024-12-31 version: v1.0 diff --git a/projects/app_hand_gesture_classifier/main.py b/projects/app_hand_gesture_classifier/main.py index 414c5439..318ce700 100644 --- a/projects/app_hand_gesture_classifier/main.py +++ b/projects/app_hand_gesture_classifier/main.py @@ -45,7 +45,7 @@ def timer(name): # print(f"X_train: {X_train[0]}") # print(f"y_train: {y_train[0]}") -if 0: +if 1: with timer("加载") as r: clfm = LinearSVCManager(LinearSVC.load("clf_dump.npz"), X_train, y_train, pretrained=True) last_train_time = r['passed'] @@ -130,7 +130,7 @@ def preprocess(hand_landmarks, is_left=False, boundary=(1,1,1)): if not class_nums_changing and current_n_classes == 4: class_nums_changing = True if class_nums_changing: - img.draw_string(30, 112, "Release to upgrade to class 14\n and please wait.", color = image.COLOR_RED) + img.draw_string(30, 112, "Release to upgrade to class 14\n and please wait for Training be done.", color = image.COLOR_RED) else: if class_nums_changing: class_nums_changing = False @@ -145,7 +145,7 @@ def preprocess(hand_landmarks, is_left=False, boundary=(1,1,1)): if not class_nums_changing and current_n_classes == 14: class_nums_changing = True if class_nums_changing: - img.draw_string(30, 112, "Release to retrain to class 4\n and please wait.", color = image.COLOR_RED) + img.draw_string(30, 112, "Release to retrain to class 4\n and please wait for Training be done.", color = image.COLOR_RED) else: if class_nums_changing: class_nums_changing = False