diff --git a/samples/04_gis_analysts_data_scientists/classifying_human_activity_using _tabPFN_classifier.ipynb b/samples/04_gis_analysts_data_scientists/classifying_human_activity_using _tabPFN_classifier.ipynb new file mode 100644 index 0000000000..e9a0716900 --- /dev/null +++ b/samples/04_gis_analysts_data_scientists/classifying_human_activity_using _tabPFN_classifier.ipynb @@ -0,0 +1,2180 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Leveraging TabPFN for Human Activity Recognition Using Mobile Dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Table of Contents \n", + "* [Introduction](#1) \n", + "* [Necessary imports](#2)\n", + "* [Connect to ArcGIS](#3)\n", + "* [Access the datasets](#4) \n", + "* [Prepare training data for TabPFN](#5)\n", + " * [Data preprocessing for TabPFN classifier model](#6) \n", + " * [Visualize training data](#9)\n", + "* [Model training](#10) \n", + " * [Define the TabPFN classifier model ](#11)\n", + " * [Fit the model](#12)\n", + " * [Visualize results in validation set](#13)\n", + "* [Predict using TabPFN classifier model](#14)\n", + "* [Accuracy assessment: Compute model metric](#16)\n", + "* [Conclusion](#17)\n", + "* [TabPFN license information](#18) " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Introduction " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Human Activity Recognition (HAR) using mobile data has become an important area of research and application due to the increasing ubiquity of smartphones, wearables, and other mobile devices that can collect a wealth of sensor data. HAR is a crucial task in various fields, including healthcare, fitness, workplace safety, and smart cities, where the goal is to classify human activities (e.g., walking, running, sitting) based on sensor data. Traditional methods for HAR often require substantial computational resources and complex hyperparameter tuning, making them difficult to deploy in real-time applications. TabPFN (Tabular Prior-Data Fitted Network), a Transformer-based model designed for fast and efficient classification of small tabular datasets, offers a promising solution to overcome these challenges.\n", + "\n", + "TabPFN’s advantages are particularly well-suited for various HAR use cases. In healthcare, it aids in fall detection for the elderly, chronic disease monitoring, providing timely interventions. For fitness and wellness, it can classify activities such as walking or running in real-time, enhancing user experience in mobile apps and wearable devices. It enhances workplace safety by identifying risky workers' activities in hazardous industrial environments, such as in mining and on oil rigs, ensuring safety and reducing accidents. Furthermore, in the case of smart cities and urban mobility, HAR data from pedestrians and commuters can be efficiently classified to optimize traffic flow, public transport systems, and urban planning initiatives. Additionally, HAR supports emergency response efforts during disasters by locating people in need of help. TabPFN's speed, simplicity, and effectiveness make it an ideal choice for these real-time HAR applications." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Necessary imports " + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "CPU times: total: 0 ns\n", + "Wall time: 1.01 ms\n" + ] + } + ], + "source": [ + "%matplotlib inline\n", + "import matplotlib.pyplot as plt\n", + "\n", + "import pandas as pd\n", + "from sklearn.discriminant_analysis import LinearDiscriminantAnalysis\n", + "from sklearn.preprocessing import StandardScaler\n", + "from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, classification_report\n", + "\n", + "from arcgis.gis import GIS\n", + "from arcgis.learn import MLModel, prepare_tabulardata" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Connect to ArcGIS " + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [], + "source": [ + "gis = GIS(\"home\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Access the datasets \n", + "\n", + "Here we will access the train and test datasets. The Human Activity Recognition (HAR) training dataset consists of 1,020 rows and 561 features, capturing sensor data from mobile devices to classify human activities like walking, running, and sitting. The data includes measurements from accelerometers, gyroscopes, and GPS, providing insights into movement patterns while ensuring that location data remains anonymized for privacy protection. Features such as BodyAcc (body accelerometer), GravityAcc (gravity accelerometer), BodyAccJerk, BodyGyro (body gyroscope), and BodyGyroJerk are used to capture dynamic and rotational movements. Time-domain and frequency-domain features are extracted from these raw signals, helping to distinguish between various activities based on patterns in acceleration, rotation, and speed, making the dataset ideal for activity classification tasks." + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "
\n", + " \n", + " \n", + " \n", + "
\n", + "\n", + "
\n", + " train_har_dataset\n", + " \n", + "
HAR dataset
CSV by api_data_owner\n", + "
Last Modified: January 10, 2025\n", + "
0 comments, 3 views\n", + "
\n", + "
\n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# access the training data\n", + "data_table = gis.content.get('1fafacc88bc3491696f981758a72de50')\n", + "data_table" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [], + "source": [ + "# Download the train data and saving it in local folder\n", + "data_path = data_table.get_data()" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
tBodyAcc-mean()-XtBodyAcc-mean()-YtBodyAcc-mean()-ZtBodyAcc-std()-XtBodyAcc-std()-YtBodyAcc-std()-ZtBodyAcc-mad()-XtBodyAcc-mad()-YtBodyAcc-mad()-ZtBodyAcc-max()-X...fBodyBodyGyroJerkMag-kurtosis()angle(tBodyAccMean,gravity)angle(tBodyAccJerkMean),gravityMean)angle(tBodyGyroMean,gravityMean)angle(tBodyGyroJerkMean,gravityMean)angle(X,gravityMean)angle(Y,gravityMean)angle(Z,gravityMean)subjectActivity
00.271144-0.033031-0.121829-0.987884-0.867081-0.945087-0.991858-0.906651-0.943951-0.909793...-0.6859320.0076290.068842-0.762768-0.751408-0.7866470.234417-0.04034522STANDING
10.278211-0.020855-0.103400-0.996593-0.980402-0.988998-0.997065-0.977596-0.988861-0.941245...-0.7719010.0017270.295551-0.035877-0.360496-0.6614640.2212400.22332317STANDING
20.276012-0.015713-0.103117-0.982340-0.834824-0.973649-0.986465-0.862017-0.976193-0.914101...-0.2064140.1273910.0285810.0533580.637500-0.8267210.212775-0.01828025STANDING
30.272753-0.016910-0.101737-0.997409-0.996203-0.983416-0.997425-0.996439-0.984400-0.944557...-0.8947350.0964690.3193190.2293980.267721-0.6720920.195500-0.19452716SITTING
40.275565-0.014967-0.107715-0.995365-0.988601-0.988218-0.995880-0.989519-0.986747-0.937645...-0.8882580.1523360.217308-0.3776480.733588-0.7495990.048129-0.15665411SITTING
\n", + "

5 rows × 563 columns

\n", + "
" + ], + "text/plain": [ + " tBodyAcc-mean()-X tBodyAcc-mean()-Y tBodyAcc-mean()-Z tBodyAcc-std()-X \\\n", + "0 0.271144 -0.033031 -0.121829 -0.987884 \n", + "1 0.278211 -0.020855 -0.103400 -0.996593 \n", + "2 0.276012 -0.015713 -0.103117 -0.982340 \n", + "3 0.272753 -0.016910 -0.101737 -0.997409 \n", + "4 0.275565 -0.014967 -0.107715 -0.995365 \n", + "\n", + " tBodyAcc-std()-Y tBodyAcc-std()-Z tBodyAcc-mad()-X tBodyAcc-mad()-Y \\\n", + "0 -0.867081 -0.945087 -0.991858 -0.906651 \n", + "1 -0.980402 -0.988998 -0.997065 -0.977596 \n", + "2 -0.834824 -0.973649 -0.986465 -0.862017 \n", + "3 -0.996203 -0.983416 -0.997425 -0.996439 \n", + "4 -0.988601 -0.988218 -0.995880 -0.989519 \n", + "\n", + " tBodyAcc-mad()-Z tBodyAcc-max()-X ... fBodyBodyGyroJerkMag-kurtosis() \\\n", + "0 -0.943951 -0.909793 ... -0.685932 \n", + "1 -0.988861 -0.941245 ... -0.771901 \n", + "2 -0.976193 -0.914101 ... -0.206414 \n", + "3 -0.984400 -0.944557 ... -0.894735 \n", + "4 -0.986747 -0.937645 ... -0.888258 \n", + "\n", + " angle(tBodyAccMean,gravity) angle(tBodyAccJerkMean),gravityMean) \\\n", + "0 0.007629 0.068842 \n", + "1 0.001727 0.295551 \n", + "2 0.127391 0.028581 \n", + "3 0.096469 0.319319 \n", + "4 0.152336 0.217308 \n", + "\n", + " angle(tBodyGyroMean,gravityMean) angle(tBodyGyroJerkMean,gravityMean) \\\n", + "0 -0.762768 -0.751408 \n", + "1 -0.035877 -0.360496 \n", + "2 0.053358 0.637500 \n", + "3 0.229398 0.267721 \n", + "4 -0.377648 0.733588 \n", + "\n", + " angle(X,gravityMean) angle(Y,gravityMean) angle(Z,gravityMean) subject \\\n", + "0 -0.786647 0.234417 -0.040345 22 \n", + "1 -0.661464 0.221240 0.223323 17 \n", + "2 -0.826721 0.212775 -0.018280 25 \n", + "3 -0.672092 0.195500 -0.194527 16 \n", + "4 -0.749599 0.048129 -0.156654 11 \n", + "\n", + " Activity \n", + "0 STANDING \n", + "1 STANDING \n", + "2 STANDING \n", + "3 SITTING \n", + "4 SITTING \n", + "\n", + "[5 rows x 563 columns]" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Read the downloaded data\n", + "train_har_data = pd.read_csv(data_path)\n", + "train_har_data.head(5)" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(1020, 563)" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "train_har_data.shape" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we will access the test dataset, which is significantly larger, containing 6,332 samples." + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "
\n", + " \n", + " \n", + " \n", + "
\n", + "\n", + "
\n", + " test_har_dataset\n", + " \n", + "
HAR dataset
CSV by api_data_owner\n", + "
Last Modified: January 10, 2025\n", + "
0 comments, 0 views\n", + "
\n", + "
\n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 27, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# access the test data\n", + "test_data_table = gis.content.get('e65312babe5b4efbaa2842235b79f653')\n", + "test_data_table" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [], + "source": [ + "# Download the test data and save it to a local folder\n", + "test_data_path = test_data_table.get_data()" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
tBodyAcc-mean()-XtBodyAcc-mean()-YtBodyAcc-mean()-ZtBodyAcc-std()-XtBodyAcc-std()-YtBodyAcc-std()-ZtBodyAcc-mad()-XtBodyAcc-mad()-YtBodyAcc-mad()-ZtBodyAcc-max()-X...fBodyBodyGyroJerkMag-kurtosis()angle(tBodyAccMean,gravity)angle(tBodyAccJerkMean),gravityMean)angle(tBodyGyroMean,gravityMean)angle(tBodyGyroJerkMean,gravityMean)angle(X,gravityMean)angle(Y,gravityMean)angle(Z,gravityMean)subjectActivity
00.288585-0.020294-0.132905-0.995279-0.983111-0.913526-0.995112-0.983185-0.923527-0.934724...-0.710304-0.1127540.030400-0.464761-0.018446-0.8412470.179941-0.0586271STANDING
10.278419-0.016411-0.123520-0.998245-0.975300-0.960322-0.998807-0.974914-0.957686-0.943068...-0.8614990.053477-0.007435-0.7326260.703511-0.8447880.180289-0.0543171STANDING
20.276629-0.016570-0.115362-0.998139-0.980817-0.990482-0.998321-0.979672-0.990441-0.942469...-0.6992050.1233200.1225420.693578-0.615971-0.8478650.185151-0.0438921STANDING
30.277293-0.021751-0.120751-0.997328-0.961245-0.983672-0.997596-0.957236-0.984379-0.940598...-0.5729950.0129540.080936-0.2343130.117797-0.8479710.188982-0.0373641STANDING
40.277175-0.014713-0.106756-0.999188-0.990526-0.993365-0.999211-0.990687-0.992168-0.943323...-0.7659010.105620-0.090278-0.1324030.498814-0.8497730.188812-0.0350631STANDING
\n", + "

5 rows × 563 columns

\n", + "
" + ], + "text/plain": [ + " tBodyAcc-mean()-X tBodyAcc-mean()-Y tBodyAcc-mean()-Z tBodyAcc-std()-X \\\n", + "0 0.288585 -0.020294 -0.132905 -0.995279 \n", + "1 0.278419 -0.016411 -0.123520 -0.998245 \n", + "2 0.276629 -0.016570 -0.115362 -0.998139 \n", + "3 0.277293 -0.021751 -0.120751 -0.997328 \n", + "4 0.277175 -0.014713 -0.106756 -0.999188 \n", + "\n", + " tBodyAcc-std()-Y tBodyAcc-std()-Z tBodyAcc-mad()-X tBodyAcc-mad()-Y \\\n", + "0 -0.983111 -0.913526 -0.995112 -0.983185 \n", + "1 -0.975300 -0.960322 -0.998807 -0.974914 \n", + "2 -0.980817 -0.990482 -0.998321 -0.979672 \n", + "3 -0.961245 -0.983672 -0.997596 -0.957236 \n", + "4 -0.990526 -0.993365 -0.999211 -0.990687 \n", + "\n", + " tBodyAcc-mad()-Z tBodyAcc-max()-X ... fBodyBodyGyroJerkMag-kurtosis() \\\n", + "0 -0.923527 -0.934724 ... -0.710304 \n", + "1 -0.957686 -0.943068 ... -0.861499 \n", + "2 -0.990441 -0.942469 ... -0.699205 \n", + "3 -0.984379 -0.940598 ... -0.572995 \n", + "4 -0.992168 -0.943323 ... -0.765901 \n", + "\n", + " angle(tBodyAccMean,gravity) angle(tBodyAccJerkMean),gravityMean) \\\n", + "0 -0.112754 0.030400 \n", + "1 0.053477 -0.007435 \n", + "2 0.123320 0.122542 \n", + "3 0.012954 0.080936 \n", + "4 0.105620 -0.090278 \n", + "\n", + " angle(tBodyGyroMean,gravityMean) angle(tBodyGyroJerkMean,gravityMean) \\\n", + "0 -0.464761 -0.018446 \n", + "1 -0.732626 0.703511 \n", + "2 0.693578 -0.615971 \n", + "3 -0.234313 0.117797 \n", + "4 -0.132403 0.498814 \n", + "\n", + " angle(X,gravityMean) angle(Y,gravityMean) angle(Z,gravityMean) subject \\\n", + "0 -0.841247 0.179941 -0.058627 1 \n", + "1 -0.844788 0.180289 -0.054317 1 \n", + "2 -0.847865 0.185151 -0.043892 1 \n", + "3 -0.847971 0.188982 -0.037364 1 \n", + "4 -0.849773 0.188812 -0.035063 1 \n", + "\n", + " Activity \n", + "0 STANDING \n", + "1 STANDING \n", + "2 STANDING \n", + "3 STANDING \n", + "4 STANDING \n", + "\n", + "[5 rows x 563 columns]" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# read the test data\n", + "test_har_data = pd.read_csv(test_data_path).drop([\"Unnamed: 0\"], axis=1)\n", + "test_har_data.head(5)" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(6332, 563)" + ] + }, + "execution_count": 30, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "test_har_data.shape" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Prepare training data for TabPFN " + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": {}, + "outputs": [], + "source": [ + "# View column names except the columns - 'subject','Activity'\n", + "ls = list(train_har_data.columns)\n", + "X = [item for item in ls if item not in ['subject','Activity']]" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['tBodyAcc-mean()-X',\n", + " 'tBodyAcc-mean()-Y',\n", + " 'tBodyAcc-mean()-Z',\n", + " 'tBodyAcc-std()-X',\n", + " 'tBodyAcc-std()-Y',\n", + " 'tBodyAcc-std()-Z',\n", + " 'tBodyAcc-mad()-X',\n", + " 'tBodyAcc-mad()-Y',\n", + " 'tBodyAcc-mad()-Z',\n", + " 'tBodyAcc-max()-X',\n", + " 'tBodyAcc-max()-Y',\n", + " 'tBodyAcc-max()-Z',\n", + " 'tBodyAcc-min()-X',\n", + " 'tBodyAcc-min()-Y',\n", + " 'tBodyAcc-min()-Z',\n", + " 'tBodyAcc-sma()',\n", + " 'tBodyAcc-energy()-X',\n", + " 'tBodyAcc-energy()-Y',\n", + " 'tBodyAcc-energy()-Z',\n", + " 'tBodyAcc-iqr()-X',\n", + " 'tBodyAcc-iqr()-Y',\n", + " 'tBodyAcc-iqr()-Z',\n", + " 'tBodyAcc-entropy()-X',\n", + " 'tBodyAcc-entropy()-Y',\n", + " 'tBodyAcc-entropy()-Z',\n", + " 'tBodyAcc-arCoeff()-X,1',\n", + " 'tBodyAcc-arCoeff()-X,2',\n", + " 'tBodyAcc-arCoeff()-X,3',\n", + " 'tBodyAcc-arCoeff()-X,4',\n", + " 'tBodyAcc-arCoeff()-Y,1',\n", + " 'tBodyAcc-arCoeff()-Y,2',\n", + " 'tBodyAcc-arCoeff()-Y,3',\n", + " 'tBodyAcc-arCoeff()-Y,4',\n", + " 'tBodyAcc-arCoeff()-Z,1',\n", + " 'tBodyAcc-arCoeff()-Z,2',\n", + " 'tBodyAcc-arCoeff()-Z,3',\n", + " 'tBodyAcc-arCoeff()-Z,4',\n", + " 'tBodyAcc-correlation()-X,Y',\n", + " 'tBodyAcc-correlation()-X,Z',\n", + " 'tBodyAcc-correlation()-Y,Z',\n", + " 'tGravityAcc-mean()-X',\n", + " 'tGravityAcc-mean()-Y',\n", + " 'tGravityAcc-mean()-Z',\n", + " 'tGravityAcc-std()-X',\n", + " 'tGravityAcc-std()-Y',\n", + " 'tGravityAcc-std()-Z',\n", + " 'tGravityAcc-mad()-X',\n", + " 'tGravityAcc-mad()-Y',\n", + " 'tGravityAcc-mad()-Z',\n", + " 'tGravityAcc-max()-X',\n", + " 'tGravityAcc-max()-Y',\n", + " 'tGravityAcc-max()-Z',\n", + " 'tGravityAcc-min()-X',\n", + " 'tGravityAcc-min()-Y',\n", + " 'tGravityAcc-min()-Z',\n", + " 'tGravityAcc-sma()',\n", + " 'tGravityAcc-energy()-X',\n", + " 'tGravityAcc-energy()-Y',\n", + " 'tGravityAcc-energy()-Z',\n", + " 'tGravityAcc-iqr()-X',\n", + " 'tGravityAcc-iqr()-Y',\n", + " 'tGravityAcc-iqr()-Z',\n", + " 'tGravityAcc-entropy()-X',\n", + " 'tGravityAcc-entropy()-Y',\n", + " 'tGravityAcc-entropy()-Z',\n", + " 'tGravityAcc-arCoeff()-X,1',\n", + " 'tGravityAcc-arCoeff()-X,2',\n", + " 'tGravityAcc-arCoeff()-X,3',\n", + " 'tGravityAcc-arCoeff()-X,4',\n", + " 'tGravityAcc-arCoeff()-Y,1',\n", + " 'tGravityAcc-arCoeff()-Y,2',\n", + " 'tGravityAcc-arCoeff()-Y,3',\n", + " 'tGravityAcc-arCoeff()-Y,4',\n", + " 'tGravityAcc-arCoeff()-Z,1',\n", + " 'tGravityAcc-arCoeff()-Z,2',\n", + " 'tGravityAcc-arCoeff()-Z,3',\n", + " 'tGravityAcc-arCoeff()-Z,4',\n", + " 'tGravityAcc-correlation()-X,Y',\n", + " 'tGravityAcc-correlation()-X,Z',\n", + " 'tGravityAcc-correlation()-Y,Z',\n", + " 'tBodyAccJerk-mean()-X',\n", + " 'tBodyAccJerk-mean()-Y',\n", + " 'tBodyAccJerk-mean()-Z',\n", + " 'tBodyAccJerk-std()-X',\n", + " 'tBodyAccJerk-std()-Y',\n", + " 'tBodyAccJerk-std()-Z',\n", + " 'tBodyAccJerk-mad()-X',\n", + " 'tBodyAccJerk-mad()-Y',\n", + " 'tBodyAccJerk-mad()-Z',\n", + " 'tBodyAccJerk-max()-X',\n", + " 'tBodyAccJerk-max()-Y',\n", + " 'tBodyAccJerk-max()-Z',\n", + " 'tBodyAccJerk-min()-X',\n", + " 'tBodyAccJerk-min()-Y',\n", + " 'tBodyAccJerk-min()-Z',\n", + " 'tBodyAccJerk-sma()',\n", + " 'tBodyAccJerk-energy()-X',\n", + " 'tBodyAccJerk-energy()-Y',\n", + " 'tBodyAccJerk-energy()-Z',\n", + " 'tBodyAccJerk-iqr()-X',\n", + " 'tBodyAccJerk-iqr()-Y',\n", + " 'tBodyAccJerk-iqr()-Z',\n", + " 'tBodyAccJerk-entropy()-X',\n", + " 'tBodyAccJerk-entropy()-Y',\n", + " 'tBodyAccJerk-entropy()-Z',\n", + " 'tBodyAccJerk-arCoeff()-X,1',\n", + " 'tBodyAccJerk-arCoeff()-X,2',\n", + " 'tBodyAccJerk-arCoeff()-X,3',\n", + " 'tBodyAccJerk-arCoeff()-X,4',\n", + " 'tBodyAccJerk-arCoeff()-Y,1',\n", + " 'tBodyAccJerk-arCoeff()-Y,2',\n", + " 'tBodyAccJerk-arCoeff()-Y,3',\n", + " 'tBodyAccJerk-arCoeff()-Y,4',\n", + " 'tBodyAccJerk-arCoeff()-Z,1',\n", + " 'tBodyAccJerk-arCoeff()-Z,2',\n", + " 'tBodyAccJerk-arCoeff()-Z,3',\n", + " 'tBodyAccJerk-arCoeff()-Z,4',\n", + " 'tBodyAccJerk-correlation()-X,Y',\n", + " 'tBodyAccJerk-correlation()-X,Z',\n", + " 'tBodyAccJerk-correlation()-Y,Z',\n", + " 'tBodyGyro-mean()-X',\n", + " 'tBodyGyro-mean()-Y',\n", + " 'tBodyGyro-mean()-Z',\n", + " 'tBodyGyro-std()-X',\n", + " 'tBodyGyro-std()-Y',\n", + " 'tBodyGyro-std()-Z',\n", + " 'tBodyGyro-mad()-X',\n", + " 'tBodyGyro-mad()-Y',\n", + " 'tBodyGyro-mad()-Z',\n", + " 'tBodyGyro-max()-X',\n", + " 'tBodyGyro-max()-Y',\n", + " 'tBodyGyro-max()-Z',\n", + " 'tBodyGyro-min()-X',\n", + " 'tBodyGyro-min()-Y',\n", + " 'tBodyGyro-min()-Z',\n", + " 'tBodyGyro-sma()',\n", + " 'tBodyGyro-energy()-X',\n", + " 'tBodyGyro-energy()-Y',\n", + " 'tBodyGyro-energy()-Z',\n", + " 'tBodyGyro-iqr()-X',\n", + " 'tBodyGyro-iqr()-Y',\n", + " 'tBodyGyro-iqr()-Z',\n", + " 'tBodyGyro-entropy()-X',\n", + " 'tBodyGyro-entropy()-Y',\n", + " 'tBodyGyro-entropy()-Z',\n", + " 'tBodyGyro-arCoeff()-X,1',\n", + " 'tBodyGyro-arCoeff()-X,2',\n", + " 'tBodyGyro-arCoeff()-X,3',\n", + " 'tBodyGyro-arCoeff()-X,4',\n", + " 'tBodyGyro-arCoeff()-Y,1',\n", + " 'tBodyGyro-arCoeff()-Y,2',\n", + " 'tBodyGyro-arCoeff()-Y,3',\n", + " 'tBodyGyro-arCoeff()-Y,4',\n", + " 'tBodyGyro-arCoeff()-Z,1',\n", + " 'tBodyGyro-arCoeff()-Z,2',\n", + " 'tBodyGyro-arCoeff()-Z,3',\n", + " 'tBodyGyro-arCoeff()-Z,4',\n", + " 'tBodyGyro-correlation()-X,Y',\n", + " 'tBodyGyro-correlation()-X,Z',\n", + " 'tBodyGyro-correlation()-Y,Z',\n", + " 'tBodyGyroJerk-mean()-X',\n", + " 'tBodyGyroJerk-mean()-Y',\n", + " 'tBodyGyroJerk-mean()-Z',\n", + " 'tBodyGyroJerk-std()-X',\n", + " 'tBodyGyroJerk-std()-Y',\n", + " 'tBodyGyroJerk-std()-Z',\n", + " 'tBodyGyroJerk-mad()-X',\n", + " 'tBodyGyroJerk-mad()-Y',\n", + " 'tBodyGyroJerk-mad()-Z',\n", + " 'tBodyGyroJerk-max()-X',\n", + " 'tBodyGyroJerk-max()-Y',\n", + " 'tBodyGyroJerk-max()-Z',\n", + " 'tBodyGyroJerk-min()-X',\n", + " 'tBodyGyroJerk-min()-Y',\n", + " 'tBodyGyroJerk-min()-Z',\n", + " 'tBodyGyroJerk-sma()',\n", + " 'tBodyGyroJerk-energy()-X',\n", + " 'tBodyGyroJerk-energy()-Y',\n", + " 'tBodyGyroJerk-energy()-Z',\n", + " 'tBodyGyroJerk-iqr()-X',\n", + " 'tBodyGyroJerk-iqr()-Y',\n", + " 'tBodyGyroJerk-iqr()-Z',\n", + " 'tBodyGyroJerk-entropy()-X',\n", + " 'tBodyGyroJerk-entropy()-Y',\n", + " 'tBodyGyroJerk-entropy()-Z',\n", + " 'tBodyGyroJerk-arCoeff()-X,1',\n", + " 'tBodyGyroJerk-arCoeff()-X,2',\n", + " 'tBodyGyroJerk-arCoeff()-X,3',\n", + " 'tBodyGyroJerk-arCoeff()-X,4',\n", + " 'tBodyGyroJerk-arCoeff()-Y,1',\n", + " 'tBodyGyroJerk-arCoeff()-Y,2',\n", + " 'tBodyGyroJerk-arCoeff()-Y,3',\n", + " 'tBodyGyroJerk-arCoeff()-Y,4',\n", + " 'tBodyGyroJerk-arCoeff()-Z,1',\n", + " 'tBodyGyroJerk-arCoeff()-Z,2',\n", + " 'tBodyGyroJerk-arCoeff()-Z,3',\n", + " 'tBodyGyroJerk-arCoeff()-Z,4',\n", + " 'tBodyGyroJerk-correlation()-X,Y',\n", + " 'tBodyGyroJerk-correlation()-X,Z',\n", + " 'tBodyGyroJerk-correlation()-Y,Z',\n", + " 'tBodyAccMag-mean()',\n", + " 'tBodyAccMag-std()',\n", + " 'tBodyAccMag-mad()',\n", + " 'tBodyAccMag-max()',\n", + " 'tBodyAccMag-min()',\n", + " 'tBodyAccMag-sma()',\n", + " 'tBodyAccMag-energy()',\n", + " 'tBodyAccMag-iqr()',\n", + " 'tBodyAccMag-entropy()',\n", + " 'tBodyAccMag-arCoeff()1',\n", + " 'tBodyAccMag-arCoeff()2',\n", + " 'tBodyAccMag-arCoeff()3',\n", + " 'tBodyAccMag-arCoeff()4',\n", + " 'tGravityAccMag-mean()',\n", + " 'tGravityAccMag-std()',\n", + " 'tGravityAccMag-mad()',\n", + " 'tGravityAccMag-max()',\n", + " 'tGravityAccMag-min()',\n", + " 'tGravityAccMag-sma()',\n", + " 'tGravityAccMag-energy()',\n", + " 'tGravityAccMag-iqr()',\n", + " 'tGravityAccMag-entropy()',\n", + " 'tGravityAccMag-arCoeff()1',\n", + " 'tGravityAccMag-arCoeff()2',\n", + " 'tGravityAccMag-arCoeff()3',\n", + " 'tGravityAccMag-arCoeff()4',\n", + " 'tBodyAccJerkMag-mean()',\n", + " 'tBodyAccJerkMag-std()',\n", + " 'tBodyAccJerkMag-mad()',\n", + " 'tBodyAccJerkMag-max()',\n", + " 'tBodyAccJerkMag-min()',\n", + " 'tBodyAccJerkMag-sma()',\n", + " 'tBodyAccJerkMag-energy()',\n", + " 'tBodyAccJerkMag-iqr()',\n", + " 'tBodyAccJerkMag-entropy()',\n", + " 'tBodyAccJerkMag-arCoeff()1',\n", + " 'tBodyAccJerkMag-arCoeff()2',\n", + " 'tBodyAccJerkMag-arCoeff()3',\n", + " 'tBodyAccJerkMag-arCoeff()4',\n", + " 'tBodyGyroMag-mean()',\n", + " 'tBodyGyroMag-std()',\n", + " 'tBodyGyroMag-mad()',\n", + " 'tBodyGyroMag-max()',\n", + " 'tBodyGyroMag-min()',\n", + " 'tBodyGyroMag-sma()',\n", + " 'tBodyGyroMag-energy()',\n", + " 'tBodyGyroMag-iqr()',\n", + " 'tBodyGyroMag-entropy()',\n", + " 'tBodyGyroMag-arCoeff()1',\n", + " 'tBodyGyroMag-arCoeff()2',\n", + " 'tBodyGyroMag-arCoeff()3',\n", + " 'tBodyGyroMag-arCoeff()4',\n", + " 'tBodyGyroJerkMag-mean()',\n", + " 'tBodyGyroJerkMag-std()',\n", + " 'tBodyGyroJerkMag-mad()',\n", + " 'tBodyGyroJerkMag-max()',\n", + " 'tBodyGyroJerkMag-min()',\n", + " 'tBodyGyroJerkMag-sma()',\n", + " 'tBodyGyroJerkMag-energy()',\n", + " 'tBodyGyroJerkMag-iqr()',\n", + " 'tBodyGyroJerkMag-entropy()',\n", + " 'tBodyGyroJerkMag-arCoeff()1',\n", + " 'tBodyGyroJerkMag-arCoeff()2',\n", + " 'tBodyGyroJerkMag-arCoeff()3',\n", + " 'tBodyGyroJerkMag-arCoeff()4',\n", + " 'fBodyAcc-mean()-X',\n", + " 'fBodyAcc-mean()-Y',\n", + " 'fBodyAcc-mean()-Z',\n", + " 'fBodyAcc-std()-X',\n", + " 'fBodyAcc-std()-Y',\n", + " 'fBodyAcc-std()-Z',\n", + " 'fBodyAcc-mad()-X',\n", + " 'fBodyAcc-mad()-Y',\n", + " 'fBodyAcc-mad()-Z',\n", + " 'fBodyAcc-max()-X',\n", + " 'fBodyAcc-max()-Y',\n", + " 'fBodyAcc-max()-Z',\n", + " 'fBodyAcc-min()-X',\n", + " 'fBodyAcc-min()-Y',\n", + " 'fBodyAcc-min()-Z',\n", + " 'fBodyAcc-sma()',\n", + " 'fBodyAcc-energy()-X',\n", + " 'fBodyAcc-energy()-Y',\n", + " 'fBodyAcc-energy()-Z',\n", + " 'fBodyAcc-iqr()-X',\n", + " 'fBodyAcc-iqr()-Y',\n", + " 'fBodyAcc-iqr()-Z',\n", + " 'fBodyAcc-entropy()-X',\n", + " 'fBodyAcc-entropy()-Y',\n", + " 'fBodyAcc-entropy()-Z',\n", + " 'fBodyAcc-maxInds-X',\n", + " 'fBodyAcc-maxInds-Y',\n", + " 'fBodyAcc-maxInds-Z',\n", + " 'fBodyAcc-meanFreq()-X',\n", + " 'fBodyAcc-meanFreq()-Y',\n", + " 'fBodyAcc-meanFreq()-Z',\n", + " 'fBodyAcc-skewness()-X',\n", + " 'fBodyAcc-kurtosis()-X',\n", + " 'fBodyAcc-skewness()-Y',\n", + " 'fBodyAcc-kurtosis()-Y',\n", + " 'fBodyAcc-skewness()-Z',\n", + " 'fBodyAcc-kurtosis()-Z',\n", + " 'fBodyAcc-bandsEnergy()-1,8',\n", + " 'fBodyAcc-bandsEnergy()-9,16',\n", + " 'fBodyAcc-bandsEnergy()-17,24',\n", + " 'fBodyAcc-bandsEnergy()-25,32',\n", + " 'fBodyAcc-bandsEnergy()-33,40',\n", + " 'fBodyAcc-bandsEnergy()-41,48',\n", + " 'fBodyAcc-bandsEnergy()-49,56',\n", + " 'fBodyAcc-bandsEnergy()-57,64',\n", + " 'fBodyAcc-bandsEnergy()-1,16',\n", + " 'fBodyAcc-bandsEnergy()-17,32',\n", + " 'fBodyAcc-bandsEnergy()-33,48',\n", + " 'fBodyAcc-bandsEnergy()-49,64',\n", + " 'fBodyAcc-bandsEnergy()-1,24',\n", + " 'fBodyAcc-bandsEnergy()-25,48',\n", + " 'fBodyAcc-bandsEnergy()-1,8.1',\n", + " 'fBodyAcc-bandsEnergy()-9,16.1',\n", + " 'fBodyAcc-bandsEnergy()-17,24.1',\n", + " 'fBodyAcc-bandsEnergy()-25,32.1',\n", + " 'fBodyAcc-bandsEnergy()-33,40.1',\n", + " 'fBodyAcc-bandsEnergy()-41,48.1',\n", + " 'fBodyAcc-bandsEnergy()-49,56.1',\n", + " 'fBodyAcc-bandsEnergy()-57,64.1',\n", + " 'fBodyAcc-bandsEnergy()-1,16.1',\n", + " 'fBodyAcc-bandsEnergy()-17,32.1',\n", + " 'fBodyAcc-bandsEnergy()-33,48.1',\n", + " 'fBodyAcc-bandsEnergy()-49,64.1',\n", + " 'fBodyAcc-bandsEnergy()-1,24.1',\n", + " 'fBodyAcc-bandsEnergy()-25,48.1',\n", + " 'fBodyAcc-bandsEnergy()-1,8.2',\n", + " 'fBodyAcc-bandsEnergy()-9,16.2',\n", + " 'fBodyAcc-bandsEnergy()-17,24.2',\n", + " 'fBodyAcc-bandsEnergy()-25,32.2',\n", + " 'fBodyAcc-bandsEnergy()-33,40.2',\n", + " 'fBodyAcc-bandsEnergy()-41,48.2',\n", + " 'fBodyAcc-bandsEnergy()-49,56.2',\n", + " 'fBodyAcc-bandsEnergy()-57,64.2',\n", + " 'fBodyAcc-bandsEnergy()-1,16.2',\n", + " 'fBodyAcc-bandsEnergy()-17,32.2',\n", + " 'fBodyAcc-bandsEnergy()-33,48.2',\n", + " 'fBodyAcc-bandsEnergy()-49,64.2',\n", + " 'fBodyAcc-bandsEnergy()-1,24.2',\n", + " 'fBodyAcc-bandsEnergy()-25,48.2',\n", + " 'fBodyAccJerk-mean()-X',\n", + " 'fBodyAccJerk-mean()-Y',\n", + " 'fBodyAccJerk-mean()-Z',\n", + " 'fBodyAccJerk-std()-X',\n", + " 'fBodyAccJerk-std()-Y',\n", + " 'fBodyAccJerk-std()-Z',\n", + " 'fBodyAccJerk-mad()-X',\n", + " 'fBodyAccJerk-mad()-Y',\n", + " 'fBodyAccJerk-mad()-Z',\n", + " 'fBodyAccJerk-max()-X',\n", + " 'fBodyAccJerk-max()-Y',\n", + " 'fBodyAccJerk-max()-Z',\n", + " 'fBodyAccJerk-min()-X',\n", + " 'fBodyAccJerk-min()-Y',\n", + " 'fBodyAccJerk-min()-Z',\n", + " 'fBodyAccJerk-sma()',\n", + " 'fBodyAccJerk-energy()-X',\n", + " 'fBodyAccJerk-energy()-Y',\n", + " 'fBodyAccJerk-energy()-Z',\n", + " 'fBodyAccJerk-iqr()-X',\n", + " 'fBodyAccJerk-iqr()-Y',\n", + " 'fBodyAccJerk-iqr()-Z',\n", + " 'fBodyAccJerk-entropy()-X',\n", + " 'fBodyAccJerk-entropy()-Y',\n", + " 'fBodyAccJerk-entropy()-Z',\n", + " 'fBodyAccJerk-maxInds-X',\n", + " 'fBodyAccJerk-maxInds-Y',\n", + " 'fBodyAccJerk-maxInds-Z',\n", + " 'fBodyAccJerk-meanFreq()-X',\n", + " 'fBodyAccJerk-meanFreq()-Y',\n", + " 'fBodyAccJerk-meanFreq()-Z',\n", + " 'fBodyAccJerk-skewness()-X',\n", + " 'fBodyAccJerk-kurtosis()-X',\n", + " 'fBodyAccJerk-skewness()-Y',\n", + " 'fBodyAccJerk-kurtosis()-Y',\n", + " 'fBodyAccJerk-skewness()-Z',\n", + " 'fBodyAccJerk-kurtosis()-Z',\n", + " 'fBodyAccJerk-bandsEnergy()-1,8',\n", + " 'fBodyAccJerk-bandsEnergy()-9,16',\n", + " 'fBodyAccJerk-bandsEnergy()-17,24',\n", + " 'fBodyAccJerk-bandsEnergy()-25,32',\n", + " 'fBodyAccJerk-bandsEnergy()-33,40',\n", + " 'fBodyAccJerk-bandsEnergy()-41,48',\n", + " 'fBodyAccJerk-bandsEnergy()-49,56',\n", + " 'fBodyAccJerk-bandsEnergy()-57,64',\n", + " 'fBodyAccJerk-bandsEnergy()-1,16',\n", + " 'fBodyAccJerk-bandsEnergy()-17,32',\n", + " 'fBodyAccJerk-bandsEnergy()-33,48',\n", + " 'fBodyAccJerk-bandsEnergy()-49,64',\n", + " 'fBodyAccJerk-bandsEnergy()-1,24',\n", + " 'fBodyAccJerk-bandsEnergy()-25,48',\n", + " 'fBodyAccJerk-bandsEnergy()-1,8.1',\n", + " 'fBodyAccJerk-bandsEnergy()-9,16.1',\n", + " 'fBodyAccJerk-bandsEnergy()-17,24.1',\n", + " 'fBodyAccJerk-bandsEnergy()-25,32.1',\n", + " 'fBodyAccJerk-bandsEnergy()-33,40.1',\n", + " 'fBodyAccJerk-bandsEnergy()-41,48.1',\n", + " 'fBodyAccJerk-bandsEnergy()-49,56.1',\n", + " 'fBodyAccJerk-bandsEnergy()-57,64.1',\n", + " 'fBodyAccJerk-bandsEnergy()-1,16.1',\n", + " 'fBodyAccJerk-bandsEnergy()-17,32.1',\n", + " 'fBodyAccJerk-bandsEnergy()-33,48.1',\n", + " 'fBodyAccJerk-bandsEnergy()-49,64.1',\n", + " 'fBodyAccJerk-bandsEnergy()-1,24.1',\n", + " 'fBodyAccJerk-bandsEnergy()-25,48.1',\n", + " 'fBodyAccJerk-bandsEnergy()-1,8.2',\n", + " 'fBodyAccJerk-bandsEnergy()-9,16.2',\n", + " 'fBodyAccJerk-bandsEnergy()-17,24.2',\n", + " 'fBodyAccJerk-bandsEnergy()-25,32.2',\n", + " 'fBodyAccJerk-bandsEnergy()-33,40.2',\n", + " 'fBodyAccJerk-bandsEnergy()-41,48.2',\n", + " 'fBodyAccJerk-bandsEnergy()-49,56.2',\n", + " 'fBodyAccJerk-bandsEnergy()-57,64.2',\n", + " 'fBodyAccJerk-bandsEnergy()-1,16.2',\n", + " 'fBodyAccJerk-bandsEnergy()-17,32.2',\n", + " 'fBodyAccJerk-bandsEnergy()-33,48.2',\n", + " 'fBodyAccJerk-bandsEnergy()-49,64.2',\n", + " 'fBodyAccJerk-bandsEnergy()-1,24.2',\n", + " 'fBodyAccJerk-bandsEnergy()-25,48.2',\n", + " 'fBodyGyro-mean()-X',\n", + " 'fBodyGyro-mean()-Y',\n", + " 'fBodyGyro-mean()-Z',\n", + " 'fBodyGyro-std()-X',\n", + " 'fBodyGyro-std()-Y',\n", + " 'fBodyGyro-std()-Z',\n", + " 'fBodyGyro-mad()-X',\n", + " 'fBodyGyro-mad()-Y',\n", + " 'fBodyGyro-mad()-Z',\n", + " 'fBodyGyro-max()-X',\n", + " 'fBodyGyro-max()-Y',\n", + " 'fBodyGyro-max()-Z',\n", + " 'fBodyGyro-min()-X',\n", + " 'fBodyGyro-min()-Y',\n", + " 'fBodyGyro-min()-Z',\n", + " 'fBodyGyro-sma()',\n", + " 'fBodyGyro-energy()-X',\n", + " 'fBodyGyro-energy()-Y',\n", + " 'fBodyGyro-energy()-Z',\n", + " 'fBodyGyro-iqr()-X',\n", + " 'fBodyGyro-iqr()-Y',\n", + " 'fBodyGyro-iqr()-Z',\n", + " 'fBodyGyro-entropy()-X',\n", + " 'fBodyGyro-entropy()-Y',\n", + " 'fBodyGyro-entropy()-Z',\n", + " 'fBodyGyro-maxInds-X',\n", + " 'fBodyGyro-maxInds-Y',\n", + " 'fBodyGyro-maxInds-Z',\n", + " 'fBodyGyro-meanFreq()-X',\n", + " 'fBodyGyro-meanFreq()-Y',\n", + " 'fBodyGyro-meanFreq()-Z',\n", + " 'fBodyGyro-skewness()-X',\n", + " 'fBodyGyro-kurtosis()-X',\n", + " 'fBodyGyro-skewness()-Y',\n", + " 'fBodyGyro-kurtosis()-Y',\n", + " 'fBodyGyro-skewness()-Z',\n", + " 'fBodyGyro-kurtosis()-Z',\n", + " 'fBodyGyro-bandsEnergy()-1,8',\n", + " 'fBodyGyro-bandsEnergy()-9,16',\n", + " 'fBodyGyro-bandsEnergy()-17,24',\n", + " 'fBodyGyro-bandsEnergy()-25,32',\n", + " 'fBodyGyro-bandsEnergy()-33,40',\n", + " 'fBodyGyro-bandsEnergy()-41,48',\n", + " 'fBodyGyro-bandsEnergy()-49,56',\n", + " 'fBodyGyro-bandsEnergy()-57,64',\n", + " 'fBodyGyro-bandsEnergy()-1,16',\n", + " 'fBodyGyro-bandsEnergy()-17,32',\n", + " 'fBodyGyro-bandsEnergy()-33,48',\n", + " 'fBodyGyro-bandsEnergy()-49,64',\n", + " 'fBodyGyro-bandsEnergy()-1,24',\n", + " 'fBodyGyro-bandsEnergy()-25,48',\n", + " 'fBodyGyro-bandsEnergy()-1,8.1',\n", + " 'fBodyGyro-bandsEnergy()-9,16.1',\n", + " 'fBodyGyro-bandsEnergy()-17,24.1',\n", + " 'fBodyGyro-bandsEnergy()-25,32.1',\n", + " 'fBodyGyro-bandsEnergy()-33,40.1',\n", + " 'fBodyGyro-bandsEnergy()-41,48.1',\n", + " 'fBodyGyro-bandsEnergy()-49,56.1',\n", + " 'fBodyGyro-bandsEnergy()-57,64.1',\n", + " 'fBodyGyro-bandsEnergy()-1,16.1',\n", + " 'fBodyGyro-bandsEnergy()-17,32.1',\n", + " 'fBodyGyro-bandsEnergy()-33,48.1',\n", + " 'fBodyGyro-bandsEnergy()-49,64.1',\n", + " 'fBodyGyro-bandsEnergy()-1,24.1',\n", + " 'fBodyGyro-bandsEnergy()-25,48.1',\n", + " 'fBodyGyro-bandsEnergy()-1,8.2',\n", + " 'fBodyGyro-bandsEnergy()-9,16.2',\n", + " 'fBodyGyro-bandsEnergy()-17,24.2',\n", + " 'fBodyGyro-bandsEnergy()-25,32.2',\n", + " 'fBodyGyro-bandsEnergy()-33,40.2',\n", + " 'fBodyGyro-bandsEnergy()-41,48.2',\n", + " 'fBodyGyro-bandsEnergy()-49,56.2',\n", + " 'fBodyGyro-bandsEnergy()-57,64.2',\n", + " 'fBodyGyro-bandsEnergy()-1,16.2',\n", + " 'fBodyGyro-bandsEnergy()-17,32.2',\n", + " 'fBodyGyro-bandsEnergy()-33,48.2',\n", + " 'fBodyGyro-bandsEnergy()-49,64.2',\n", + " 'fBodyGyro-bandsEnergy()-1,24.2',\n", + " 'fBodyGyro-bandsEnergy()-25,48.2',\n", + " 'fBodyAccMag-mean()',\n", + " 'fBodyAccMag-std()',\n", + " 'fBodyAccMag-mad()',\n", + " 'fBodyAccMag-max()',\n", + " 'fBodyAccMag-min()',\n", + " 'fBodyAccMag-sma()',\n", + " 'fBodyAccMag-energy()',\n", + " 'fBodyAccMag-iqr()',\n", + " 'fBodyAccMag-entropy()',\n", + " 'fBodyAccMag-maxInds',\n", + " 'fBodyAccMag-meanFreq()',\n", + " 'fBodyAccMag-skewness()',\n", + " 'fBodyAccMag-kurtosis()',\n", + " 'fBodyBodyAccJerkMag-mean()',\n", + " 'fBodyBodyAccJerkMag-std()',\n", + " 'fBodyBodyAccJerkMag-mad()',\n", + " 'fBodyBodyAccJerkMag-max()',\n", + " 'fBodyBodyAccJerkMag-min()',\n", + " 'fBodyBodyAccJerkMag-sma()',\n", + " 'fBodyBodyAccJerkMag-energy()',\n", + " 'fBodyBodyAccJerkMag-iqr()',\n", + " 'fBodyBodyAccJerkMag-entropy()',\n", + " 'fBodyBodyAccJerkMag-maxInds',\n", + " 'fBodyBodyAccJerkMag-meanFreq()',\n", + " 'fBodyBodyAccJerkMag-skewness()',\n", + " 'fBodyBodyAccJerkMag-kurtosis()',\n", + " 'fBodyBodyGyroMag-mean()',\n", + " 'fBodyBodyGyroMag-std()',\n", + " 'fBodyBodyGyroMag-mad()',\n", + " 'fBodyBodyGyroMag-max()',\n", + " 'fBodyBodyGyroMag-min()',\n", + " 'fBodyBodyGyroMag-sma()',\n", + " 'fBodyBodyGyroMag-energy()',\n", + " 'fBodyBodyGyroMag-iqr()',\n", + " 'fBodyBodyGyroMag-entropy()',\n", + " 'fBodyBodyGyroMag-maxInds',\n", + " 'fBodyBodyGyroMag-meanFreq()',\n", + " 'fBodyBodyGyroMag-skewness()',\n", + " 'fBodyBodyGyroMag-kurtosis()',\n", + " 'fBodyBodyGyroJerkMag-mean()',\n", + " 'fBodyBodyGyroJerkMag-std()',\n", + " 'fBodyBodyGyroJerkMag-mad()',\n", + " 'fBodyBodyGyroJerkMag-max()',\n", + " 'fBodyBodyGyroJerkMag-min()',\n", + " 'fBodyBodyGyroJerkMag-sma()',\n", + " 'fBodyBodyGyroJerkMag-energy()',\n", + " 'fBodyBodyGyroJerkMag-iqr()',\n", + " 'fBodyBodyGyroJerkMag-entropy()',\n", + " 'fBodyBodyGyroJerkMag-maxInds',\n", + " 'fBodyBodyGyroJerkMag-meanFreq()',\n", + " 'fBodyBodyGyroJerkMag-skewness()',\n", + " 'fBodyBodyGyroJerkMag-kurtosis()',\n", + " 'angle(tBodyAccMean,gravity)',\n", + " 'angle(tBodyAccJerkMean),gravityMean)',\n", + " 'angle(tBodyGyroMean,gravityMean)',\n", + " 'angle(tBodyGyroJerkMean,gravityMean)',\n", + " 'angle(X,gravityMean)',\n", + " 'angle(Y,gravityMean)',\n", + " 'angle(Z,gravityMean)']" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "X" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "561" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(X)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Data preprocessing for TabPFN classifier model" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To process the training data for the TabPFN model, we will use Linear Discriminant Analysis (LDA) to reduce the number of features from the original 561 to below the TabPFN model's maximum limit of 100. By applying LDA, we can preserve the most relevant information for classification while reducing the complexity of the input data, making it suitable for the TabPFN model, which requires a compact input format for efficient processing and predictions." + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(1020, 6)" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Data processing to reduce the features to 100 or less as required for TabPFN models\n", + "X = train_har_data.drop(columns=['Activity'])\n", + "y = train_har_data['Activity']\n", + "scaler = StandardScaler()\n", + "X_scaled = scaler.fit_transform(X)\n", + "lda = LinearDiscriminantAnalysis(n_components=min(100, len(set(y)) - 1))\n", + "X_reduced_lda = lda.fit_transform(X_scaled, y)\n", + "X_train_lda_df = pd.DataFrame(X_reduced_lda, columns=[f'LDA{i+1}' for i in range(X_reduced_lda.shape[1])])\n", + "X_train_lda_df['Activity'] = y.reset_index(drop=True)\n", + "X_train_lda_df.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Index(['LDA1', 'LDA2', 'LDA3', 'LDA4', 'LDA5', 'Activity'], dtype='object')" + ] + }, + "execution_count": 36, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Visualize the final processed training data columns\n", + "X_train_lda_df.columns" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We define the explanatory variables as follows: In the training dataframe above we use `Activity` as the target label to be predicted, using the rest of the features as explanatory variables `X`. We define the explanatory variables as follows: " + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "5" + ] + }, + "execution_count": 35, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# define the explanatory vairables\n", + "X = list(X_train_lda_df.columns)\n", + "X =X[:-1]\n", + "len(X)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once the explanatory variables `X` are defined, they are used as input in the `prepare_tabulardata` method from the tabular learner in `arcgis.learn`. The method takes the feature layer or a spatial dataframe containing the dataset and prepares it for fitting the model.\n", + "\n", + "The input parameters required for the tool are used as follows:" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [], + "source": [ + "data = prepare_tabulardata(X_train_lda_df, 'Activity', explanatory_variables=X)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Visualize training data " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To get a sense of what the training data looks like, the `show_batch()` method will randomly pick a few training samples and visualize them. The samples show the explanatory variables and the `Activity` target label to predict." + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
ActivityLDA1LDA2LDA3LDA4LDA5
28STANDING-19.431367-11.1744150.816325-0.6871532.809134
130SITTING-18.377005-9.4293100.342490-0.757008-4.074501
311WALKING17.852727-1.390446-8.1646044.176480-0.160302
734WALKING_UPSTAIRS23.6336402.3011212.475455-10.447339-0.302447
847LAYING-26.62091314.783496-0.7058470.511383-0.568820
\n", + "
" + ], + "text/plain": [ + " Activity LDA1 LDA2 LDA3 LDA4 LDA5\n", + "28 STANDING -19.431367 -11.174415 0.816325 -0.687153 2.809134\n", + "130 SITTING -18.377005 -9.429310 0.342490 -0.757008 -4.074501\n", + "311 WALKING 17.852727 -1.390446 -8.164604 4.176480 -0.160302\n", + "734 WALKING_UPSTAIRS 23.633640 2.301121 2.475455 -10.447339 -0.302447\n", + "847 LAYING -26.620913 14.783496 -0.705847 0.511383 -0.568820" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data.show_batch(rows=5)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Model training \n", + "First, we initialize the model as follows:" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Model initialization \n", + "\n", + "The default, initialization of the TabPFN classifier model object is shown below:" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": {}, + "outputs": [], + "source": [ + "tabpfn_classifier = MLModel(data, 'tabpfn.TabPFNClassifier', device='cpu', N_ensemble_configurations=32)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Fit the model \n", + "\n", + "Next, we will train the model.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": {}, + "outputs": [], + "source": [ + "tabpfn_classifier.fit()" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0.9901960784313726" + ] + }, + "execution_count": 41, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "tabpfn_classifier.score()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can see the model score is showing excellent results." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Visualize results in validation set \n", + "\n", + "It is a good practice to see the results of the model viz-a-viz ground truth. The code below picks random samples and shows us the `Activity` which is the ground truth or target state and model predicted `Activity_results` side by side. This enables us to preview the results of the model we trained." + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
ActivityLDA1LDA2LDA3LDA4LDA5Activity_results
101LAYING-27.92574617.698939-0.2119930.137931-0.872844LAYING
299WALKING22.2306590.572533-8.3181691.2750160.139479WALKING
693SITTING-18.977440-10.1230490.551960-0.028800-3.010251SITTING
884WALKING23.685543-0.261172-11.3384813.464041-0.628183WALKING
967SITTING-18.285170-11.0177981.873937-0.702550-1.983438SITTING
\n", + "
" + ], + "text/plain": [ + " Activity LDA1 LDA2 LDA3 LDA4 LDA5 \\\n", + "101 LAYING -27.925746 17.698939 -0.211993 0.137931 -0.872844 \n", + "299 WALKING 22.230659 0.572533 -8.318169 1.275016 0.139479 \n", + "693 SITTING -18.977440 -10.123049 0.551960 -0.028800 -3.010251 \n", + "884 WALKING 23.685543 -0.261172 -11.338481 3.464041 -0.628183 \n", + "967 SITTING -18.285170 -11.017798 1.873937 -0.702550 -1.983438 \n", + "\n", + " Activity_results \n", + "101 LAYING \n", + "299 WALKING \n", + "693 SITTING \n", + "884 WALKING \n", + "967 SITTING " + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "tabpfn_classifier.show_results()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Predict using the TabPFN classifier model \n", + "\n", + "Once the TabPFN classifier is trained on the smaller dataset of 1,020 samples, we can use it to predict the classes of a larger dataset containing 6,332 samples. Given TabPFN’s ability to process data efficiently with a single forward pass, it can handle this larger dataset quickly, classifying each sample based on the patterns learned during training. Since the model is optimized for fast and scalable predictions, it will generate class predictions for all samples. \n", + "\n", + "Before using the trained TabPFN model to predict the classes of the test dataset, we will first apply Linear Discriminant Analysis (LDA) to reduce the test data to the same feature space as the training data. This ensures consistency between the training and test datasets, enabling the trained TabPFN model to effectively classify the larger test sample." + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "(6332, 6)\n" + ] + } + ], + "source": [ + "# Align tset data with the train data format \n", + "X = test_har_data.drop(columns=['Activity'])\n", + "y = test_har_data['Activity']\n", + "scaler = StandardScaler()\n", + "X_scaled = scaler.fit_transform(X)\n", + "lda = LinearDiscriminantAnalysis(n_components=min(100, len(set(y)) - 1)) \n", + "X_reduced_lda = lda.fit_transform(X_scaled, y)\n", + "X_test_lda_df = pd.DataFrame(X_reduced_lda, columns=[f'LDA{i+1}' for i in range(X_reduced_lda.shape[1])])\n", + "X_test_lda_df['Activity'] = y.reset_index(drop=True)\n", + "print(X_test_lda_df.shape) " + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
LDA1LDA2LDA3LDA4LDA5Activity
0-10.188443-8.6413770.6066691.1209833.836137STANDING
1-9.735631-6.7166750.537841-0.5433852.295157STANDING
2-8.954351-7.3762960.798942-0.5074652.508069STANDING
3-10.400401-7.2673211.0351340.2727382.034312STANDING
4-9.596161-6.9800610.480017-0.2845371.103180STANDING
\n", + "
" + ], + "text/plain": [ + " LDA1 LDA2 LDA3 LDA4 LDA5 Activity\n", + "0 -10.188443 -8.641377 0.606669 1.120983 3.836137 STANDING\n", + "1 -9.735631 -6.716675 0.537841 -0.543385 2.295157 STANDING\n", + "2 -8.954351 -7.376296 0.798942 -0.507465 2.508069 STANDING\n", + "3 -10.400401 -7.267321 1.035134 0.272738 2.034312 STANDING\n", + "4 -9.596161 -6.980061 0.480017 -0.284537 1.103180 STANDING" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "X_test_lda_df.head(5)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Predict " + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": {}, + "outputs": [], + "source": [ + "activity_predicted_tabpfn = tabpfn_classifier.predict(X_test_lda_df, prediction_type='dataframe')" + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
LDA1LDA2LDA3LDA4LDA5Activityprediction_results
632718.0957832.5937758.7043375.4242570.443448WALKING_DOWNSTAIRSWALKING_DOWNSTAIRS
632817.0942861.9972845.2707522.8398470.550822WALKING_DOWNSTAIRSWALKING_DOWNSTAIRS
632915.9095941.5378034.8872374.771153-0.157321WALKING_DOWNSTAIRSWALKING_DOWNSTAIRS
633011.9449850.8346600.116338-6.2852360.045984WALKING_UPSTAIRSWALKING_UPSTAIRS
633114.5755701.737412-0.866397-5.458011-1.150021WALKING_UPSTAIRSWALKING_UPSTAIRS
\n", + "
" + ], + "text/plain": [ + " LDA1 LDA2 LDA3 LDA4 LDA5 Activity \\\n", + "6327 18.095783 2.593775 8.704337 5.424257 0.443448 WALKING_DOWNSTAIRS \n", + "6328 17.094286 1.997284 5.270752 2.839847 0.550822 WALKING_DOWNSTAIRS \n", + "6329 15.909594 1.537803 4.887237 4.771153 -0.157321 WALKING_DOWNSTAIRS \n", + "6330 11.944985 0.834660 0.116338 -6.285236 0.045984 WALKING_UPSTAIRS \n", + "6331 14.575570 1.737412 -0.866397 -5.458011 -1.150021 WALKING_UPSTAIRS \n", + "\n", + " prediction_results \n", + "6327 WALKING_DOWNSTAIRS \n", + "6328 WALKING_DOWNSTAIRS \n", + "6329 WALKING_DOWNSTAIRS \n", + "6330 WALKING_UPSTAIRS \n", + "6331 WALKING_UPSTAIRS " + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "activity_predicted_tabpfn.tail(5)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Accuracy assessment \n", + "\n", + "Next, we will evaluate the model's performance. This will print out multiple model metrics that we can use to assess the model quality. These metrics include a combination of multiple evaluation criteria, such as `accuracy`, `precision`, `recall` and `F1-Score`, which collectively measure the model's performance on the validation set." + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Accuracy: 96.83%\n", + "Precision: 0.97\n", + "Recall: 0.97\n", + "F1 Score: 0.97\n", + "\n", + "Classification Report:\n", + " precision recall f1-score support\n", + "\n", + " LAYING 1.00 1.00 1.00 1219\n", + " SITTING 1.00 0.83 0.91 1119\n", + " STANDING 0.86 0.99 0.93 1197\n", + " WALKING 1.00 1.00 1.00 1031\n", + "WALKING_DOWNSTAIRS 1.00 0.99 1.00 835\n", + " WALKING_UPSTAIRS 0.99 1.00 1.00 931\n", + "\n", + " accuracy 0.97 6332\n", + " macro avg 0.97 0.97 0.97 6332\n", + " weighted avg 0.97 0.97 0.97 6332\n", + "\n" + ] + } + ], + "source": [ + "# Extract ground truth and predictions\n", + "y_true = activity_predicted_tabpfn['Activity']\n", + "y_pred = activity_predicted_tabpfn['prediction_results']\n", + "\n", + "# Calculate Accuracy\n", + "accuracy = accuracy_score(y_true, y_pred)\n", + "print(f'Accuracy: {accuracy * 100:.2f}%')\n", + "\n", + "# Calculate Precision \n", + "precision = precision_score(y_true, y_pred, average='weighted', zero_division=0)\n", + "print(f'Precision: {precision:.2f}')\n", + "\n", + "# Calculate Recall \n", + "recall = recall_score(y_true, y_pred, average='weighted', zero_division=0)\n", + "print(f'Recall: {recall:.2f}')\n", + "\n", + "# Calculate F1-Score \n", + "f1 = f1_score(y_true, y_pred, average='weighted', zero_division=0)\n", + "print(f'F1 Score: {f1:.2f}')\n", + "\n", + "# classification_report \n", + "print(\"\\nClassification Report:\")\n", + "print(classification_report(y_true, y_pred))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The performance metrics obtained from the trained TabPFN model on the test dataset of 6,332 samples indicate excellent classification quality.\n", + "\n", + "`Accuracy (96.81%)` : The model correctly classified approximately 97% of the samples, which is a strong indication of its ability to generalize well to unseen data, despite being trained on a smaller dataset of just 1,020 samples.\n", + "\n", + "`Precision (0.97)` : Precision measures the proportion of true positive predictions among all positive predictions made by the model. A precision of 0.97 means that 97% of the predicted positive activity classes are correct, indicating that the model rarely makes false positive errors.\n", + "\n", + "`Recall (0.97)` : Recall represents the model's ability to correctly identify all relevant instances of a class. A recall of 0.97 means that the model correctly identifies 97% of all actual positive instances, with minimal false negatives.\n", + "\n", + "`F1 Score (0.97)` : The F1 Score is the harmonic mean of precision and recall, and a value of 0.97 shows that the model balances precision and recall very well. This indicates that the model is both highly accurate and sensitive in detecting the correct activity classes.\n", + "\n", + "Overall, these metrics demonstrate that the TabPFN model performs exceptionally well, achieving near-perfect classification with minimal errors. This performance is particularly impressive given that it was trained on a relatively small sample size of 1,020 data points, highlighting its efficiency and effectiveness in handling human activity recognition tasks." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Conclusion " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This project highlights the powerful capabilities of the TabPFN classifier for Human Activity Recognition (HAR) tasks. Even with a training dataset of just 1,020 samples, the model achieved impressive results on a larger test dataset of 6,332 samples, with an accuracy of 96.81%, and precision, recall, and F1 scores all reaching 0.97. The TabPFN model's speed, simplicity, and strong performance in classifying human activities, highlight its potential for applications in healthcare, fitness, smart cities and disaster relief operations, offering an efficient and scalable solution for HAR systems." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### TabPFN license information " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "| License Description |\n", + "|:------------------- |\n", + "| Built with TabPFN - tabpfn.TabPFNClassifier |" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "pro3.5_LearnLesson2025", + "language": "python", + "name": "pro3.5_learnlesson2025" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.11" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}