-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
4681f7b
commit ef24401
Showing
1 changed file
with
44 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
Source of the original data: https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip . Original description: http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones | ||
|
||
The attached R script (run_analysis.R) performs the following to clean up the data: | ||
|
||
1. Merges the training and test sets to create one data set | ||
|
||
2. Reads file features.txt and extracts only the measurements on the mean and standard deviation for each measurement. | ||
|
||
3. Reads activity_labels.txt and applies descriptive activity names to name the activities in the data set: | ||
walking | ||
|
||
walkingupstairs | ||
|
||
walkingdownstairs | ||
|
||
sitting | ||
|
||
standing | ||
|
||
laying | ||
|
||
4. The script also appropriately labels the data set with descriptive names: all feature names (attributes) and activity names are converted to lower case, underscores and brackets () are removed. | ||
Then it merges the 10299x66 data frame containing features with 10299x1 data frames containing activity labels and subject IDs. | ||
|
||
The result is saved as merged.txt, data frame such that the first column contains subject IDs, the second column activity names, and the last 66 columns are measurements. Subject IDs are integers between 1 and 30 inclusive. Names of the attributes are similar to the following: | ||
|
||
tbodyacc-mean-x | ||
|
||
tbodyacc-mean-y | ||
|
||
tbodyacc-mean-z | ||
|
||
tbodyacc-std-x | ||
|
||
tbodyacc-std-y | ||
|
||
tbodyacc-std-z | ||
|
||
tgravityacc-mean-x | ||
|
||
tgravityacc-mean-y | ||
|
||
5. Finally, the script creates a 2nd, independent tidy data set with the average of each measurement for each activity and each subject. | ||
The result is saved as averages.txt, the first column contains subject IDs, the second column contains activity names (see below), and then the averages for each of the 66 attributes are in columns 3...68. There are 30 subjects and 6 activities, thus 180 rows in this data set with averages. |