TREETS

Install

pip install treets

Example for a quick data analysis on phased studies.

import treets.core as treets
import pandas as pd

Take a brief look on the food logging dataset and the reference information sheet

treets.file_loader('data/col_test_data/yrt*').head(2)

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Unnamed: 0	original_logtime	desc_text	food_type	PID
0	0	2021-05-12 02:30:00 +0000	Milk	b	yrt1999
1	1	2021-05-12 02:45:00 +0000	Some Medication	m	yrt1999

pd.read_excel('data/col_test_data/toy_data_17May2021.xlsx').head(2)

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	mCC_ID	Participant_Study_ID	Study Phase	Intervention group (TRE or HABIT)	Start_Day	End_day	Eating_Window_Start	Eating_Window_End
0	yrt1999	2	S-REM	TRE	2021-05-12	2021-05-14	00:00:00	23:59:00
1	yrt1999	2	T3-INT	TRE	2021-05-15	2021-05-18	08:00:00	18:00:00

Call summarize_data_with_experiment_phases() function to make the table that contains analytic information that we want.

df = treets.summarize_data_with_experiment_phases(treets.file_loader('data/col_test_data/yrt*')\
                      , pd.read_excel('data/col_test_data/toy_data_17May2021.xlsx'))

Participant yrt1999 didn't log any food items in the following day(s):
2021-05-18
Participant yrt2000 didn't log any food items in the following day(s):
2021-05-12
2021-05-13
2021-05-14
2021-05-15
2021-05-16
2021-05-17
2021-05-18
Participant yrt1999 have bad logging day(s) in the following day(s):
2021-05-12
2021-05-15
Participant yrt1999 have bad window day(s) in the following day(s):
2021-05-15
2021-05-17
Participant yrt1999 have non adherent day(s) in the following day(s):
2021-05-12
2021-05-15
2021-05-17

df

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	mCC_ID	Participant_Study_ID	Study Phase	Intervention group (TRE or HABIT)	Start_Day	End_day	Eating_Window_Start	Eating_Window_End	phase_duration	caloric_entries_num	...	logging_day_counts	%_logging_day_counts	good_logging_days	%_good_logging_days	good_window_days	%_good_window_days	outside_window_days	%_outside_window_days	adherent_days	%_adherent_days
0	yrt1999	2	S-REM	TRE	2021-05-12	2021-05-14	00:00:00	23:59:00	3 days	7	...	3	100.0%	2.0	66.67%	3.0	100.0%	0.0	0.0%	2.0	66.67%
1	yrt1999	2	T3-INT	TRE	2021-05-15	2021-05-18	08:00:00	18:00:00	4 days	8	...	3	75.0%	2.0	50.0%	1.0	25.0%	2.0	50.0%	1.0	25.0%
2	yrt2000	3	T3-INT	TRE	2021-05-12	2021-05-14	08:00:00	16:00:00	3 days	0	...	0	0.0%	0.0	0.0%	0.0	0.0%	0.0	0.0%	0.0	0.0%
3	yrt2000	3	T3-INT	TRE	2021-05-15	2021-05-18	08:00:00	16:00:00	4 days	0	...	0	0.0%	0.0	0.0%	0.0	0.0%	0.0	0.0%	0.0	0.0%
4	yrt2001	4	T12-A	TRE	NaT	NaT	NaN	NaN	NaT	0	...	0	nan%	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

5 rows × 32 columns

Look at resulting statistical information for the first row in the resulting dataset.

df.iloc[0]

mCC_ID                                           yrt1999
Participant_Study_ID                                   2
Study Phase                                        S-REM
Intervention group (TRE or HABIT)                    TRE
Start_Day                            2021-05-12 00:00:00
End_day                              2021-05-14 00:00:00
Eating_Window_Start                             00:00:00
Eating_Window_End                               23:59:00
phase_duration                           3 days 00:00:00
caloric_entries_num                                    7
medication_num                                         0
water_num                                              0
first_cal_avg                                   5.916667
first_cal_std                                   2.240722
last_cal_avg                                   19.666667
last_cal_std                                   12.933323
mean_daily_eating_window                           13.75
std_daily_eating_window                        11.986972
earliest_entry                                       4.5
2.5%                                              4.5375
97.5%                                            27.5625
duration mid 95%                                  23.025
logging_day_counts                                     3
%_logging_day_counts                              100.0%
good_logging_days                                    2.0
%_good_logging_days                               66.67%
good_window_days                                     3.0
%_good_window_days                                100.0%
outside_window_days                                  0.0
%_outside_window_days                               0.0%
adherent_days                                        2.0
%_adherent_days                                   66.67%
Name: 0, dtype: object

Example for a quick data analysis on non-phased studies.

take a look at the original dataset

df = treets.file_loader('data/test_food_details.csv')
df.head(2)

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Unnamed: 0	ID	unique_code	research_info_id	desc_text	food_type	original_logtime	foodimage_file_name
0	1340147	7572733	alqt14018795225	150	Water	w	2017-12-08 17:30:00+00:00	NaN
1	1340148	411111	alqt14018795225	150	Coffee White	b	2017-12-09 00:01:00+00:00	NaN

preprocess the data to create features we might need in the furthur analysis such as float time, week count since the first week, etc.

df = treets.load_food_data(df,'unique_code', 'original_logtime',4)
df.head(2)

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Unnamed: 0	ID	unique_code	research_info_id	desc_text	food_type	original_logtime	date	float_time	time	week_from_start	year
0	1340147	7572733	alqt14018795225	150	Water	w	2017-12-08 17:30:00+00:00	2017-12-08	17.500000	17:30:00	1	2017
1	1340148	411111	alqt14018795225	150	Coffee White	b	2017-12-09 00:01:00+00:00	2017-12-08	24.016667	00:01:00	1	2017

Call summarize_data() function to make the table that contains analytic information that we want.¶

df = treets.summarize_data(df, 'unique_code', 'float_time', 'date')
df.head(2)

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	unique_code	num_days	num_total_items	num_f_n_b	num_medications	num_water	first_cal_avg	first_cal_std	last_cal_avg	last_cal_std	eating_win_avg	eating_win_std	good_logging_count	first_cal variation (90%-10%)	last_cal variation (90%-10%)	2.5%	95%	duration mid 95%
0	alqt1148284857	13	149	96	19	34	7.821795	6.710717	23.485897	4.869082	15.664103	8.231201	146	2.966667	9.666667	4.535000	26.813333	22.636667
1	alqt14018795225	64	488	484	3	1	7.525781	5.434563	25.858594	3.374839	18.332813	6.603913	484	13.450000	3.100000	4.183333	27.438333	23.416667

Look at resulting statistical information for the first row in the resulting dataset.

df.iloc[0]

unique_code                      alqt1148284857
num_days                                     13
num_total_items                             149
num_f_n_b                                    96
num_medications                              19
num_water                                    34
first_cal_avg                          7.821795
first_cal_std                          6.710717
last_cal_avg                          23.485897
last_cal_std                           4.869082
eating_win_avg                        15.664103
eating_win_std                         8.231201
good_logging_count                          146
first_cal variation (90%-10%)          2.966667
last_cal variation (90%-10%)           9.666667
2.5%                                      4.535
95%                                   26.813333
duration mid 95%                      22.636667
Name: 0, dtype: object

Clean text in food loggings

# import the dataset
df = treets.file_loader('data/col_test_data/yrt*')
df.head(3)

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Unnamed: 0	original_logtime	desc_text	food_type	PID
0	0	2021-05-12 02:30:00 +0000	Milk	b	yrt1999
1	1	2021-05-12 02:45:00 +0000	Some Medication	m	yrt1999
2	2	2021-05-12 04:45:00 +0000	bacon egg	f	yrt1999

treets.clean_loggings(df, 'desc_text', 'PID').head(3)

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	PID	desc_text	cleaned
0	yrt1999	Milk	[milk]
1	yrt1999	Some Medication	[medication]
2	yrt1999	bacon egg	[bacon, egg]

We can see that words are lower cased, modifiers are removed(2nd row) and items are split into individual items(third row).

Visualizations

# import the dataset
df = treets.file_loader('data/test_food_details.csv')
df.head(2)

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	Unnamed: 0	ID	unique_code	research_info_id	desc_text	food_type	original_logtime	foodimage_file_name
0	1340147	7572733	alqt14018795225	150	Water	w	2017-12-08 17:30:00+00:00	NaN
1	1340148	411111	alqt14018795225	150	Coffee White	b	2017-12-09 00:01:00+00:00	NaN

make a scatter plot for people’s breakfast time

# create required features for function first_cal_mean_with_error_bar()
df['original_logtime'] = pd.to_datetime(df['original_logtime'])
df['local_time'] = treets.find_float_time(df, 'original_logtime')
df['date'] = treets.find_date(df, 'original_logtime')

# call the function
treets.first_cal_mean_with_error_bar(df,'unique_code', 'date', 'local_time')

Use swarmplot to visualize each person’s eating time distribution.

treets.swarmplot(df, 50, 'unique_code', 'date', 'local_time')

Name		Name	Last commit message	Last commit date
Latest commit History 174 Commits
.github/workflows		.github/workflows
.quarto		.quarto
conda/treets		conda/treets
data		data
index_files/figure-commonmark		index_files/figure-commonmark
treets		treets
.devcontainer.json		.devcontainer.json
.gitignore		.gitignore
00_core.ipynb		00_core.ipynb
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
_quarto.yml		_quarto.yml
docker-compose.yml		docker-compose.yml
index.ipynb		index.ipynb
nbdev.yml		nbdev.yml
settings.ini		settings.ini
setup.py		setup.py
sidebar.yml		sidebar.yml
styles.css		styles.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TREETS

Install

Example for a quick data analysis on phased studies.

Example for a quick data analysis on non-phased studies.

Clean text in food loggings

Visualizations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

FleischerResearchLab/treets

Folders and files

Latest commit

History

Repository files navigation

TREETS

Install

Example for a quick data analysis on phased studies.

Example for a quick data analysis on non-phased studies.

Clean text in food loggings

Visualizations

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages