Data augmentation #7

chris-mrn · 2023-07-10T13:17:37Z

The augmentation process is now working as we were expecting and we improved the way we compute the scores

… each iteration

tomMoral

Some comments

tomMoral · 2023-07-10T17:28:49Z

benchmark_utils/transformation.py

@@ -6,15 +6,14 @@
 # - getting requirements info when all dependencies are not installed.
 with safe_import_context() as import_ctx:
    import numpy as np
-
    from numpy import concatenate


prefer using np.concatenate instead of importing the function, as this makes it easier to read the code.

tomMoral · 2023-07-10T17:34:08Z

benchmark_utils/transformation.py

    X_augm = to_numpy(X)
    y_augm = y
    for i in range(n_augmentation):
+        transform = ChannelsDropout(probability=probability)


why not instantiate this object outside the loop?
Also, a seed should be given otherwise the benchmark is not reproducible.
So not sure why you made this change, can you comment?

I put it inside the loop because when it was outside the augmentation was always giving the same X_tr

and I have the same issue when I fix the seed

Maybe we could predefine a sequence a seed, to have a different seed for each transformation, do you have an other idea ?

yes, having a sequence of seeds is indeed a nice idea.
But I would look at the Transform object and try to understand what is happening, because the standard for data augmentation would be that repeated calls to the transform should give different augmented data.

Can you do simple example and check what is the behavior of the object?

Okay I think I know what is the exact issue, what defines the transformation is mask_len_samples and mask_start_per_sample. They are generated "randomly" in transform.get_augmentation_params, the issue is that to choose randomly, the transformation uses rng.uniform with a seed that is fixed. So we are always having the same parameters for the augmentation and we get at each iteration the same augmented data.

tomMoral · 2023-07-10T17:34:31Z

benchmark_utils/transformation.py

+                                probability=probability,
+                                mask_len_samples=mask_len_samples,
+                                    )


weird formatting

objective.py

…g seeds

tomMoral

Some comments about the PR :)

benchmark_utils/transformation.py

tomMoral · 2023-08-07T21:18:55Z

benchmark_utils/transformation.py

 def channels_dropout(
-    X, y, n_augmentation, seed=0, probability=0.5, p_drop=0.2
+    X, y, n_augmentation, probability=0.5, p_drop=0.2


Suggested change

X, y, n_augmentation, probability=0.5, p_drop=0.2

X, y, n_augmentation, probability=0.5, p_drop=0.2, seed=None

tomMoral · 2023-08-07T21:19:19Z

benchmark_utils/transformation.py

@@ -43,51 +47,58 @@ def channels_dropout(
        The labels.

    """
-    transform = ChannelsDropout(probability=probability, random_state=seed)
+
+    seed = gen_seed()


Suggested change

seed = gen_seed()

rng = np.random.RandomState(seed)

For a reproducible sequence of random numbers, the best is to use a random number generator, that will be used to sample different number for each draw, but with a reproducible order.

tomMoral · 2023-08-07T21:20:01Z

benchmark_utils/transformation.py

+        transform = ChannelsDropout(
+                                    probability=probability,
+                                    random_state=next(seed)
+                                    )


Suggested change

transform = ChannelsDropout(

probability=probability,

random_state=next(seed)

)

transform = ChannelsDropout(

probability=probability,

random_state=rng

)

tomMoral · 2023-08-07T21:20:17Z

benchmark_utils/transformation.py


    return X_augm, y_augm


 def smooth_timemask(
-    X, y, n_augmentation, sfreq, seed=0, probability=0.5, second=0.1
+    X, y, n_augmentation, sfreq, probability=0.8, second=0.2


Suggested change

X, y, n_augmentation, sfreq, probability=0.8, second=0.2

X, y, n_augmentation, sfreq, probability=0.8, second=0.2, seed=None

tomMoral · 2023-08-07T21:31:11Z

solvers/ShallowFBCSPNet_augm_channeldrop.py

@@ -13,7 +13,7 @@
        SmoothTimeMask,
    )
    from braindecode.models import ShallowFBCSPNet
-    from numpy import linspace, pi
+    from numpy import linspace


Suggested change

from numpy import linspace

import numpy as np

tomMoral · 2023-08-07T21:31:25Z

solvers/ShallowFBCSPNet_augm_channeldrop.py

@@ -96,25 +96,25 @@ def set_objective(self, X, y, sfreq):
                    mask_len_samples=int(sfreq * second),
                    random_state=seed,
                )
-                for second in linspace(0.1, 2, 10)
+                for second in linspace(0.1, 2, 3)


Suggested change

for second in linspace(0.1, 2, 3)

for second in np.linspace(0.1, 2, 3)

Why did you decrease the number of samples?

It was only to have faster results

tomMoral · 2023-08-07T21:33:17Z

solvers/TGSPSVM.py

+        "augmentation": [
+            "SmoothTimeMask",
+        ],
        "covariances_estimator": ["oas"],


Suggested change

"augmentation": [

"SmoothTimeMask",

],

"covariances_estimator": ["oas"],

"covariances_estimator": ["oas"],

tomMoral · 2023-08-07T21:33:35Z

solvers/ShallowFBCSPNet_augm_channeldrop.py

            ]

        elif self.augmentation == "ChannelDropout":
            transforms = [
                ChannelsDropout(
                    probability=self.proba, p_drop=prob, random_state=seed
                )
-                for prob in linspace(0, 1, 10)
+                for prob in linspace(0, 1, 3)


Suggested change

for prob in linspace(0, 1, 3)

for prob in np.linspace(0, 1, 3)

tomMoral · 2023-08-07T21:33:42Z

solvers/ShallowFBCSPNet_augm_channeldrop.py

                    random_state=seed,
                )
-                for prob in linspace(0, 2 * pi, 10)
+                for phase_freq in linspace(0, 1, 3)


Suggested change

for phase_freq in linspace(0, 1, 3)

for phase_freq in np.linspace(0, 1, 3)

Co-authored-by: Thomas Moreau <[email protected]>

chris-mrn added 4 commits July 10, 2023 12:02

changing the place of transform to generate different augmentation at…

826b81f

… each iteration

getting rid of the visualisation tools (matplotlib)

a0b0d4c

improving the way we are scoring

e240e2a

solving an issue with the input of the scores

4a5ae22

tomMoral reviewed Jul 10, 2023

View reviewed changes

chris-mrn and others added 5 commits July 11, 2023 13:58

a solution for reproducibility in the data augmentation by generatin…

3ed90f2

…g seeds

changing set_data to have other paradigms

79e51f5

having the labels beginning at 0 for deep solvers

9fb460c

lighter benchmark

f9bceb4

Merge branch 'main' into data_augmentation

6da9b54

tomMoral reviewed Aug 7, 2023

View reviewed changes

Update benchmark_utils/transformation.py

6cc66ec

Co-authored-by: Thomas Moreau <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data augmentation #7

Data augmentation #7

chris-mrn commented Jul 10, 2023

tomMoral left a comment

tomMoral Jul 10, 2023

tomMoral Jul 10, 2023

chris-mrn Jul 10, 2023

chris-mrn Jul 10, 2023

chris-mrn Jul 11, 2023

tomMoral Jul 11, 2023

chris-mrn Jul 11, 2023

tomMoral Jul 10, 2023

tomMoral left a comment

tomMoral Aug 7, 2023

tomMoral Aug 7, 2023

tomMoral Aug 7, 2023

tomMoral Aug 7, 2023

tomMoral Aug 7, 2023

tomMoral Aug 7, 2023

tomMoral Aug 7, 2023

tomMoral Aug 7, 2023

chris-mrn Aug 9, 2023

tomMoral Aug 7, 2023

tomMoral Aug 7, 2023

tomMoral Aug 7, 2023

	X, y, n_augmentation, probability=0.5, p_drop=0.2
	X, y, n_augmentation, probability=0.5, p_drop=0.2, seed=None

	X, y, n_augmentation, sfreq, probability=0.8, second=0.2
	X, y, n_augmentation, sfreq, probability=0.8, second=0.2, seed=None

	for second in linspace(0.1, 2, 3)
	for second in np.linspace(0.1, 2, 3)

	for prob in linspace(0, 1, 3)
	for prob in np.linspace(0, 1, 3)

	for phase_freq in linspace(0, 1, 3)
	for phase_freq in np.linspace(0, 1, 3)

Data augmentation #7

Are you sure you want to change the base?

Data augmentation #7

Conversation

chris-mrn commented Jul 10, 2023

tomMoral left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomMoral left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment