Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch size variable #3

Open
dwalke04 opened this issue Nov 5, 2021 · 5 comments
Open

Batch size variable #3

dwalke04 opened this issue Nov 5, 2021 · 5 comments

Comments

@dwalke04
Copy link

dwalke04 commented Nov 5, 2021

Hello,
I noticed you have an option for batch size in your code. Does this need to be specified? Is it possible to analyze studies that have batches with different sizes?

My second question is related to time to run the routine. It has take over 20 hours for the train function to run. My study included about 10,000 peaks, 3 batches and 150 samples. Is this expected? Is there anyway to speed up the code? This was using an i7 processor and 8 cores. It looks like the model is running ~1700 training iterations. Is it possible to change this?

Thanks!

@dwalke04
Copy link
Author

dwalke04 commented Nov 6, 2021

I'm also receiving another error when I run the training function:
ImportError: dlopen(/opt/anaconda3/lib/python3.8/site-packages/scipy/linalg/_solve_toeplitz.cpython-38-darwin.so, 2): no suitable image found. Did find:
/opt/anaconda3/lib/python3.8/site-packages/scipy/linalg/_solve_toeplitz.cpython-38-darwin.so: open() failed with errno=23

@dwalke04
Copy link
Author

dwalke04 commented Nov 8, 2021

One more question: I've managed to successfully run the code, and have my results. The batch correction seems to have run successfully, but the number of feature tables in my final results has been reduced from around 11,000 to 4,000. Is there filtering criteria that is applied during the correction? If so, how is that threshold determined/set?

@luyiyun
Copy link
Owner

luyiyun commented Nov 10, 2021

batch_size is a hyperparameter of training of deep neural network, which means the number of samples send to neural network to update parameters. It is not the sample size of batch.

@luyiyun
Copy link
Owner

luyiyun commented Nov 10, 2021

The feature size reduction is due to the data preprocessing:

  1. remove peaks with more than 20% zeros
  2. for each peaks, impute zeros with the half of minimum values

@luyiyun
Copy link
Owner

luyiyun commented Nov 10, 2021

you can use -e e1 e2 e3 to adjust the number of training iterations, e1 means the number of autoencoder pretraining, e2 means the number of discriminators pretraining, e3 means the number of adversarial training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants