Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long_read problem #138

Open
ZarulHanifah opened this issue Aug 1, 2023 · 1 comment
Open

Long_read problem #138

ZarulHanifah opened this issue Aug 1, 2023 · 1 comment

Comments

@ZarulHanifah
Copy link

Hello SemiBin developers,

Thank you for the software. It worked well with my nanopore data (R.9.4.1, Guppy v5, super accurate basecalling configuration), until I tried
--sequencing-type=long_read. Any ideas?

Command:

SemiBin single_easy_bin -i results/proovframe/assem.fasta \
 -b results/minimap2.bam \
--depth-metabat2 results/depth.tsv \
-r /home/mzar0002/pg32_scratch/db/SemiBin_db \
--environment soil \
--sequencing-type=long_read \
-o results/binning/semibin \
-p 8 &> results/log/semibin/log.log

Log:

/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/SemiBin/long_read_cluster.py:77: RuntimeWarning: divide by zero encountered in log
  embedding_new = np.concatenate((embedding, np.log(depth)), axis=1)
Traceback (most recent call last):
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/bin/SemiBin", line 10, in <module>
    sys.exit(main1())
             ^^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/SemiBin/main.py", line 1482, in main1
    main2(args, is_semibin2=False)
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/SemiBin/main.py", line 1455, in main2
    single_easy_binning(
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/SemiBin/main.py", line 1183, in single_easy_binning
    binning_long(**binning_kwargs)
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/SemiBin/main.py", line 1061, in binning_long
    cluster_long_read(model,
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/SemiBin/long_read_cluster.py", line 101, in cluster_long_read
    dist_matrix = kneighbors_graph(
                  ^^^^^^^^^^^^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/neighbors/_graph.py", line 122, in kneighbors_graph
    ).fit(X)
      ^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/base.py", line 1151, in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/neighbors/_unsupervised.py", line 178, in fit
    return self._fit(X)
           ^^^^^^^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/neighbors/_base.py", line 498, in _fit
    X = self._validate_data(X, accept_sparse="csr", order="C")
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/base.py", line 604, in _validate_data
    out = check_array(X, input_name="X", **check_params)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/utils/validation.py", line 959, in check_array
    _assert_all_finite(
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/utils/validation.py", line 124, in _assert_all_finite
    _assert_all_finite_element_wise(
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/utils/validation.py", line 173, in _assert_all_finite_element_wise
    raise ValueError(msg_err)
ValueError: Input X contains infinity or a value too large for dtype('float32')./fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/SemiBin/long_read_cluster.py:77: RuntimeWarning: divide by zero encountered in log
  embedding_new = np.concatenate((embedding, np.log(depth)), axis=1)
Traceback (most recent call last):
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/bin/SemiBin", line 10, in <module>
    sys.exit(main1())
             ^^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/SemiBin/main.py", line 1482, in main1
    main2(args, is_semibin2=False)
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/SemiBin/main.py", line 1455, in main2
    single_easy_binning(
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/SemiBin/main.py", line 1183, in single_easy_binning
    binning_long(**binning_kwargs)
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/SemiBin/main.py", line 1061, in binning_long
    cluster_long_read(model,
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/SemiBin/long_read_cluster.py", line 101, in cluster_long_read
    dist_matrix = kneighbors_graph(
                  ^^^^^^^^^^^^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/neighbors/_graph.py", line 122, in kneighbors_graph
    ).fit(X)
      ^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/base.py", line 1151, in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/neighbors/_unsupervised.py", line 178, in fit
    return self._fit(X)
           ^^^^^^^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/neighbors/_base.py", line 498, in _fit
    X = self._validate_data(X, accept_sparse="csr", order="C")
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/base.py", line 604, in _validate_data
    out = check_array(X, input_name="X", **check_params)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/utils/validation.py", line 959, in check_array
    _assert_all_finite(
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/utils/validation.py", line 124, in _assert_all_finite
    _assert_all_finite_element_wise(
  File "/fs03/ie79/Zarul/status_nanopore/C002_D1/.snakemake/conda/4555c0c8960801d84920076de87e12a0_/lib/python3.11/site-packages/sklearn/utils/validation.py", line 173, in _assert_all_finite_element_wise
    raise ValueError(msg_err)
ValueError: Input X contains infinity or a value too large for dtype('float32').
@psj1997
Copy link
Collaborator

psj1997 commented Sep 19, 2023

Sorry for the late reply.

It seems there is a very big number that can not be represented by 'float32'. Can you check the biggest value in the depth column of the data.csv file? Thanks!

Sincerely
Shaojun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants