Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Custom model not able to detect ROIs after training on cyto3, #1071

Open
duckyaisha opened this issue Dec 3, 2024 · 2 comments
Open
Labels
bug Something isn't working

Comments

@duckyaisha
Copy link

duckyaisha commented Dec 3, 2024

Describe the bug
I have been struggling with this for a few days. Please bear with me as I am not very good at coding yet and trying to fix this problem has tested the limits of my admittedly minimal knowledge of python, coding in general, etc.
I am trying to generate ROIs of simple, GFP-tranfected CHO cells which I can then load into FIJI to extract fluorescence intensity data from another channel.

The default cyto3 model works very well in general, HOWEVER: I have to set the diameter to between 95-107 to capture most of the ROIs. It detects very few ROIs with the default diameter of 30.
The problem is that sometimes it also detects miniscule ROIs with no area or that don't correspond to a cell at all. I thought that by annotating about 40 images where I remove these small ROIs, it would eventually learn not to detect them.
However, instead every trained model detects 0 ROIs, despite cyto3 detecting between 40 and 60 per image I'm using for training.

I have so far tried:

Training from scratch (ie, no default model): 0 ROIs detected
Training from models that aren't Cyto3 (namely, the bac_fluor model, which has worked as a training model w/ which to identify bacteria in a much more challenging set of images): 0 ROIs detected
Forcing the use of RAdam instead of SGD, as was suggested previously, which required me to run the training through the command line instead of the GUI, but I did eventually figure this out: 0 ROIs detected
Removing all images that have fewer than 20 ROIs from the training set: 0 ROIs detected
Increasing Epochs from 100 to 300 to 500 to 1000: 0 ROIs detected in all conditions.

Reproducing
Here is a google drive link containing the images I have been using for training + the _seg.npy outputs (https://drive.google.com/drive/folders/1884j88XyEyndEBN8BfIIyQk4b2wUc2lU?usp=sharing)
See if you can train on these using Cyto3 as a baseline?

Run log
Here is the terminal output from the last time I tried to train on cyto3 and make a model without those annoying little ROIs.

`2024-12-03 12:49:15,919 [WARNING] <tifffile.TiffFile 'scan_Plate_R_p00_0_C04f25d1.tif'> OME series is missing 1 frames. Missing data are zeroed

2024-12-03 12:49:15,976 [WARNING] <tifffile.TiffFile 'scan_Plate_R_p00_0_C04f26d1.tif'> OME series is missing 1 frames. Missing data are zeroed

2024-12-03 12:49:16,019 [WARNING] <tifffile.TiffFile 'scan_Plate_R_p00_0_C04f27d1.tif'> OME series is missing 1 frames. Missing data are zeroed

2024-12-03 12:49:16,063 [WARNING] <tifffile.TiffFile 'scan_Plate_R_p00_0_C04f28d1.tif'> OME series is missing 1 frames. Missing data are zeroed

2024-12-03 12:49:16,104 [WARNING] <tifffile.TiffFile 'scan_Plate_R_p00_0_C04f29d1.tif'> OME series is missing 1 frames. Missing data are zeroed

2024-12-03 12:49:16,145 [WARNING] <tifffile.TiffFile 'scan_Plate_R_p00_0_C04f30d1.tif'> OME series is missing 1 frames. Missing data are zeroed

2024-12-03 12:49:44,219 [INFO] training with ['scan_Plate_R_p00_0_C04f25d1.tif', 'scan_Plate_R_p00_0_C04f26d1.tif', 'scan_Plate_R_p00_0_C04f27d1.tif', 'scan_Plate_R_p00_0_C04f28d1.tif', 'scan_Plate_R_p00_0_C04f29d1.tif', 'scan_Plate_R_p00_0_C04f30d1.tif']

2024-12-03 12:49:44,219 [INFO] training new model starting at model cyto3

2024-12-03 12:49:44,219 [INFO] training with chan = 0: gray, chan2 = 0: none

2024-12-03 12:49:44,219 [INFO] >> cyto3 << model set to be used

2024-12-03 12:49:44,220 [INFO] ** TORCH MPS version installed and working. **

2024-12-03 12:49:44,221 [INFO] >>>> using GPU (MPS)

2024-12-03 12:49:44,307 [INFO] >>>> loading model /Users/Alexander.Morano/.cellpose/models/cyto3

2024-12-03 12:49:44,388 [INFO] >>>> model diam_mean = 30.000 (ROIs rescaled to this size during training)
GUI_INFO: name of new model: CP_20241203_124811

2024-12-03 12:49:44,388 [INFO] computing flows for labels
100%|█████████████████████████████████████████████| 6/6 [00:14<00:00, 2.43s/it]

2024-12-03 12:49:58,999 [INFO] >>> computing diameters
100%|████████████████████████████████████████████| 6/6 [00:00<00:00, 100.68it/s]

2024-12-03 12:49:59,059 [INFO] >>> using channels [0, 0]

2024-12-03 12:49:59,059 [INFO] >>> normalizing {'lowhigh': None, 'percentile': [1.0, 99.0], 'normalize': True, 'norm3D': True, 'sharpen_radius': 0, 'smooth_radius': 0, 'tile_norm_blocksize': 0, 'tile_norm_smooth3D': 1, 'invert': False}

2024-12-03 12:49:59,819 [INFO] >>> n_epochs=1000, n_train=6, n_test=None

2024-12-03 12:49:59,819 [INFO] >>> SGD, learning_rate=0.10000, weight_decay=0.00010, momentum=0.900

2024-12-03 12:50:00,366 [INFO] >>> saving model to /Users/Alexander.Morano/Desktop/cellpose_training/old_plate/untitled_folder/models/CP_20241203_124811

2024-12-03 12:50:01,234 [INFO] 0, train_loss=nan, test_loss=0.0000, LR=0.0000, time 0.87s

2024-12-03 12:50:03,862 [INFO] 5, train_loss=nan, test_loss=0.0000, LR=0.0556, time 3.50s

2024-12-03 12:50:06,394 [INFO] 10, train_loss=nan, test_loss=0.0000, LR=0.1000, time 6.03s

2024-12-03 12:50:11,418 [INFO] 20, train_loss=nan, test_loss=0.0000, LR=0.1000, time 11.05s

2024-12-03 12:50:16,631 [INFO] 30, train_loss=nan, test_loss=0.0000, LR=0.1000, time 16.26s

2024-12-03 12:50:21,745 [INFO] 40, train_loss=nan, test_loss=0.0000, LR=0.1000, time 21.38s

2024-12-03 12:50:26,784 [INFO] 50, train_loss=nan, test_loss=0.0000, LR=0.1000, time 26.42s

2024-12-03 12:50:31,839 [INFO] 60, train_loss=nan, test_loss=0.0000, LR=0.1000, time 31.47s

2024-12-03 12:50:36,883 [INFO] 70, train_loss=nan, test_loss=0.0000, LR=0.1000, time 36.52s

2024-12-03 12:50:41,890 [INFO] 80, train_loss=nan, test_loss=0.0000, LR=0.1000, time 41.52s

2024-12-03 12:50:46,994 [INFO] 90, train_loss=nan, test_loss=0.0000, LR=0.1000, time 46.63s

2024-12-03 12:50:52,022 [INFO] 100, train_loss=nan, test_loss=0.0000, LR=0.1000, time 51.66s

2024-12-03 12:50:57,150 [INFO] 110, train_loss=nan, test_loss=0.0000, LR=0.1000, time 56.78s

2024-12-03 12:51:02,305 [INFO] 120, train_loss=nan, test_loss=0.0000, LR=0.1000, time 61.94s

2024-12-03 12:51:07,308 [INFO] 130, train_loss=nan, test_loss=0.0000, LR=0.1000, time 66.94s

2024-12-03 12:51:12,327 [INFO] 140, train_loss=nan, test_loss=0.0000, LR=0.1000, time 71.96s

2024-12-03 12:51:17,350 [INFO] 150, train_loss=nan, test_loss=0.0000, LR=0.1000, time 76.98s

2024-12-03 12:51:22,393 [INFO] 160, train_loss=nan, test_loss=0.0000, LR=0.1000, time 82.03s

2024-12-03 12:51:27,499 [INFO] 170, train_loss=nan, test_loss=0.0000, LR=0.1000, time 87.13s

2024-12-03 12:51:32,573 [INFO] 180, train_loss=nan, test_loss=0.0000, LR=0.1000, time 92.21s

2024-12-03 12:51:37,602 [INFO] 190, train_loss=nan, test_loss=0.0000, LR=0.1000, time 97.24s

2024-12-03 12:51:42,637 [INFO] 200, train_loss=nan, test_loss=0.0000, LR=0.1000, time 102.27s

2024-12-03 12:51:47,677 [INFO] 210, train_loss=nan, test_loss=0.0000, LR=0.1000, time 107.31s

2024-12-03 12:51:52,751 [INFO] 220, train_loss=nan, test_loss=0.0000, LR=0.1000, time 112.39s

2024-12-03 12:51:57,823 [INFO] 230, train_loss=nan, test_loss=0.0000, LR=0.1000, time 117.46s

2024-12-03 12:52:02,951 [INFO] 240, train_loss=nan, test_loss=0.0000, LR=0.1000, time 122.58s

2024-12-03 12:52:08,185 [INFO] 250, train_loss=nan, test_loss=0.0000, LR=0.1000, time 127.82s

2024-12-03 12:52:13,248 [INFO] 260, train_loss=nan, test_loss=0.0000, LR=0.1000, time 132.88s

2024-12-03 12:52:18,512 [INFO] 270, train_loss=nan, test_loss=0.0000, LR=0.1000, time 138.15s

2024-12-03 12:52:23,587 [INFO] 280, train_loss=nan, test_loss=0.0000, LR=0.1000, time 143.22s

2024-12-03 12:52:28,610 [INFO] 290, train_loss=nan, test_loss=0.0000, LR=0.1000, time 148.24s

2024-12-03 12:52:33,724 [INFO] 300, train_loss=nan, test_loss=0.0000, LR=0.1000, time 153.36s

2024-12-03 12:52:38,912 [INFO] 310, train_loss=nan, test_loss=0.0000, LR=0.1000, time 158.55s

2024-12-03 12:52:43,991 [INFO] 320, train_loss=nan, test_loss=0.0000, LR=0.1000, time 163.63s

2024-12-03 12:52:49,064 [INFO] 330, train_loss=nan, test_loss=0.0000, LR=0.1000, time 168.70s

2024-12-03 12:52:54,048 [INFO] 340, train_loss=nan, test_loss=0.0000, LR=0.1000, time 173.68s

2024-12-03 12:52:59,019 [INFO] 350, train_loss=nan, test_loss=0.0000, LR=0.1000, time 178.65s

2024-12-03 12:53:04,085 [INFO] 360, train_loss=nan, test_loss=0.0000, LR=0.1000, time 183.72s

2024-12-03 12:53:09,124 [INFO] 370, train_loss=nan, test_loss=0.0000, LR=0.1000, time 188.76s

2024-12-03 12:53:14,248 [INFO] 380, train_loss=nan, test_loss=0.0000, LR=0.1000, time 193.88s

2024-12-03 12:53:19,416 [INFO] 390, train_loss=nan, test_loss=0.0000, LR=0.1000, time 199.05s

2024-12-03 12:53:24,425 [INFO] 400, train_loss=nan, test_loss=0.0000, LR=0.1000, time 204.06s

2024-12-03 12:53:29,634 [INFO] 410, train_loss=nan, test_loss=0.0000, LR=0.1000, time 209.27s

2024-12-03 12:53:34,740 [INFO] 420, train_loss=nan, test_loss=0.0000, LR=0.1000, time 214.37s

2024-12-03 12:53:39,817 [INFO] 430, train_loss=nan, test_loss=0.0000, LR=0.1000, time 219.45s

2024-12-03 12:53:44,851 [INFO] 440, train_loss=nan, test_loss=0.0000, LR=0.1000, time 224.49s

2024-12-03 12:53:49,923 [INFO] 450, train_loss=nan, test_loss=0.0000, LR=0.1000, time 229.56s

2024-12-03 12:53:55,020 [INFO] 460, train_loss=nan, test_loss=0.0000, LR=0.1000, time 234.65s

2024-12-03 12:54:00,040 [INFO] 470, train_loss=nan, test_loss=0.0000, LR=0.1000, time 239.67s

2024-12-03 12:54:05,191 [INFO] 480, train_loss=nan, test_loss=0.0000, LR=0.1000, time 244.82s

2024-12-03 12:54:10,331 [INFO] 490, train_loss=nan, test_loss=0.0000, LR=0.1000, time 249.97s

2024-12-03 12:54:15,347 [INFO] 500, train_loss=nan, test_loss=0.0000, LR=0.1000, time 254.98s

2024-12-03 12:54:20,658 [INFO] 510, train_loss=nan, test_loss=0.0000, LR=0.1000, time 260.29s

2024-12-03 12:54:25,725 [INFO] 520, train_loss=nan, test_loss=0.0000, LR=0.1000, time 265.36s

2024-12-03 12:54:30,811 [INFO] 530, train_loss=nan, test_loss=0.0000, LR=0.1000, time 270.44s

2024-12-03 12:54:35,895 [INFO] 540, train_loss=nan, test_loss=0.0000, LR=0.1000, time 275.53s

2024-12-03 12:54:41,073 [INFO] 550, train_loss=nan, test_loss=0.0000, LR=0.1000, time 280.71s

2024-12-03 12:54:46,296 [INFO] 560, train_loss=nan, test_loss=0.0000, LR=0.1000, time 285.93s

2024-12-03 12:54:51,456 [INFO] 570, train_loss=nan, test_loss=0.0000, LR=0.1000, time 291.09s

2024-12-03 12:54:56,469 [INFO] 580, train_loss=nan, test_loss=0.0000, LR=0.1000, time 296.10s

2024-12-03 12:55:01,617 [INFO] 590, train_loss=nan, test_loss=0.0000, LR=0.1000, time 301.25s

2024-12-03 12:55:06,719 [INFO] 600, train_loss=nan, test_loss=0.0000, LR=0.1000, time 306.35s

2024-12-03 12:55:11,825 [INFO] 610, train_loss=nan, test_loss=0.0000, LR=0.1000, time 311.46s

2024-12-03 12:55:16,819 [INFO] 620, train_loss=nan, test_loss=0.0000, LR=0.1000, time 316.45s

2024-12-03 12:55:21,880 [INFO] 630, train_loss=nan, test_loss=0.0000, LR=0.1000, time 321.51s

2024-12-03 12:55:27,092 [INFO] 640, train_loss=nan, test_loss=0.0000, LR=0.1000, time 326.73s

2024-12-03 12:55:32,194 [INFO] 650, train_loss=nan, test_loss=0.0000, LR=0.1000, time 331.83s

2024-12-03 12:55:37,367 [INFO] 660, train_loss=nan, test_loss=0.0000, LR=0.1000, time 337.00s

2024-12-03 12:55:42,365 [INFO] 670, train_loss=nan, test_loss=0.0000, LR=0.1000, time 342.00s

2024-12-03 12:55:47,405 [INFO] 680, train_loss=nan, test_loss=0.0000, LR=0.1000, time 347.04s

2024-12-03 12:55:52,401 [INFO] 690, train_loss=nan, test_loss=0.0000, LR=0.1000, time 352.04s

2024-12-03 12:55:57,513 [INFO] 700, train_loss=nan, test_loss=0.0000, LR=0.1000, time 357.15s

2024-12-03 12:56:02,576 [INFO] 710, train_loss=nan, test_loss=0.0000, LR=0.1000, time 362.21s

2024-12-03 12:56:07,571 [INFO] 720, train_loss=nan, test_loss=0.0000, LR=0.1000, time 367.20s

2024-12-03 12:56:12,654 [INFO] 730, train_loss=nan, test_loss=0.0000, LR=0.1000, time 372.29s

2024-12-03 12:56:17,701 [INFO] 740, train_loss=nan, test_loss=0.0000, LR=0.1000, time 377.34s

2024-12-03 12:56:22,759 [INFO] 750, train_loss=nan, test_loss=0.0000, LR=0.1000, time 382.39s

2024-12-03 12:56:27,838 [INFO] 760, train_loss=nan, test_loss=0.0000, LR=0.1000, time 387.47s

2024-12-03 12:56:32,834 [INFO] 770, train_loss=nan, test_loss=0.0000, LR=0.1000, time 392.47s

2024-12-03 12:56:37,782 [INFO] 780, train_loss=nan, test_loss=0.0000, LR=0.1000, time 397.42s

2024-12-03 12:56:42,877 [INFO] 790, train_loss=nan, test_loss=0.0000, LR=0.1000, time 402.51s

2024-12-03 12:56:48,049 [INFO] 800, train_loss=nan, test_loss=0.0000, LR=0.1000, time 407.68s

2024-12-03 12:56:53,209 [INFO] 810, train_loss=nan, test_loss=0.0000, LR=0.1000, time 412.84s

2024-12-03 12:56:58,218 [INFO] 820, train_loss=nan, test_loss=0.0000, LR=0.1000, time 417.85s

2024-12-03 12:57:03,236 [INFO] 830, train_loss=nan, test_loss=0.0000, LR=0.1000, time 422.87s

2024-12-03 12:57:08,401 [INFO] 840, train_loss=nan, test_loss=0.0000, LR=0.1000, time 428.04s

2024-12-03 12:57:13,435 [INFO] 850, train_loss=nan, test_loss=0.0000, LR=0.1000, time 433.07s

2024-12-03 12:57:18,535 [INFO] 860, train_loss=nan, test_loss=0.0000, LR=0.1000, time 438.17s

2024-12-03 12:57:23,577 [INFO] 870, train_loss=nan, test_loss=0.0000, LR=0.1000, time 443.21s

2024-12-03 12:57:28,553 [INFO] 880, train_loss=nan, test_loss=0.0000, LR=0.1000, time 448.19s

2024-12-03 12:57:33,504 [INFO] 890, train_loss=nan, test_loss=0.0000, LR=0.1000, time 453.14s

2024-12-03 12:57:38,488 [INFO] 900, train_loss=nan, test_loss=0.0000, LR=0.0500, time 458.12s

2024-12-03 12:57:43,568 [INFO] 910, train_loss=nan, test_loss=0.0000, LR=0.0250, time 463.20s

2024-12-03 12:57:48,638 [INFO] 920, train_loss=nan, test_loss=0.0000, LR=0.0125, time 468.27s

2024-12-03 12:57:53,791 [INFO] 930, train_loss=nan, test_loss=0.0000, LR=0.0063, time 473.43s

2024-12-03 12:57:58,935 [INFO] 940, train_loss=nan, test_loss=0.0000, LR=0.0031, time 478.57s

2024-12-03 12:58:03,986 [INFO] 950, train_loss=nan, test_loss=0.0000, LR=0.0016, time 483.62s

2024-12-03 12:58:09,090 [INFO] 960, train_loss=nan, test_loss=0.0000, LR=0.0008, time 488.72s

2024-12-03 12:58:14,165 [INFO] 970, train_loss=nan, test_loss=0.0000, LR=0.0004, time 493.80s

2024-12-03 12:58:19,211 [INFO] 980, train_loss=nan, test_loss=0.0000, LR=0.0002, time 498.85s

2024-12-03 12:58:24,203 [INFO] 990, train_loss=nan, test_loss=0.0000, LR=0.0001, time 503.84s

/Users/Alexander.Morano/Desktop/cellpose_training/old_plate/untitled_folder/models/CP_20241203_124811 copied to models folder /Users/Alexander.Morano/.cellpose/models
GUI_INFO: selected model CP_20241203_124811, loading now
2024-12-03 12:58:28,847 [INFO] >> CP_20241203_124811 << model set to be used

2024-12-03 12:58:28,848 [INFO] ** TORCH MPS version installed and working. **

2024-12-03 12:58:28,848 [INFO] >>>> using GPU (MPS)

2024-12-03 12:58:28,921 [INFO] >>>> loading model /Users/Alexander.Morano/.cellpose/models/CP_20241203_124811

2024-12-03 12:58:28,983 [INFO] >>>> model diam_mean = 30.000 (ROIs rescaled to this size during training)

2024-12-03 12:58:28,983 [INFO] >>>> model diam_labels = 97.534 (mean diameter of training ROIs)
GUI_INFO: diameter set to 97.53 (but can be changed)

2024-12-03 12:58:29,014 [INFO] >>>> diameter set to diam_labels ( = 97.534 )

2024-12-03 12:58:29,018 [WARNING] <tifffile.TiffFile 'scan_Plate_R_p00_0_C04f26d1.tif'> OME series is missing 1 frames. Missing data are zeroed
GUI_INFO: restore: None
GUI_INFO: normalization checked: computing saturation levels (and optionally filtered image)
{'lowhigh': None, 'percentile': [1.0, 99.0], 'normalize': True, 'norm3D': False, 'sharpen_radius': 0, 'smooth_radius': 0, 'tile_norm_blocksize': 0, 'tile_norm_smooth3D': 1, 'invert': False}
[0, 255.0]
(1, 1536, 2048, 3)
GUI_INFO: 42 masks found
GUI_INFO: creating cellcolors and drawing masks
GUI_INFO: loaded in previous changes

2024-12-03 12:58:29,389 [INFO] ** TORCH MPS version installed and working. **

2024-12-03 12:58:29,389 [INFO] >>>> using GPU (MPS)

2024-12-03 12:58:29,460 [INFO] >>>> loading model /Users/Alexander.Morano/.cellpose/models/CP_20241203_124811

2024-12-03 12:58:29,522 [INFO] >>>> model diam_mean = 30.000 (ROIs rescaled to this size during training)

2024-12-03 12:58:29,522 [INFO] >>>> model diam_labels = 97.534 (mean diameter of training ROIs)
{'lowhigh': None, 'percentile': [1.0, 99.0], 'normalize': True, 'norm3D': False, 'sharpen_radius': 0, 'smooth_radius': 0, 'tile_norm_blocksize': 0, 'tile_norm_smooth3D': 1, 'invert': False}

2024-12-03 12:58:29,957 [INFO] No cell pixels found.

2024-12-03 12:58:30,178 [INFO] 0 cells found with model in 0.814 sec

GUI_INFO: 0 masks found

GUI_INFO: creating cellcolors and drawing masks

2024-12-03 12:58:30,237 [INFO] !!! computed masks for scan_Plate_R_p00_0_C04f26d1.tif from new model !!!
`
I suspect the "train_loss=nan, test_loss=0.0000" output(s) have something to do with the failure, but I have no clue how to resolve that problem, or what could be causing it.

@duckyaisha duckyaisha added the bug Something isn't working label Dec 3, 2024
@sun1000yao
Copy link

Hi,

I have similar training issue. I started with running cyto3 (the best performance model), then manually corrected the masks, and then train new model based on that. I did several iterations but still got no cell detection. I am not sure this is an innate bug or because of image features. Here is an example image used for training.
chx10-GFP-f2_z01-1

Looking forward to suggestions

@duckyaisha
Copy link
Author

Hi,

I have similar training issue. I started with running cyto3 (the best performance model), then manually corrected the masks, and then train new model based on that. I did several iterations but still got no cell detection. I am not sure this is an innate bug or because of image features. Here is an example image used for training. chx10-GFP-f2_z01-1

Looking forward to suggestions

Hi! I figured out how to fix this: apparently this is a known problem in earlier versions of Cellpose. I started utilizing the most recent version on Github and unclicked / deactivated the GPU. running on CPU takes longer, but if I use the most recent version of cellpose and do not use GPU, It all works fine!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants