Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bert-LSR 结果异常 #31

Open
Zeyu-Liang opened this issue Nov 5, 2020 · 6 comments
Open

Bert-LSR 结果异常 #31

Zeyu-Liang opened this issue Nov 5, 2020 · 6 comments

Comments

@Zeyu-Liang
Copy link

@nanguoshun 更新版Bert-LSR 发现跑出的结果发现异常,请问是什么原因呢?感谢您的工作。

@Zeyu-Liang
Copy link
Author

| epoch 0 | step 50 | ms/b 2511.34 | train loss 5.877 | NA acc: 0.94 | not NA acc: 0.00 | tot acc: 0.91
| epoch 0 | step 100 | ms/b 2568.21 | train loss 0.406 | NA acc: 0.97 | not NA acc: 0.00 | tot acc: 0.94
| epoch 0 | step 150 | ms/b 2528.33 | train loss 0.403 | NA acc: 0.98 | not NA acc: 0.00 | tot acc: 0.95
| epoch 0 | step 200 | ms/b 2541.94 | train loss 0.396 | NA acc: 0.99 | not NA acc: 0.00 | tot acc: 0.96
| epoch 0 | step 250 | ms/b 2476.36 | train loss 0.411 | NA acc: 0.99 | not NA acc: 0.00 | tot acc: 0.96
| epoch 0 | step 300 | ms/b 2550.52 | train loss 0.386 | NA acc: 0.99 | not NA acc: 0.00 | tot acc: 0.96
| epoch 1 | step 350 | ms/b 2462.27 | train loss 0.399 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 1 | step 400 | ms/b 2482.02 | train loss 0.390 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 1 | step 450 | ms/b 2594.16 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 1 | step 500 | ms/b 2545.52 | train loss 0.392 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 1 | step 550 | ms/b 2510.27 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 1 | step 600 | ms/b 2484.04 | train loss 0.392 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 2 | step 650 | ms/b 2492.20 | train loss 0.374 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 2 | step 700 | ms/b 2492.06 | train loss 0.373 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 2 | step 750 | ms/b 2453.17 | train loss 0.396 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 2 | step 800 | ms/b 2515.16 | train loss 0.376 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 2 | step 850 | ms/b 2465.45 | train loss 0.408 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 2 | step 900 | ms/b 2433.18 | train loss 0.398 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 3 | step 950 | ms/b 2543.93 | train loss 0.386 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 3 | step 1000 | ms/b 2550.19 | train loss 0.383 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 3 | step 1050 | ms/b 2774.69 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 3 | step 1100 | ms/b 2466.55 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 3 | step 1150 | ms/b 2507.94 | train loss 0.397 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 3 | step 1200 | ms/b 2414.12 | train loss 0.390 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 4 | step 1250 | ms/b 2502.15 | train loss 0.402 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 4 | step 1300 | ms/b 2483.49 | train loss 0.382 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 4 | step 1350 | ms/b 2498.14 | train loss 0.380 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 4 | step 1400 | ms/b 2549.41 | train loss 0.377 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 4 | step 1450 | ms/b 2533.72 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 4 | step 1500 | ms/b 2538.75 | train loss 0.392 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 5 | step 1550 | ms/b 2453.19 | train loss 0.382 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 5 | step 1600 | ms/b 2542.53 | train loss 0.388 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 5 | step 1650 | ms/b 2487.09 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 5 | step 1700 | ms/b 2537.71 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 5 | step 1750 | ms/b 2548.65 | train loss 0.370 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 5 | step 1800 | ms/b 2401.13 | train loss 0.401 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 6 | step 1850 | ms/b 2538.25 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 6 | step 1900 | ms/b 2532.69 | train loss 0.382 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 6 | step 1950 | ms/b 2538.84 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 6 | step 2000 | ms/b 2591.51 | train loss 0.368 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 6 | step 2050 | ms/b 2540.94 | train loss 0.388 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 6 | step 2100 | ms/b 2449.79 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 7 | step 2150 | ms/b 2439.32 | train loss 0.373 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 7 | step 2200 | ms/b 2446.64 | train loss 0.372 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 7 | step 2250 | ms/b 2411.52 | train loss 0.371 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 7 | step 2300 | ms/b 2541.08 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 7 | step 2350 | ms/b 2516.03 | train loss 0.383 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 7 | step 2400 | ms/b 2455.05 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 8 | step 2450 | ms/b 2509.16 | train loss 0.397 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 8 | step 2500 | ms/b 2486.06 | train loss 0.379 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 8 | step 2550 | ms/b 2506.90 | train loss 0.374 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 8 | step 2600 | ms/b 2420.65 | train loss 0.397 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 8 | step 2650 | ms/b 2514.13 | train loss 0.388 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 8 | step 2700 | ms/b 2505.45 | train loss 0.371 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 9 | step 2750 | ms/b 2516.60 | train loss 0.396 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 9 | step 2800 | ms/b 2482.15 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 9 | step 2850 | ms/b 2633.90 | train loss 0.373 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 9 | step 2900 | ms/b 2547.70 | train loss 0.392 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 9 | step 2950 | ms/b 2544.95 | train loss 0.381 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 9 | step 3000 | ms/b 2514.13 | train loss 0.383 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 9 | step 3050 | ms/b 2586.53 | train loss 0.379 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 10 | step 3100 | ms/b 2583.68 | train loss 0.380 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 10 | step 3150 | ms/b 2599.92 | train loss 0.405 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 10 | step 3200 | ms/b 2557.08 | train loss 0.390 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 10 | step 3250 | ms/b 2612.56 | train loss 0.371 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 10 | step 3300 | ms/b 2769.04 | train loss 0.382 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 10 | step 3350 | ms/b 2666.34 | train loss 0.380 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 11 | step 3400 | ms/b 2565.08 | train loss 0.379 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 11 | step 3450 | ms/b 2633.72 | train loss 0.372 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 11 | step 3500 | ms/b 2746.64 | train loss 0.391 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 11 | step 3550 | ms/b 2613.45 | train loss 0.384 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 11 | step 3600 | ms/b 2663.32 | train loss 0.382 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 11 | step 3650 | ms/b 2525.16 | train loss 0.405 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 12 | step 3700 | ms/b 2663.08 | train loss 0.382 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 12 | step 3750 | ms/b 2771.10 | train loss 0.383 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 12 | step 3800 | ms/b 2650.00 | train loss 0.389 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 12 | step 3850 | ms/b 2600.85 | train loss 0.386 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 12 | step 3900 | ms/b 2694.03 | train loss 0.368 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 12 | step 3950 | ms/b 2651.42 | train loss 0.380 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 13 | step 4000 | ms/b 2643.70 | train loss 0.385 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 13 | step 4050 | ms/b 2628.64 | train loss 0.394 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 13 | step 4100 | ms/b 2606.33 | train loss 0.390 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 13 | step 4150 | ms/b 2521.79 | train loss 0.394 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 13 | step 4200 | ms/b 2752.30 | train loss 0.374 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 13 | step 4250 | ms/b 2819.76 | train loss 0.367 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 14 | step 4300 | ms/b 2591.43 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 14 | step 4350 | ms/b 2666.80 | train loss 0.389 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 14 | step 4400 | ms/b 2693.22 | train loss 0.386 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 14 | step 4450 | ms/b 2760.62 | train loss 0.381 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 14 | step 4500 | ms/b 2727.32 | train loss 0.366 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 14 | step 4550 | ms/b 2629.03 | train loss 0.392 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 15 | step 4600 | ms/b 2662.55 | train loss 0.378 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 15 | step 4650 | ms/b 2617.05 | train loss 0.387 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 15 | step 4700 | ms/b 2598.81 | train loss 0.379 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 15 | step 4750 | ms/b 2529.91 | train loss 0.397 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 15 | step 4800 | ms/b 2712.70 | train loss 0.405 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 15 | step 4850 | ms/b 2676.14 | train loss 0.388 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 16 | step 4900 | ms/b 2732.74 | train loss 0.373 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 16 | step 4950 | ms/b 2750.27 | train loss 0.379 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 16 | step 5000 | ms/b 2618.98 | train loss 0.399 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 16 | step 5050 | ms/b 2756.20 | train loss 0.370 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 16 | step 5100 | ms/b 2676.40 | train loss 0.390 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 16 | step 5150 | ms/b 2546.86 | train loss 0.384 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 17 | step 5200 | ms/b 2693.64 | train loss 0.380 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 17 | step 5250 | ms/b 2665.40 | train loss 0.391 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 17 | step 5300 | ms/b 2625.90 | train loss 0.374 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97
| epoch 17 | step 5350 | ms/b 2591.55 | train loss 0.374 | NA acc: 1.00 | not NA acc: 0.00 | tot acc: 0.97

@Zeyu-Liang
Copy link
Author

所有的参数都跟更新版的一致,batch size = 10

@nanguoshun
Copy link
Owner

谢谢 @Zeyu-Liang 反馈,我们尽快检查一下然后回复你,谢谢!

@nanguoshun
Copy link
Owner

nanguoshun commented Nov 11, 2020

@Zeyu-Liang we have fixed the issue and now you can try to reproduce the results on BERT. Noted that Bz =20 (for BERT, 15+ batch size is suggested for finetuning), hidden_size = 216 (should be divisible by 12 due to the constraint of the hyperparameter of GCN layers). Totally you may need about 50GB GPU memory with the setting for BERT-base. Thanks a lot!

@MingYangi
Copy link

麻烦问一下 浮现的时候又遇到RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor, but got 1D cuda:0 Long tensor这问题么

@stvhuang
Copy link

@SeaYM Try downgrade your PyTorch version to 1.6.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants