Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harddrive missed and unknown message #24

Open
argawow opened this issue Feb 11, 2021 · 22 comments
Open

Harddrive missed and unknown message #24

argawow opened this issue Feb 11, 2021 · 22 comments

Comments

@argawow
Copy link

argawow commented Feb 11, 2021

Hi,
first, I am using this script a long time ago for now. Thanks for it :)

Since two days one of my harddrives are missing in the summary and I get a unknown message

Image with the missing drive:
image

Image without the missing drive.
image

Errormessage:
awk: newline in string 37267 newer... at source line 1 awk: newline in string Extended 19... at source line 1

Freenas dont have any errors about this drive at the moment. Poos is not degraded.

Thanks for help :)

@edgarsuit
Copy link
Owner

Does ada1 show up when you run sysctl -n kern.disks (as root)?

@argawow
Copy link
Author

argawow commented Feb 11, 2021

Hi,
here is the output of the command:

root@freenas:~ # sysctl -n kern.disks da0 ada7 ada6 ada5 ada4 ada3 ada2 ada1 ada0 cd0

@edgarsuit
Copy link
Owner

Gotcha, it shows up there, so that's good. The script then checks smartctl to see if SMART is enabled. Run smartctl -i /dev/ada1 and paste the output here.

@edgarsuit
Copy link
Owner

Sorry, is it /dev/ada1 that's missing? Or /dev/ada2?

@argawow
Copy link
Author

argawow commented Feb 12, 2021

/dev/ada2 is the missing one :)

here is the output of the command smartctl -i /dev/ada2

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red
Device Model: WDC WD80EFZX-68UW8N0
Serial Number: xxxxxx
LU WWN Device Id: 5 000cca 254f61940
Firmware Version: 83.H0A83
User Capacity: 8,001,563,222,016 bytes [8.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Feb 12 12:23:50 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Andreas

@edgarsuit
Copy link
Owner

Sorry, I missed the notification from your reply and this fell off my radar. I'm kind of at a loss for why this disk is getting excluded.

Try copying this whole block of code into your terminal and see what it spits out:

drives=$(for drive in $(sysctl -n kern.disks); do
    if [ "$(smartctl -i /dev/"${drive}" | grep "SMART support is: Enabled")" ] && ! [ "$(smartctl -i /dev/"${drive}" | grep "Solid State Device")" ]; then
        printf "%s " "${drive}"
    fi
done | awk '{for (i=NF; i!=0 ; i--) print $i }')
echo $drives

This is the code the script uses to figure out which drives should be included in the report. It looks at the smartctl output to check if SMART is enabled and to see if it's an SSD.

@ekaley
Copy link

ekaley commented Mar 28, 2021

Ive been having the same issue. It started when one of my drives started getting Current Pending Sectors errors. The error i got was
awk: newline in string 9648 newer... at source line 1 awk: newline in string Extended 17... at source line 1

and its excluded in the smart summary table that the script outputs. However smartctl and freenas can see the drive and the pool and disks all show up fine with the above commands and in the GUI

@ekaley
Copy link

ekaley commented Mar 28, 2021

I also tried https://github.com/Spearfoot/FreeNAS-scripts this smart report script and get the "same awk error". The interesting part with either this smart script or the one i linked is the script outputs is has the drive only excluded from the smart summary table not the details.

@ekaley
Copy link

ekaley commented Mar 28, 2021

image

@ekaley
Copy link

ekaley commented Mar 28, 2021

image

@Markvis
Copy link

Markvis commented Apr 3, 2021

@edgarsuit

ran into this today as well. issue comes from these two lines

-v lastTestHours="$(smartctl -l selftest /dev/"$drive" | grep "# 1" | awk '{print $9}')" \

-v lastTestType="$(smartctl -l selftest /dev/"$drive" | grep "# 1" | awk '{print $3}')" \

when a drive starts to have an error this command smartctl -l selftest /dev/"$drive | grep "# 1" will give something like this

# 1  Extended offline    Completed without error       00%     32284         -
12 of 12 failed self-tests are outdated by newer successful extended offline self-test # 1

so awk '{print $9}' will print

32284
newer

and awk '{print $3}'

Extended
12

Here's a sample output of smartctl -l selftest /dev/"$drive" for your reference

smartctl 7.1 2019-12-30 r5022 [FreeBSD 12.2-RELEASE-p3 amd64] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     32284         -
# 2  Short offline       Completed without error       00%     32275         -
# 3  Extended offline    Completed: read failure       10%     32125         3519067072
# 4  Conveyance offline  Completed without error       00%     32093         -
# 5  Short offline       Completed: read failure       60%     32069         3519067072
# 6  Extended offline    Completed: read failure       10%     31957         3519067072
# 7  Conveyance offline  Completed without error       00%     31925         -
# 8  Short offline       Completed: read failure       60%     31901         3519067072
# 9  Extended offline    Completed: read failure       10%     31790         3519067072
#10  Conveyance offline  Completed without error       00%     31758         -
#11  Short offline       Completed: read failure       70%     31734         3519067072
#12  Extended offline    Completed: read failure       10%     31625         3519067072
#13  Conveyance offline  Completed without error       00%     31590         -
#14  Short offline       Completed: read failure       60%     31566         3519067072
#15  Extended offline    Interrupted (host reset)      10%     31458         -
#16  Conveyance offline  Completed without error       00%     31422         -
#17  Short offline       Completed: read failure       10%     31400         3519067072
#18  Extended offline    Completed: read failure       10%     31289         3519067072
#19  Conveyance offline  Completed without error       00%     31255         -
#20  Short offline       Completed: read failure       60%     31231         3519067072
#21  Extended offline    Completed: read failure       10%     31120         3519067072
12 of 12 failed self-tests are outdated by newer successful extended offline self-test # 1

@dak180
Copy link

dak180 commented May 16, 2021

@Markvis could you let me know if my refactor fixes your issue?

@jamesstanw
Copy link

I also face the same issue with 3 drives being skipped. If I've got it - I should move over to the refactor?

@dak180
Copy link

dak180 commented Jul 26, 2021

@jamesstanw try it and let me know if it works.

@jamesstanw
Copy link

sdsds

@jamesstanw try it and let me know if it works.

I've called the script with:
/bin/sh ./service.sh
and it reports:
root@freenas:/mnt/NAS2/NAS2_data/james/scripts # /bin/sh ./service.sh ./service.sh: 73: Syntax error: "(" unexpected
Removing the brackets at line 73 -- sorry I'm poking in the dark here -- shows this:
./service.sh: function: not found ./service.sh: cannot create : No such file or directory Please edit the config file for your setup

@dak180
Copy link

dak180 commented Jul 30, 2021

I've called the script with:
/bin/sh ./service.sh

@jamesstanw that will not work; just use ./report.sh -c /path/where/you/want/the/config/file since it requires bash (sh is not bash) and the shebang line in the script will take care of that for you.

@jamesstanw
Copy link

jamesstanw commented Jul 30, 2021

I've called the script with:
/bin/sh ./service.sh

@jamesstanw that will not work; just use ./report.sh -c /path/where/you/want/the/config/file since it requires bash (sh is not bash) and the shebang line in the script will take care of that for you.

Sorry a bit of naivete here . . .
'-c /path/where/you/want/the/config/file'
Where should I want the config to be? :)

I've copied your script, marked it executable and set it (through the GUI) to run as a cron job. I want to manually run to make sure all is well.

@dak180
Copy link

dak180 commented Jul 30, 2021

Where should I want the config to be? :)

Wherever you like. ☺

I've copied your script, marked it executable and set it (through the GUI) to run as a cron job. I want to manually run to make sure all is well.

You should run it manually first; on the first run it will create the config file which you will need to edit before the script will run correctly.

@jamesstanw
Copy link

Thanks!
I've run the script and get a partial report (pool status but not smart testing results) - that is almost instantaneous. It is throwing the following error, though:
parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9 parse error: Invalid numeric literal at line 1, column 9
I am running Freenas 11.2 (not truenas). Have I missed a setting somewhere?
Thanks!

@dak180
Copy link

dak180 commented Aug 8, 2021

I am running Freenas 11.2 (not truenas). Have I missed a setting somewhere?
Thanks!

My version of the script has never been tested on 11 only 12; since I do not have a system running 11 I do not think that I would be able to make the script work there. I would encourage you to move to 12 anyway though.

@jamesstanw
Copy link

I've just looked at update adn the only version of 12 I've got access to is the development version for testing. I'm really tied to the FreeNas release train.
Frustrating . . . since the script was running great but started to overlook the four oldest disks. It was great to have this automated (big thanks for all the work). Is there some significant difference in the way TrueNas handles the disks over Freenas?

@dak180
Copy link

dak180 commented Aug 9, 2021

Is there some significant difference in the way TrueNas handles the disks over Freenas?

No, not the disks; I would suggest reading the release notes starting 12.0 though U5 before you update though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants