tracking "unkown command"
theedge456 opened this issue · 7 comments
Hello,
I don't know if this is the correct place to post this issue.
If not, mzy someone direct me to the correct one.
I have a ST1000DM010 handled by a gigabyte motherboard a320ma-m.2.
The OS is devuan chimera (based on debian bullseye), using kernel 6.1.37 from kernel.org.
I'm tracking a problem displayed in the logs because the disk is rarely used.
It is only used to perform compilation in multicore mode.
kernel: [ 750.902215] sd 5:0:0:0: [sda] tag#2 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=5s
kernel: [ 750.905013] sd 5:0:0:0: [sda] tag#2 Sense Key : Illegal Request [current]
kernel: [ 750.907809] sd 5:0:0:0: [sda] tag#2 Add. Sense: Unaligned write command
kernel: [ 750.910572] sd 5:0:0:0: [sda] tag#2 CDB: Read(10) 28 00 12 c2 4c 80 00 00 38 00
kernel: [ 750.916106] sd 5:0:0:0: [sda] tag#22 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=5s
kernel: [ 750.918889] sd 5:0:0:0: [sda] tag#22 Sense Key : Illegal Request [current]
kernel: [ 750.921675] sd 5:0:0:0: [sda] tag#22 Add. Sense: Unaligned write command
kernel: [ 750.924462] sd 5:0:0:0: [sda] tag#22 CDB: Read(10) 28 00 17 14 8d 00 00 00 88 00
kernel: [ 750.930065] ata6: EH complete
smartctl shows:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 063 006 Pre-fail Always - 811937
3 Spin_Up_Time 0x0003 099 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 099 099 020 Old_age Always - 1397
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 077 060 045 Pre-fail Always - 56586838
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 599
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 020 Old_age Always - 1399
183 Runtime_Bad_Block 0x0032 086 086 000 Old_age Always - 14
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 066 057 040 Old_age Always - 34 (Min/Max 34/34)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 1398
194 Temperature_Celsius 0x0022 034 007 000 Old_age Always - 34 (0 7 0 0 0)
195 Hardware_ECC_Recovered 0x001a 100 001 000 Old_age Always - 811937
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 589h+00m+00.656s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 21343524455
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 10832830779
I compiled the last version of openSeaChest (2.2.1-6_0_1 X86_64).
I started a long test with this command:
sudo ./openSeaChest_GenericTests -d /dev/sg1 --longGeneric
The result showed no errors. I only saw messages about "unknown command".
Is there a way to display these messages ?
I saw on the seagate website that there was an updated firmware for ST1000DM series (ST1000DM004 or ST1000DM007) but not the ST1000DM010, still driven by the CC43 firmware.
May I try to update the firmware ?
Any other hint ?
Fabien
Forget this message. Everything went back to normal when I connected the disk to other data and power supply ports on the motherboard.
Hi @theedge456,
Just to confirm, the drive is working properly with new cabling and this message no longer shows in the system logs?
Yes. Lucky me
Fabien
Thanks for confirming that!
I have seen cabling issues before, but not something that shows up like this. A lot of times the attribute 199 will start to increase when there is a problem and there are other symptoms but generally not "unaligned write command", so that is really weird. Cabling issues always have odd symptoms, but this is not one I've seen before.
If it happens again, please update this issue or create a new one and we can see if we can dig a little deeper to find out more information behind the cause, but sometimes it is as simple as replacing a flaky cable.
I'm going to mark this closed for now, but please reopen it if you need to.
I generated this file at that time.
Tell me if it helps.
@theedge456,
I took a look at the file and didn't see anything that would be an indication of a cabling issue like I would expect.
If you run into this again, capturing the device statistics and SMART attributes will likely be most helpful.
openSeaChest_SMART -d <handle> --smartAttributes analyzed --deviceStatistics > debug.txt
Another thing that might help is dumping the SATA phy event counters. This is not part of openSeaChest yet, but it is present in smartctl. It is possible whatever is happening was logged there was well.
@vonericsen,
This is the result of
sg_sat_phy_event --ck_cond --verbose 1>stdout.log 2>stderr.log
Tell me if it helps
ata_phy_event.zip