sata hiba

Fórumok

Sziasztok,

Van egy gentoo amd64 masina 2.6.24-gentoo-r5 kernellel. A dmesg -ben a következőket találom időnként:


[393889.082730] ata1: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000 status 0x400 next cpb count 0x0 next cpb idx 0x0
[393889.082739] ata1: CPB 0: ctl_flags 0x9, resp_flags 0x0
[393889.082748] ata1: timeout waiting for ADMA IDLE, stat=0x400
[393889.082756] ata1: timeout waiting for ADMA LEGACY, stat=0x400
[393889.082767] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[393889.082773] ata1.00: cmd e7/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[393889.082775]          res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[393889.082778] ata1.00: status: { DRDY }
[393889.394556] ata1: soft resetting link
[393889.550482] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[393889.567612] ata1.00: configured for UDMA/133
[393889.567634] ata1: EH complete
[393889.718673] sd 1:0:0:0: [sdc] 234441648 512-byte hardware sectors (120034 MB)
[393889.718706] sd 1:0:0:0: [sdc] Write Protect is off
[393889.718708] sd 1:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[393889.718722] sd 1:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[395590.494740] ata2: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000 status 0x400 next cpb count 0x0 next cpb idx 0x0
[395590.494747] ata2: CPB 0: ctl_flags 0x9, resp_flags 0x0
[395590.494755] ata2: timeout waiting for ADMA IDLE, stat=0x400
[395590.494762] ata2: timeout waiting for ADMA LEGACY, stat=0x400
[395590.494773] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
[395590.494779] ata2.00: cmd e7/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[395590.494780]          res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[395590.494783] ata2.00: status: { DRDY }
[395590.807562] ata2: soft resetting link
[395590.962489] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[395590.987612] ata2.00: configured for UDMA/133
[395590.987634] ata2: EH complete
[395591.139705] sd 2:0:0:0: [sdd] 234441648 512-byte hardware sectors (120034 MB)
[395591.139718] sd 2:0:0:0: [sdd] Write Protect is off
[395591.139720] sd 2:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[395591.139734] sd 2:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Az alaplap egy másfél éves Tyan Tomcat K8E (S2865 AG2NRF). Itt a smartctl kimenete is:
sdc:


SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   051    Pre-fail  Always       -       89
  3 Spin_Up_Time            0x0007   100   100   025    Pre-fail  Always       -       6080
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       34
  5 Reallocated_Sector_Ct   0x0033   253   253   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   253   253   051    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0025   253   253   015    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       19013
 10 Spin_Retry_Count        0x0033   253   253   051    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x0012   253   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       28
190 Airflow_Temperature_Cel 0x0022   088   079   000    Old_age   Always       -       50
194 Temperature_Celsius     0x0022   088   079   000    Old_age   Always       -       50
195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -       82604786
196 Reallocated_Event_Count 0x0032   253   253   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0012   253   253   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   253   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x000a   100   100   000    Old_age   Always       -       0
201 Soft_Read_Error_Rate    0x000a   100   100   000    Old_age   Always       -       0
202 TA_Increase_Count       0x0032   253   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     18996         -
# 2  Short offline       Completed without error       00%     18972         -
# 3  Short offline       Completed without error       00%     18948         -
# 4  Extended offline    Completed without error       00%     18926         -
# 5  Short offline       Completed without error       00%     18924         -
# 6  Short offline       Completed without error       00%     18900         -
# 7  Short offline       Completed without error       00%     18876         -
# 8  Short offline       Completed without error       00%     18852         -
# 9  Short offline       Completed without error       00%     18828         -
#10  Short offline       Completed without error       00%     18804         -
#11  Short offline       Completed without error       00%     18780         -
#12  Extended offline    Completed without error       00%     18757         -
#13  Short offline       Completed without error       00%     18756         -
#14  Short offline       Completed without error       00%     18731         -
#15  Short offline       Completed without error       00%     18707         -
#16  Short offline       Completed without error       00%     18683         -
#17  Short offline       Completed without error       00%     18659         -
#18  Short offline       Completed without error       00%     18635         -
#19  Short offline       Completed without error       00%     18611         -
#20  Extended offline    Completed without error       00%     18589         -
#21  Short offline       Completed without error       00%     18587         -

sdd:


SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   051    Pre-fail  Always       -       1
  3 Spin_Up_Time            0x0007   100   100   025    Pre-fail  Always       -       5952
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       34
  5 Reallocated_Sector_Ct   0x0033   253   253   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   253   253   051    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0025   253   253   015    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       19013
 10 Spin_Retry_Count        0x0033   253   253   051    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x0012   253   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       28
190 Airflow_Temperature_Cel 0x0022   112   088   000    Old_age   Always       -       42
194 Temperature_Celsius     0x0022   112   088   000    Old_age   Always       -       42
195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -       24307684
196 Reallocated_Event_Count 0x0032   253   253   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0012   253   253   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   253   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x000a   100   100   000    Old_age   Always       -       0
201 Soft_Read_Error_Rate    0x000a   100   100   000    Old_age   Always       -       0
202 TA_Increase_Count       0x0032   253   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     18996         -
# 2  Short offline       Completed without error       00%     18972         -
# 3  Short offline       Completed without error       00%     18948         -
# 4  Extended offline    Completed without error       00%     18925         -
# 5  Short offline       Completed without error       00%     18923         -
# 6  Short offline       Completed without error       00%     18899         -
# 7  Short offline       Completed without error       00%     18876         -
# 8  Short offline       Completed without error       00%     18852         -
# 9  Short offline       Completed without error       00%     18828         -
#10  Short offline       Completed without error       00%     18803         -
#11  Short offline       Completed without error       00%     18780         -
#12  Extended offline    Completed without error       00%     18757         -
#13  Short offline       Completed without error       00%     18755         -
#14  Short offline       Completed without error       00%     18731         -
#15  Short offline       Completed without error       00%     18707         -
#16  Short offline       Completed without error       00%     18683         -
#17  Short offline       Completed without error       00%     18659         -
#18  Short offline       Completed without error       00%     18635         -
#19  Short offline       Completed without error       00%     18611         -
#20  Extended offline    Completed without error       00%     18589         -
#21  Short offline       Completed without error       00%     18587         -

lspci kimenete


00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)
00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3)
00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2)
00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2)
00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3)
00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev f2)
00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2)
00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3)
00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
01:05.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
01:06.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 80)
01:08.0 SCSI storage controller: Adaptec AIC-7892A U160/m (rev 02)
04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet PCI Express (rev 11)

Lemezek típusa (egyformák):


MODEL:              SAMSUNG HD120IJ
FIRMWARE:           WZ100-33
match smartmontools Drive Database entry:
MODEL REGEXP:       ^SAMSUNG HD(080H|120I|160J)J$
FIRMWARE REGEXP:    .*
MODEL FAMILY:       SAMSUNG SpinPoint P80 SD series

Szóval mivel lehet a bibi szerintetek? Esetleg valami libata, nvidia driver probléma, vagy fizikai?

Hozzászólások

nekem is hasonló hiba jelentkezik.
a linkelt oldal alján ott vannak a sorok az én logjaimból. De nekem nem nvidia vezérlőm van.

A gépben nem volt sata és egy PCI-os kártya van benne. A vinyó néha megakad és akkor eszeveszett kattogó hangot ad "csapkodja a fejet"

sokat google-ztam utánna, de a legtöbb helyen vki benyögte,h kernel hiba, vezérlő hiba, IRQ ütközés, vinyó hiba stb... de nem indokolta,h melyik paraméterből, adatból jött rá :(

Tudtok vmi módszert, hogy hogyan deríthetném ki,h mi a hiba?

-- Ubuntu Gutsy --

Fizikailag távol vagyok a géptől, így én nem tudom megmondani, hogy az mit csinál olyankor (kattog -e vagy sem). Ez a két disk "szerencsére csak" a mentéseket tartalmazza raid1 -ben, ezért a rendszeren sem igen veszek észre változást ilyenkor. Bár egyszer elhalt rendesen, de csak egyszer...

--
http://laszlo.co.hu/

nekem a megszakitasok szoktak problemat okozni altalaban, cat /proc/interrupts?

udv Zoli

Sajnos nem tudom egyesével az ethernet portokat letiltani. Kettő integrált van, van egy nvidia és van egy broadcom. Ezekből a broadcom -ot használom. Vagy mind a kettőt tudom csak bios -ból kikapcsolni vagy egyikset sem.
De ennek nem sok köze van ehhez a hibához szerintem.

--
http://laszlo.co.hu/