WD Purple hibas vagy mas a baj?

UPDATE: mult heten megkaptam a cserevinyot (mivel megingott a bizalmam a Purple-ben, egy Redet kertem helyette, remeljuk ezzel mar nem lesz gond)

vettem nemreg egy WD Purple-t (a Red-et vagy ezt ajanlottak egy masik forumon, vegul a Purple-t valasztottam...)
tisztaban vagyok vele hogy megfigyelorendszerekhez ajanljak elsosorban, nekem otthonra NAS-ba kell gyenge torrentforgalomra illetve Time Machine backuphoz (vagyis jellemzoen kb. folyamatos, kis sebessegu iras/olvasas). Egy Sharkoon kulso hazban uldogel.

Viszont erdekes hibakat talaltam a rakotott Raspberry logjaban.
Ezen nekibuzdulva raeresztettem egy badblocksot egy desktop gepre kotve (amugy ezzel kezdtem volna, de olyan baromi sokaig tart, hogy inkabb nekialltam atmasolni ra a tobbi kulso vinyorol a cuccost, gondolva hogy ha baj van ugyis kibukik - hat ugy nez ki bejott :D).
az elso kb. 80%-ig siman elmegy kb. egyenletes tempoval, onnantol viszont nagyon durvan belassul, jelenleg itt all, az 'elapsed' az 5-10 percenkent lep elore, a szazalekjelzes lenyegeben mozdulatlan:


$ sudo badblocks -svn /dev/sdi
Checking for bad blocks in non-destructive read-write mode
From block 0 to 2930266583
Checking for bad blocks (non-destructive read-write test)
Testing with random pattern: 83.19% done, 48:09:17 elapsed. (0/0/0 errors)

jelenleg: Testing with random pattern: 83.19% done, 48:29:45 elapsed. (0/0/0 errors)

Szoval, most rossz a vinyo, elkepzelheto hogy a rack-kel van baj, netan valami beallitas/kernel/stb okozza a gondot? (ezek a tippjeim, csokkeno valoszinuseggel :D)

hdparm:


$ sudo hdparm -I /dev/sdi

/dev/sdi:

ATA device, with non-removable media
Model Number: WDC WD30PURX-64P6ZY0
Serial Number: WD-WCC4NDA2NYJD
Firmware Revision: 80.00A80
Transport: Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
Supported: 9 8 7 6 5
Likely used: 9
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 5860533168
Logical Sector size: 512 bytes
Physical Sector size: 4096 bytes
Logical Sector-0 offset: 0 bytes
device size with M = 1024*1024: 2861588 MBytes
device size with M = 1000*1000: 3000592 MBytes (3000 GB)
cache/buffer size = unknown
Nominal Media Rotation Rate: 5400
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, with device specific minimum
R/W multiple sector transfer: Max = 16 Current = 0
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
Power-Up In Standby feature set
* SET_FEATURES required to spinup after power up
SET_MAX security extension
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
Media Card Pass-Through
* General Purpose Logging feature set
* 64-bit World wide name
* URG for READ_STREAM[_DMA]_EXT
* URG for WRITE_STREAM[_DMA]_EXT
* IDLE_IMMEDIATE with UNLOAD
* WRITE_UNCORRECTABLE_EXT command
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
* Gen1 signaling speed (1.5Gb/s)
* Gen2 signaling speed (3.0Gb/s)
* Gen3 signaling speed (6.0Gb/s)
* Native Command Queueing (NCQ)
* Host-initiated interface power management
* Phy event counters
* Idle-Unload when NCQ is active
* NCQ priority information
* READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
DMA Setup Auto-Activate optimization
Device-initiated interface power management
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT Write Same (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
unknown 206[12] (vendor specific)
unknown 206[13] (vendor specific)
unknown 206[14] (vendor specific)
Security:
Master password revision code = 65534
supported
not enabled
not locked
not frozen
not expired: security count
supported: enhanced erase
438min for SECURITY ERASE UNIT. 438min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 50014ee25fcf21ff
NAA : 5
IEEE OUI : 0014ee
Unique ID : 25fcf21ff
Checksum: correct

a dmesg vege:


[839547.898949] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839547.915060] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839547.915062] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[839547.915704] sd 26:0:0:0: [sdi] Unhandled error code
[839547.915707] sd 26:0:0:0: [sdi]
[839547.915709] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK
[839547.915710] sd 26:0:0:0: [sdi] CDB:
[839547.915711] Read(16): 88 00 00 00 00 01 22 95 1a 68 00 00 00 80 00 00
[839547.915719] end_request: I/O error, dev sdi, sector 4875164264
[839547.915722] quiet_error: 20 callbacks suppressed
[839547.915723] Buffer I/O error on device sdi, logical block 609395533
[839547.915728] Buffer I/O error on device sdi, logical block 609395534
[839547.915729] Buffer I/O error on device sdi, logical block 609395535
[839547.915731] Buffer I/O error on device sdi, logical block 609395536
[839547.915732] Buffer I/O error on device sdi, logical block 609395537
[839547.915734] Buffer I/O error on device sdi, logical block 609395538
[839547.915736] Buffer I/O error on device sdi, logical block 609395539
[839547.915737] Buffer I/O error on device sdi, logical block 609395540
[839547.915739] Buffer I/O error on device sdi, logical block 609395541
[839547.915740] Buffer I/O error on device sdi, logical block 609395542
[839585.802913] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839585.818810] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839585.818812] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[839616.920001] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839616.935854] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839616.935856] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[839647.908818] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839647.924918] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839647.924921] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[839678.897791] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839678.913885] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839678.913887] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[839709.890552] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839709.906652] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839709.906654] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[839741.003493] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839741.019702] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839741.019705] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[839741.020348] sd 26:0:0:0: [sdi] Unhandled error code
[839741.020350] sd 26:0:0:0: [sdi]
[839741.020352] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK
[839741.020353] sd 26:0:0:0: [sdi] CDB:
[839741.020355] Read(16): 88 00 00 00 00 01 22 95 1a 80 00 00 00 80 00 00
[839741.020362] end_request: I/O error, dev sdi, sector 4875164288
[839800.932808] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839800.948811] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839800.948814] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[839831.985714] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839832.001765] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839832.001768] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[839863.038450] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839863.054528] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839863.054531] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[839894.027653] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839894.043425] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839894.043428] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[839925.016411] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839925.032600] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839925.032602] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[839956.005481] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839956.021436] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839956.021438] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[839956.022158] sd 26:0:0:0: [sdi] Unhandled error code
[839956.022159] sd 26:0:0:0: [sdi]
[839956.022160] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK
[839956.022161] sd 26:0:0:0: [sdi] CDB:
[839956.022162] Read(16): 88 00 00 00 00 01 22 95 22 00 00 00 00 80 00 00
[839956.022176] end_request: I/O error, dev sdi, sector 4875166208
[839995.990121] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[839996.006184] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[839996.006186] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[840027.075191] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[840027.091243] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[840027.091246] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[840058.064081] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[840058.080084] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[840058.080087] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[840089.053124] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[840089.068979] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[840089.068981] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0
[840120.041972] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
[840120.061915] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b591480
[840120.061917] xhci_hcd 0000:00:14.0: xHCI xhci_drop_endpoint called with disabled ep ffff88029b5914c0

es vegul smartctl:


$ sudo smartctl --all /dev/sdi
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-34-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: WDC WD30PURX-64P6ZY0
Serial Number: WD-WCC4NDA2NYJD
LU WWN Device Id: 5 0014ee 25fcf21ff
Firmware Version: 80.00A80
User Capacity: 3.000.592.982.016 bytes [3,00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Thu Oct 2 09:01:13 2014 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (41280) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 414) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x703d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 088 064 051 Pre-fail Always - 82502
3 Spin_Up_Time 0x0027 185 182 021 Pre-fail Always - 5733
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 85
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 517
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 49
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 21
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 75
194 Temperature_Celsius 0x0022 112 108 000 Old_age Always - 38
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Hozzászólások

Szia!

Ez egyertelmuen hibas.
Gondolom a badblocksot non-destructive modon hasznaltad, igy o meg fog probalni minden adatot elolvasni.
mivel nem sikerul neki (lasd smart read error rate es dmesg io error), ezert nagyon sokat var, es ujra-ujra probalkozik.

A smart alapjan latszik hogy a read error rate meghaladta a "threshold" erteket, ezert ez a hdd garancialis cserere erett.

Bekötném direkt SATA-ra és futnék még egy kört vele.

Amennyiben hibás olvasását produkált volna a Pending szektorok nőnének, nálad zérón áll a számláló.

Raw Read Error például Seagate-nél tipikusan állandó jelleggel nő és náluk nem garanciás eset. WD-nél elég új ez a széria, ki tudja...