Image Map Image Map
Results 1 to 9 of 9

Thread: HP Enterprise SSDs stop after 32K hours

  1. #1
    Join Date
    Jan 2007
    Location
    Pacific Northwest, USA
    Posts
    33,143
    Blog Entries
    18

    Default HP Enterprise SSDs stop after 32K hours

    Don't know if this affects anyone here, but it's interesting:

    HP SSDs stop after 32K hours

    A firmware update is probably in order if you've got one of these. Just shows to go you that programmers make bonehead mistakes even today.

  2. #2

    Default

    I wonder who really makes those drives, and if they made the custom HP firmware, or if HP wrote it themselves.

  3. #3
    Join Date
    Oct 2008
    Location
    Kamloops, BC, Canada
    Posts
    5,864
    Blog Entries
    44

    Default

    Nimbus for a period of time sold rebranded SSD's from OCZ.

  4. #4
    Join Date
    Jan 2013
    Location
    Marietta, GA
    Posts
    3,336

    Default

    You mean this isn't normal? That is about the maximum of how long modern electronics producers think a product should last before the cowsumer must throw it away and buy a new one.

  5. #5
    Join Date
    Sep 2003
    Location
    Ohio/USA
    Posts
    7,777
    Blog Entries
    2

    Default

    So I should run out and buy stacks of these timed out bricks and update the firmware?
    What I collect: 68K/Early PPC Mac, DOS/Win 3.1 era machines, Amiga/ST, C64/128
    Nubus/ISA/VLB/MCA/EISA cards of all types
    Boxed apps and games for the above systems
    Analog video capture cards/software and complete systems

  6. #6

    Default

    Quote Originally Posted by SomeGuy View Post
    You mean this isn't normal? That is about the maximum of how long modern electronics producers think a product should last before the cowsumer must throw it away and buy a new one.
    Not if they are something that is still supported and with onsite maintenance. As a CE for a company that’s not HPE, I can tell you that there is almost nothing more expensive than for a company to find a bug like this and then have to send hundreds of people out to all of their accounts to update this stuff on overtime and racking up miles, etc.

    At least for the server stuff they might be able to say it’s customer responsibility for firmware, but I’ll bet that doesn’t fly for disk arrays.

  7. #7
    Join Date
    Jan 2007
    Location
    Pacific Northwest, USA
    Posts
    33,143
    Blog Entries
    18

    Default

    I'm assuming that the 32K hours is "on time". So a disk array running 24x7 with a bunch of these things installed simultaneously will go dark in about 3 years, 9 months. Not so great.

  8. #8

    Default

    Both integer and floating point numbers have their limits in such application. A case in point was using a floating point time accumulator. It needed to be fast, something in the us to ms range. Anyway, after about two weeks it would look at the time difference between two events but the floating point value stopped increasing because the number got too large and the increment was too small.
    Another place where one often sees issues is in temperature control system where there is a A/D that reads the temperature value. This often needs an offset. Still the offset can push a valid value over the edge if done on the raw data from the A/D as an integer value.
    This always goes back to test cases at a level where it isn't masked by 30 layers of IF...ELSE...ENDIF statements. There should be test procedures at each decision point. Each should be the rollup of all the previous exceptions plus any new ones that might be needed. Of course, doing OO applications of today one often depends on the quality of the method code written by someone else that may not even work for your company any more and is already too complicated for the current idiot programmers to figure out, so they leave it as is and go on with life or death.
    Anyone that thinks a large application can be properly tested at the final coding, only using the specification is fooling themselves. That may be enough to get the customer to pay up but for any kind of application that can do harm, it is no where near enough.
    Dwight
    Last edited by Dwight Elvey; December 4th, 2019 at 02:37 PM.

  9. #9

    Default

    I gave it a little more thought, read a few comments and all seem to be assuming it was an integer number issue. It could have just as likely been a floating point number. Floating point numbers have a fixed relevant size as well. Also the fact that when it fails it bricks the SSD. This would tend to tell me that the timing was smaller than hours. It is more likely a floating point number that is larger than 16 bits. Such timers are often used for timing such events as how long the write pulse is on. It is quite likely that it is burning out the IC by cooking it. It is waiting for the time to time out and it never happens. There are safe ways to use integer timers as well as floating point. The fact that it bricks the IC would tell me that the count increment is likely much more than 16 bits.
    Dwight

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •