MonsterMegs - Reboot and Failed Raid Drive - Storm – Incident details

All systems operational

About This Site

Welcome to MonsterMegs' status page. If you want to keep on top of any disruptions in service to our website, control panel, or hosting platform, this is the page to check. We report minor/major outages and scheduled maintenance to this status page. However, issues affecting a small number of customers may be reported directly to affected customers in MyMonsterMegs (https://my.monstermegs.com). If you're experiencing an issue you do not see reported on this page, please log into MyMonsterMegs to view any alerts our team has added to your account.

Reboot and Failed Raid Drive - Storm

Resolved
Operational
Started 4 months agoLasted about 3 hours

Affected

Web Hosting Servers

Operational from 9:59 PM to 1:06 AM

Storm

Operational from 9:59 PM to 1:06 AM

Updates
  • Resolved
    Resolved

    The firmware upgrades have been completed and everything is back online. We found there was an issue with earlier versions of the Samsung 990 Pro's that were allowing the disk to fail much earlier than its lifespan. We applied the latest firmware that addresses that issue as well as both of the drives in the server as of now, are the latest production version of the hard drives.

    So with that said, we do not anticipate any further hard drive failures.

  • Identified
    Update

    The server has been back online for about 30 minutes now. We are waiting for the raid to finish rebuilding and then we will proceed with the firmware updates.

  • Identified
    Update

    The server has just went down for the hard drive replacement. Please anticipate several shorter downtimes over the next couple hours as we apply these firmware updates in rescue mode.

  • Identified
    Update

    We have checked the server in recue mode and the drive did not show. We have gotten the server back online with the single drive. So the datacenter will be replacing this drive very shortly. Once the server is back online, we will performing firmware updated so all internal server components to hope this rectifies the hard drives failing or bricking themselves before their life span.

  • Identified
    Identified

    We are going to perform an emergency reboot to check a reported failed hard drive in our raid setup. While doing this, we will most likely be doing firmware updates on the motherboard and hard drives to try and resolve these hard drives that keep getting reported as failed.

    We will update as we determine the course of action.