[Post-Mortem] Network connectivity issues

Description of the Incident:

On September 29, 2017, we experienced a sequence of events that led to intermittent degraded performance across many Hover services.

At 1:00AM UTC on September 29, Tucows (the parent company of Hover) was the target of a sophisticated DNS attack that was followed by an unrelated double failure of core network equipment at our main Canadian data center which houses Hover services in addition to other Tucows infrastructure. The failure was determined to have been caused by an undocumented software limitation.

This complex combination of events impacted the following services intermittently throughout the incident:

  • Domain registration and management services via Hover.com
  • Hover Email services
  • Hover Customer support via email and live chat
  • Hover DNS services (ns1/2.hover.com)

 

Description of the Fix:

The Tucows’ operations team was able to quickly recover from the core network equipment failure but continued to experience the DNS attack until 13:10 UTC on September 29, when the attack was stopped and systems started responding reliably again. The network equipment failure made it more difficult for us to identify that we were under a DNS attack and impacted our response time.

Tucows and Hover operations teams are working on creating more separation between components of our DNS infrastructure that will improve resilience to similar attacks in the future.

In addition, Hover has also put together a communication plan to share status updates, at hoverstatus.com, more regularly when services are impacted.

We apologize for any inconvenience this issue has caused and thank you for your patience and understanding.




Have more questions? Submit a request

0 Comments

Article is closed for comments.
Powered by Zendesk