FH-Dave
10-26-2004, 11:34 PM
This email notification has been sent to all customers.
Dear Valued Customers,
Note: All time mentioned below are with respect to local time, GMT -04:00. Please do not reply to this email directly since your reply may not be attended to.
As everybody has been made aware, on 10/15/2004 DDOS (Distributed Denial Of Service) attacks were launched against us. The attack was targeting a shared IP on WIN2 (66.150.196.200). We worked with Internap to minimize any impact and was able to normalize the traffic. The DDOS attack on 10/15/2004 has made our IIS on WIN2 became intermittent. However, web sites were still being served for most time. No other customers were affected.
Starting on 8 PM on Sunday, 10/24/2004 we received another round of DDOS attacks against us, in particular 66.150.196.200. Within two hours, we were able to quickly normalized the attacks by adding attackers' IPs onto our switch ACL/firewall. During the next hours, we kept monitoring our network and kept putting more and more rules into our switch ACL. The attack continued well into the next morning and everything was still performing well due to the filtered traffic. No customers even realized that we were still undergoing attacks. On Monday (10/25/2400) afternoon, at around 1:50 PM, our switch started to behave abnormally because it has simply too overloaded by the filtering processes. We quickly contacted Internap to have the switch rebooted. We had also requested Internap to null route 66.150.196.200 as to take all loads from the switch. However, the switch simply refused to work normally again.
Thus at around 3:00 PM, we dispatched technicians to the data center. After few hours working on the switch, we had realized that the switch configuration had been corrupted. We reset the switch configuration to the default settings and started to configure the setting for all VLANs as well the routing tables. By 6:51 PM, the switch started to work again and traffic started to flow from/to all servers as we reconfigured the switch. By 7:30 PM, traffic has resumed normally to all servers. However, we still had 66.150.196.200 null routed since it was still a target of the DDOS attacks. During the period of 1:50 PM to 7:30 PM, our network was sporadic. Some people may not be able to reach our network at all, some was able to reach their websites fine. Some others had problems only with certain services, and et cetera. In any case, we will consider this a total network outage.
While in the data center, we prepared a dedicated firewall server to be placed between Internap core switch and our core switch. The main job on this firewall was to take off the filtering task for all traffic coming to our network. This way we could have clean traffic going into our core switch and to all servers. By 10:30 PM, we started to load firewall rules, and by 11:28 PM the 66.150.196.200 has been re-routed again. Traffic had gone normal shortly after as we keep adding more and more IPs to our firewall deny list.
The DDOS attack is still going on until now (Tuesday, 10/26/2004) and our dedicated firewall is working within our expectation to filter out the attacks. Even at this moment, we are still receiving moderate amount of attacks, but thanks to our firewall, no customers are affected by the attacks.
Some customers may have realized that today, at around 5:04 PM, we had a brief network outage lasting for ~10 minutes, as recorded by our off-site service monitoring (Alertra). This was not a result of a DDOS attack. But for some reasons that we are still investigating, one of our link to Internap has dropped out while the second link to Internap had failed to take over the traffic. For your information, we have two Ethernet drops from Internap's switch. Under normal operation, one of the link is used to pass all incoming traffic where we have put our firewall in. The second Ethernet drop is used to pass all outgoing traffic. The two links are setup in such a way (through HARP setup) that in the even of any of the link going down, the other link will be able to take the traffic within few seconds. There was a glitch that had prevented the fail over to take place. We worked with Internap to resolve the problem as soon as possible. By 5:14 PM, all incoming and outgoing traffics have flowed as normal through our second link, while we investigated on the problem on the first link. We were able to bring back the first link shortly after. However, as part of the diagnosis, we had to disabled our firewall rules and thus, the DDOS attacks were able to flow freely to 66.150.196.200. Again, customer on WIN2 may have been affected, though itermittenly/sporadically. Everything resumed normal by 6:45 PM. No other customer was affected.
Currently the network has been functioning normally. Our firewall has been blocking roughly 700+ IPs and still growing. The load on this firewall is still incredibly low, with a 5, 10, and 15 minutes load average of 0.00. We will keep monitoring our network for any unusual traffic and will dedicate resources to maintain network quality.
We accept responsibility for the network outages caused by this incidences. Although Alertra only detected the brief 10 minutes network outage on 10/26/2004 and even though not all customers were affected the same way and to the same extent, we will offer compensation to all customers in accordance to our SLA (http://www.fluidhosting.com/sla.php). The following compensation applies:
- Customers with websites hosted on WIN2 with dedicated IP of 66.150.196.200. Shared hosting customers whose website IP is 66.150.196.200, then your total downtime on 10/25/2004 together with a brief network outage on 10/26/2004 would have been at the most 9 hours and 38 minutes. The downtime has extended beyond our uptime guarantee of 99.9% and you are entitled for 30% compensation of your monthly hosting fees.
- Other customers. Other shared hosting customers, VPS customers, reseller customers, colo customers, and dedicated customers would have had a total downtime of at most 5 hours 40 minutes. The downtime has extended beyond our uptime guarantee of 99.9% and you are entitled for 15% compensation of your monthly hosting fees.
The compensation would be given as a service credit. Please email billing[at]fluidhosting.com within the next 7 days to receive this service credit. Please do keep in mind that we do not compensate you for loss of sales/opportunities during the network outage, as set forth by our Terms of Service (TOS) and SLA.
Some of you have been a customer of us since our early stage, 3 years ago. During your stay with us, you will realize that we rarely have unscheduled downtime extending for a long period. In fact, in the last 3 years of operation, we only had three network outages lasting for more than 30 minutes. The first two was on the Christmas and New Year's Eve of 2001. Since 01/01/2002, for the duration of almost 23 months, we have had no network outage lasting for more than 30 minutes, until yesterday, 10/25/2004. There were one network outage lasting for around 30 minutes on September 23rd, 2003 due to a loss of power in the Internap facility. Another network outage lasting for around 4 minutes on January 18th, 2004. According to Alertra, our network uptime has been 99.997% since November of 2003.
I understand that a lot of people were upset due to this incidence, especially during this high Halloween season. However, as you have known by now, service interruption/unscheduled downtime is not a typical of us. We do hope you understand that there is nothing one can do to prevent DDOS attacks. Even large companies such as Authorize.Net, WorldPay, etc had to suffer prolonged DDOS attacks recently. What one can do is to contact/involve any parties (authorities and ISPs) while at the same time do filter these attacks and hope to be able to get through with it. There is no mechanism known than we can put together that make us save against any DDOS attacks. We are working hard to make sure that all services will be running smoothly while the DDOS attacks are still going on.
This incidence has bring new perspectives, especially for us. We are once again being tested on our level of hosting maturity and we believe we can only improve ourselves through our struggles in overcoming problems we face. The attackers may be joyful when they saw their attacks were successful to bring us down momentarily yesterday. But one thing they they must not have realized is that we can only come up stronger after this incidence. Below are what we will be planning/implementing as a way to improve ourselves.
1. Better communications. As a reminder to everybody, we have a community forums (http://forums.fluidhosting.com/) that is being hosted off network, on our Equinix facility. We also have an IRC channel (server: irc.fluidhosting.com, #support) that is also being hosted on our Equinix facility. During the outages yesterday, we keep everybody informed through our community forums. We welcome you all to visit our community forums and even to become a member there. Having said this, we also realized that during the network outage, communications were very crippled because our mail server was not able to receive/send emails properly. Furthermore, our own website was also unreachable for most people. For this reason, we will bring our main website and mail server off network to our other facility (Equinix or Telehouse). This way, in the event of network outage, communication can still be kept proper. We will plan to implement this in the next 2-4 weeks.
2. Better firewall infrastructure. Currently we only filter incoming traffic through our first link with Internap. In the event of a outage on the first link, all traffic will fail towards the second link. Currently there is no firewall place on the second link, and thus when the first link goes down, although the network will stay up, we would be prone to attacks again. We will install another dedicated firewall to filter the second link by this coming Sunday.
3. Better network infrastructure. We will also setup switch redundancy by having two switches running together. Each switch will be connected to each separate link to Internap. We will then connect all servers to both switches. In the even one switch fails, traffic will fall over to the other switch within second, and thus minimizing any network downtime. This is the hardest and the most expensive to implement. We will plan to fully implement this before the end of this quarter.
Once again, please accept our sincerest apology for the incidence and any inconveniences/losses caused. We see this as an opportunity to test and improve ourselves.
Should you have any further questions/concerns, please do not hesitate to contact us at support[at]fluidhosting.com.
Sincerely,
Dave Tong
Owner/Manager
Fluid Hosting, LLC
Dear Valued Customers,
Note: All time mentioned below are with respect to local time, GMT -04:00. Please do not reply to this email directly since your reply may not be attended to.
As everybody has been made aware, on 10/15/2004 DDOS (Distributed Denial Of Service) attacks were launched against us. The attack was targeting a shared IP on WIN2 (66.150.196.200). We worked with Internap to minimize any impact and was able to normalize the traffic. The DDOS attack on 10/15/2004 has made our IIS on WIN2 became intermittent. However, web sites were still being served for most time. No other customers were affected.
Starting on 8 PM on Sunday, 10/24/2004 we received another round of DDOS attacks against us, in particular 66.150.196.200. Within two hours, we were able to quickly normalized the attacks by adding attackers' IPs onto our switch ACL/firewall. During the next hours, we kept monitoring our network and kept putting more and more rules into our switch ACL. The attack continued well into the next morning and everything was still performing well due to the filtered traffic. No customers even realized that we were still undergoing attacks. On Monday (10/25/2400) afternoon, at around 1:50 PM, our switch started to behave abnormally because it has simply too overloaded by the filtering processes. We quickly contacted Internap to have the switch rebooted. We had also requested Internap to null route 66.150.196.200 as to take all loads from the switch. However, the switch simply refused to work normally again.
Thus at around 3:00 PM, we dispatched technicians to the data center. After few hours working on the switch, we had realized that the switch configuration had been corrupted. We reset the switch configuration to the default settings and started to configure the setting for all VLANs as well the routing tables. By 6:51 PM, the switch started to work again and traffic started to flow from/to all servers as we reconfigured the switch. By 7:30 PM, traffic has resumed normally to all servers. However, we still had 66.150.196.200 null routed since it was still a target of the DDOS attacks. During the period of 1:50 PM to 7:30 PM, our network was sporadic. Some people may not be able to reach our network at all, some was able to reach their websites fine. Some others had problems only with certain services, and et cetera. In any case, we will consider this a total network outage.
While in the data center, we prepared a dedicated firewall server to be placed between Internap core switch and our core switch. The main job on this firewall was to take off the filtering task for all traffic coming to our network. This way we could have clean traffic going into our core switch and to all servers. By 10:30 PM, we started to load firewall rules, and by 11:28 PM the 66.150.196.200 has been re-routed again. Traffic had gone normal shortly after as we keep adding more and more IPs to our firewall deny list.
The DDOS attack is still going on until now (Tuesday, 10/26/2004) and our dedicated firewall is working within our expectation to filter out the attacks. Even at this moment, we are still receiving moderate amount of attacks, but thanks to our firewall, no customers are affected by the attacks.
Some customers may have realized that today, at around 5:04 PM, we had a brief network outage lasting for ~10 minutes, as recorded by our off-site service monitoring (Alertra). This was not a result of a DDOS attack. But for some reasons that we are still investigating, one of our link to Internap has dropped out while the second link to Internap had failed to take over the traffic. For your information, we have two Ethernet drops from Internap's switch. Under normal operation, one of the link is used to pass all incoming traffic where we have put our firewall in. The second Ethernet drop is used to pass all outgoing traffic. The two links are setup in such a way (through HARP setup) that in the even of any of the link going down, the other link will be able to take the traffic within few seconds. There was a glitch that had prevented the fail over to take place. We worked with Internap to resolve the problem as soon as possible. By 5:14 PM, all incoming and outgoing traffics have flowed as normal through our second link, while we investigated on the problem on the first link. We were able to bring back the first link shortly after. However, as part of the diagnosis, we had to disabled our firewall rules and thus, the DDOS attacks were able to flow freely to 66.150.196.200. Again, customer on WIN2 may have been affected, though itermittenly/sporadically. Everything resumed normal by 6:45 PM. No other customer was affected.
Currently the network has been functioning normally. Our firewall has been blocking roughly 700+ IPs and still growing. The load on this firewall is still incredibly low, with a 5, 10, and 15 minutes load average of 0.00. We will keep monitoring our network for any unusual traffic and will dedicate resources to maintain network quality.
We accept responsibility for the network outages caused by this incidences. Although Alertra only detected the brief 10 minutes network outage on 10/26/2004 and even though not all customers were affected the same way and to the same extent, we will offer compensation to all customers in accordance to our SLA (http://www.fluidhosting.com/sla.php). The following compensation applies:
- Customers with websites hosted on WIN2 with dedicated IP of 66.150.196.200. Shared hosting customers whose website IP is 66.150.196.200, then your total downtime on 10/25/2004 together with a brief network outage on 10/26/2004 would have been at the most 9 hours and 38 minutes. The downtime has extended beyond our uptime guarantee of 99.9% and you are entitled for 30% compensation of your monthly hosting fees.
- Other customers. Other shared hosting customers, VPS customers, reseller customers, colo customers, and dedicated customers would have had a total downtime of at most 5 hours 40 minutes. The downtime has extended beyond our uptime guarantee of 99.9% and you are entitled for 15% compensation of your monthly hosting fees.
The compensation would be given as a service credit. Please email billing[at]fluidhosting.com within the next 7 days to receive this service credit. Please do keep in mind that we do not compensate you for loss of sales/opportunities during the network outage, as set forth by our Terms of Service (TOS) and SLA.
Some of you have been a customer of us since our early stage, 3 years ago. During your stay with us, you will realize that we rarely have unscheduled downtime extending for a long period. In fact, in the last 3 years of operation, we only had three network outages lasting for more than 30 minutes. The first two was on the Christmas and New Year's Eve of 2001. Since 01/01/2002, for the duration of almost 23 months, we have had no network outage lasting for more than 30 minutes, until yesterday, 10/25/2004. There were one network outage lasting for around 30 minutes on September 23rd, 2003 due to a loss of power in the Internap facility. Another network outage lasting for around 4 minutes on January 18th, 2004. According to Alertra, our network uptime has been 99.997% since November of 2003.
I understand that a lot of people were upset due to this incidence, especially during this high Halloween season. However, as you have known by now, service interruption/unscheduled downtime is not a typical of us. We do hope you understand that there is nothing one can do to prevent DDOS attacks. Even large companies such as Authorize.Net, WorldPay, etc had to suffer prolonged DDOS attacks recently. What one can do is to contact/involve any parties (authorities and ISPs) while at the same time do filter these attacks and hope to be able to get through with it. There is no mechanism known than we can put together that make us save against any DDOS attacks. We are working hard to make sure that all services will be running smoothly while the DDOS attacks are still going on.
This incidence has bring new perspectives, especially for us. We are once again being tested on our level of hosting maturity and we believe we can only improve ourselves through our struggles in overcoming problems we face. The attackers may be joyful when they saw their attacks were successful to bring us down momentarily yesterday. But one thing they they must not have realized is that we can only come up stronger after this incidence. Below are what we will be planning/implementing as a way to improve ourselves.
1. Better communications. As a reminder to everybody, we have a community forums (http://forums.fluidhosting.com/) that is being hosted off network, on our Equinix facility. We also have an IRC channel (server: irc.fluidhosting.com, #support) that is also being hosted on our Equinix facility. During the outages yesterday, we keep everybody informed through our community forums. We welcome you all to visit our community forums and even to become a member there. Having said this, we also realized that during the network outage, communications were very crippled because our mail server was not able to receive/send emails properly. Furthermore, our own website was also unreachable for most people. For this reason, we will bring our main website and mail server off network to our other facility (Equinix or Telehouse). This way, in the event of network outage, communication can still be kept proper. We will plan to implement this in the next 2-4 weeks.
2. Better firewall infrastructure. Currently we only filter incoming traffic through our first link with Internap. In the event of a outage on the first link, all traffic will fail towards the second link. Currently there is no firewall place on the second link, and thus when the first link goes down, although the network will stay up, we would be prone to attacks again. We will install another dedicated firewall to filter the second link by this coming Sunday.
3. Better network infrastructure. We will also setup switch redundancy by having two switches running together. Each switch will be connected to each separate link to Internap. We will then connect all servers to both switches. In the even one switch fails, traffic will fall over to the other switch within second, and thus minimizing any network downtime. This is the hardest and the most expensive to implement. We will plan to fully implement this before the end of this quarter.
Once again, please accept our sincerest apology for the incidence and any inconveniences/losses caused. We see this as an opportunity to test and improve ourselves.
Should you have any further questions/concerns, please do not hesitate to contact us at support[at]fluidhosting.com.
Sincerely,
Dave Tong
Owner/Manager
Fluid Hosting, LLC