Developers Club geek daily blog

2 years, 8 months ago
Our partners from ISPsystem have suggested to carry out joint action — to provide the license ISPmanager 5 Lite free of charge to all cloudy VPS in Niderladnakh and the USA till November, and we have thought why and is not present …

Especially, when VPS is not one many more expensive than the license as big sale now proceeds. But there is more to come, we have decided to reduce the prices of all ruler, and not just of the S and M servers as have put new storage into operation, is exclusive on SSD drives, now cloudy VPS became even more productive quicker and the most important — service became stable (not so long ago there were big problems on cloud platform because of SAN storage as a result of which some of our subscribers have suffered, about incident under cat):

TO ORDER THE CLOUD SERVER OF SOFTWARE TO THE MAGIC PRICE

S

Kernels (vCPU) of 1 Core
Memory (vRAM) 1 GB
Disk quota of 40 GB (SSD Storage)
Port of 1000 Mbps
Premium traffic 4 TB
Fayervol Cisco ASA 5500 included!

9,00 $3,99$/month

M

Kernels (vCPU) of 2 Core
Memory (vRAM) 2 GB
Disk quota of 60 GB (SSD Storage)
Port of 1000 Mbps
Premium traffic 6 TB
Fayervol Cisco ASA 5500 included!

19,00 $7,99$/month

L

Kernels (vCPU) of 4 Core
Memory (vRAM) 4 GB
Disk quota of 80 GB (SSD Storage)
Port of 1000 Mbps
Premium traffic 8 TB
Fayervol Cisco ASA 5500 included!

39,00 $19,99$/month

XL

Kernels (vCPU) of 8 Core
Memory (vRAM) 8 GB
Disk quota of 160 GB (SSD Storage)
Port of 1000 Mbps
Premium traffic 10 TB
Fayervol Cisco ASA 5500 included!

59,00 $32,99$/month

Clouds fall, ours — not exception: the report on incident on cloud platform

Due to the regrettable incident connected with the service Virtual Cloud Server/VPS provided by us (S, M,L,XL),
after completion of works on recovery of correct functioning of service, us it has been decided to take up the matter more extensively publicly.

[ORIGINAL MESSAGE]
For the first time the problem has appeared 31.08.2015 about 20:00 CEST.
The problem from one of storedzh platforms that led to unstable work of the virtual has been noticed
machines.

Employees of DTs together with equipment provider have at once started the detailed analysis of the arisen situation,
after operation of the trigger about the increased load of one of notes.
Works on decrease in loading, and recovery of correct operability of platform have been carried out.

[UPDATE SEPT 1st, 09:45 CEST]
After the careful analysis of TsODA which is carried out by employees and equipment provider the faulty
the equipment has been replaced in 9:00 CEST. However contrary to calculations and waiting loading has not fallen
to normal state, and works were continued.

[UPDATE SEPT 1st, 12:15]
For the period of carrying out repair work it was accepted solutions on restriction of throughput band of storedzh, for decrease in loading.

[UPDATE SEPT 1st, 17:15 CEST]
Investigation of this incident is still conducted, the reasons of failure are not established yet but employees of TsOD together with 
equipment provider apply all forces for the fastest resuming of operability of platform.

[UPDATE SEPT 1st, 23:00 CEST]
It was succeeded to stabilize work of platform, engineers are going to bring all VPS into up state within several hours.
For ensuring stability of work of platform, for the period of work, opportunity has been blocked
inclusions \switching off \resets of the server by clients, for prevention of increase in loading.

[UPDATE SEPT 2nd, 09.30 CEST]
Engineers of TsODA worked all night long for stabilization of platform.
Work was recovered, the part of the mentioned VPS was returned to regular operation mode. The remained VPS now are in automatic recovery. Engineers also recheck operability of each VPS affected by this incident in manual mode twice.

Report about plans of transition to other storedzh platform — full SSD.

[UPDATE SEPT 2nd, 14.00 CEST] 
Operability of platform was recovered, and everything the mentioned VPS will be recovered in full between 16:00-17:00 CEST today.

Soon migration of all VPS on new storedzh platform will begin. The platform was already ottestirovat, and have begun preparation for migration.

[UPDATE SEPT 2nd, 15:30 CEST]
The problem with high loading repeats because of what working capacity essential is mentioned
speak rapidly VPS.

[UPDATE SEPT 2nd, 18.40 CEST]
Repetition of problem has happened at 15:30 hrs. CEST. After the analysis and recovery work of engineers of TsODA and  
equipment provider CEST was succeeded to stabilize loading at 17:30.

Preparatory work on preparation of migration on new platform is already finished, and is going to begin migration after 20:00 CEST.

[UPDATE SEPT 3rd, 01:00 hrs. CEST] 
As it was reported before work on process of migration of VPS on new SSD platform are already begun.
Migration of the first batch of VPS has been already successfully carried out and employees of TsODA work on recovery of their full working capacity.
According to the plan — to recover correct operability of the first batch of VPS on new platform will take about two hours of time.

[FINAL UPDATE SEPT 3rd, 09:30 hrs. CEST]
Apology … 
The problem which has led to failure of VPS consisted in machine failure of part storedzh platforms in this connection  excessive loading on notes was created, as led to errors in work of VPS.
Plans for transition to more productive and reliable full SSD storedzh have been accepted still earlier, and this incident has only accelerated moving.
The most part of servers has already been migrated on new platform.
Within hour  restrictions on managements of VPS will be disconnected: reset, switching off\inclusion and restrictions on the consumed resources which have been forced to apply to quiet migration.
Migration of the following parts of VPS will be is carried out further,  will be approximately spent 
4 hours on each pool of VPS, dauntaym will not exceed everyone 5-15 minutes.

This incident is not typical neither for us, nor for our partners (service providers).
We, by employees of TsODA, and his equipment provider have made every effort for minimization of dauntaym of VPS, respectively and losses, our clients.

We once again apologize for this incident, to all injured subscribers compensation in the form of free service about 3 months has been added, we hope for your understanding, it is very difficult to exclude machine failures at new products completely, from errors nobody is insured. Clouds fall, all have, sooner or later, main thing — the taken measures. Do backups and be reserved. We in return will try to make service by the stablest.

Sincerely yours UA-Hosting team.

This article is a translation of the original post at habrahabr.ru/post/267019/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: sysmagazine.com@gmail.com.

We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.
Best wishes.

comments powered by Disqus