Developers Club geek daily blog

1 year, 7 months ago
As it is possible to make fault-tolerant data storage system of domestic servers
Cluster note: server of domestic production of Etegro (2 AMD Opteron 6320, 16 GB RAM, 4 HDD)

The data storage systems used now in practice in Russia conditionally share on three categories:
  • Very expensive high-end SHD;
  • Midreynzhevy arrays (tiring, HDD, hybrid solutions);
  • And the economic clusters on the basis of SSD arrays and HDD arrays from "household" disks which are often collected by the hands.

And not the fact that last solutions more slowly or less reliable, than hayend. Other approach which is not always suitable for the same bank, for example is simply used absolutely. But perfectly is suitable for almost all medium business or cloud solutions. — we take the general sense a lot of cheap nearly "household" iron and we connect in fault-tolerant configuration, compensating problems the correct software of virtualization. Example — domestic RAIDIX, creation of the St. Petersburg colleagues.

And here EMC known for the devilishly expensive and reliable pieces of iron with software which allows to lift without problems both VMware-farm, and virtual SHD, and any butt on same h86 servers has come to this market. Also the history has still begun with servers of the Russian production.

Russian servers


The initial idea was to take our domestic iron and to lift on it the pieces of infrastructure lacking for import substitution. For example, the same nurseries of virtual computers, VDI servers, data storage systems for butt and application servers.

The Russian server — is such wonderful piece of iron which for 100% gathers in Russia and is on all norms of 100% domestic. In practice — carry separate parts from China and other countries, buy domestic wires and do assembly in the territory of the Russian Federation.

It turns out not so that it is very bad. It is possible to work though reliability is lower than the same HP. But it is compensated at the price of iron. Farther we come to situation in which not the stablest iron has to be compensated to good managing directors of software. At this stage we have started experimenting EMC ScaleIO.

Experience was got good, during experiments it has become clear that iron is not strongly important. That is — it is possible to replace on checked, from known brands. It will turn out slightly more expensively, but it is less than problems with service.

As a result the concept has exchanged: now we speak simply about benefit of ScaleIO on different iron, including (and first of all) — from the lower price segment.

But to business: results of tests


Here the principle of work of ScaleIO (http://habrahabr.ru/company/croc/blog/248891/) — we take servers, we type them chock-full disks (for example, the same SSD pieces which intended for replacement of HDD in due time) and we connect all this in cluster:

image

The configuration checked by us in laboratory this time — is integration of EMC ScaleIO and VMware. Colleagues from Etegro have kindly borrowed us 4 servers with 2 AMD Opteron (tm) Processor 6320 and 16 GB processors of random access memory in everyone. In everyone stood on 4 disks. Not the most capacious configuration, I would prefer servers on 25 disks 2.5 inches, but it is necessary to work with that is, but not with for what there is a wish.

Here servers in rack:

As it is possible to make fault-tolerant data storage system of domestic servers

I have divided all volume of disks in the server into 3 parts:
  • 10 Gb for ESX. It is quite enough of it.
  • 30 GB for internal needs of ScaleIO, but about it later.
  • All other place will be given through ScaleIO.

The first that it is necessary to make — to set to VMware. We will put ESX on network, so quicker. The task it not the new, but virtual computer with the PXE server has taken deserved place in my notebook long ago.

As it is possible to make fault-tolerant data storage system of domestic servers

As you can see, in ours lab there is a lot of test equipment. On the right there are 4 more racks and 12 more on the first floor. We can assemble practically any stand at the request of the customer.

The technology of work of Software Defined Storage such is that each note can request rather large number of information from other notes. This fact means that in such solutions the flow capacity and response time of BackEnd of network are very important. 10G Ethernet well is suitable for solution of this task, and the switches Nexus already stand in this rack.

Here the scheme of the turned-out solution:

As it is possible to make fault-tolerant data storage system of domestic servers

The ScaleIO installation on VMware is very simple. Actually it consists of 3 points:
  1. We set plug-in for vSphere;
  2. We open plug-in and we start installation, in the same point it is necessary to answer on 15-20 questions of wizard.
  3. We look how ScaleIO independently creeps on servers.


As it is possible to make fault-tolerant data storage system of domestic servers

If the wizard has ended without errors, in vSphere there will be special section for ScaleIO in which it is possible to create the moon and to give them to ESX servers.

As it is possible to make fault-tolerant data storage system of domestic servers

Or it is possible to use the standard ScaleIO console, having installed it on the local computer.

As it is possible to make fault-tolerant data storage system of domestic servers

Now small test for productivity. I have created on each host on the virtual computer with 2 disks on 50Gb and have evenly separated them on datastor. Generated loading by means of IOmeter. Here the maximum results which I managed to receive on loading 100% of random, 70% of read, 2k.

As it is possible to make fault-tolerant data storage system of domestic servers
3000 application IO from 4 servers with SAS disks — quite good result.

As entertainment I have tried "to fill up" system, pulling out notes in different sequence and through different periods. If to pull out notes on one and to give ScaleIO enough time for rebild, virtual computers will work even if there will be one note. If to disconnect 3 notes in minute, for example, access to the general space will stop till that time until these notes are included back. Data become available, and the array executes check of integrity of data and rebild (if it is necessary) in the background. Thus, the solution turns out rather reliable to use it on fighting tasks.

Perhaps, about virtualization all. It is time to sum up the result.

Summary


Pluses:
  • The solution price (processor capacities + memory + disks) is competitive and in many cases will be significantly lower, than servers + SHD in similar monovendor sentences.
  • It is possible to use any iron and to receive replacement "big beautiful" SHD for the same tasks. If it is necessary, servers can be replaced with more powerful without purchase of any additional licenses for ScaleIO. It is licensed poterabaytno without binding to iron.
  • Solution convergent. Virtual computers and SHD on the same server. It is very convenient practically for any medium business. Flesh-SHD — any more not fantasy at this level.
  • Plus is required less place in racks, energy consumption is less.
  • Good balancing — uniform distribution of IO on all disk resources.
  • The solution can be carried on 2 different platforms, having configured mirroring between them at the level of one cluster of ScaleIO.
  • For synchronous and asynchronous replication between clusters it is possible to use virtual RecoverPoint.

Minuses:
  • First, it is necessary to apply brain. Expensive solutions are, as a rule, good that are implemented very quickly and do not demand almost any special training. Considering that ScaleIO — this, actually, self-combined SHD, it is necessary to understand architecture, to smoke couple of forums for optimum configuration, to put experiment on the data for choice of optimum configuration.
  • Secondly, you pay in place on disks for redundancy and fault tolerance. The conversion factor of raw space in user space changes depending on configuration, and, perhaps, you will need more disks, than you thought initially.


I remind, here it is possible to read about software part in detail, and I with pleasure will answer questions in comments or mail of RPokruchin@croc .ru. And in month, on November 26, we with colleagues will do open test drive with Scale IO the vacuum cleaner Kerkher, sledge hammer and other pieces until men tell: "Aaaa, zarrraza". Registration here.

This article is a translation of the original post at habrahabr.ru/post/269289/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: sysmagazine.com@gmail.com.

We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.
Best wishes.

comments powered by Disqus