Decided to try it after reading of the article "As It Is Possible to Make a Fault-tolerant Data Storage System of Domestic Servers" and "You Do Not Hurry to Throw Out Old Servers, from Them It Is Possible to Collect Fast Ethernet-SHD for a Hour".
In the beginning I will call into question one picture from the second article. According to documentation the cluster can consist only of two notes, and three are shown there (blue color).
It is written with the purpose to discuss the arisen problems as from EMC of the answer did not act. Yes, the system was developed at the test stand about any technical there is no support from the vendor under the terms of licensing. But also search in a World Wide Web did not bring desirable result.
Actually, characteristics of the test stand
- two notes with a role of MDM (primary and secondary)
- one note with Tie Breaker role. On it will unroll GUI for monitoring and administration.
- three notes with Data Server role. On each of them storage devices (device) were organized as follows: two devices — raw-sections on the disks connected under the iSCSI protocol. One device was provided by the file of the big size.
- Windows 2012 standard acted as an operating system on each note. The volume of RAM are 4 GB. A network — 1 GB
The first neponyatka happened after installation of Meta Data Manager on the first note. That it could be configured, it was necessary to reboot OS as in attempt to execute command - add_primary_mdm directly after the end of process of installation, the connection error was persistently displayed though all necessary ports were able LISTENING and all necessary services were started.
Then process of accession of the second note and setup of a cluster, installation of roles Data Server went without problems.
On each note of Data Server about two storage devices in the form of RAW sections on the disks connected on iSCSI, and one file of the big size on a local disk were successfully connected.
Feature of connection of disks on iSCSI was that sources of these disks were computers in a network which joined/were switched off haphazardly, is unpredictable that helped to check fully such declared fault-tolerant technologies as: Rebuild, Rebalance. During supervision over system within two weeks to these aspects work of claims was not. Everything worked on "hurrah".
Problems began in attempt to increase quantity of the connected device by each of Data Server notes. Could not find out for what reason new devices did not join neither by means of command - add_sds_device, nor through GUI. All operations came to an end with the error "Comminication error". And so for each note. At the same time each of attached devices well in OS as the block device, does not oppose to formatting, creation on it file objects, work with it on the SMB protocol.
However, the most critical error emerged only in few weeks.
In one of days I paid attention that the cluster is in the status of degraded. There were problems with electricity and a network partially did not work at night. Both notes of Data Manager were in the status of Secondary. At the same time the note of Tie breaker was available on a network from both notes.
Forced transfer of a note in Primary is impossible, the administrative port is not listened, it is impossible to unload settings of a cluster in the file.
That is, all notes of Data Server, Data Client work, are thrown with each other by information at the network layer, the disk section provided to the client is available, integrity of information is not broken.
But impasse: neither to reconfigure, nor to add new notes.
Tried to lift new Primary Data Manger, to create a new cluster and to connect to it the existing Secondary a note. The illusive hope died without having been born — the new cluster was pure (in principle, it also was so clear from the very beginning).
One more small minus it is possible to call impossibility to arrange the GUI size under the extent of the current permission of the monitor — the GUI sizes are fixed and expected permission not less than 1280х1024.
Spent a lot of time for communication with Google, nothing adequate was succeeded to find.
Window the consultant's online decided to visit the website EMC, and there. I asked contact someone from technical support and wrote it the letter with the description of the revealed problems.
In the response letter (in Russian) asked me the specifying questions. I answered them and I was promised to answer after a while. Without having waited for the answer within a week, I reminded of myself the letter, but so still in reply received nothing.
By result of the testing described in article according to the second link at the beginning of article it is told that
Fail-safe tests took place successfully
I cannot agree with it. This first defined distributed storage tested by me program. I will gradually test also others. I will unsubscribe by results.
This article is a translation of the original post at habrahabr.ru/post/273345/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: email@example.com.
We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.