Perhaps, there comes the turning point in the history of data-centers. Already for many years tanks a flash memory increase that allows us to expand constantly possibilities of our phones, cameras and mediaplayers.
Emergence of new solid-state drives with use a flash memory allowed to begin development of more modern tablets and notebooks. Such drives began to appear also in data-centers.
Imperceptibly for us, the speed of emergence of new technologies sharply increased. NAND flash memory with new vertical architecture exceeded characteristics planar a flash memory, having allowed to store considerable data volumes in each crystal. New non-volatile technologies, for example memory of Intel-Micron 3D Xpoint, already begin to come to mass production.
You should not forget also about hard drives which technologies also continue to develop. Result of all this will become unprecedented change of structure of data-centers which will begin with hyper scalable clouds, but will gradually reach also the enterprises.
Technologies which do it possible
At the Flash Memory Summit exhibition in 2015 it was possible to get acquainted with new technologies of data storage and to see how they change the world of data-centers. The vice-president of the Samsung company Jim Elliot (Jim Elliott) announced the beginning of sales of a new 256-gigabit 48-layer chip with the cells storing three bits of information.
"Such chips have twice a high speed of consecutive reading and consume 40% less than the power, in comparison with our old 128-gigabit devices", – Elliot comments. He added that the company intends to develop the 1-terabit chip on the scheme with 100 layers.
Representatives of Toshiba concern which announced the 256-gigabit chip with 48 layers week before in brief described structure of multilevel placement of crystals where the microelement transfers in parallel the arriving data from memory arrays directly to the microcontroller by means of end-to-end openings through silicon. Such approach allows to avoid emergence of a jam at data transmission from one chip to another and to increase flow capacity.
However achievements in the field of vertical memory were eclipsed by the announcement of the Intel and Micron companies which declared 3D Xpoint development. Xpoint – it is a 128-gigabit chip which possesses in one thousand times smaller latency, than a flash. This technology is capable to generate new category of a non-volatile memory: rather fast to use the bus DRAM, and rather capacious to store a large number of data.
However still popular disk storage modules do not lag behind. "At hard drives still ahead, – the head of department of cloudy systems of the Seagate company Phil Breys (Phil Brace) says. – We deliver disks with a packing density of 1 of Tbitdyuym2. The developed technologies, namely tile magnetic record, 2D - record and the thermoassisted record, will be able to reach density in 5 Tbitdyuym2, and their cost will make less than one cent for gigabit. The difference at 5-10 times between the prices of hard drives and SSD will remain".
There is nothing surprising that the company which is engaged in sale of hard drives gives such optimistical forecasts. However the point of view of Breys is supported also by the companies suppliers of data storage systems.
"At a flash memory big benefit in speed in front of disk storage modules, – the vice-president of Oracle Michael Vorkman (Michael Workman) says, – however in cloudy storage systems this benefit is practically erased. The huge difference in the price also plays the role. In 5 years in cloudy storages the high-performance hard drives having huge capacity will be still used".
We here for a while
Creation of a cluster from chips a flash memory under control of the chip emulating the disk controller became natural reaction of developers of architecture of data-centers to emergence of a large number of non-volatile technologies (which speed of data reading is many times more, than at disk storage modules). This SSD replaces hard drives in crucial places (figure 1) is what occurs in data-centers for several years.
Figure 1 – In modern data-centers elements with low values of latency are located closer to the processor, and elements with high capacity make a basis of the large separated arrays. Data transmission between levels is performed by different interfaces
In a server field of SSD began to force out the small built-in SAS drives with a low delay. As for other part of the server, big pools a flash memory became alternative or the interface of large disk arrays with a high volume. Also sentences to replace storages of "cold" data arrived is slow high-capacious disks where seldom used data – a flash arrays are stored.
However experts consider that complete replacement of disk storage modules by SSD is an incorrect solution. One of the reasons is that the existing interfaces developed for disk storage modules are badly adapted for work about a flash chips and are not capable to retrieve performance maximum.
"Arrays a flash memory have higher flow capacity, in comparison with disk storage modules, – Vorkman explains, – but the SAS interface, and, in certain cases, the buses PCI Express (PCIe), nullify this benefit". It is worth using the specification on protocols of access to the solid-state drives NVMe, but not to regard a flash arrays as clones of disk storage modules. But even NVMe there is not enough reliability, availability and convenience of service (RAS) which are so necessary for big data storages. Therefore architects and developers of data storage systems look for new solutions. "SAS, SATA, and PCIe is a last century", – Kevin Konli (Kevin Conley), the technical director of Sandisk declares.
"Where there is no opportunity to use PCIe, for connection to big data storages architects use a set of other technologies", – Vorkman says. Due to growth of popularity of payments 10 GE, the transmission method of SCSI commands on 10 gigabit Ethernet in an IP packet can find the application. As possible alternatives Vorkman also considers Fibre Channel protocols and Infiniband.
The weak place of such system is the server network interface card for which the SAS interface not absolutely is suitable – there is a need for new interfaces. Moreover, changes in the software used by data-centers set thinking on how to use a flash in a cloud.
Evolution of the software
Those days when data-centers belonged to the large IT enterprises, the software used by them were quite uniform. For the appeal to relational databases, stored on hard drives, applications were used by SQL queries. On development of strategy of search of often used database sections for the purpose of their transfer on disks with a high performance or rooms in DRAM cache, a lot of time and efforts was spent.
With the advent of search engines and methods of the analysis of big data the situation changed. In new hyper scalable data-centers data remained on disk carriers, here only the structured relational databases disappeared. They were replaced by storages like key value or just a lot of unstructured documentary data. The MapReduce environments, such as Hadoop, worked with the parallel data stream coming from disks to server DRAM arrays for the analysis. However everything changed again.
"Loading of one page of the website Amazon requires about 30 microservices, – the director of technical strategy of NetApp of Val Berkovichi (Val Bercovici) says. – You can have storages a key value of Redis or Riak, the graph Neo4j database, for determination of the accompanying goods, and the document oriented MongoDB database. All this is necessary for creation of one page".
The main difference between new and old applications, such as Hadoop, is their approach to data storage. They, in the majority, considered that all data set is located in memory. "Distribution of API Memcached and emergence of Spark and Redis led to the fact that applications began to devour memory, – the president of Diablo Technologies Ricardo Badalone (Riccardo Badalone) warns. – We need to find alternative of DRAM".
New architecture for a new code
Cloud servers, from the point of view of new applications, but not hardware (gigabits, controllers and buses), look absolutely in a different way. Not improvement of performance of modern disk architecture, but search of an opportunity to transfer all data to random access memory has to be the main objective of developers. Gradually there is a transition from a SCSI broadcast on 10 GE or Infiniband, to the general component for all memory – the bus DRAM.
"Shortly there will be all-flash-data storages developed with use of DDR, DRAM and DIMM on a flash", – Vorkman says. Many agree with it. Speaks to Badalona: "Time to consider as flash memory, but not storage came. Using all-flash DIMM, we can transfer to four or even ten times more than operational data on the server bus".
The result is cardinal changes in structure of data storages: a small amount of DRAM becomes a big array fast a flash memory. If to go deep into structure of a data-center, then we will see high-capacious and highly reliable SSD on tens or hundreds terabyte which work with server DIMM in the mode of direct access – thus OS and a hypervisor manage, latency decreases. Some developers of architecture call it disaggregating when drives of data are distributed on all data-center as it is possible closer to servers.
As a result of DRAM, the DIMM flash, the memory connected by the RDMA channel, and "cold" storages create concentric layers of a cache, creating the seamless architecture which is smoothly passing from a cache of the first level to permanent storage on other end of a network (figure 2). From the point of view of the application, all data will be in DRAM, and the operator of a data-center will not notice fluctuations in the value of delays.
Figure 2 – The new model is a hierarchy of caches in which on each of layers different technologies of memory are used
However this system has a shortcoming. Small delays demand synchronism in a read and write. "Time response characteristics of DDR4 are strictly defined", – one of speakers warned. The DDR4 protocol does not work with memory if her behavior on reading is determined, but the behavior on record is absolutely unpredictable.
If record happens not often, the controller can solve this problem with rather fast buffer of memory. It is necessary just to place information in the corresponding buffer and to issue it in process of release of the equipment. Fortunately, the majority of modern applications meet this condition [a condition of infrequent record]. Programs, such as Spark and Redis, seldom store something.
Even in old SQL applications record happens less than managing directors of data-centers consider. Zakher Finblit (Shacher Fienblit), the chief engineer of Kaminario, found out that 97% of users write all set (their) data less often than once a day. With the good controller, he says, it is possible to keep loading level on record in a limit of 15% of all data which arrived in a day. Write buffers can cope with it.
Two new waves
Two technology trends can accelerate transition to these new architecture. One of them is a non-volatile memory about which it was told in the announcement of Intel-Micron – 3D Xpoint.
This technology, according to the analyst Dave Eglston (Dave Eggelston), is based on an element with change of a phase status and the switch with ovonic memory that allows to receive memory with a volume by 10 times exceeding the DRAM resource and with a performance of 1000 times bigger, than at a flash memory. "It will allow to create new level in hierarchy", – Chuck Sobey (Chuck Sobey) says.
It is obvious that Intel agree with it. On the last Intel Developer Forum the company declared that it will begin development of new memory on DIMM, and also the controller which will expand possibilities of the protocol of memory of DDR4 and will give the chance to work with Xpoint. These DIMM will create the level of a non-volatile memory necessary to software developers.
The second technology was very often mentioned on Flash memory Summit: calculations in random access memory (in-memory computing). Many from speaking at the summit noticed that transition to such software as Spark which stores all data in memory creates natural conditions for work with DIMM, but not the CPU.
"If everything goes such turn, then in the future 90-95% of calculations will happen in a permanent cache", – the CEO of Tegile Systems Rokhit Kshetrapal (Rohit Kshetrapal) declared. Naturally, Micron agree with it. They put a lot of things on the Hybrid Memory Cube technology developed by them where the crystal the processor/interface is included in the block of memory cells.
With the advent of these technologies the architecture of data-centers will not be included, most likely, into an era of SSD, having avoided the problems connected with drivers of the disk software and schemes of connection of new memory – it will enter the new unknown world. In this world the software will consider all memory operational, there will be no disk drivers, API for data storages or virtualization of levels – all these functions will be transferred to the program defined controllers and switches.
Caches of SRAM of the server CPU, DRAM, non-volatile high-speed memory and network flash SSD are integrated, the controller to the controller, in topology which can change together with applications. The most part of data of a data-center will be stored closer to the CPU now.
Terabytes of data will begin to be routed live via the memory bus DDR. The disk storage modules having the low cost and high capacity will take the place as storages of "cold" data. This architecture will differ very strongly from already existing.
This article is a translation of the original post at habrahabr.ru/post/273939/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: firstname.lastname@example.org.
We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.