Developers Club geek daily blog

Low-level optimization of parallel algorithms or SIMD in .NET

2 years, 10 months ago
image

Now the huge number of tasks demands the big performance of systems. Infinitely physical restrictions do not allow to increase the number of transistors on a processor crystal. The geometrical sizes of transistors cannot be reduced physically as when exceeding perhaps admissible sizes the phenomena which are not noticeable at the big sizes of active elements begin to be shown — quantum size effects begin to affect strongly. Transistors begin to work not as transistors.
And Moore's law here at anything. It was and remains the law of cost, and increase in number of transistors on a crystal is rather consequence from the law. Thus, to increase the power of computer systems it is necessary to look for other methods. This use of multiprocessors, multicomputers. Such approach is characterized by a large number of processor elements that execution of subtasks on each computing device brings to independent.

Read more »


Flows vs processes on the example of native Node.js of an add-on for load testing

2 years, 10 months ago
Slightly less than a year ago I wrote a note about attempt to create the instrument of load testing for Node.js using the built-in opportunities (cluster and net modules). In comments fairly indicated the need of the analysis of RPS and comparison with other benchmarks. As a result of comparison I came to a natural conclusion that mnogoprotsessovy service will never be compared on performance with multithreaded because of very expensive costs for data exchange (later we will be convinced of it on an example)

Read more »


Parallelization of algorithm of Shtrassen on Intel® Xeon Phi(TM)

2 years, 10 months ago
Intel Xeon Phi(TM) coprocessors represent PCI Express the device and have x86 architecture, providing high peak productivities — to 1,2 TFLOPS (one trillion operations with a floating comma per second) to double precision on the coprocessor. Xeon Phi(TM) can ensure simultaneous functioning to 244 flows, and it needs to be considered when programming for achievement of maximum efficiency.

Recently we together with the Intel company conducted small research of efficiency of implementation of algorithm of Shtrassen for the Intel Xeon Phi(TM) coprocessor. To whom subtleties of work with this device and simply loving parallel programming are interesting, I ask under kat.


Read more »


We learn English with Scala on Future and Actor

2 years, 11 months ago
I decided to tighten the English here. In particular, there was a wish to expand a lexicon considerably. I know that there is a mass of programs which in a game form help to make it. The hitch is that I do not love a geymfikation. I prefer in the old manner. A paper leaflet where the table with words, a transcription and transfer. Also we teach it we learn. Also we check the knowledge, for example, closing a column with transfer. Generally, as I learned it at university.

I heard about the fact that there are 3000 most often used loss for words which is been on OxfordDictionary the website. Here this list of words: www.oxfordlearnersdictionaries.com/wordlist/english/oxford3000/Oxford3000_A-B And transfer into Russian I decided to take from here: www.translate.ru/dictionary/en-ru only One problem, all to be on these websites well at all not in that format which can be printed and learned. As a result the idea all this was born to program. But to make it not as consecutive algorithm, and to parallelize everything. What pumping out and parsing of all words occupied not (3000 words * 2 websites) / 60 seconds = 100 minutes. It if to give on 1 second on pumping out and rasparsivany pages for extraction of transfer and a transcription (in reality I think it by 3 times longer until we open connection until we close both td and tp).

image

Read more »


Java 8 in a parallel. We learn to create subtasks and to control their execution

2 years, 11 months ago
All hi.
We continue the cycle of articles devoted to processing of large volumes of data in a parallel (the beautiful word, a lie?).
In the previous article we got acquainted also the interesting Fork/Join Framework tools allowing to break processing into several parts and to start side-by-side execution of separate tasks. What new is in this article – you ask? I will answer – more informative examples and new mechanisms for high-quality information processing. In parallel :) I will tell you about resource and other features of work in this mode.



I invite all interested under kat:

Read more »


Atomic processing of data units without blocking

2 years, 11 months ago
Use of algorithms without blocking always was something frightening for the developer. It is very difficult to imagine data access arrangement without blocking so that two or more flows could not process the same data unit at the same time. Most of developers use standard containers like stacks or chained lists without blocking, but no more than that. In the same article I would like to tell how to organize data access in the multithreaded environment without blocking.

The main idea of such method is that each flow uses the separate buffer in which it copies data from the main buffer, processes them and then interchanges the position of the pointer on the buffer with the pointer on the main buffer.

Read more »


Altera + OpenCL: we program under FPGA without knowledge of VHDL/Verilog

3 years ago
image

Hi everyone!

Altera SDK for OpenCL is a set of libraries and applications which allows to compile the code written on OpenCL in a firmware for COTTON VELVET of Altera firm. It gives the chance to the programmer to use FPGA as the accelerator of high-performance calculations without knowledge of HDL languages, and to write that he got used when it does under GPU.

I was played with this tool on a simple example and I want to tell about it to you.

Plan:

Welcome under kat! Carefully, there will be pictures!

Read more »


Pony — the murderer...?

3 years ago
Such progressive beginners in programming as — "Go, Rust, Nim, Crystal" and, all of them are very cool in the certain areas are known to all.

For example:
  1. Go was given rise as super simple and industrial language for a fast solution of objectives with ideas which all are fine are known, but some of them are nailed up to other languages (On 5 mm).
  2. Our second opponent is Rust, the winner on life, but because of the difficult life in development it became for community as future and fashionable replacement of C ++. For me its destiny is not clear yet as with green flows and IO under them there still hardly, I put it into place in a row with C for microcontrollers, drivers and operating systems.
  3. Crystal … Directly and accurately I say that it is a super productive clone of Ruby. There is nothing to tell more, all it is impregnated with its spirit.
  4. Nim (It is Nimushka or Nimrod) and its similarity to scripting languages create to it the special atmosphere, however inside it is rather difficult organism and for me this entity as Haxe with the same feelings when programming on it.


And Pony is my darling and a small ponyashka. In appearance and according to the name of language it is possible to pass dashingly by … Generally, I invite you under article cowl.

Read more »


Scheduler of Go

3 years ago
Preamble from the translator: It is rather loose translation let and not the freshest (June, 2013), but the intelligible publication about the new scheduler of parallel branches of execution in Go. The advantage of this note is that in it is absolutely simple, "on fingers" the new mechanism of planning for acquaintance is described. The same whom the explanation "on fingers" does not arrange and who would like detailed statement, I recommend Scheduling Multithreaded Computations by Work Stealing — 29 pages of statement with strict and difficult mathematical apparatus for performance review, 48 positions of the bibliography.

Introduction


The new manager designed by Dmitry Vyyukov (Dmitry Vyukov) became one of the greatest novelties in Go 1.1. The new scheduler has given so striking increase in productivity for parallel programs without changes of code that I have decided to write something about it.

Read more »


Async/await and the mechanism of implementation in C# 5.0

3 years, 1 month ago

In detail about conversion of the asynchronous code which is carried out by the compiler


The async mechanism is implemented in the compiler C# with support from .NET base class libraries. It was not necessary to make to the executing environment any changes. It means that the key word of await is implemented by conversion to look which we could write and in the previous C# versions. For studying of the generated code it is possible to use the .NET Reflector or ILSpy decompiler. It not only is interesting, but also it is useful for debugging, performance review and other types of diagnostics of asynchronous code.

Read more »