Mathematical secrets of "big data"

So-called machine learning does not cease to surprise, however for mathematicians the success reason is still not absolutely clear.

Somehow few years ago at supper, to which I was invited, the outstanding specialist in the field of differential geometry Eugenio Calabi volunteered to devote me in a subtlety of very ironic theory about a difference between adherents of pure and applied mathematics. So, having reached a deadlock in the researches, supporters of a pure mathematics quite often narrow a perspective, trying to bypass an obstacle thus. And their colleagues specializing in applied mathematics come to a conclusion that current situation indicates the need to continue studying of mathematics for the purpose of creation of more effective tools.

I always liked such approach; thanks to it it becomes clear that applied mathematicians will always manage to involve new concepts and structures which continually appear within fundamental mathematics. Today, when the question of studying of "big data" – too volume or difficult information blocks which do not manage to be understood is on the agenda, using only traditional methods of data handling – the tendency especially does not lose the relevance.

Investigation materials: "200 years since the birth of Ada Lovelace, the first programmer of mankind"

Date: On December 10, 2015, to the head of department No. 8 from the investigator of id1033.
Request type: investigation initiation.
Reason: in connection with suspicious activity of the user of id1596704383 during the period from July 30, 2005 to December 9, 2015, I ask to provide necessary resources in the Form 2 and to give authority according to the Observer-z protocol.
Justification: on the basis of the data obtained from open sources by system of analytics of POPSII-2014 ("Juniper") unique signatures (identifiers from sig8876 on sig8951 are appropriated) testimonial of active collecting and analytics of materials from a network from discharge "Primary source-18" were revealed. According to the order of November 20, 2015 to report without delay on any activity in the reality connected about "Primary source-18", I notify that on December 10 at 16-00 Moscow time, the user of id1596704383 passed to active actions into realities.

I apply the materials intercepted from the user's draft copies on December id1596704383 10, 2015 on the public Habrahabr resource to request.

"I am a devil or an angel" (Ada Lovelace, from the letter to Charles Babbage 1843)

200 years since the birth of Ada Lovelace, the first programmer of mankind

On December 10, 1815 at the poet Byron the daughter who in 1842 in the 27 years wrote the first program for the computer (steam) of Babbage was born.

"The essence and purpose of the machine will change from what information we will enclose in it. The machine will be able to write music, to draw pictures and will show to science such ways which we never and anywhere saw." Ada Lovelace

Ada — the programming language created in 1979 — 1980 during the project by the Ministry of Defence of the USA with the purpose to develop a uniform programming language for the embedded systems (that is management systems the automated complexes functioning in real time). First of all, onboard management systems military facilities (the ships, by airplanes, tanks, rockets, shells, etc.) meant. On December 10, 1980 the language standard was approved.

Whether it is easy to recognize information on the cash card?

When we communicate with our customers, being specialists in this area, actively we use the corresponding terminology, in particular the word "recognition". At the same time the listening audience which is brought up on Cuneiform and FineReader often puts in this term a problem of comparison of the cut image section to some number (a character code) which is solved neural network approach today and is not the first stage in a problem of recognition of information. At the beginning it is necessary to localize a card on the image, to find information fields, to execute segmentation on characters. Each listed subtask from the formal point of view is an independent problem of recognition. And if for training of neural networks there are proved approaches and tools, then in problems of orientation and segmentation every time is required an individual approach. If it is interesting to you to learn about approaches which we used at a solution of a problem of recognition of the cash card, then welcome under kat!

Steady beauty of indecent models

— You to us could not construct statistical model?
— With pleasure. It is possible to look at your historical data?
— We have no data yet. But the model is all the same necessary.

Familiar dialog, isn't that so? Further two options of succession of events are possible:

A. "Then you come when data appear." The option will not be considered as trivial.
B. "Tell what factors in your opinion are most important." Article remaining balance about it.

Under a cat the story that such improper model why their beauty is steady and what it costs. In total on the example of a distressful data set about a survival of passengers of Titanic.

Rational approximations to pi

Target figure Damme's by method

КДПВThe target figure is often added to identifiers which people can write or give with errors that then to find these errors.

The last digit of a credit card number, the ninth digit of the VIN cars sold in in the USA or the last digit of ISBN can be examples.

Algorithm of a target figure of van Damme — rather new and therefore little-known. It is published 2004.

The algorithm finds all errors in one digit and all single shifts of the next digits. It is much simpler, than Verkhuff's algorithm, comparable by opportunities, and does not demand use of special characters (such as X in 10-unit ISBN).

Writing of MKE of the clerk in less than 180 code lines

Today, MKE is probably the most widespread method for a solution of a wide range of applied engineering tasks. Historically, it appeared from mechanics, however afterwards was applied to various not mechanical tasks.

Today there is a big variety of software packages, such as ANSYS, Abaqus, Patran, Cosmos, etc. These software packages allow to solve problems of construction mechanics, mechanics of liquid, thermodynamics, electrodynamics and many others. Implementation of a method, as a rule is considered rather difficult and volume.

Here I want to show what now, using modern tools, writing of the elementary MKE of the clerk from scratch, for a two-dimensional problem of a flat tension is not something very difficult and bulky. I selected this type of a task because it was the first successful example of application of the finite-element method. Well and of course it are the simplest for implementation. I am going to use a linear, three-nodal element as it is the only flat element in case of which integration is not required numerical as it will be shown below. For elements of higher order, except for integration operation (which not absolutely trivial, but at the same time its implementation rather interesting) idea absolutely same.

The picture for drawing attention:

sin 1 ° on the calculator

Important refining — the calculator normal, without the sin button. As in accounts department or in the market.

Under a cat three different candidate solutions from different eras, from ancient Samarkand to the USA of times of cold war.

Works and koproizvedeniye

It is the fifth article from the cycle "The Category Theory for Programmers". The previous articles were already published on Habré in transfer Monnorochof :
0. The category theory for programmers: preface
1. Category: composition essence
2. Types and functions
3. Categories, big and small
4. Kleysli's categories

On KDPV a pig Pyotr brings on one tractor to each object of category.

Follow on shooters

The Ancient Greek playwright Euripedes wrote "Any person is similar to the environment". It is right also for the category theory. It is possible to select a certain object of category only by the description of nature of its relationship with other objects (and by itself) where the relations are morphisms.

For object definition in terms of their relationship the category theory resorts to so-called universal constructions. For this purpose it is possible to select some template, the chart from objects and morphisms of a certain form and to consider all constructions of the considered category suitable under it. If the template is rather widespread and the category is rather big, then probably the found constructions will be very much and many. The idea of universal construction consists in arranging constructions under some law and to select the most suitable.

This process can be compared to net search. The request of the user is our template. If the request is not really specific, then in reply the search engine will issue a set of suitable documents, only part of which are relevant. To exclude irrelevant answers, the user specifies request that increases search accuracy. Eventually the search engine will range coincidence and if carries, the required result will be in the list head.

Yandex Meteum announces own technology of forecasting of weather. To within the house

Today we announce new technology Meteum — now with its help to Yandex. Weather will build own weather forecast, but not to rely only on data of partners as it was earlier.

And the forecast will be separately calculated for each point from which you request it and to be recalculated every time when you look at it to be the most actual.

In this post I want to tell a little about how presently the world of weather models is arranged, than our approach differs from normal why we decided to build own forecast and why we believe what at us will turn out better, than at all others.

We constructed own forecast with use of traditional model of the atmosphere and the most detailed grid, but also tried to collect all possible sources of data on atmospheric conditions, statistics on how weather in practice behaves, and applied machine learning to these data to reduce error probability.

Now in the world there are several main models on which forecast the weather. For example, model open source WRF, the GFS model which initially were the American development. Now the NOAA agency is engaged in its development.

