We express many thanks for preparation of article to Kirill Malev from the Merku company. Kirill more 3kh is engaged years in practical application of machine learning for different data volumes. In the company solves problems in the field of prediction of outflow of clients and natural languag processing, much attention paying commercializations of the received results. Has finished magistracy of University of Bologna and NGTU
Today we will tell you how in practice to use cloud platform of Azure for solution of problems of machine learning for solution of problems of machine learning on the example of popular problem of prediction of the survived passengers of Titanic.
We all remember the known picture about owl therefore in this article all steps are in detail commented. If any step is not clear to you, you can ask questions in comments.
1 year, 3 months ago
Working on article "Deep learning on R...", I several times met mentioning of t-SNE — mysterious technology of nonlinear decrease in dimension and visualization of multidimensional variables (for example, here), have been intrigued and have decided to understand everything in parts. t-SNE is t-distributed stochastic neighbor embedding. The Russian option with "implementation of neighbors" to a certain extent sounds ridiculously therefore I will use further English acronym.
1 year, 4 months ago
Labor market represents classical forming of supply and demand on labor. And if from demand for work, many recruitment agencies and portals on job search represent some analytics according to the available sentences (the truth, not always in necessary look). That from the sentence (competitors) of analytics much less and that that is, is not universal for everyone, and most often represents simply cut according to the desirable income in some general spheres, or it is simple according to the name of the summary.
The tool which on arbitrary access of the summary (according to the name, key word and other) would show the main characteristics of such selection, distribution of salaries, age, and many other things as in graphic look, and look any the pertsentily was interesting to me. The result of my desire, is lower under cat.
1 year, 4 months ago
Usually the clustering means selection of several object groups with similar characteristics in group, and between groups — different. Feature to - clusterings — grouping not only objects, but also characteristics of these objects. That is, if data are presented in the matrix form, the clustering — is regrouping of lines or columns of matrix, and to - clustering — regrouping and lines and columns of data matrix. As well as in my previous publications, examples of use of methods and visualization of solutions are shown on these results of polls. Typical scope of algorithms to - clusterings — bioinformation science, segmentation of images, the analysis of texts.
1 year, 5 months ago
In the previous part of the publication the method of factorization of non-negative matrixes as decrease in dimension and visualization of contingency tables has been considered. In this part statistic analysis of the received charts with use of loglinear models will be carried out. I will remind, examples are shown for complex survey of the data — stratified, clustered and the weighed selections. This circumstance assumes application of ad hoc methods of assessment and choice of models. Markov networks — the convenient instrument of graphical representation of interaction of factors of loglinear models are applied to visualization of the received results.
1 year, 5 months ago
Many people face a question of purchase or property sale, and important criterion here, as if not to purchase more expensively or not to sell cheaper concerning other, comparable options. The elementary method — comparative to be guided by the average price of meter in the specific place and ekspertno adding or reducing percent from cost for merits and demerits of the specific apartment. But this approach is labor-consuming, inexact and will not allow to consider all variety of differences of apartments from each other. Therefore I decided to automate selection process of real estate, using data analysis by a prediction of the "fair" price. In this publication the main stages of such analysis are described, the best predictive model from eighteen tested models on the basis of three criteria of quality is selected, as a result the best (underestimated) apartments are marked on the card at once, and all this using one web application created by means of R.