Developers Club geek daily blog

The plan fact, dynamics and profit on one chart with the help of R

4 years, 7 months ago
Every time when the financial results of last year are summed up and prepares the corresponding presentation, people puzzle, as if to find room for the main digits on one chart. What was field of activity of the organization, summing up, as a rule, begins with the analysis of the main financial performance, separately on each of the business directions:
  • turn in ended year (the actual digits);
  • the plans for ended year set earlier (for the analysis of execution);
  • turn in the previous year (for understanding of dynamics);
  • profitability.
The standard column chart which can be constructed quickly in Excel, gives, to put it mildly, not absolutely visual result. For example, if at business four directions, on the chart appear the 16th number of the standing columns, and someone can confuse for want of habit leaders and lagging behind.
The specialists familiar with R can use ggplot2 for program creation of the necessary chart, for example, such as here. For example digits for 2012 from the annual report of the Unilever company are taken. Planned indicators do not belong to public data therefore it was necessary to invent them from the head, having set, for determinancy, at the level? last year + 5%?.
Initial digits are in Excel and look so (data in millions of euros):
The chart constructed in RStudio looks as follows:
Check the chart for intuitivism, and without looking at digits, assume to what what indicator the element of the chart corresponds, and explanations will be further.

We expand possibilities of MS Excel 2010 with the help of R

4 years, 7 months ago
Good afternoon, dear readers.
Today I want to show as it is possible to connect possibilities of language R and office packet of MS Excel 2010. Below I will tell how it is possible to expand functionality of the built-in VBA language by means of functions R, and I will be helped with it by RExcel superstructure. The instruction on its installation can be found without problems in network or on off. site.

The analysis and visualization of real tabular data in R

4 years, 7 months ago
Material will be useful to those who masters language R as the tool of the analysis of tabular data and wants to see through example of implementation of the main steps of processing.
Below data loading from csv-files, analysis of text lines with data scrubbing elements, aggregation of data on analytical measurements and creation of charts is shown.
In an example functionality of packets of data.table, reshape2, stringdist and ggplot2 is actively used.

In quality? real data? information on the given-out permissions to implementation of activities for transportation of passengers and baggage is taken passenger taxi in Moscow. Data are provided in general use by Department of transport and development of road and transport infrastructure of the city of Moscow. Page of a data set of
Basic data have the following format:
1;"А248УЕ197";"ООО ?ТАКСИ-АВТОЛАЙН?";"017263";"FORD FOCUS";"7734653292";"1117746207578"
2;"А249УЕ197";"ООО ?ТАКСИ-АВТОЛАЙН?";"017264";"FORD FOCUS";"7734653292";"1117746207578"
3;"А245УЕ197";"ООО ?ТАКСИ-АВТОЛАЙН?";"017265";"FORD FOCUS";"7734653292";"1117746207578"

1. Loading of primary data
Data can be loaded directly from the site. In loading process at once we will rename columns conveniently.
url <- ""
colnames = c("RowNumber", "RegPlate", "LegalName", "DocNum", "Car", "INN", "OGRN", "Void")
rawdata <- read.table(url, header = TRUE, sep = ";",
             colClasses = c("numeric", rep("character",6), NA),
             col.names = colnames,
             strip.white = TRUE,
             blank.lines.skip = TRUE,
             stringsAsFactors = FALSE,
             encoding = "UTF-8")
Now it is possible to start the analysis and visualization?

Animated schedules in R (and it is a little about bifurcation, chaos and attractors)

4 years, 8 months ago
were necessary Once for me for presentation animated schedules. With schedules, actually, problems did not originate, and for their animation it was necessary to use one more packet animation which can be installed from CRAN.

Let's fix NAs!

4 years, 9 months ago
often enough there are incomplete data sets in which some variables are not defined. In language R contents of such variables are set as ²Not Available ⌡ ≈ or in abbreviated form NA. Accordingly, there is a question how to arrive with indefinite to values: whether it is necessary to ignore or modify them somehow?

Creation of model of SARIMA by means of Python+R

4 years, 9 months ago


Good afternoon, dear readers.
After writing of the previous post about the analysis of time series on Python, I deciding to correct remarks who specif in comments, but at them correction I facing with a row of problems, for example at creation of seasonal model of ARIMA since similar function and a packet of statsmodels I doing not find. As a result I deciding to use for this function from R, and searches le me to library of rpy2 which pozvolyaetispolzovat to function from libraries of the mention language.
At many to arise the question «what for it it could are necessary?», after all more simple simply to take R and to perform all operation in it. I completely agreed with this statement but as it seemed to me if the data demanded preliminary handling it more simple to produce on Python, and possibilities of R to use if needed for the analysis.
Besides, it will be show as to integrate results of distribution of operation of function of R into IPython Notebook.

Introduction in parallel computings in R

4 years, 11 months ago
   This article are devot to language of R. It not so are widespread in territory of ex-USSR as Matlab and the more so Python, but, certainly, are worthy. It is necessary to mark that R — actually the standard for Data Science (though here it are well wr, what not R uniform lived data scientists). Rich syntax, compatibility with legacy the code (that are rather important in scientific applications), the convenient development environment of RStudio and presence of huge number of libraries in CRAN done by that R.

R: a horoplet-Russia map with the increas European part

5 years ago

Shortly about the main thing: reading recently postinfotanka . Getting on Tatyany Misyutinoy site and peeping there a horoplet-Russia map with the increas European part. And after all, really, klassny idea. Conveniently, evidently. It wanting to make to itself a template under R for the same schedules. After all good ideas should be multipl by division?

R: a packet of ellipse for visualization of fiducial areas

5 years ago
I doubts the availability to write valuable informative posts. But would like to have possibility to make comments and ask questions.

In the last post from a R-hub «Visualization of two-dimensional gaussian on a plane» were describ algorithm of creation of a fiducial ellipse on a covariance matrix. The algorithm were accompan by an example and a R-script.

Probably, to the author of a post about «Visualization of gaussiana» to mephistopheies and readers of a R-hub there will be the useful the following information. In a repository of R there are a packet of ellipse. This packet contained various procedures for creation of ellipses of fiducial areas.

Let's consider an example.

Visualization of two-dimensional gaussian on a plane

5 years ago
Kind time of days. In development process of clustering one of methods, there were at me a need to visualize gaussian (to draw an ellipse as a matter of fact) on a plane on the given covariance matrix. But I somehow also doing not reflect at once that behind a simple otrisovka of a normal ellipse on 4 numbers difficulties disappeared what that. It appearing that at calculation of points of an ellipse own numbers and latent vectors of a covariance matrix, distance of Makhalanobisa, and as was us quantileschi-square distributionby which I, to tell the truth, doing not use since university never.

