So, we held the Festival of the new technologies Given at an exhibition here.
And we tell this first action from a series in which we bring together experts from different areas of business, science and public administration about analytics of data.
Storage and data analysis which were a prerogative of a narrow circle of the companies and people now begin to affect life practically all. For this reason we also began this series of actions where we tell wide audience about data and their analytics.
So, that was at the Festival:
At first, Andrey Ustyuzhanin (The head of joint projects of Yandex and CERN) told how machine learning helps to study Black Matter.
Further, Alexey Vorobyov and Kirill Krasnoshchekov (GUP "NI i PI Genplana Moskvy") told about use of Big Data for planning of the city.
Natalya Kalaytanova (the Media expert of the DCA company) told about change of approach to media placements with analytics.
Nikita Kotlyarov from Avito told about use of machine learning for blocking of fraudulent declarations on Avito.
Yury Kashnitsky from School of Data "Beeline" told about importance of the analysis of emissions in data on the example of identification of very successful Playboy models in the parameters which are not suitable under classical canons.
Rostislav Yavorsky (The associate professor of department of data analysis and artificial intelligence of faculty of computer sciences of Higher School of Economics National Research University) told about the analysis of social networks.
Sergei Marín from department of Big Data Beeline and the founder of School of Data "Beeline" told about use of Big Data for creation of the personalized client experience at the level of each client.
All presentations are available here.
Also, within the Festival we held the Hackathon on data analysis. A subject of the Hackathon was — a prediction of communications between subscribers.
Especially for a hackathon we generated synthetic data the closest to reality which described a bond graph with between different subscribers. Tops of the graph there were more than one million.
Later, we in a special way rustled these data, having destroyed some of communications. The task was — to recover the maximum number of communications in passing, without having created a set of new edges, early not existing.
We were not limited to the simple fact of existence of any communication between users, but also added information on value and a form of communication between them.
Description of fields of the file:
A — Id of an abonet And,
B — Id of an abonet of B,
x_A — Id of the operator of an abonet And,
x_B — Id of the operator of an abonet In,
c_AB — quantity of calls from And to In,
d_AB — duration of calls from And to In,
c_BA — quantity of calls from In to And,
d_BA — duration of calls from In to And,
s_AB — quantity of SMS from And to In,
s_BA — quantity of SMS from In to And
The code was also provided to participants for acquaintance with structure of a format of a solution and internal checks:
Benchmark.ipynb — an example of a simple solution with converting of the answer in the special format required for check of results.
Checker.ipynb — a code which will check quality of a solution.
During the Hackathon we understood that the offered task is more interesting and more difficult than to us saw earlier, and we decided not to be limited initial four hours, having given to the registered participants time to 18:00 Wednesdays December 23. For this purpose, we quickly transferred the Hackathon to online the mode.
The subsequent format online of interaction was following:
The form in guglforma in which the registered participants specified the following information was created
Name and Surname (or nickname)
Direct reference on the laid-out submission.csv
The comment — on a case of questions
The final document was visible only to organizers.
Time and even more often than in day we:
Downloaded solutions and banished them through a cheker with basic data
Updated a rating and results of participants
After 6 in the evening on Wednesday we summed up the results and defined winners. They appeared:
1st place: Alexander Kukushkin. Prize: The certificate on training at School of Data Beeline
2nd place: Anton Ustinov. Prize: Ticket for the Quest
3rd place: Georgy Zubriyenko. Prize: Earphones
Alexander laid out the description of the solution here.
All children big good fellows! We will solemnly award all prizes in the first week of January at the central office "VimpelCom" in Moscow.
In general, there is a wish to tell Many Thanks to all participants of our Festival, and also to hope that action and the organization were pleasant.
This the first of similar actions and next year we plan them much more. You monitor announcements on Habré and subscribe for news on the page of School.
In completion of this year and throughout a paradigm to tell about analytics of data of wide audience we acted on air of radio Komsomolskaya Pravda where we told about analytics of data, about trends and about School of Data. Record of air is available here.
In total with the Coming Holidays and to meetings in New Year!
This article is a translation of the original post at habrahabr.ru/post/274205/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: firstname.lastname@example.org.
We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.