Some modern approaches in the field of natural languag processing

1 year ago
The results of scientific researches received in recent years in problems of voice recognition [1], machine translation [2], determination of a shade of the sentence [3] and word classes [4] showed prospects of neural network algorithms of deep learning in comparison with classical methods of natural languag processing (natural language processing). However, in the field of question-answer and dialogue systems still there are many unresolved tasks [5, 6]. In this article the overview of results of application of modern algorithms for problems of processing and a natural language understanding is given. The overview contains the description of several different approaches and does not apply for completeness of researches.

Human: how many legs does a cat have?
Machine: four, i think.
Human: What do you think about messi?
Machine: he ’s a great player.
Human: where are you now?
Machine: i ’m in the middle of nowhere.

(from article A Neural Conversational Model. KDPV from the movie Ex Machina)

About participation in a hackathon from Beeline

1 year ago
In last days off in the Museum of Moscow there took place the exhibition within which Beeline held a hackathon. I, just in case, decided to descend. The interesting challenge was offered: the graph, in tops subscribers is given, in edges the number of calls of one subscriber to another, their duration and number of sms is written. Data looked here so:

A, B — subscribers, x — the operator, with — number of calls, d — duration of talk, s — number of sms. In total ~ 6 000 000 edges. Besides there was a confidential set of edges which in advance in a random way deleted from the graph. It was offered to guess what edges were. That is on the known set of communications to tell what else communications I can appear.

First of all I took 10 000 couples of subscribers between whom there was a communication and 10 000 couples between which communication was not. Two main differences consisted in the following:
  1. If subscribers are connected, then almost always at one of them the operator 0. So it turns out because Beeline possesses information only on the clients
  2. If subscribers are not connected, then they almost always have no general contacts.

That is, roughly speaking, my solution consisted in the following: if couple of subscribers, has at least one general contact and at least one of subscribers uses the operator 0, then we add between them communication. The problem was only that in the graph there were ~ 1 000 000 subscribers and in a forehead to check how many the general contacts were impossible to each couple. Here once again the algorithm which already two times was mentioned on this website, in articles about search of similar groups in VK and about search of the connected requests comes to the rescue. I will shortly describe an essence. Let to eat 5 edges:
A    B
1    10
2    10
3    10
1    11
2    11

Subscribers 1 and 2 are crossed on two contacts 10 and 11. Let's group edges in B and for each group we will write out all matchings of A:
1    2
1    3
2    3
1    2

Let's count the frequency of all matchings and, about a miracle, at the couple 1, 2 frequency 2. This algorithm it is good to lay down on a paradigm map-redyyus therefore here again very much is useful nano-hadup on 20 lines.

To check on how many qualitative the solution turns out, I took away 20% of edges from the graph and tried to guess them. As a metrics organizers used f1 score. If to guess accidentally f1 turns out ~ 0. Beyzlayn who organizers provided gathers ~ 0.02. My solution — ~ 0.07. It turned out that when checking the direction of edges therefore f1 turns out a little higher — ~ 0.08 is not considered.

Still I tried to consider duration of talk. Really, one general contact with which both subscribers communicated only once and not for long, at all does not mean that subscribers have to be somehow connected. But for some reason in practice I did not receive any gain in quality.

The first prize of a series of marathons of "Master Card" Master of Code was taken by command from Singapore

1 year ago
A series from 12 hackathons of "Master Kod" of MasterCard payment service provider passed in 2015 in 12 cities worldwide, yet did not reach the final San Francisco where already winner commands from around the world faced in fight for the first prize. The Singapore command which won $100 thousand became the winner of the daily marathon which passed on December 5-6.

Indie-ultrakhardkor development as way of development

1 year, 1 month ago
Only three years passed an initiative, our with friends — a marathon of development of game in 48 hours. It is not trusted, but three years ago the good part of community of a habr looked at us as on addicts and suicides.

The video proof that it is possible to work 48 hours and almost not to stick

Today only lazy did not participate in hackathons and did not write a code on jams. Such extreme type of rest from routine of daily work for many became even usual. In many respects thanks to activity of the major companies and their "greasy" hackathons with solid material prizes (Microsoft'a hackathons, world battlkhak, activities of VKontakte, etc.). The invaluable contribution to development such dvizhukh puts Twitch and its section Game Development. An opportunity in real time to communicate with colleagues, to get advice, to discuss ideas, but the most important to feel part of the world in which there are same dolbanuty people as well as you where understand your jokes. In such atmosphere development becomes a pleasant and simple task.

On the London hackathon of "Master Card" won against the application for the organization of parties with friends

1 year, 1 month ago
On November 15-16 in London there passed the next stage of a series of hackathons of "Master Kod" (Masters Of Code) of the MasterCard company which the payment service provider holds worldwide. Participants of hackathons create business applications with use of API "Master Card".


As Microsoft participated in a 8-week hackathon of "AlfaCamp 2.0"

1 year, 1 month ago
In the summer of this year Alfa-Bank carried out the second AlfaCamp, having called it "AlfaCamp 2.0" and having invited as partners of Microsoft, Visa and IIDF. AlfaCamp 2.0 are eight weeks of a hard work of participating commands, free software of Microsoft for development within the BizSpark program, including resources of a cloud platform Azure, technology of Alfa-Bank and the help of specialists of bank with implementation of these technologies in products and services of developers.

Under a cat you will find the story about how there passed AlfaCamp and that there was.

Hackathon on machine learning: To come. To train model. To win

1 year, 1 month ago
Standard plan of any hackathon ↓

Microsoft Azure Machine Learning Hackathon

In these days off there will take place the hackathon on machine learning which organizer is the Microsoft company. Participants of a hackathon will have 2 days strong it is better not to sleep and make the world.

The narration in this article will pass in the same promptest manner in what as I believe, for most of participants will pass also a hackathon. Any water (if you are not familiar with Azure ML, then it is better to read "water" or some fact-finding material after all; otherwise it will be unclear), long determinations and such long introductions as it further — only the fact that it is necessary for you to win on a hackathon.

Traditional nonalcoholic hackathon in Sibiriksa: we write free HelpDesk

1 year, 1 month ago
Traditionally on a hackathon we take the small project. Which would have practical advantage. Here so, hurriedly, we already made:
  • Huizhn — service for demonstration to customers of models with storage in Google Docs. It was cool.
  • Planing Poker — the old, but still quite visited project.
  • KeyRights — a corporate parolnik. Perhaps, the only project from a hackathon which was decided to be made paid.

This time we decided to attempt upon sacred — to write HelpDesk. Absolutely free, opensorsny, madly simple, it is put on time - two. We draw design, we impose. All this in a week to. We gather on Sunday at 10 in the morning in office. We are bought by power engineering specialists, and rushed!

On November 7 in Moscow, the Hackathon: "Where there is our money? Civil technologies of the analysis of state expenses and state income."

1 year, 2 months ago
Perhaps, many of you know our non-commercial project of Goszatrata on monitoring of all state contracts. We from the very beginning created this project:
  • with a careful eye to developers;
  • on convenience of work with data;
  • on creation of interesting projects on the basis of those data what we collected;

Now we collected millions of contracts and hundreds of thousands of organizations participating in receipt of state funds, however, our pleasure would be incomplete without these data would be used by everyone.

The audience of developers is one of our priorities and therefore, we hold on November 7 a hackathon to one narrow, but important subject devoted only in work with open data - it is the subject "Where There Is Our Money?"

We are Autonomous Non-Commercial Organization of "Infokultur" and a great number of volunteers helping on with work with open data.

And of course, we will be glad if someone takes published by us earlier this, however only you should not be limited to them. And it is possible to take in general absolutely other information and other data.

How to take part and learn more?

As we participated in the hakatena of Data Science Week 2015

1 year, 4 months ago
Has returned from dataton of DSW 2015 where we have taken the second place and until it was forgotten nothing, would like to share impressions.

