In the beginning we will shortly describe a project essence. There are doctors in clinics which on special devices dictate information on the patient and his visit. Then this information is transferred to a text type (the special division which staff listens is responsible for it and type the text), the text is checked, there is a filling of a template. Then there is a movement on Workflow which includes different stages with different business logic, then there is an integration to several external systems. And, at last, the letter to the patient is printed and sent. And work is archived after a while (but at the same time it can be recovered as necessary).
At the same time the part of Workflow can be executed on iPad, there is a special part for settings of system and workflow, there is also a special part for editing and document approval in the browser. But the main part of work is performed by the local machine of the user which is in hospital.
Also there is an indispensable condition that data cannot be lost. Therefore by the local machine of the user everything is stored in MySQL DB, data from which are synchronized with a primary server.
Specifics of the project – the field of health care in one Western European country. Also this program set is unrolled more than in 10 clinics, each of which has some unique specifics. Also, to persuade clinic on the new version, a lot of time is necessary: at first it is necessary to unroll on a test server of clinic, then they test during 3kh weeks, then there is a deployment on the live server.
Also in databases there is confidential information on patients therefore each clinic has the server (and as practice showed, to persuade them to purchase the new screw – already a problem).
What did we receive?
As you understand, refused 3 previous commands not because of their quality work. We received about 400.000 source code lines on C# (+ some more external libraries). The part of source codes was lost. Lack of documentation. Lack of tests. Lack of test data. Absence of the person from the customer who would know everything about system (the person who would be from the moment of creation of the project). The code was written so that it was impossible to cover its unit with tests. A DB approximately from 120 tables, the Problem with performance, Use of ADO.NET, Dapper, LINQ, Entity Framework 4 for interaction with a DB. MS SQL database.
Part of architecture of the application:
The small explanation about the represented parts of system:
- Desktop client – the most part of functions of system is executed on it - it is a working application of the doctor
- Desktop Client Service – service which is used by all Desktop Clients
- Task Scheduler – system module which is responsible for transfer of works into the following stage and other periodic operations
- iPad App iOS – the application for doctors on iPad
- Web Admin – web part for setup of worker processes, access rights to devices, etc.
- Web Client — web part for doctors which has quite limited list of possible functions
In a week (one and a half years ago) in the sum 5000 works were processed. Processing of work is understood as passing on all necessary stages, execution of interaction with all external services, etc.
What do we have now (the main achievements in 1,5 years)?
Application performance increased in some operations at 10-12 times, the Number of simultaneous users increased by 5 times, the number of the processed works in a week reached 30.000, the customer became quiet
Also we will describe some problems (and the fact that we made that to overcome them) which we faced.
1 problem: how to prove to the customer that everything is bad?
As you understand, the customer sees that the program works that it can process tasks. And to prove that "under a cowl" at this system everything works either with crutches, or on a limit – it is problematic. And the problem connected with it: how to show to the customer that the system improves?
One of options which was selected eventually, – use of the NDepend system. It is system of static analysis of a code. And very good. What especially in it was pleasant – existence about 150 standard rules.
Of them rules of finding of a dead code – not used classes and methods were very useful. In 90% of cases they were right, but there were also errors. Said that default constructors are not used and operators of implicit conversion are not necessary. Actually they were necessary.
Also positive impression was left by work speed – the analysis of the turned-out 40 libraries passed within 4-6 seconds. Also ease of integration with Continuous Integration (though already very few people you will surprise with it). Still the diagram of dependences which can build NDepend was useful. When it was shown to the customer, it was very good justification of time which it was required on implementation of one of features.
Still a curious feature of NDepend – creation of the rules for the analysis of a code. Though personally it was not useful to us, there are enough standard rules. At the same time because the project was legacy, we changed standard rules (so that they passed), made them critical (if they are broken that Continious Integration then gives the report on it). And then gradually raised a threshold of "quality" of a code.
Still very useful function – comparison with the previous results of the analysis and finding of a difference.
What conclusions were drawn on this problem: NDepend has very good opportunities, builds many beautiful diagrams, we expand. After the customer began to see diagrams, he became quieter: he saw progress.
2 problem: confidentiality of data.
Clinics work with real patients, and data on them are confidential. Moreover, they are on servers of clinics. These data joined audio files with dictations of doctors, white papers, etc.
The copy of this DB was necessary for us for better testing. But because of confidential data hospital were not eager to give these data.
Therefore the program of anonymization which replaces all letters in documents with units was written, replaces all contents of audio with an array from zero, replaces all names with Name 1, Name 2, etc.
Implementation of this program took about 3 days. After that the copy of a live DB which underwent anonymization was created.
In what there was an error? Did not consider that it will be long (as it appeared, pure time of anonymization was about 5 days). Anonymization created additional load of the same physical disk on which there was a live DB that created problems. Fortunately, we just in case monitored and at receipt of the first signs of big queue on disk reading stopped the program of anonymization.
After that added to it function that it could continue anonymization of a DB after interruption and began to start only at night.
What conclusions were drawn on this problem: confidential data – very big problem. As showed further use, benefits from such solution were not especially big because different hospitals had different business processes and a data structure sometimes strongly differed. Though it is undoubted, use of the anonymized DB gave short-term pluses and partially calmed the customer because he understood that it was the copy of real data, and it was the good proof that new modules will work with predictable speed on real data.
3 problem: a code which is written not for testing.
Legacy code which should be expanded, at the same time without having broken? It would seem that tests are simpler – to use unit and to enjoy life. However quality of a legacy code was very low. And as practice showed, this code was not written for testing (a singletona, use of old Entity Framework which is not suitable for testing). But at the same time, it was necessary to create for it tests. And we selected a way of creation of integration tests. And new features tried to write so that they could be covered with unit-tests.
For writing of integration tests we created one additional utility and function of recovery of a DB. Thus the programmer should have provided the program before the tested action, then to start the utility of saving of data. And then to cause at the beginning of the test recovery of a DB with indication of a way to the recovered data. The utility of saving of data was just passed according to all tables which are in a DB, and saved data from them in XML. After that recovery function dynamically created a new DB, recovered data in it and changed lines of connection so that tests saw this new DB. For data recovery used forming of command of an insert and use of ADO.NET for command execution that gave us independence of Entity Framework, etc. As a positive consequence – now it is easy to transfer this utility to any other language.
As practice showed if it is correct to prepare a DB, then recovery takes 1 second. And taking into account that in CI there were night tests, it was acceptable value.
And then still this utility got accustomed at testers when they began to do testing by means of Ranorex – added arguments of the command line and began to use for recovery of a DB and there.
What conclusions we drew on this problem:
- For a legacy code to invent small bicycles – it is almost inevitable.
- If you only thought of integration tests, at once think of night tests
- Recovery of a DB actually can be very convenient
- It is not necessary to store data for recovery of a DB in the draft of tests, it is better to throw out them in zip archive on drop box and to store the link to them in archive
- Check of a correctness of change of structure of a DB which we periodically should do was positive ghost effect of recovery.
4 problem: performance counters by the target machine.
Also topical issue which at us arose within a year – how to show system status which was inherited and the fact that it gradually begins not to maintain loadings? One of the most effective remedies – to collect performance counters with an interval of week. Really there is a lot of counters, they give good diagrams, they can be exported to Excel, to make the analysis, etc. It is a pity only that we did not do it from the very beginning when there were 5000 works a week.
Also plus of this approach is that it is very easy to persuade the customer on it is a built-in possibility of Windows therefore it was not necessary to persuade even almost on start of collecting of statistics.
What conclusions we drew: it was necessary to remove performance counters at the very beginning of the project. Though now it is quite interesting to compare them to an interval in one month.
5 problem: do not trust converters from RFH in HTML.
In a legacy code documents were stored in the RTF form. The customer asked to finish a possibility of editing documents in web the client. The customer had a license for TxTextControl v 19 therefore the editor was implemented with use of this component. However it became clear later that formatting (which demands conversion from RTF to HTML and back) leads to problems. Sometimes 10y type size becomes 10,5 after conversion and many other small kosyachok. At the same time the service of technical support recommended to be updated to 20y versions. But at more detailed research it turned out that in 20y versions are other jambs which do not suit us too.
Conclusions which we drew on this problem:
- It is not necessary to hope that, having the license for web the editor, technical support will solve all problems.
- Converting from rtf in HTML and back at more attentive studying brought many problems.
6 problem: recovery of the lost part for IOS.
Except part web there is also an unrolled part for use of the application on IPad. As it appeared, source codes from it were lost. After some searches we tried to recover source codes by means of decompiling (it can be done by means of Developer Bundle from RedGate which part Ants Profiler is). In general quality of decompiling very good, after the analysis we could during 2kh find hours how there was an authentication for IOS in the WCF server and to recover it in the latest version of the program.
Outputs: Developer Bundle from RedGate really helped out us in that situation.
7 problem – the profiler.
We also faced one performance concern: in 40 minutes of work on the client delays when printing began. We checked everything that is possible from the point of view of the server, but there found nothing bad.
Fortunately, during that time we decided to try Ants Profiler. It has very good benefit – really easy in use, but demands the administrator rights to installation by the local machine. To start under it the program – literally action in 3 cliques. Still the trialny full-function period in 14 days pleases – if it is necessary to start one-time by the final machine, then it is the most suitable choice.
After started the client for 40 minutes under the profiler, to find a problem it was business of 3 minutes (in one place there was an accumulation of the list of reductions, and at each set of the character an inspection on reduction which caused delays was carried out).
One more of benefits – very good reports which are very evident (the example of the report is given below) + can log SQL requests, appeals to a disk, network requests.
Minus – at the highest level profiling detailings badly works with COM objects (at least). But after some experiments we found that detailing at the level of procedures changes nothing, and already over a year we use this good tool.
8 problem – DB performance at the level of snapshot.
As it was already told because we work in health care, we have quite long process of updating. About half a year ago, when the number of works increased up to 20.000 works a week, BOIPL had problems connected with the fact that the system did not manage to process queue of tasks, and they from the point of view of the user did not go on the worker of workflow. The problem was shown because during reading from the table it was blocked and therefore commands of updating of tables were expedited not so as it would be desirable.
Therefore to us the task of optimization of the application without change of the application on the server was set.
It is logical to assume that if it is necessary to optimize without change of a code, then it is necessary to optimize not the application, but the database. At the same time indexes in the database were already created.
After the week analysis we decided to try to transfer the database to snapshot mode. Unexpectedly it yielded extraordinary good results – queue from 400 works was reduced to 12 works (but it already was within regulation).
Conclusions which we drew on this problem: in addition to creation of indexes it is worth to remember about such mechanism which is in MS SQL
9 problem – reports.
Also from the point of view of performance there was one more problem: there was an adminka which allowed to build different types of reports. But at the same time these reports were written not optimum and their calculation began to take 40 minutes after a while. For example, these reports included calculation of performance of typesetters of the text. At the same time often these reports led to varnishes of tables (and it was to a DB performance solution at the level of snapshot). On business logic these reports showed information for last day and Hindus who work at the boypl, at the beginning of the working day started its generation through an adminka then the file appeared on a network drive and all watched it. At the same time because of a loco of tables all operations of updating and removal from a DB periodically stopped.
This problem was solved very simply – we just created the console utility and put its not execution by means of a task scheduler of Windows, and from an adminka cleaned. Thus, we saved business to the logician (the report is ready to the beginning of the working day) and avoided problems with a varnish.
Outputs on this problem: it is good that we analyzed with what frequency reports are created and to what term the report has to be ready.
We listed some problems which we faced during the work with the legacy project. Whether there were our solutions ideal? It is unlikely.
But they allowed to hold the project, to develop it and to make it steady, and the customer remained with us.
This article is a translation of the original post at habrahabr.ru/post/272417/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: email@example.com.
We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.