The way to avoid performance problems when implementing a federal-level information system
In the fall of 2018, a company assigned us a task of conducting load testing. It turned out that the customer was implementing an educational project based on the platform as a service (PaaS) principle and was creating a platform that would ensure effective interactions between teachers and students in the educational process.
The customer was faced with an ambitious task of launching a pilot project in general education schools in six regions of the country during six months and then replicating the project to all of the general education schools in the Russian Federation.
The customer chose the first option. They then purchased a solution for the pilot program from an American educational platform developer and enhanced the basic product to meet the established requirements.
In load testing, there are different types of tests. For example:
The main task of load testing, in this case, was to confirm that the system met the stated requirements f or maximum performance, which was defined as supporting as many as 500,000 simultaneous users. To begin with, we suggested running a performance test consisting of one iteration of load testing to determine the maximum performance.
Preparation of the load testing stand is an integral and very important stage of testing. The ideal stand configuration is an identical copy of an industrial stand, but this is not always possible when testing large-scale systems. In this case, it was critical to choose an optimal configuration to minimize the utilized capacity and obtain data that could be reliably extrapolated to the industrial system.
In this case, Performance Lab engineers were very lucky, because the system was not in the industrial operation, and we had a rare opportunity to conduct testing in the prepared "battlefield environment", and this was a big plus.
However, the fact that the system was not in real industrial operation was also a significant minus. It was impossible to collect user statistics about the system business processes. As a rule, in such situations, when a load profile is expertly compiled, the business customer and industry experts are involved. That said, we had an opportunity to access information from open sources and we used official data from Rosstat and the Ministry of Education and Science.
Thus, after collecting and analyzing all of the available data, we had a load profile ready, the requirements for filling the database were defined and the load testing stand was prepared. We were ready to conduct load testing in the conditions which were as close to "battlefield operation" as possible.
See our case studies to have detailed information about the projects we have worked on. Take deep into the tasks we managed to solve and implemented solutions.
From the very start of the project, we have been writing the methodology and preparing load testing facilities (LTF) in parallel. LTF are capacities that will generate load and to test them sufficiently, it was necessary to estimate the throughput of an individual load station. This stage helped to calculate the number of load stations required to supply the target load.
The throughput of a load station is measured using synthetic testing, i.e. standardized tests showing the performance of an information technology (IT) system in terms of hardware and software metrics using synthetic monitoring tools.
In the process of searching the throughput of load stations, the load testing team did not have access to testing stand machines, and we did not have an opportunity to observe the utilization of hardware resources of the stand itself during the synthetic software testing.
After the first synthetic load testing, it became clear that central processing unit (CPU) utilization at the load station did not exceed 20%. The first concern was the non-optimal performance of the load scripts. After a series of optimizations to the load scripts, we still observed no improvements.
The next possible reason for the throughput limitation was the network channel itself (in load scripts there was a large number of web statics).
The load testing team transferred load stations to one network with a gigabit communication channel instead of 100 Mbps. After transferring to one network, a few more synthetic tests were performed which showed a similar result. The CPU load of the load station did not exceed 25% or 3% of the target load.
The reasons for the loading station side were excluded, and the team received access to testing stand machines in due time. Monitoring software to track hardware metrics was installed on the testing stand and the Performance Lab team continued synthetic load testing.
During the test, it was revealed that, at 10% of the target load, the web server's CPU utilization reached 100% and after the test was terminated that CPU utilization dropped to 2%.
The customer was notified of this problem, and then the team needed to correct the test plan and load testing methodology as soon as possible.
Before the problems with the web server's CPU utilization were identified, stages of 10% of the target intensity (from 10% to 110%) had been set in the methodology to find the maximum performance. In the current configuration, it was clear that starting with 10% intensity did not make sense. To solve this problem, we decided to introduce into the test plan "micro stages" of 0.5% of the load from the target load up to 10% (20 stages) for synthetic performance monitoring.
The results of load testing showed that the given configuration for the system is far from the stated performance requirements. The maximum system performance was 2% of the target performance.
The reasons for the system performance limitations were localized due to synthetic monitoring and recommendations for their elimination were described in detail.
Within a short time and at an early stage the customer obtained insight into the system performance limits, localized bottlenecks and gained an understanding of the amount of work to be done in order to increase performance.
Since the testing was conducted at an early stage, the customer obtained the key data necessary to make a decision about the further development of the project. At this stage, replacement of the vendor/developer is acceptable, and the Performance Lab team is ready to provide support in eliminating constraints and repeat the test iteration.
There's no better place for a QA solution than Performance Lab. Drop us a line to find out what our team can do for you.