Study of the effectiveness of the developed algorithms. The main principles underlying the creation of effective algorithms Efficiency than an algorithm based on

14.02.2022 Ulcer

Several different algorithms can be developed to solve the same problem. Therefore, the task of choosing the most effective algorithms arises. Note that accurately assessing the effectiveness of algorithms is a very difficult task and in each specific case requires special research.

That part of the theory of algorithms that deals with estimating the characteristics of algorithms is called metric. It can be divided into descriptive (qualitative) and metric (quantitative). The first examines algorithms from the point of view of the correspondence they establish between input data and results. The second examines algorithms from the point of view of the complexity of both the algorithms themselves and the “computations” they specify, i.e., the processes of sequential transformation of structural objects. It is important to emphasize that the complexity of algorithms and calculations can be determined in various ways, and it may turn out that with one method the algorithm A it will be more difficult IN, and with the other method it’s the other way around.

Most often, algorithms are evaluated by the required memory, number of operations performed, solution time, or computational error. These characteristics often depend on the parameters (dimensions) of the problem and are nonlinear. Therefore, in the theory of algorithms, there is a direction for assessing the effectiveness of algorithms based on asymptotic estimates of functions: required memory, computation time, etc. In this case, the most significant parameter of the function is determined and the behavior of the function is studied as the parameter values ​​increase. In the course of the study, they are trying to determine the nature of the dependence of the values ​​of the characteristics of the algorithm under consideration on the parameter. It can be linear (i.e. proportional to the parameter n), logarithmic (i.e. proportional to log n), quadratic (i.e. proportional to n 2), etc. By comparing the asymptotic estimates of algorithms that solve the same problem, you can choose a more efficient algorithm. They say that the value of some parameter T(n) is of order n x if there are positive constants k and n o such that for all n³n o the inequality T(n) ≤ k n x holds.

Suppose that n is the amount of numerical data received at the input of several different algorithms (A 1, A 2, A 3, A 4, A 5), which perform calculations at the same speed - 1000 operations per second, but have different asymptotic estimates. Table 1.8 shows the values ​​of n that these algorithms can process in 1 second, 1 minute and 1 hour (values ​​are rounded down to the nearest whole number). The data in Table 1.3 clearly shows that the performance of the algorithm (i.e., the number of data processed per unit of time) significantly depends on the nature of the asymptotic evaluation function.

Testing of the developed algorithms is usually carried out at small values ​​of the parameter n. Such testing allows you to gain confidence in the performance of the algorithm, but does not at all guarantee the completion of the task for large values ​​of n. We may simply not have enough computer memory or time to solve a real problem. Asymptotic estimates are important in the sense that they allow one to estimate the sufficiency of computer resources for practical calculations within known limits of variation of the parameter n.

Algorithm efficiency is a property of an algorithm that is associated with the computational resources used by the algorithm. The algorithm must be analyzed to determine the resources required by the algorithm. Algorithm efficiency can be thought of as analogous to the manufacturing productivity of repetitive or continuous processes.

To achieve maximum efficiency, we want to reduce the use of resources. However, different resources (such as time and memory) cannot be directly compared, so which of two algorithms is considered more efficient often depends on which factor is more important, such as the requirement for high speed, minimal memory usage, or another measure of efficiency.

Please note that this article NOT about algorithm optimization, which is discussed in the articles program optimization, optimizing compiler, cycle optimization, object code optimizer, and so on. The term "optimization" itself is misleading because everything that can be done falls under the umbrella of "improvement."

Background

The importance of efficiency with an emphasis on execution time was emphasized by Ada Lovelace in 1843 regarding Charles Babbage's mechanical analytical engine:

“In almost all computing, there is a large choice of configurations possible to successfully complete the process, and different conventions should influence the choice for the purpose of performing the calculation. The essential thing is to choose a configuration that will minimize the time required to perform the calculation."

Early electronic computers were very limited in both speed and memory. In some cases, it has been realized that there is a time-memory trade-off, in which a task must either use a large amount of memory to achieve high speed, or use a slower algorithm that uses a small amount of working memory. In this case, the fastest algorithm was used for which the available memory was sufficient.

Modern computers are much faster than those early computers and have much more memory (gigabytes instead of kilobytes). However, Donald Knuth emphasizes that efficiency remains an important factor:

“In established engineering disciplines, 12% improvement is easily achievable and has never been considered prohibitive, and I believe the same should be true in programming.”

Review

An algorithm is considered efficient if its resource consumption (or resource cost) is at or below some acceptable level. Roughly speaking, "acceptable" here means "the algorithm will run for a reasonable amount of time on an available computer." Because there has been a significant increase in the processing power and available memory of computers since the 1950s, the current "acceptable level" was not acceptable even 10 years ago.

Computer manufacturers periodically release new models, often more powerful ones. Price software can be quite large, so in some cases it is easier and cheaper to get better performance by purchasing a faster computer that is compatible with your existing computer.

There are many ways to measure the resources used by an algorithm. The two most used measurements are speed and memory used. Other measurements may include transfer speed, temporary disk usage, long-term disk usage, power consumption, total cost of ownership, response time to external signals, and so on. Many of these measurements depend on the size of the algorithm's input data (that is, the quantities requiring data processing). Measurements may also depend on the way in which the data is presented (for example, some sorting algorithms perform poorly on already sorted data or when the data is sorted in reverse order).

In practice, there are other factors that influence the effectiveness of the algorithm, such as the required accuracy and/or reliability. As explained below, the way an algorithm is implemented can also have a significant effect on actual performance, although many aspects of the implementation are optimization issues.

Theoretical analysis

IN theoretical analysis In algorithms, it is common practice to estimate the complexity of an algorithm in its asymptotic behavior, that is, to reflect the complexity of the algorithm as a function of the size of the input. n Big O notation is used. This estimate is generally quite accurate for large n, but may lead to incorrect conclusions at small values n(Thus, bubble sort, which is considered slow, may be faster than quick sort if you only need to sort a few elements).

Designation Name Examples
O(1) (\displaystyle O(1)\,) permanent Determining whether a number is even or odd. Using a constant size lookup table. Using a suitable hash function to select an element.
O (log ⁡ n) (\displaystyle O(\log n)\,) logarithmic Finding an element in a sorted array using binary search or balanced tree, similar to operations on the binomial heap.
O(n) (\displaystyle O(n)\,) linear Finding an element in an unsorted list or unbalanced tree (worst case). Addition of two n-bit numbers using end-to-end carry.
O (n log ⁡ n) (\displaystyle O(n\log n)\,) quasilinear, logarithmically linear Compute fast Fourier transform, heapsort, quicksort (best and average case), merge sort
O (n 2) (\displaystyle O(n^(2))\,) square Multiplying two n-digit numbers using a simple algorithm, bubble sort (worst case), Shell sort, quicksort (worst case), selection sort, insertion sort
O (c n) , c > 1 (\displaystyle O(c^(n)),\;c>1) exponential Finding an (exact) solution to the traveling salesman problem using dynamic programming. Determining if two logical statements are equivalent using exhaustive search

Verification Tests: Measuring Performance

For new versions of software or to provide comparison with rival systems, benchmarks are sometimes used to compare the relative performance of algorithms. If, for example, a new sorting algorithm is released, it can be compared with its predecessors to ensure that the algorithm is at least as efficient on known data as the others. Performance tests can be used by users to compare products from different manufacturers to evaluate which product will best suit their requirements in terms of functionality and performance.

Some benchmark tests provide comparative analysis of different compiling and interpreting languages, such as Roy Longbottom's PC Benchmark Collection, and The Computer Language Benchmarks Game compares the performance of implementations of typical tasks in some programming languages.

Implementation issues

Implementation issues may also affect actual performance. This includes the choice of programming language and the way in which the algorithm is actually coded, the choice of translator for the chosen language or compiler options used, and even the type of operating system. In some cases, a language implemented as an interpreter may be significantly slower than a language implemented as a compiler.

There are other factors that can affect timing or memory usage that are beyond the programmer's control. This includes data alignment, detailing, garbage collection , instruction - level parallelism and subroutine calling .

Some processors have the ability to perform vector operations, which allows one operation to process multiple operands. It may or may not be easy to use such features at the programming or compilation level. Algorithms designed for sequential computing may require complete redesign to accommodate parallel computing.

Another issue may arise with processor compatibility, where instructions may be implemented differently, so that instructions on some models may be relatively slower on other models. This can be a problem for the optimizing compiler.

Measuring Resource Usage

Measurements are usually expressed as a function of the size of the entrance n.

The two most important dimensions are:

  • Time: How long the algorithm takes on the CPU.
  • Memory: How much working memory (usually RAM) is needed for the algorithm. There are two aspects to this: the amount of memory for the code and the amount of memory for the data that the code operates on.

For battery-powered computers (such as laptops) or for very long/large calculations (such as supercomputers), a different kind of measurement is of interest:

  • Direct energy consumption: Energy required to run a computer.
  • Indirect energy consumption: Energy required for cooling, lighting, etc.

In some cases, other, less common measurements are needed:

  • Gear size: Bandwidth may be the limiting factor. Compression can be used to reduce the amount of data transferred. Displaying a graphic or image (such as the Google logo) can result in tens of thousands of bytes being transferred (48K in this case). Compare this to transmitting the six bytes in the word "Google".
  • External memory: Memory required on a disk or other external storage device. This memory can be used for temporary storage or for future use.
  • Response time: This setting is especially important for real-time applications where the computer must respond quickly to external events.
  • Total Cost of Ownership: The parameter is important when it is intended to execute a single algorithm.

Time

Theory

This type of test also significantly depends on the choice of programming language, compiler and its options, so that the compared algorithms must be implemented under the same conditions.

Memory

This section deals with the use of main memory (often RAM) needed by the algorithm. As with timing analysis above, analysis of an algorithm typically uses spatial complexity of the algorithm to estimate the required runtime memory as a function of the input size. The result is usually expressed in terms of "O" big.

There are four aspects of memory usage:

  • The amount of memory required to store the algorithm code.
  • The amount of memory required for the input data.
  • The amount of memory required for any output (some algorithms, such as sorts, frequently rearrange the input and do not require additional memory for the output).
  • The amount of memory required by the computational process during computation (this includes named variables and any stack space required for subroutine calls, which can be significant when using recursion).

Early electronic computers and home computers had relatively small working memory capacities. Thus, in 1949 the EDSAC had a maximum working memory of 1024 17-bit words, and in 1980 the Sinclair ZX80 was released with 1024 bytes of working memory.

Modern computers can have relatively large amounts of memory (perhaps gigabytes), so compressing the memory used by an algorithm into some given amount of memory is less required than before. However, the existence of three different categories of memory is significant:

  • Cache (often static RAM) - runs at speeds comparable to the CPU
  • Main physical memory (often dynamic RAM) - runs slightly slower than the CPU
  • Virtual memory (often on disk) - gives the illusion of huge memory, but works thousands of times slower than RAM.

An algorithm whose required memory fits into the computer's cache is much faster than an algorithm that fits into main memory, which, in turn, will be much faster than an algorithm that uses virtual space. Complicating matters is the fact that some systems have up to three levels of cache. Different systems have different amounts of these types of memory, so the memory effect on an algorithm can vary significantly from one system to another.

In the early days of electronic computing, if an algorithm and its data did not fit into main memory, it could not be used. These days, using virtual memory provides massive memory, but at the cost of performance. If the algorithm and its data fit in the cache, very high speed can be achieved, so minimizing the memory required helps minimize time. An algorithm that does not fit entirely into the cache, but provides locality of links, can work relatively quickly.

Examples of effective algorithms

Criticism of the current state of programming

Programs are becoming slower more rapidly than computers are becoming faster.

May states:

In widespread systems, halving instruction execution can double battery life, and big data provides an opportunity for better algorithms: Reducing the number of operations from N x N to N x log(N) has a strong effect for large N... For N=30 billion, these the changes are similar to 50 years of technological improvements.

Competition for the best algorithm

The following competitions invite participation in the development of the best algorithms, the quality criteria of which are determined by the judges:

See also

  • Arithmetic coding is a type of entropy coding with variable code length for efficient data compression
  • An associative array is a data structure that can be made more efficient when used trees PATRICIA or Judy arrays
  • Performance test - a method of measuring comparative execution time in certain cases
  • Best, worst and average case- conventions for estimating execution time for three scenarios
  • Binary search is a simple and effective technique for searching a sorted list
  • Branch table

Submitting your good work to the knowledge base is easy. Use the form below

Students, graduate students, young scientists who use the knowledge base in their studies and work will be very grateful to you.

There is no HTML version of the work yet.
You can download the archive of the work by clicking on the link below.

Similar documents

    Description of the formal model of the algorithm based on recursive functions. Development of an analytical and program model of an algorithm for a Turing recognition machine. Development of an analytical model of the algorithm using normal Markov algorithms.

    course work, added 07/07/2013

    The concept of an algorithm and analysis of theoretical estimates of the time complexity of matrix multiplication algorithms. Comparative analysis of estimating the time complexity of some classes of algorithms using conventional programming and programming using Open MP technology.

    thesis, added 08/12/2017

    General concept algorithm and measures of its complexity. Time and capacity complexity of algorithms. Basic methods and techniques of complexity analysis. Optimization associated with the choice of method for constructing an algorithm and with the choice of methods for presenting data in the program.

    abstract, added 11/27/2012

    The problem of improving the quality of fingerprints in order to increase the efficiency of biometric authentication algorithms. Review of fingerprint image processing algorithms. Analysis of an algorithm based on the use of the Gabor transform.

    thesis, added 07/16/2014

    Methods for organizing the computing process in systems with multiple processors. Development of a program based on algorithms for multiprocessor systems for batch processing of tasks. Calculation of key performance indicators for each algorithm.

    course work, added 06/21/2013

    Estimation of the computational complexity of the program. Implementation of the Huffman information coding algorithm. Encoding the test in binary and Huffman trees. Binary character code. The symbol and frequency of its appearance in the text. Calculation of the complexity of the algorithm.

    test, added 12/16/2012

    The transition from a verbal informal formulation to a mathematical formulation of this problem. Evaluate various options to select the most efficient data structures and processing algorithms. Implementation of algorithms in one of the programming languages.

    course work, added 06/25/2013

Main principles for creating effective algorithms

Anyone who develops algorithms must master some basic techniques and concepts. Anyone who has ever faced a difficult task was faced with the question: “Where to start?” One possible way is to look through your stock of common algorithmic methods to see if one of them can be used to formulate a solution to a new problem. Well, if there is no such reserve, then how can you still develop a good algorithm? Where to start? We've all had the frustrating experience of looking at a task and not knowing what to do. Let's look at three general problem-solving techniques that are useful for developing algorithms.

First method associated with reducing a difficult task to a sequence of simpler tasks. Of course, the hope is that simpler problems are easier to process than the original problem, and also that a solution to the original problem can be derived from solutions to these simpler problems. This procedure is called method of private goals. This method looks very reasonable. But like most general methods for solving problems or designing algorithms, it is not always easy to transfer to a specific problem. Making intelligent choices about easier problems is more of an art or intuition than a science. There is no general set of rules to define the class of problems that can be solved using this approach. Thinking about any specific problem begins with asking questions. Specific goals can be established when the following questions are answered:

  • 1. Is it possible to solve part of the problem? Is it possible to solve the rest of the problem by ignoring some conditions?
  • 2. Is it possible to solve the problem for special cases? Is it possible to develop an algorithm that produces a solution that satisfies all the conditions of the problem, but whose input data is limited to some subset of all input data?
  • 3. Is there anything related to the problem that is not well understood? If we try to delve deeper into some of the features of the problem, will we be able to learn something that will help us approach a solution?
  • 4. Is there a known solution to a similar problem? Is it possible to modify its solution to solve the problem under consideration? Is it possible that this problem is equivalent to a known unsolved problem?

Second method algorithm development is known as lifting method. The lifting algorithm begins by making an initial guess or computing an initial solution to the problem. Then the fastest possible upward movement begins from the initial solution towards better solutions. When the algorithm reaches a point from which it is no longer possible to move upward, the algorithm stops. Unfortunately, it is not always possible to guarantee that the final solution obtained by the lifting algorithm is optimal. This situation often limits the use of the lifting method.

In general, lifting methods are classified as “rough”. They remember some goal and try to do everything they can, where they can, to get closer to the goal. This makes them somewhat “short-sighted.” The short-sightedness of the lifting method is well illustrated by the following example. Suppose we need to find the maximum of a function at =/(X), presented by the graph (Fig. 2.15). If the initial value of the argument x = a, then the method of ascent will give aspiration to the nearest goal, i.e. to the value of the function at the point x = b, whereas the true maximum of this function is in = c. In this case

Rice. 2.15. An illustration of the lifting method The lifting method finds a local maximum, but not a global one. This is the “roughness” of the lifting method.

Third method known as working back, those. The work of this algorithm begins with a goal or solution to a problem and then moves towards the initial formulation of the problem. Then, if these actions are reversible, a movement is made back from the problem statement to the solution.

Let's look at all three methods in jeep problem. Suppose you need to cross a 1000-kilometer desert in a jeep, using a minimum of fuel. The volume of the jeep's fuel tank is 500 liters, fuel is consumed evenly, 1 liter per 1 km. At the same time, at the starting point there is an unlimited tank of fuel. Since there are no fuel depots in the desert, you need to install your own storage facilities and fill them with fuel from the car's tank. So, the idea of ​​the task is clear: you need to drive off from the starting point with a full tank for some distance, set up the first warehouse there, leave a certain amount of fuel from the tank there, but enough to get back. At the starting point, full refueling is carried out again and an attempt is made to move the second warehouse further into the desert. But where should these warehouses be set up and how much fuel should be stored in each of them?

Let's approach this problem using the working backward method. At what distance from the end can you cross the desert with exactly the same amount of fuel? To tanks? Let's consider this question for To= 1,2, 3,... until we find such an integer p, What n full tanks allows you to cross the entire 1000-kilometer desert. For To= 1 answer is 500 km = 500 l (dot IN), as shown in fig. 2.16.

Rice. 2.16.

You can refuel your car at the point IN and cross the remaining 500 km of desert. A particular goal was set because the original problem cannot be solved immediately.

Let's assume that To= 2, i.e. there are two full tanks (1000 l). This situation is illustrated in Fig. 2.16. What is the maximum value of jCj such that, starting with 1000 L of fuel from point (500 - Xj), it is possible to carry enough fuel to the point to complete the trip, as in To= 1. One way to determine an acceptable value X ( is as follows. We refuel at the point (500 - Xj), we go X ( kilometers to point IN and pour all the fuel into the storage, except for the part that is required to return to the point (500 - Xj). At this point the tank becomes empty. Now we fill the second tank, drive Xj kilometers to IN, pick up at IN fuel left there, and from IN We're going to C with a full tank. The total distance traveled consists of three segments along X ( kilometers and one segment Sun 500 km long. Therefore, from the equation 3x t + 500 = 1000 we find its solution Xj = 500/3. Thus, two tanks (1000 l) allow you to travel Z> 2 = 500 +x ( = 500(1 + 1/3) km.

Let's consider k = 3. From what point can you leave with 1500 liters of fuel so that the jeep can deliver 1000 liters to the point (500 - x))? Let's find the largest value of x 2 such that, leaving with 1500 liters of fuel from the point (500 - Xj - x 2), we can deliver 1000 liters to the point (500 - Xj). We leave the point (500 - Xj - x 2), drive to (500 - x), transfer all the fuel except x 2 liters, and return to the point (500 - Xj - x 2) with an empty tank. Repeating this procedure, we will spend 4x 2 liters on travel and leave (1000 - 4x 2) liters at the point (500 - x L). Now at the point (500 - Xj - x 2) there are exactly 500 liters left. We fill up with the last 500 liters and go to the point (500 - Xj), having spent x 2 liters on this.

Being at the point (500 - Xj), we spend 5x 2 liters of fuel on travel. A total of (1500 - 5x 2) liters are left here. This amount should be equal to 1000 l, i.e. x 2 = 500/5. From this we conclude that 1500 liters allow you to drive

Continuing the process of working backwards inductively, we obtain that n fuel tanks allow us to pass Dn kilometers, where Dn = 500(1 +1/3 + 1/5 + ... + 1/(2n - 1)).

We need to find the smallest value p, at which Dn> 1000. Simple calculations show that for n = 7 we have D?= 997.5 km, i.e. seven tanks, or 3500 liters, of fuel will allow you to travel

  • 977.5 km. A full eighth tank - this would be more than necessary to transport 3500 liters from the point A to a point located at
  • 22.5 km (1000 - 977.5) from A The reader is given the opportunity to independently verify that 337.5 liters are sufficient to deliver 3500 liters of fuel to the 22.5 km mark. Thus, in order to cross the desert from I to C by car, you need 3837.5 liters of fuel.

Now the fuel transportation algorithm can be presented as follows. We start from A, having 3837.5 liters. There is just enough fuel here to gradually transport 3500 liters to the mark

22.5 km, where the jeep will eventually end up with an empty tank and enough fuel for 7 full refills. This fuel is enough to transport 3000 liters to a point 22.5 + 500/13 km from A, where the car's tank will be empty again. Subsequent transportation will bring the jeep to a point located 22.5 + 500/13 + 500/11 km from A, with an empty tank of the car and 2500 l in the warehouse.

Continuing in this way, we move forward thanks to the analysis carried out by working backwards. Soon the jeep will be at the 500(1 - 1/3) km mark with 1000 liters of fuel. Then we will transport 500 liters of fuel to the point IN, Let's pour them into the tank of the car and drive to the point without stopping WITH(Fig. 2.17).


Rice. 2.17.

For those familiar with infinite series, note that D There is n-th partial sum of an odd harmonic series. Since this series diverges, the algorithm makes it possible to cross any desert. Try modifying this algorithm to leave enough fuel at various points in the desert to return to the point A.

The question arises whether it is possible to travel 1000 km using less than 3837.5 liters of fuel. It turns out you can't. The proof of this statement is quite complicated. However, the following, rather plausible, argument can be made. Obviously we are acting in the best possible way For To= 1. When To= 2 plan used for To= 1 and then the second tank of fuel is activated in order to be as far away from the point as possible IN. The starting premise for To tanks is that we know how to act best in the case of (To - 1) tanks, and move back as far as possible with the help Who tank

So, in the problem considered, the method of working backwards is that the problem is solved as if from the end; the method of partial goals is that they do not solve the whole problem at once, but, as it were, in parts; and, finally, the method of ascent is manifested in the fact that the solution is not found immediately, but sequentially, as if approaching it.

TEST QUESTIONS

  • 1. Give a definition of an object, class, system, model.
  • 2. Name the main types of models.
  • 3. What is simulation modeling?
  • 4. What classifications of models exist?
  • 5. Indicate the main stages of modeling.
  • 6. What is an algorithm?
  • 7. List the properties of the algorithm.
  • 8. What stages are performed in the complete construction of the algorithm?
  • 9. What is an algorithm flowchart?
  • 10. Define a function block.
  • 11. Which algorithm is called structural?
  • 12. Name the main principles underlying the creation of effective algorithms.