On Time Versus Input Size

Introduction

Computer scientists seek not only to find correct solutions to a particular problem, but also to efficiently solve a problem. Even when we have a correct algorithm for solving a computational problem, that particular algorithm may still be far too costly in its use of computer resources to be practical. The key resources consumed by computer programs are running time and memory space. These two measures are the most frequently examined when analyzing the performance of an algorithm. In general, we simply refer to these concepts as time complexity and space complexity.

Time is measured by the number of basic actions carried out during the execution of a program, and space is measured by the amount of computer memory required to store the data generated and manipulated during the execution of these basic actions. For example, consider the problem of evaluating a polynomial . We might measure the performance of an algorithm which evaluates the polynomial by the number of multiplications and additions it must make. Even though on many real systems these operations take different amounts of time to complete, we assume in this model that they take the same amount of time. Abstractions such as these are often made in order to make the running time analysis simpler. In a similar manner, one often abstracts the unit of space. If talking about the amount of space required to compute some function, one might look at the number of intermediate variables required to compute the result, rather than the number of bits actually used to store those numbers.

Binary Addition

Let's start with a standard "grade-school" algorithm for adding two n-bit numbers.

Here are the basic rules for adding bits:

0 + 0 = 0 no carry
1 + 0 = 1 no carry
0 + 1 = 1 no carry
1 + 1 = 0 carry a 1

Adding two 5-bit numbers is done in the same way that we learned to add numbers in grade school. We apply the rules column-wise, starting with least significant bits and keep track of the carries as we progress. The numbers below are written with the most significant bit first:

   cc
  10110
+ 00111
  -----
  11101

We can generalize this procedure to two n-bit numbers and ask about the efficiency of this method of adding two binary numbers. In particular, we're interested in the amount of time (number of steps) that it takes to add two n-bit numbers as a function of n.

T(n) represents the amount of time grade-school addition uses to add two n-bit numbers. But what do we mean by time in this context? In general, a given algorithm will take different amounts of time on the same inputs depending on such factors as processor speed, process scheduling, instruction set, disk speed, and compiler optimizations. Our goal, however, is to define 'time' in a way that transcends implementation details and allows us to make assertions about grade-school addition in a very general yet useful way. By picking the right measurement of 'time,' we can analyze grade-school addition without having to worry about implementation details.

Here is how this works: On any reasonable computer, adding 3 bits (two input bits and a previous carry) and writing down the 2-bit answer (result and carry) can be done in constant time c. Pick any particular computer M and define c to be the time it takes to perform an add-and-carry operation on that computer. Hence, the total time to add two n-bit numbers using grade-school addition is c time for each of n columns, or cn. On a different computer, M', the time to perform an add-and-carry operation may be c'. In this case, the total time required to add two n-bit number using grade-school addition would be c'n.

We can measure 'time' in terms of the number of elementary 'steps' defined in some other way, provided that each such 'step' takes constant time in a reasonable implementation. 'Constant time' means an amount of time independent of the length n of the input. For example, in grade-school addition, the elementary step requires 3 bits of input (2 bits plus the carry bit) to get 2 bits of output. On a typical computer, such an operation takes a fixed amount of time, though the amount of time will be different on different computers. When reasoning about running time, useful to break up any program's operation.

[plot]

When we plot running time versus input size, no matter what machine we use, we get a straight line for each machine. The lines for different machines have different slopes, but in all cases, time grows linearly as input size increases. We arrive at the implementation-independent insight that grade-school addition is a linear-time algorithm in terms of bits added.

To arrive at this insight, we abstracted away the details of the world that we did not currently wish to study in order to focus on and identify similarities among seemingly different things. This process of abstracting away details and determining the rate of resource usage as a function of problem size n is one of the fundamental ideas in computer science.

In doing such an analysis for a given algorithm, the idea is to define the fundamental unit of computation, then look at the running time as a function of the number of bits required to specify the input. We can then study the rate of growth of this function.

=''Example''=

Suppose that we want to multiply two n-bit numbers. We can use the following grade-school multiplication algorithm.

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

      • * * * * * * *
    • * * * * * * *
  • * * * * * * *
* * * * * * * *
      • * * * * * * * * * * * * * * * * * * * * *

The computation requires generating n rows, then adding rows two at a time. So, there are n rows of n bits each, for a total of n times cn bits. Abstracting away the implementation details, we can say that the total time is bounded by cn2.

We have established that grade-school addition runs in linear time, whereas grade-school multiplication runs in quadratic time. Note that no matter how dramatic the difference in the constants, the quadratic curve will eventually dominate the linear curve. So, if the constant is very small, multiplication can take less time than addition. But for sufficiently large inputs, multiplication ends up taking more time than addition. Overall, the running time for multiplication is much slower than that for addition, no matter how fast the computer may be.

[plot]

In general, for any algorithm, we can define the input size n to be the number of bits to specify the algorithm’s input. We define Tn to be the worst-case amount of time used on inputs of size n.

Running time of algorithm A on input x: T(n) = max[all permissible inputs x of size n].

In computer science, we are often interested in the growth rate of Tn.

Exercise

How much time does it take to square the number n using grade-school multiplication?

Runtime Notation (Asymptotic Notation)

Here is some useful notation for discussing growth rates, expressed by the symbols O,

For any monotonic functions f, g from the positive integers to the positive integers, we say: f(n) = O(g(n)) if g(n) eventually dominates f(n).

So, f = O(n) means that there is a straight line that can be drawn that stays above f from some point on. More formally, we say that there exists a constant c such that for all sufficiently large n, f(n) ≤ c*g(n). This constant is the slope of the line representing input versus time.

In general,

Analogously, for any monotonic functions f, g from the positive integers to the positive integers, we say: f(n) = Ω(g(n)) if f eventually dominates g(n).

So, f = Ω(n) means that there is a line that can be drawn that stays below f from some point on. Formally, there exists a constant c such that for all sufficiently large n, f(n) ≥ c*g(n).

Also, for any monotonic functions f, g from the positive integers to the positive integers, we say: f(n) = Θ(g(n)) if f(n) = O(g(n)) and f = Ω(g(n)).

So, f = Θ(n) means that f can be sandwiched between two lines from some point on.

The symbol “O” for the “order” of the running time is not unique to computer science. It comes up in other fields. [example] The symbols Ω and Θ, on the other hand, are unique to computer science and were invented by Donald Knuth. They represent cases in which it’s useful, when comparing functions or sequences, to establish a bound or precisely nail down the growth of a function.

Exercises

True or false? %$\begin{eqnarray*} n &=& O(n^2) \n &=& O(n) n^2 &=& \Omega(n log n) • 3n2 + 4n + p = Ω(n2) • n2 log n = Θ(n2) \end{eqnarray*}$%

Note that in the case of n2 log n = Θ(n2), we know that n2 log n = O(n2) is not true but n2 log n = Ω(n2) is true.

Now, we can express growth rates in the following terms. Linear time: T(n) = O(n) Quadratic time: T(n) = O(n2) Cubic time: T(n) = O(n3)

In general, we describe a growth rate as polynomial time if, for some constant k, T(n) = O(nk). For example, we could have T(n) = 13 n5.

Some growth rates are even larger. We define exponential time as follows: For some constant k, T(n) = O(kn). For example, we could have T(n) = n*2n.

We also have logarithmic time, where T(n) = O(log n). For example, we could have T(n) = 15 log2(n).

=''Examples''=

Here’s a simple algorithm for adding two n-bit numbers. We’ll call it nursery-school addition. Input: Two n-bit numbers, a and b Output: a + b Start at a and increment (by 1) b times

What is this algorithm’s running time, T(n).

If b = 000 . . . 0000, then nursery-school addition takes almost no time. If b = 1111 . . . 11111, then nursery-school addition takes c n2n time.

So, in the worst case, this algorithm runs in exponential time.

Here’s a simple algorithm for multiplying two n-bit numbers. We’ll call it kindergarten multiplication.

Input: Two n-bit numbers, a and b Output: a * b

Start with a and add a, b – 1 times To find this algorithm, running time, we pick the worst case input for the input size n. Thus, T(n) = c n2n.

In the worst case, this algorithm also runs in exponential time.

Both nursery-school addition and kindergarten multiplication run in exponential time. Both algorithms scale horribly as input size grows. In contrast, grade-school methods scale polynomially: addition is linear and multiplication is quadratic. Using grade-school algorithms, we can, with reasonable efficiency, add and multiply fairly large numbers.

In general, if T(n) is not polynomial, the algorithm is not efficient. The run time scales too poorly with the input size. This will be the yardstick by which we will measure “efficiency.”

Factoring Efficiency

We have shown that grade-school multiplication is efficient. What about “reverse multiplication,” or factoring?

Let’s define FACTORING(N) to be any method that produces a non-trivial factor of N, or to assert that N is prime when there are no factors.

The simplest algorithm for factoring N is trial division. We check whether any number up to √N divides into N. for k = 2 to √N do if k | N then return “N has a non-trivial factor k” return “N is prime”

The running time is c √N (logN)2 time if we assume that division has the same complexity as multiplication: c (logN)2.. So, in input N, trial factoring uses c√N (logN)2 time.

Is this efficient? No! The input length n = log N. Hence, we’re using c 2n/2 n2 time. The time is exponential in the input length n.

Can we do better? We know of methods for factoring that are sub-exponential (about 2n^1/3 time) but no method that is truly efficient. Establishing whether there is an efficient algorithm for factoring is a very important open problem in computer science. At present, the security of the RSA cryptosystem rests on the assumption that factoring is hard. If there were efficient ways of factoring numbers, the RSA cryptosystem would be broken.

Almost Exponential Time For some constant k, T(n) = 2kth root of n Example: T(n) = 2n

Small Growth Rates Logarithmic Time: T(n) = O(log n) Example: T(n) = 15log2(n) Polylogarithmic Time: For some constant k, T(n) = O(logk(n))

Note: These kind of algorithms can’t possibly read all of their inputs. A very common example of logarithmic time is looking up a word in a sorted dictionary.

Some Big Ones: Doubly Exponential Time Example: 22^kn Triply Exponential Example: 22^2^kn And so forth.

Faster and Faster: 2STACK 2STACK(0) = 1

2STACK(n) = 22STACK(n – 1)

2STACK(1) = 2 2STACK(2) = 4 2STACK(3) = 16 2STACK(4) = 65536

2STACK(5) ≥ 1080 = atoms in universe

2STACK(n) = “tower of n 2’s”

And the inverse of 2STACK: log* 2STACK(0) = 1 log*(1) = 0

2STACK(n) = 22STACK(n – 1)

2STACK(1) = 2 log*(2) = 1 2STACK(2) = 4 log*(4) = 2 2STACK(3) = 16 log*(16) = 3 2STACK(4) = 65536 log*(65536) = 4

2STACK(5) ≥ 1080 = atoms in universe log*(atoms) = 5

log*(n) = # of times you have to apply the log function to n to make it ≤ 1

So an algorithm that can be shown to run in O(n log*n) time is linear time for all practical purposes!!

Ackermann’s Function A(0, n) = n + 1 for n ≥ 0 A(m, 0) = A(m - 1, 1) for m ≥ 1 A(m, n) = A(m - 1, A(m, n - 1)) for m, n ≥ 1

A(4, 2) > # of particles in universe A(5, 2) can’t be written out in this universe

Inverse Ackermann function A(0, n) = n + 1 for n ≥ 0 A(m, 0) = A(m – 1, 1) for m ≥ 1 A(m, n) = A(m – 1, A(m, n – 1)) for m, n ≥ 1

Define: A’(k) = A(k, k) Inverse Ackerman α(n) is the inverse of A’

Practically speaking: n × α(n) ≤ 4n

The inverse Ackermann function – in fact, Θ(n α(n)) arises in the seminal paper of D. D. Sleator and R. E. Tarjan. A data structure for dynamic trees. Journal of Computer and System Sciences, 26(3):362-391, 1983.

Reasoning about Time

On any reasonable computer, adding 3 bits and writing down the 2-bit answer can be done in constant time. Pick any particular computer A and define c to be the time it takes to perform X on that computer. So, the total time to add two n-bit numbers using grade school addition is cn [c time for each of n columns].

But please don't get the impression that our notion of counting “steps” is only meant for numerical algorithms that use numerical operations as fundamental steps.

Here is a general framework in which to reason about “time.”

Suppose we want to evaluate the running time T(n) of pur favorite algorithm DOUG. We want to ask: How much “time” does DOUG take when given an input X?

For concreteness, let’s look at an implementation of the algorithm DOUG in machine language. Now, “time” can be measured as the number of instructions executed when given input X. And T(n) is the worst-case time on all permissible inputs of length n.

In other contexts, we may want to use slightly different notions of “time.”

You can measure “time” as the number of elementary “steps” defined in any other way, provided each such “step” takes constant time in a reasonable implementation. So instead, we can count the number of JAVA bytecode instructions executed when given the input X. Or, when looking at grade school addition and multiplication, we can just count the number of additions.

Algorithms and Time

We have argued that grade school multiplication uses more time than grade school addition. This is a comparison of the complexity of two specific algorithms. To argue that multiplication is an inherently harder problem than addition we would have to show that “the best” addition algorithm is faster than “the best” multiplication algorithm.

Exercises

[TK]

Topic revision: r2 - 2008-04-04 - 14:26:18 - AntonBachin
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback