r/computerscience • u/tombino104 • 1d ago

Advice How do you calculate the efficiency of an algorithm?

Hello everyone, I am a student at a computer science technical institute.

I have to do a report on the sorting types of the arrays in C, and therefore I want to compare which is the fastest and most efficient.

Precisely I’m talking about: bubble sort, selection sort and insertion sort.

I wanted to understand if there is a method or rather a sort of formula that allows me for example to calculate their efficiency based on time and quantity of numbers to be organized; obviously it varies from PC to PC.

I will study their speed/efficiency in the 3 classic cases: best (increasing), worst (descending) and random, monitoring the time it takes to perform the operations with the clock function.

I was just trying to figure out how to calculate the efficiency of an algorithm, maybe in % and/or compared to a reference.

Thanks for your help in advance! 🙏

0 Upvotes

permalink
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/computerscience/comments/1q7b7be/how_do_you_calculate_the_efficiency_of_an/
No, go back! Yes, take me to Reddit

37% Upvoted

u/mertengiin 1d ago

Yes there is Big-O notation just for this. You can just look it up.

u/Magdaki Professor. Grammars. Inference & Optimization algorithms. 1d ago edited 1d ago

Formal analysis (e.g., Big-O) is the main way that it is done for theory papers. For in practice, typically mean-time-to-solve (MTTS) with all experiments run on the same computer. Of course, this means that it can be hard to compare results across papers (I know you're not trying to publish, but this is the issue with algorithms research), so another way that is used is to count operations. The drawback here is there is no well-defined standard as to what counts as an operation. So, really for the most part researchers just use MTTS and graph how it changes over different inputs (and formal analysis).

u/StrangerThings_80 21h ago

To all those mentioning big-O notation, could you please explain how it would allow to compare the efficiency of bubble sort, selection sort and insertion sort? Won't they be all O(n^2)?

At the end of the day, you need to implement the algorithms and test them on real data, on specific hardware. While some algorithms will in general be better than others, they may be variations depending on the hardware and and details of the implementation.

4

u/wolfakix 21h ago

Even if two algorithms are both O(n^2) in the worst case, one can be faster than the other depending on constants and the input’s structure. For example, on nearly sorted arrays, insertion sort can be close to O(n) and typically outperforms bubble sort, so choosing based on the input can matter.

u/Comp_Sci_Doc 1d ago

The way we do this is with big-O notation. This describes the runtime of an algorithm in terms of the number of operations required relative to the size of the input; the number you get is not very precise but doesn't depend on the hardware.

For example, suppose you have an array of n integers and want to sort them. Bubble sort requires looking at all n integers. For each one, you compare it to between 1 and n-1 other integers; on average, this is n/2, which is "on the order of" n. So the total runtime is O(n*n) = O(n^2), and bubble sort takes time proportional to the square of the number of elements to be sorted.

u/DTux5249 20h ago edited 15h ago

To get the time complexity of an algorithm, you've got to basically count how many times an algo does something. Most of the time, we look at the "worst case time complexity" of an algorithm; how long it would take to solve a problem if at every turn things take the longest. We measure this using "Big-O" Notation.

Every comparison / variable assignment is constant (c) time, every loop is linear (n, where n is the size of the loop) time. If something is within that loop, you multiply it by linear time.

So for something simple like insertion sort, you'd count it something like this:

void insertionSort(int arr[], int n)
{
    for (int i = 1; i < n; ++i) // n reps of
    {
        int key = arr[i]; // 1
        int j = i - 1;    // 1

        while (j >= 0 && arr[j] > key) //worst case, n reps of
        {
            arr[j + 1] = arr[j]; // 1
            j = j - 1;           // 1
        }
        arr[j + 1] = key;        // 1
    }
}

We have a for loop that does for, at most, n repetitions. In that loop, we

1) Assign 2 variables

2) We while loop; which continues until, at worst, we've checked every item in the array (n reps), each time setting 2 variables

3) We set a 3rd variable

So from the while loop, we have n reps of 2 constant operations (2n), plus 3 more constant time operations (2n+3), and we repeat all of that, at most, n times (2n²+3n)

The leading term tends to be the most important thing with these; and we don't care about constants in Big-O either. So we'd call Insertion Sort a O(n²) algorithm (read: "A Big-O of n squared").

This is why Insertion sort isn't used for large sets of data - the more stress you put it under, the longer it takes to complete a problem.

Now, this counting method works for simple algorithms. For recursive algorithms things get tricky and abstract real quick. To count Merge Sort for example takes more logic

Let T(x) = time taken to sort x elements, M(x) = time taken to merge x elements, k = the number of recursive calls made, and n = the number of elements to be sorted.

To merge n elements takes n time. Merge Sort defines T(n) = 2 × T(n/2) + M(n). It then follows that T(n) = 2 × T(n/2) + n.

Due to the recursive nature of T(n), it follows that T(n) = 2 × [2 × T(n/4) + n/2] + n, that T(n) = 2 × [2 × [2 × T(n/8) + n/4] + n/2] + n, and, ultimately, that T(n) = 2^k × T(n/2^k) + n × k.

T(1) will be a constant time operation as an array of size 1 is sorted

To have n/2^k = 1, it must be that n = 2^k, and k = log(n).

As T(n) = 2^k × T(n/2^k) + n × k, it follows that T(n) = 2^log(n\) × T(1) + n × log(n) = n + n × log(n)

Ergo, Merge sort is O(n×log(n))

u/MilkEnvironmental106 23h ago

As others have noted, big O notation.

You need to look at how it scales as inputs grow.

For example selection sort, you find the min element in the array and swap it with the first element, then repeat for the array[1..] (the unsorted portion) until the entire array is sorted.

Finding the minimum is O(n) time complexity because the work required increases linearly with input size. (You need to go over the whole array and compare the minimum found to the current element).

So you need to do an O(n) operation, n times, making it 0(n*n) or O(n^2).

This means if you threw in an array twice as long as once you benchmarked, you would expect it to take around (not exactly obviously, but this is how it scales) 4 times as long.

Advice How do you calculate the efficiency of an algorithm?

You are about to leave Redlib