Thursday 2 March 2017

Lab 5: Algorithm Selection Lab

For this lab, we are trying a few different ways increasing the volume  of a sequence of sound samples, using different algorithms. The purpose is to test which approaches are faster than others by comparing the time elapsed from the  beginning to the end of the process.

During group work, we came up with several different algorithms that had varying results - so the code used for this blog is not my own original code, but the code that was worked on by the group.

Since sound is represented by computers as 16-bit float numbers, we use integers with the type: int16_t 

The first step is creating a naive version and testing its results with a time comparison function:

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <sys/time.h>


int main(int argc, char* argv[]) {

    float factor = 0.5;
    int max = 5000000;

    if(argc >= 2) {
        factor = atof(argv[1]);
    }
    if (argc >= 3) {
        max = atoi(argv[2]);
    }

    int16_t* sound = generateSound(max);


    int i;
    struct timeval t1, t2;
    double elapsed; 


    gettimeofday(&t1, NULL);
    adjustVolume(sound, factor, max);
    gettimeofday(&t2, NULL);
    elapsed = (t2.tv_sec - t1.tv_sec) * 1000 + (t2.tv_usec - t1.tv_usec) / 1000;
    printf("%d size sound file, original array, simple: %.2lf\n", max, elapsed);


} // end of main()

int16_t* generateSound(int max) {

    int leftover = max % 8;
    int len = (max + (8 - leftover));
    int16_t* rc = (int16_t*)malloc(len * 2);
    int i;
    for (i = 0; i < max; i++) {
        rc[i] = (int16_t)((rand() % 65536) - 32768);
    }
    return rc;
}


void adjustVolume(int16_t* original, float factor, int max) {

    int i;
    for (i = 0; i < max; i++) {
        original[i] = original[i] * factor;
    }
} 

-------------------------------------------------------------------------------------------------

The code above will give the simple naive version, which will create an array of int16_t values that range from -32768 -> +32767. Then our adjustVolume algorithm will simple go through the entire array and multiply those random values by the factor of 0.5.
This is probably a very slow process, since our program is creating 5 million elements in an array and looping through each one trying to change its value.  This process could run up a lot of memory and time the bigger that number gets.

The second step was to create a secondary algorithm by which we could compare the response times.
Another version we used was a table lookup. In our first naive version, we had to create a random number each time and then do calculations on it (multiply by the factor) in order to get a new number. That takes time and processing power. The purpose of the lookup table is to do all those calculations for every single possible value one time. Then, when a random number is created, we search for the already-calculated answer in the lookup. This saves us the time of calculating the values. The differences to our code look like this:

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <sys/time.h>


int main(int argc, char* argv[]) {

    float factor = 0.5;
    int max = 5000000;

    if(argc >= 2) {
        factor = atof(argv[1]);
    }
    if (argc >= 3) {
        max = atoi(argv[2]);
    }



int16_t* sound = generateSound(max);
int16_t* sound2 = copySound(sound, max); 

    int i;
    struct timeval t1, t2;
    double elapsed; 
 


   gettimeofday(&t1, NULL);
    adjustVolume(sound, factor, max);
    gettimeofday(&t2, NULL);
    elapsed = (t2.tv_sec - t1.tv_sec) * 1000 + (t2.tv_usec - t1.tv_usec) / 1000;
    printf("%d size sound file, original array, simple: %.2lf\n", max, elapsed);
   


   gettimeofday(&t1, NULL);
    adjustVolume3(sound2, factor, max);
    gettimeofday(&t2, NULL);
    elapsed = (t2.tv_sec - t1.tv_sec) * 1000 + (t2.tv_usec - t1.tv_usec) / 1000;
    printf("%d size sound file, original array, table: %.2lf\n", max, elapsed); 


} // end of main() 

int16_t* generateSound(int max) {

    int leftover = max % 8;
    int len = (max + (8 - leftover));
    int16_t* rc = (int16_t*)malloc(len * 2);
    int i;
    for (i = 0; i < max; i++) {
        rc[i] = (int16_t)((rand() % 65536) - 32768);
    }
    return rc;
}


int16_t* copySound(int16_t* original, int max) {

    int leftover = max % 8;
    int len = (max + (8 - leftover));
    int16_t* duplicate = (int16_t*)malloc(len * 2);
    int i;
    for (i = 0; i < max; i++) {
        duplicate[i] = original[i];
    }
    return duplicate;
}


void adjustVolume(int16_t* original, float factor, int max) {

    int i;
    for (i = 0; i < max; i++) {
        original[i] = original[i] * factor;
    }


void adjustVolume3(int16_t* original, float factor, int max) {

    uint16_t i;
    float* table = malloc(65536 * 2);
    for (i = 0; i < 65535; i++) {
        table[i] = (int16_t)i * factor;

    }
    table[65535] = (int16_t)65535 * factor;
    int j;
    for (j = 0; j < max; j++) {
        original[j] = table[(uint16_t)original[j]];

    }
}


----------------------------------------------------------------------------------------
 
This code will make a duplicate of the first array we create with our naive version and use it as sound2. Then in adjustVolume3, we will make all our calculations ahead of time inside our duplicate table. Then we can make changes to the original array by referencing to our lookup table. 

The next step we have to do is to compare the naive version we created with the table lookup version - first in the X86 system and then separately in the Aarch64 system. Note that we are not comparing times between the systems, but comparing the algorithm differences inside each system. So that means we'll need to do these tests on the Xerxes server (X86) and then on the Betty server (Aarch64) and compare the printf responses we get.

[https://wiki.cdot.senecacollege.ca/wiki/SPO600_Servers

We had several algorithms, so I'll point out which responses are directly a result of the naive version and the table lookup versions:

On Betty:

5000000 size sound file, new array, simple: 92.00
5000000 size sound file, original array, simple: 91.00                  *** NAIVE VERSION ***
5000000 size sound file, new array, table: 56.00
5000000 size sound file, original array, table: 56.00                     *** TABLE LOOKUP ***
5000000 size sound file, new array, int hack: 41.00
5000000 size sound file, original array, int hack: 40.00
5000000 size sound file, new array, inline assembly: 0.00
5000000 7size sound file, original array, inline assembly: 4.00
 

As we can see in the Aarch64 system, using the table lookup algorithm was about 63% faster than the naive version, whereas the naive version was about 62% slower than the table lookup version.

On Xerxes:

I tried to copy the code over exactly the same way, but I got segmentation faults with core dump on the adjustVolume3 and adjustVolume4. Everything else appeared to work fine except for the inline assembly part of adjustVolume8, because that code is meant for Aarch64 only. I studied the code for a while and then I asked for some advice and got directed to this github reference for debugging:

https://github.com/SENECA-DSA555/dsa555-w17/wiki/gdb-guide

I ran the code through the debugger and got this:

(gdb) run
Starting program: /home/wawilliams/lab7
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.24-4.fc25.x86_64
5000000 size sound file, new array, simple: 30.00
5000000 size sound file, original array, simple: 23.00

Program received signal SIGSEGV, Segmentation fault.
0x0000000000400f3d in adjustVolume4 (original=0x7ffff66fd010, newSound=0x7ffff4a5f010, factor=0.5, max=5000000) at lab7_Final.c:148
warning: Source file is more recent than executable.
148                     table[i] = (int16_t)i * factor;





 

No comments:

Post a Comment