Tuesday, September 21, 2010

Algorithms to Abstractions

Hi:

In last two classes we have talked about various topics related to computer science and engineering: Algorithms (a precise step-by-step description of how to solve a problem in finite time (i.e. the algorithm execution terminates as well as solves the problem)); Linear search and Binary Search with associated worst case computational complexity; and Computer Architecture as an abstraction/interface between hardware and software; Processor Performance Equation (Execution Time = Instruction Count*Clock Cycle Per Instruction*Clock Cycle Time) and how it highlights the role of Computer Scientists and Engineers; The importance of exploiting parallelism along with associated complexity (e.g. need to be careful to avoid race conditions); The role of Compilers and Operating systems; as well as the importance of layers of abstractions (hardware, operating systems, programming environments etc.)  in computer science and engineering.

There are Wikipedia pages which you can visit to get more information about any of these topics. Specifically, I would like you to visit the Binary search Wikipedia page and answer the following questions:

1) Even though binary search has far better worst case complexity than linear search (log(N) versus N) why in practice linear search may perform better than binary search?
2) Why the best search algorithm may employ a combination of binary search and linear search?

Also visit the page on CPU and try to explain in your words what is the difference between instruction level parallelism, thread level parallelism and data level parallelism. Give examples of where (in what devices) and for what purpose (in which applications) you might be using these different forms of parallelisms.

And finally, explain in your words what are the characteristics of a good abstraction in computing?

Regards,
Sandeep

16 comments:

  1. 1. Linear is better to use in practice even though binary better worst case complexity than linear, because for binary to work all the items are in sorted order.
    2.It is best to employ both binary and linear so that binary can quickly cut a lot of your choices down to 16 or 8 then use linear which is more efficient with smaller lists.
    3. Instruction level parallelism tries to increase the rate at which instructions are executed within a CPU,thread level parallelism tries to increase the number of threads that a CPU can execute at the same time, and data parallelism deals with multiple pieces of data in one instruction.
    4. Good characteristics of abstraction in computing is focusing on one concept/idea at a time.

    ReplyDelete
  2. 1. The binary search uses a sorted list order. If the items are not sorted, then it won't be effective. Instead of taking the time to sort them, most of the time, a linear search will be faster. It is better "in practice" because, obviously, all lists will not be sorted.

    2. It is best to use both because a binary search can knock out a large list of items and then you can use a linear search later when the span is smaller. This increases performance.

    3. Instruction level parallelism deals with fetching and decoding an instruction before the prior instruction is finished executing. Namely, it tries to decrease the time between and of execution of instructions. Thread level parallelism involves executing multiple threads or programs at the same time and in parallel. An example of this is multicore processing and is used in video games and scientific/engineering applications (ie memory intensive programs). Data parallelism deals with multiple pieces of data associated with only ONE instruction. This allows for faster performance when one instruction/operation is being performed on a large data set. This is used in multimedia applications and many scientific applications.

    4. Specificity, detail, and elimination/reduction of factors are all parts of good abstraction in computing. The idea is to focus on one or a small few of concept or ideas at a time.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. 1. Binary search is the obvious choice when looking purely at worst case complexity. However for binary search the lists need to be sorted, if you have unsorted list binary search is ineffective and linear search would be the way to go.

    2. Binary search is good to limit the search to a certain list, than linear might be more preferred once the list is eliminated to a manageable size.

    3. All three forms a parallelism are trying to make the CPU more efficient. Instruction level parallelism trys to increase the rate at which the CPU can execute instructions, thread level parallelism's goal is to the numbers of threads the CPU can execute instead. An example of ILP is simple pipelining and can be found in most modern general-purpose CPUs. An example of thread level parallelism is multiprocessors. Finally, data parallelism is used in vector CPUs to focus on one instruction at a time with a large set of data, like in multimedia products.

    4. A main characteristic of good abstraction is to make sure the focus remains clear to the audience.

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. 1. A linear search can perform better in practice because data is not always sorted in a way that allows for a binary search. The data could be sorted, but the number of steps taken to sort the list plus the number of steps it takes to perform a binary search may often be greater than the number of steps it takes to perform a linear search.

    2. With a small enough list, a linear search can perform faster than a binary search. This is because, for small numbers, log(N) and N are almost equal and the binary search requires the extra computations involved in finding the median of the list. For this reason, a good search algorithm will employ both linear searches and binary searches. The binary search can be used to cut a massive list down into a smaller one and then the linear search will finish off the process and find the specified element.


    3. Instruction level parallelism tries to take advantage of steps that are not dependent on another to execute more code simultaneously in the same amount of time. Instruction level parallelism is used wherever it can be, but it is often most useful in graphics and scientific work. Thread level parallelism speeds the execution of a program by distributing the workload in the form of threads over multiple processors. Threading is used quite a bit processing intensive applications such as video games. Each thread can be performing its own operations and computations while communicating with the other threads. Data parallelism also takes advantage of multiple processors, but each processor is simply preforming the same task on different parts of the data. Data parallelism is often used in applications in which the same operation must be performed a great number of times; an example of this would be animation software.

    4. A good abstraction turns a complicated process into a tool that is simple to use. The inner workings of the tool are still complex, but the abstraction hides that and allows us to use the tool to create further abstractions.

    ReplyDelete
  7. 1.) In practice, linear search has better performance than binary search because linear search does not require a sorted list. This is obviously better because not all lists are sorted. In fact, it's probably safe to say most lists are not sorted. A binary list is almost useless without a sorted list, so linear search is the way to go in most cases.

    2.) It's better to use both because binary can tackle a large list in the beginning, then linear can come in later when the list is more manageable and smaller. This is the most efficient way to search.

    3.) Instruction level parallelism is meant to start a process before the process before it is finished. It is useful in shortening the gap between and during process/instructions. Thread level parallelism is the practice of executing multiple processes and instructions at the same time. It is used in multicore processors which are very useful for games and video encoding and decoding. Data parallelism is used when many pieces of data are delegated to one task. It is very fast and efficient when performing that one instruction. It is used for multimedia type applications and the like.

    4.) Good abstraction in computing deals with factoring out less important details and distractions so that someone can focus most of their energy and time on just a few ideas at one time.

    ReplyDelete
  8. 1. Binary search is definitely the better choice when dealing with an array in which everything has been sorted, however if the list has yet to be sorted, a linear search will produce much better results.

    2. Using a combination of binary and linear search is preferable because a binary search would be able to sort through a large list and one that list has become a manageable size, a linear search could take over and finish the searching process more efficiently.

    3. The three different forms of parallelism are all designed to make the processing more efficient in the computer. Instruction level parallelism is used to optimize the rate at which instructions are executed in the CPU. Thread level parallelism works to increase the number of threads that can be executed within the CPU at a time. And data parallelism deals with multiple pieces of data at the same time.

    Instruction level parallelism can be seen in pipelining and superscalars. Thread level parallelism can be observed in video games or scientific modeling or calculation programs where multiple different tasks are being done at the same time. And data level parallelism can be used to churn through massive amounts of data coming from one program or instruction.

    4. Good abstraction will simplify the idea and problem and make it more succinct and manageable to anyone looking at the abstraction.

    ReplyDelete
  9. 1. Binary search will only work well when things are in order.

    2. Using binary search and linear search at the same time is a much better idea then using one or the other because binary search can cut your choices down to a smaller amount so that linear search will have an easier time making a decision.

    3. Instruction level parallelism allows only two instructions to be sent out at a time. Thread parallelism allows multiple CPU's to share memory and is able to keep data more up to date. Data parallelism can deal with multiple pieces of data for each instruction.

    4. Good abstraction allows you to simplify ideas down to smaller ones and make it more manageable for other people looking at it.

    ReplyDelete
  10. 1. Linear is often found to be better because for binary to work everything in the array has to be sorted.

    2. It is better to use a combination of the two because the binary search can reduce the size of extremely large list then linear search can take over when the list gets smaller.

    3. Instruction level parallelism is a measure of how many of the operations in a computer program can be performed simultaneously and has been used to make the runtime between the CPU and main memory more efficient. thread level parallelism executes multiple programs at the same time. This is used in multi-core processors to allow the different cores to work on the same application together. data level parallelism deal with multiple pieces of data in the context of one instruction.

    4. good abstraction is focusing on only a few details at a time.

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete
  12. 1) Binary search, while O(log n), only works on sorted lists. If the input data set is never sorted, then sorting O(n log n) is added - which is worse than the O(n) of linear search

    2) I know of no such algorithms.

    Instruction level parallelism: The degree to which operations can be performed simultaneously without breaking dependence of one operation on the result of another. This is done by hyper-threading CPUs in realtime, and also done by compilers as an optimization step. Applications are any sequential task that can benefit from performance improvements.

    Task-level (thread) parallelism: Performing two distinct tasks in parallel. Each task my share a data source with another task, or they can be completely separate. This is physically performed in multi-core processors, and is performed logically using preemptive multitasking by modern operating systems. Applications are platforms that may be tasked with multiple duties simultaneously.

    Data-level parallelism: Performing the same task across multiple independent data. SIMD hardware, such as GPU cores perform this while executing shaders. An application is rendering graphical scenes at interactive framerates.

    Good abstraction separates the user/caller from implementation details. It simplifies utilization of the resources and also makes the system more robust. It may also provide a means for getting more direct access when it's needed.

    ReplyDelete
  13. 1.)Binary search relies on the fact that elements in an array are sorted; therefore if said array is not sorted, linear search will most likely perform better.
    2.)Since linear search is more efficient with fewer elements and binary search is more efficient when the array is much larger, the best algorithm would be to use binary search until the number of elements within the sorted array is small enough that linear search could maximize the efficiency with its method at that point.
    3.)Instruction level parallelism processes instructions, fetching, decoding, and executing in a manner so that one of these actions is always happening to an instruction, but they are at different stages along the assembly line. This parallelism is utilized in general CPUs to increase general performance through faster completion of instructions (1 instruction per cycle). Thread level parallelism is utilized in multiprocessors and involves the execution of multiple programs at the same time, in parallel. Data level parallelism is more beneficial on vector CPUs, which perform a single operation on a large set of data with one instruction. This parallelism is only efficient if the application needs to have one operation done to a large set of data.
    4.)A good abstraction makes it simple for users to perform potentially complicated tasks. For example, high-level programming languages are examples of well-implemented abstractions because they allow us humans to work with difficult concepts in manageable pieces.

    ReplyDelete
  14. 1.
    Linear search may perform better than binary search because it doesn’t have to be sorted. It is practical when doing a single search in an unsorted list. Binary search has to be sorted. Binary search also gets complicated and slower when there is change in the array. The array will have to be re-sorted in order for binary search to work properly.

    2.
    The best search algorithm may employ a combination of binary search and linear search because it will be more sufficient. When the size of a remaining span falls below a certain small value, then it will be more efficient to use a linear search instead of a binary search.

    3.
    Instruction level parallelism executes more than one instruction at a time like an assembly line. It is frequently used in video creation software and photo processing because they are very monotonous.

    Thread level parallelism executes multiple programs in parallel. For example, a single program that contains more than one thread (function) can be executed separately. This type of technique usage can be seemed in UltraSPARC technology.

    Data parallelism is different from both instruction level parallelism and thread level parallelism in the sense that data parallelism deals with every one piece of data for every instruction. Modern examples include PowerPC-related AltiVec (which is also known as VMX).

    4.
    A good abstraction focuses on simplicity. It takes away all the specific implementations to allow easier usages.

    ReplyDelete
  15. 1. Linear is probably better because it does not have to be sorted like binary does.
    2. The algorithm would have to be set up so that the binary section searches through the large numbers, and then the linear section will be search for the small numbers. The linear search will trigger once the binary search has narrowed down the numbers enough that the linear will be more efficient.
    3. Instruction level parallelism is the number of operands a cpu can perform simultaneously. Thread parallelism is when multiple processors of the same cpu performs processes on the same or different data. Data parallelism is when a processor executes a specific list of processes.
    4. The point of abstraction is to simplify complicated processes.

    ReplyDelete
  16. 1.Binary will work effectively if the items are sorted. Linear search would be more effective to use when it comes to unsorted list.

    2.I think that we use both binary and linear search because with Binary it gets rid of the large list and after its reduce to a smaller size, linear can then do the rest.
    Instruction level executes more than one instruction at a time. Can performed operation at the same time and it increase the rate of the CPU to execute the instructions. Task thread will increase the number of the threads. Data level when instructions can be performed a lot of times in an application.

    3.Good abstraction will allow to get rid of all the detailed tasks.

    ReplyDelete