prev

next

of 61

View

221Download

0

Embed Size (px)

DESCRIPTION

3 Purpose: However, if that list grew from 10 to 10 million, the WAY we store, order & retrieve this data would become critically important.

1 Algorithms Starring: Binary Search Co Starring: Big-O 2 Purpose: The ability to effectively process a large volume of data is a critical element in systems design. If we had to maintain information on 10 licensed drivers, we could code it almost any way we wished. 3 Purpose: However, if that list grew from 10 to 10 million, the WAY we store, order & retrieve this data would become critically important. 4 Resources: Java Essentials Chapter 18 p.703 Java Essentials Study Guide Chapter 15 p.235 5 Intro: Knowing all of the rules of English, grammar and spelling, will not help you give directions from place a to place b if you do not know how to get there. In systems, an analyst can describe a method in more abstract terms to a programmer without knowing the exact syntax of the programming language. Programs are typically based on one or more algorithms. 6 An algorithm is a abstract and formal step by step recipe that tells how to perform a certain task or solve a certain problem on a computer. Pseudocode is a solution in a loosely formatted style of the actual software, Java, code but without the syntax. This is the shorthand that developers use to flesh out a solution. 7 When dealing with handling large volumes of data, it makes sense to form an acceptable algorithm that will effectively work with the data. Before you actually implement this algorithm, you need to scope it out and analyze its potential efficiency. 8 Therefore, an algorithm that efficiently orders (sorts) a large volume of data and another algorithm that efficiently searches for a specific element in that data, a specific drivers information obtained by using their SSN, is imperative. 9 We will cover the following topics: A Gentle Introduction to Big-O Sequential Search Algorithm A Need for Order Bubble Sort (a Review) Selection Sort Binary Search 10 A Gentle Introduction to Big-O: When we begin to discuss algorithms we MUST be able to evaluate their effectiveness in some way One way would be to evaluate their execution or pure clock time This method leaves a tremendous dependency on the power of a specific CPU 11 Also, if the algorithm is inefficient, a powerful CPU can mask the problem only up to a point We need a more abstract, standard, mechanism for evaluating efficiency 12 We use a more logical Order of Growth methodology, Big-O, to evaluate theoretical efficiency This method obviates the relative strengths of the system(s) on which a given algorithm executes The Big-O Growth Rate can be summed up with the following chart: 13 O(1) < O(log n) < O(n) < O(n log n) < O(n^2) < O(n^3) < O(a^n) Linear t n Exponential Quadratic N log n Constant O(1) 14 As you can see, a constant growth rate is optimum whereas an exponential rate is a problem What do you think the N stands for ? 15 Here is a little comparison chart that illustrates the concept: N N^2 N Log(N) , ,000 2,468 1,000 1,000,000 9, ,000 10,000,000,000 1,660,964 16 We will examine these in depth in our lecture on Big-O For now, understand that an algorithm that has a Logarithmic efficiency is preferable to a Quadratic algorithm 17 Sequential Search Algorithm: Given an example where we have a database consisting of only 10 Licensed drivers Well, we can create driver class and then an array of instances of that class Order really does not matter since we have only 10 drivers to search 18 Adding drivers to the array would be efficient as it only takes one step: myDriverArray[2] = new myDriver(constructor info); What do you think the Order of Efficiency would be for the add ? 19 Adding drivers to the array would be efficient as it only takes one step: myDriverArray[2] = new myDriver(constructor info); What do you think the Order of Efficiency would be for the add ? ANS:Constant O(1) 20 If we needed to look for a specific driver using their SSN, at most how many steps would we need to execute ? at Least ? 21 If we needed to look for a specific driver using their SSN, at most how many steps would we need to execute ? at Least ? ANS: 10 if driver was last item or not in array Best case is 1 step 22 This is the essence of a Sequential Search, it iterates over each element in a list and stops either when the item is located or the end of the list is reached What do you think the Order of Efficiency is in the best and worst cases ? 23 This is the essence of a Sequential Search, it iterates over each element in a list and stops either when the item is located or the end of the list is reached What do you think the Order of Efficiency is in the best and worst cases ? ANS:if the driver being searched is the first in the list, then it is Constant O(1) otherwise it is Linear O(N) 24 This search is also known as a Linear Search How is this coded ? 25 This search is also known as a Linear Search How is this coded ? Driver myDriver[] = new Driver[100]; String SSN = new String( ); for (int i = 0; i < myDriver.length; i++) { if myDriver[i].getSSN.equals(SSN) return i; } return -1; 26 A Need for Order: Well, our little search works for 10 Drivers, but if our list had 1 million drivers, then we can expect our linear search algorithm to execute 1 million times EACH time we look for a specific Driver We need a better way to search our list, but before we can think of a more efficient search we need to order the data in a way that can be used in a more advanced search 27 We need to make sure that our list is indexed in a manner such that the sequence of SSNs is ordered from smallest to greatest Now, it is important to note that just as there is a cost to performing a search against a list there is a cost for sorting a list 28 Therefore, we need to evaluate the relative value of sorting a list so that we may execute an efficient search AGAINST simply leaving the list unordered and performing a linear search 29 In order to make a decision we need to know what the Dominant factor or process is in our application If the list is fairly static and there will be extensive searches for specific drivers then the search is the dominant factor and our solution needs to make sure that the search algorithm is efficient even at the expense of a costly SORT algorithm 30 If the list is dynamic and the searching is infrequent then the inserting or adding algorithm efficiency overrides the search efficiency We will learn when we discuss Data Structures that this solution requires the evaluation of the efficiency (Big-O) of competing ways to store and maintain data 31 At this point we know of only the array or ArrayList as a potential Data Structure but we will soon cover Linked Lists, Binary Trees and Hash Tables Lets assume that in our project, the list of licensed drivers will be about 1 million and the list once loaded will remain static 32 Lets also assume that there will be frequent requests for information on specific drivers This information provides us with our solution, we will order the data so that we can provide an efficient method for searching the list 33 There are MANY ways to sort a list(MergeSort, QuickSort, InsertionSort ) We will cover all of them in a later lecture, but for now we will focus on using the Selection Sort, and we will look back at the Bubble Sort 34 Bubble Sort (a Review): Sort an array in ascending or descending order by evaluating the nth element against the nth+1 element If they are not in the prescribed order, swap them When we reach the end of the array, all of the items will be sorted How will we sort our Driver class list ? 35 int c1, c2, leng, temp; Driver temp; leng = myDriver.length; for (c1 = 0; c1 < (leng - 1); c1++) { for (c2 = (c1 +1); c2 < leng; c2++) { if (myDriver[c1].compareTo(myDriver [c2]) > 1) { temp = myDriver [c1]; myDriver [c1] = myDriver [c2]; myDriver [c2] = temp; } 36 What are we actually Swapping here ? This sort has a nested for loop This means that for each element of the list, the inner loop is executed In effect we perform the number of steps equal to the number of elements squared Thats why we call this an O(n^2) sort 37 Selection Sort An algorithm for arranging the elements of an array in (ascending) order Find the largest element on the array and swap it with the last element, then continue the process for the first n-1 elements 38 1st iteration takes the LARGEST ARRAY element and swaps it with the LAST array element The largest element is now in its correct place and will not be moved again 39 We logically reduce the size of the array and ignore the last element(s) 40 Steps in selection sort: Initialize a variable, n, to the size of the array Find the largest among the first n elements Make it swap places with the nth element Decrement n by 1 Repeat steps 2 to 4 while n >= 2 41 SELECTION SORT OUTPUT: initial array: selection sort in progress 42 SELECTION SORT OUTPUT: initial array: selection sort in progress 43 The same procedure can be used to sort the array in descending order by finding the SMALLEST element in the array 44 For the same reasons as the bubble sort, this is also an O(n^2) sort 45 Once we have sorted the list, there is no need to apply a linear (Sequential) search unless you need to accumulate data about each driver in the list We are now free to apply an efficient search algorithm 46 Binary Search: The concept of a Binary search is that it continually & logically divides the list in half until the element is found or the logical size of the list is eliminated It is an algorithm for quickly finding the element with a given value in a sorted array 47 Used to find the location of a given target value in an array by searching the array Works on sorted arrays. Unsorted arrays need to be searched element by element 48 Take a sorted (acsending) array of n elements and search for a given value, x 49 Locate the middle element Compare that element with x A match ends the search 50 If x is smaller, the target element is in the LEFT half of the array 51 If x is larger, the target element is in the RIGHT half of