Array
Why not use Arrays for Everything?
1. In unordered array the insertion is fast Q(1), but search takes slow O(N) time.
2. In ordered array, you can search quickly, in O(logN) time, but insertion takes O(N) time.
For both above kinds, the deletion takes O(N) time
Another problem with arrays is that their size is fixed when they are first created with new.
Simple Sorts
Bubble sort
Bubble sort makes about Square(N)/2 comparisons. Both swaps and comparisons are proportional to N2. Bubble sort runs in O(N2) time.
Selection Sort
The Selection sort improves on bubble sort by reducing the number of swaps necessary from O(N2) to O(N). Unfortunately, the number of comparisons remains O(N2).
Insertion Sort
Insertion sort executes twice as fast as bubble sort and somewhat faster that selection sort in normal situation. Insertion Sort runs in O(N2) for random data. For data already sorted (O(N)) or almost sorted, the insertion sort does much better. However, for data, that is reverse sorted, this runs no faster that bubble sort.
Comparisons of simple sorts
There’s probably no point in using the bubble sort, unless you don’t have your algorithm book handy. The bubble sort is so simple that you can write it from memory. Even so, it’s practical only if the amount of data is small.
The selection sort minimizes the number of swaps, but the number of comparisons is still high. This sort might be useful when the amount of data is small and swapping data items is very time-consuming compared with comparing them.
The insertion sort is the most versatile of the three and is the best bet in most situations, assuming the amount of data is small or the data is almost sorted. For larger amounts of data, quicksort is generally considered the fastest approach;
We’ve compared the sorting algorithms in terms of speed. Another consideration for any algorithm is how much memory space it needs. All three of the algorithms in this chapter carry out their sort in place, meaning that, besides the initial array, very little extra memory is required. All the sorts require an extra variable to store an item temporarily while it’s being swapped.
Stacks and Queues
There are basically three operations that can be performed on stacks . They are 1) inserting an item into a stack (push). 2) deleting an item from the stack (pop). 3) displaying the contents of the stack(pip).
Below are some of operations a stack data type normally supports:
Push, pop, top, isempty, isfull, get-size,
All operations except
Queue, Circular queue.
As with stacks, items can be inserted and removed from a queue in O(1) time.
A deque is a double-ended queue. You can insert items at either end and delete them from either end. The methods might be called insertLeft() and insertRight(), and removeLeft() and removeRight().
Priority Queue
A de
Linked list
double ended list
Red-Black Rules
1. Every node is either red or black.
2. The root is always black.
3. If a node is red, its children must be black (although the converse isn’t necessarily true).
4. Every path from the root to a leaf, or to a null child, must contain the same number of black nodes.
The number of black nodes on a path from root to leaf is called the black height.
Fixing Rule Violations
Color Flips on the Way Down
Here’s the rule: Every time the insertion routine encounters a black node that has two red children, it must
change the children to black and the parent to red (unless the parent is the root, which always remains black).
Like ordinary binary search trees, a red-black tree allows for searching, insertion, and deletion in O(log2N) time.
Hash Tables
A hash table is a data structure that offers very fast insertion and searching. Hash tables are significantly faster than trees, which, as we learned in the preceding chapters, operate in relatively fast O(logN) time. Not only are they fast, hash tables are relatively easy to program.
Hash tables do have several disadvantages. They’re based on arrays, and arrays are difficult to expand after they’ve been created. For some kinds of hash tables, performance may degrade catastrophically when a table becomes too full, so the programmer needs to have a fairly accurate idea of how many data items will need to be stored (or be prepared to periodically transfer data to a larger hash table, a time-consuming process). Also, there’s no convenient way to visit the items in a hash table in any kind of order
However, if you don’t need to visit items in order, and you can predict in advance the size of your database, hash tables are unparalleled in speed and convenience.
Open Addressing
Double Hashing
Experience has shown that this secondary hash function must have certain characteristics:
stepSize = constant - (key % constant); where constant is prime and smaller than the array size.
Double hashing requires that the size of the hash table is a prime number.
Separate Chaining
In open addressing, performance degrades badly as the load factor increases above one-half or two-thirds. In separate chaining the load factor can rise above 1 without hurting performance very much. This makes separate chaining a more robust mechanism, especially when it’s hard to predict in advance how much data will be placed in the hash table.
Another approach similar to separate chaining is to use an array at each location in the hash table, instead of a linked list. Such arrays are sometimes called buckets.
Hash Functions
Quick Computation
Random Keys
Use a Prime Number for the Modulo Base
Miscellaneous
1. Insertion sort is preferable to quicksort for small files and for almost-sorted files.
2. Some sorting algorithm that retains secondary order is said to be stable
1. In unordered array the insertion is fast Q(1), but search takes slow O(N) time.
2. In ordered array, you can search quickly, in O(logN) time, but insertion takes O(N) time.
For both above kinds, the deletion takes O(N) time
Another problem with arrays is that their size is fixed when they are first created with new.
Simple Sorts
Bubble sort
Bubble sort makes about Square(N)/2 comparisons. Both swaps and comparisons are proportional to N2. Bubble sort runs in O(N2) time.
Selection Sort
The Selection sort improves on bubble sort by reducing the number of swaps necessary from O(N2) to O(N). Unfortunately, the number of comparisons remains O(N2).
Insertion sort executes twice as fast as bubble sort and somewhat faster that selection sort in normal situation. Insertion Sort runs in O(N2) for random data. For data already sorted (O(N)) or almost sorted, the insertion sort does much better. However, for data, that is reverse sorted, this runs no faster that bubble sort.
Comparisons of simple sorts
There’s probably no point in using the bubble sort, unless you don’t have your algorithm book handy. The bubble sort is so simple that you can write it from memory. Even so, it’s practical only if the amount of data is small.
The selection sort minimizes the number of swaps, but the number of comparisons is still high. This sort might be useful when the amount of data is small and swapping data items is very time-consuming compared with comparing them.
The insertion sort is the most versatile of the three and is the best bet in most situations, assuming the amount of data is small or the data is almost sorted. For larger amounts of data, quicksort is generally considered the fastest approach;
We’ve compared the sorting algorithms in terms of speed. Another consideration for any algorithm is how much memory space it needs. All three of the algorithms in this chapter carry out their sort in place, meaning that, besides the initial array, very little extra memory is required. All the sorts require an extra variable to store an item temporarily while it’s being swapped.
Stacks and Queues
There are basically three operations that can be performed on stacks . They are 1) inserting an item into a stack (push). 2) deleting an item from the stack (pop). 3) displaying the contents of the stack(pip).
Below are some of operations a stack data type normally supports:
Push, pop, top, isempty, isfull, get-size,
All operations except
get-size()
can be performed in time. get-size()
runs in at worst
Queue, Circular queue.
As with stacks, items can be inserted and removed from a queue in O(1) time.
Deques
Priority Queue
A de
Linked list
double ended list
Shell sort run in about O(N*(logN)2) The Shellsort is good for medium-sized arrays, perhaps up to a few thousand items, depending on the particular implementation.
- What is Gap sequence
Quick sort in about O(N*logN) timeRed-Black Rules
1. Every node is either red or black.
2. The root is always black.
3. If a node is red, its children must be black (although the converse isn’t necessarily true).
4. Every path from the root to a leaf, or to a null child, must contain the same number of black nodes.
The number of black nodes on a path from root to leaf is called the black height.
Fixing Rule Violations
- You can change the colors of nodes.
- You can perform rotations.
Color Flips on the Way Down
Here’s the rule: Every time the insertion routine encounters a black node that has two red children, it must
change the children to black and the parent to red (unless the parent is the root, which always remains black).
Like ordinary binary search trees, a red-black tree allows for searching, insertion, and deletion in O(log2N) time.
Hash Tables
A hash table is a data structure that offers very fast insertion and searching. Hash tables are significantly faster than trees, which, as we learned in the preceding chapters, operate in relatively fast O(logN) time. Not only are they fast, hash tables are relatively easy to program.
Hash tables do have several disadvantages. They’re based on arrays, and arrays are difficult to expand after they’ve been created. For some kinds of hash tables, performance may degrade catastrophically when a table becomes too full, so the programmer needs to have a fairly accurate idea of how many data items will need to be stored (or be prepared to periodically transfer data to a larger hash table, a time-consuming process). Also, there’s no convenient way to visit the items in a hash table in any kind of order
However, if you don’t need to visit items in order, and you can predict in advance the size of your database, hash tables are unparalleled in speed and convenience.
Open Addressing
- Linear Probing
- Quadratic Probing
- Double Hashing
Double Hashing
Experience has shown that this secondary hash function must have certain characteristics:
- It must not be the same as the primary hash function.
- It must never output a 0 (otherwise, there would be no step; every probe would land on the same cell, and the algorithm would go into an endless loop).
stepSize = constant - (key % constant); where constant is prime and smaller than the array size.
Double hashing requires that the size of the hash table is a prime number.
Separate Chaining
In open addressing, performance degrades badly as the load factor increases above one-half or two-thirds. In separate chaining the load factor can rise above 1 without hurting performance very much. This makes separate chaining a more robust mechanism, especially when it’s hard to predict in advance how much data will be placed in the hash table.
Another approach similar to separate chaining is to use an array at each location in the hash table, instead of a linked list. Such arrays are sometimes called buckets.
Hash Functions
Quick Computation
Random Keys
Use a Prime Number for the Modulo Base
Miscellaneous
1. Insertion sort is preferable to quicksort for small files and for almost-sorted files.
2. Some sorting algorithm that retains secondary order is said to be stable