Data Structure Description | Priority Queue Operations + Running Times
Empirical Running Times on Provided Data | Best/Worst Case Inputs
Comparison to other data structures | References/Credits

Data Structure Description

Our group chose to implement our priority queue using Splay Trees, a type of binary tree invented by Robert Tarjan and Daniel Sleator in 1985. Splay trees are a self-adjusting data structure which have the unique property that nodes which are frequently accessed are kept near the root of the tree and thus efficiency in accessing them is increased.

To improve upon this model, we added hashes to cache and provide immediate access to nodes when performing inserts on the same priority value. How is this possible since newPriority returns seemingly random numbers, even when priorities are the same? We have implemented a caching scheme for calls to newPriority. Now, before making a call to this function in the provided library, we look in a hash table cache to see if we have called this function before with the same cpuUsage and niceness. If so, then we use the value in the cache instead of calling the function. Our goal in this can be explained as follows: if (compare(p1, p2) == 0), then we would like it to be the case that p1 == p2, a property we can easily determine without an expensive compare function call. Using a hash table lookup ensures that p1 == p2 for the same priority. Our hash is a fixed size, however, and only stores 5 entries per bucket and up to maxNodes entries total. So if a bucket is filled a translation may be evicted. The entry to evict is chosen randomly (thus making our number of compares non-deterministic). Because this is meant to be a cache, however, all that means is the translation will have to be re-entered later.

We have another cache as well, which hashes priorities to nodes in our splay tree. Because it is now the case that nodes with the same priority have the same p1 value, this mapping is valid and saves time in tree traversal.

Also, if it is the case that a different cpuUsage and niceness give the same priority value, when we find the appropriate node we convert the assigned priority value to the cached priority value.

Further, we in truth have a multi-level splay tree, for which the master tree stores nodes corresponding to different priorities, and at each node is a pointer to a sub-tree which corresponds to jobs with the same priority (but different job ids). In this way, we can keep our master tree as small as possible save on priority comparisons.

We implemented our splay tree using the bottom-up method, because we found that the top-down implementation had slightly more comparisons than top-down. Our job table is simply a direct-mapped array whose elements, corresponding to tree nodes, are all malloc'd at the beginning, creating a pool which is drawn from during the course of the program. We have also implemented our delete node using one splay instead of two, which saves on comparisons. Finally, many of our functions are declared as inline in order to save on function calls.

We have two functions we are using for hashing in our hash caches: one which uses Knuth's recommended constant of (sqrt(5) - 1) / 2 for hashing ints (found in CLR chapter 12), the other which just uses mod division for hashing long longs.

Priority Queue Operations + Running Times

Our insert operation performs in the following way:
It first tries to find the priority given the cpuUsage and niceness in our hash cache. If it succeeds, it looks up in another hash, using its priority as a key, for a pointer to the node in our master splay tree which contains another tree of all nodes with the same priority. If it can't find that priority in the cache, then it calls newPriority, inserts the value it gets into the cache, and does a standard btree insert on the master tree, until it gets to the node or location corresponding to its priority. It creates a node if none yet exists for that priority, or inserts the new node into that sub-splay tree of the existing node using a standard binary tree insert. It then splays the new node in the sub-tree to the top of the tree. It also splays the corresponding node in the master tree to the top of the tree.

We did not bother to implement an Increase-Key function, because there is no benefit in splay trees. Instead, we merely delete the node from the tree, adjust its priority, and reinsert the node into the tree.

When we age all of the priorities, we use a stack-based tree traversal in order to save on function calls and space, and destroy and completely reconstruct the splay tree. With our multi-levels of splay trees, it makes things somewhat more complicated, but complete data structure recreation is the main idea. Thus, our worst-case running time for aging is n * O(lg n) = O(n lg n)

Delete works by first disconnecting the left and right children of the node to be deleted. Then the join operation is performed on the left and right subtrees. That tree is then reconnected with the deleted node's parent. The join operation combines two trees, where all elements in the left tree are less than all elements in the right tree. It works by first finding the maximum in the left tree, thus splaying the largest value to the top of the left tree. Then the right tree is set as the right child of this tree. Finally, a splay is performed on the parent of the deleted node.

Again, while deleting is more complicated with our multi-level splay tree, this description can be applied recursively to both levels.

To find the maximum, we have two versions. In the simple, bottom-up version, the maximum is found simply by following right child pointers down the tree. Then the tree is splayed on the found maximum node. In the top-down version, the tree is splayed as we go down the tree. We disabled the top down version because it was somewhat slower than bottom up (in spite of some of the things we read which indicate otherwise...?)

Extract-max is like maximum but it first removes the maximum, replacing it with the left child of the maximum node and then splaying on that node.

We should now point out that all operations -- insert, delete, find, and extract max -- depend on the splay function. Demonstrating that the splay function runs in O(lg n) amortized time will show that all of these operations run in O(lg n) amortized time. In our reference section are the class notes for CS 6998, by Chris Okasaki, at Columbia University. According to the notes, the proof is "probably one the hardest" of the class due to its use of a "magic" potential function. Feel free to inspect the proof, but needless to say the end result is the demonstration that all operations do run in amortized O(lg n) time. An extensive formal proof of the amortized bounds of splay trees can be found in Tarjan and Sleator's paper published in the ACM in 1985; see References/Credits for more info.

We can point out, according to Tarjan and Sleator's "access lemma" (Sleator and Tarjan, p 658), that the constant factor hidden within this big-o amortized cost is:

splay(k,t) <= 3*rank(t)
where rank(t) = lg (|t| + 1)
Or the 3 * lg(size of the tree + 1). This constant factor of 3 is not very large at all, and is indicative of the benefit of using splay trees.


Empirical Running Times on Provided Data

Test File Splash Tree compares* Linked List compares* Splash Tree time* Linked List time*
test.100.a 14203 967642 0.54u (7.3% of LL) 7.43u
test.100.b 14132 968434 0.54u (7.3% of LL) 7.37u
test.500.a 40389 27787093 2.67u (1.2% of LL) 218.04u
test.500.a.rev 39959 27781669 2.67u (0.9% of LL) 299.86u
test.1000.a 161818 151071987 8.88u (0.8% of LL) 1106.40u
test.10000.a 795152 1206286 19.67u (95.6% of LL) 20.57u

* All time tests were run on epic or saga machines with no other significant (>10% CPU) processes running with the 'time' program. The times listed are user times, which should be deterministic regardless of server load. Also note that because our code has an element of randomness in its cache-replacement, the number of compares varies for each run, but the amount by which it varies is statistically insignificant.


Best/Worst Case Inputs

The worst case input for our splay tree priority queue is essentially a series of inserts in increasing priority order, followed by an access of the first job inserted. That is, the set
{1 2 3 4 5 6 7 8 9}
would look like Figure 1 after all the inserts. When job 1 was requested, the tree will have to traverse every node which will take O(n) time. Further, our tree will splay the node 1 to the top, which will touch every node from it to the root and again be O(n). After the node 1 is splayed to the top, it will look like figure 2. But now another access to node 1 will be immediate. Moreover, the tree is now only about half as deep as it used to be after splaying, so the worst-case time for an access to the bottom of the tree has been reduced.

As an interesting side note, if we remove the splay function from Insert and Delete, our code runs noticeable slower and with many more comparisons. From an empirical standpoint, then, the benefits of splaying are clearly measurable.

Thus, the individual worst-case behavior for any operation on a splay tree is O(n) as we demonstrated in our hypothetical data set, but amortized worst-case behavior of a splay tree of size n is O(lg n).


                 9 
               8 
             7
           6
         5
       4
     3
   2
 1
                 
         1
           8 
         6   9
       4   7
     2   5
       3

FIGURE 1FIGURE 2

An absolute best-case example is a bit more difficult to construct. Basically, any operations which are on the same node or small subset of nodes within the tree will be optimal. For example, if you insert a node which happens to be the max and then call Maximum, that node will have been splayed to the root so it can be found immediately. Other nodes recently inserted will also be near the root, so they too can be found quickly. Calls to Extract-Max instead of maximum right after an insert will also be fast, but somewhat less so because they must splay on the left subtree of the max node. Essentially, splay trees are like data structures with a built-in cache, so any sequence of operations which are on recently used nodes will be fast. Otherwise, the running time will be amortized O(lg n).

Additionally, for our data structure, nodes which we insert whose assigned priority we have cached will be faster than nodes we insert whose priority we have not yet cached. The reason is as follows: if we have already seen the priority, we can jump directly to the node in our master splay tree using our hash cache, and insert in the sub-splay tree, which is considerably smaller. This requires zero calls to the provided compare function, and results in tremendous time savings (although is still worst case amortized O(lg n) if every job had the same priority and thus mapped to the same node in our master splay tree).

If we have not cached the priority, however, we will need to do a call to newPriority, and then find the appropriate node in our master splay tree using a series of calls to the provided compare function, before we find the appropriate sub-tree. As expected, this takes longer & runs slower.

Finally, as an aside, we found that splaying, in practice, is a very computationally inexpensive operation and it greatly reduces the number of compares. By keeping the tree well-balanced, we can get an optimal shape like heaps but with the additional information of binary trees.


Comparison to other data structures

Our group spent several days researching different data-structure possibilities, and our choice of splay trees is based on several factors.

Mainly, we chose splay trees because of a paper submitted to Journal of the ACM by Douglas Jones in 1986 entitled "An Empirical Comparison of Priority Queue and Event Set Implementations." His findings concluded that splay trees ranked consistently among the top three of 11 different data structures based on different types of data distributions for priority queue implementations. He compared splay trees to, among other structures, linked lists, implicit heaps, leftist trees, binomial queues, pagodas, skew heaps, and pairing heaps. He did not, however, compare them to Fibonacci heaps, most likely because the seminal work on Fibonacci heaps was released one year later.

Further, according to the class notes for CS 6998 at Columbia University, pairing heaps "are interesting because they are both much simpler and much faster than Fibonacci heaps on real data" for priority queues. Thus, we have the transitive relation: fibonacci heaps < pairing heaps < splay trees. In addition, CLR seems to pan Fibonacci heaps when they describe their large constant factors and use purely for "theoretical" applications. Fibonacci heaps are also a very complicated data structure that looked difficult to implement. Although Fibonacci heaps have O(1) time for all operations except Extract-Min and Delete, it turns out in our scheduler that we are extracting and deleting quite a bit. Splay trees are also not biased to one type of operation, and don't have a hidden constant factor that makes them impractical for small sets. Thus, since splay trees beat out all of the other suggested data structures, by transitivity were faster that Fibonacci heaps, are practically much easier to implement, and tend not to be too punishing or biased towards one kind of operation, we chose them to implement our priority queue.

Moreover, articles by Dean Clark and Andrew M. Liao in Dr. Dobb's Journal and Kent Williams in Computer Language provided anecdotal evidence of the quality and efficiency of splay trees. The O(lg n), but probably less, running time was very appealing, and the fact that they seem to cache data fits in with the cardinal rule of CS as taught in CS 140: Operating Systems -- the only reason computers work today is caching.

But the real reason we selected splay trees is that they were not on the list of suggested data structures and they were not heaps. We figured that everyone is going to do either normal or fibonacci heaps, and that you guys would get very bored of reading and grading the same type of structure over and over and over. So here's to a little variety! Also splay trees have the coolest name.

We should add that one structure we considered but did not have time to try and implement were treaps, or tree-heaps. They are essentially a hybrid, and enforce balanced trees using a second heap field along with a regular tree key. These structures are fairly new, however, and we could not find much theoretical or empirical analysis on this type of structure. So we opted to stick with splay trees.


References/Credits

We spent quite a bit of time researching and owe our findings to many different sources. They include:

Clark, Dean. "Splay trees: for fast access of frequently accessed data. (Tutorial)" Dr. Dobb's Journal v17, n12 (Dec, 1992):56. Online. Available URL (via search through Melvyl): http://www.dbs.cdlib.org/?CSdb=comp

Cormen, Thomas H., Charles E. Leiserson, and Ronald L. Rivest. Introduction to Algorithms , McGraw-Hill, 1997.

Jones, Douglas. "An Empirical Comparison of Priority Queue and Event Set Implementations." Journal of the ACM. Apr. 1986, Vol. 29 No. 4. Online. Available URL: http://www.acm.org/pubs/articles/journals/cacm/1986-29-4/p300-jones/p300-jones.pdf

Liao, Andrew M. "Self-adjusting data structures: use self-adjusting heuristics to improve the performance of your applications. (technical)" Dr. Dobb's Journal v15, n2 (Feb, 1990):44. Online. Available URL (via search through Melvyl): http://www.dbs.cdlib.org/?CSdb=comp

Nilsson, Stephan. "Treaps in Java." Dr. Dobb's Journal . July 1997. Online. Available URL: http://www.ddj.com/articles/1997/9707/9707e/9707e.htm

Okasaki, Chris. Advanced Data Structures, Lectures 7 and 8: Splay Trees and Pairing Heaps. Oct. 28 and Nov. 4, 1999. Online. Accessed 26 Feb. 2000. Available URL: http://www.cs.columbia.edu/~cdo/6998/

Sleator, Daniel. Notes on Splay Trees. Oct. 8, 1998. Online. Accessed 29 Feb. 2000. Available URL: http://www.cs.cmu.edu:80/afs/cs.cmu.edu/academic/class/15750-s99/www/handouts/splay.ps

Sleator, Daniel, and Robert Tarjan, "Self-Adjusting Binary Search Tress," Journal of the Association for Computing Machinery. 32(3): 652-686, July 1985

Weiss, Mark Alan. Data Structures and Algorithm Analysis in C: Source Code . Second Ed. Online. Available URL: http://www.cs.fiu.edu/~weiss/dsaa_c2e/files.html (Note that we used this source code to analyze treaps, and as a guide for top-down splaying. We did not use any of the source code to implement our splay tree).

Williams, Kent. "Smart searching with C++. (self-optimizing binary search trees implemented using a C++ class) (tutorial)." Computer Language v7, n9 (Sept, 1990):45 (9 pages). Online. Available URL (via search through Melvyl): http://www.dbs.cdlib.org/?CSdb=comp



Last Updated Mon, Mar. 6 2000