A data structure
is encapsulation of data and relevant
operations. It is referred to as an
Abstract Data Type (ADT), signifying that the actual implementation
is left open, subject to satisfying the intended meaning of the
For the most part the exact data representation is not presented;
rather, all access to the data is through the operations.
The Java class is a perfect model for an ADT.
In addition to the encapsulation with accessibility via operations
only, the presentation of operations are in the modern
fashion in which the data takes precedence over
According to this definition, every Java class is a data structure.
However, classically, we have particular
interest in the classes which
are often termed to be part of the Java collections
In Java terms these are contained in the
package. In particular, they are
aggregate data structures typically representing unbounded
structures whose size is parameterized
The focus of interest to this class is:
- Building data structures.
- what data to use
- what operations are of interest
- how to implement the operations
- Comparing and contrasting the efficiency of different
- Using data structures to solve difficult algorithmic problems.
There are many factors involved in these pursuits, but we can
often pin it down to three things:
- Speed (run-time).
- Space usage.
This really should be the first consideration. We say time is
money, but the "time" may be the amount of time you spend on
something, not the time it takes to execute.
- KISS philosophy: Keep It Simple and Stupid
- Always use the simplest data structure (dumbest?) first.
Try to get a working model.
Then build on the model, to make it faster, etc.
Don't spend time making something complex so as to
speed it up if it won't greatly improve the overall speed.
Keep in mind the utility and possible life-cycle of this code
so that you're not wasting your time.
This is the second consideration: try to improve the overall speed.
There are several considerations:
Profiling and tweaking. Try to speed up the most costly operations.
For example, in a database algorithm that involves disk and RAM
operations, the disc operations are at least 1000 times as
costly as the RAM operations, so we want to
focus on making the disc operations less costly.
If you identify some time-consuming operation, look for an
order-of-magnitude improvement, e.g., going from quadratic
time to linear time. This is difficult and involves
finding a better data structure and/or better algorithm.
Advanced data structures usually require extra storage to
hold information about the structure which can be used for
efficient processing. Usually this features is considered
last, but in data-intensive data structures, say, for
a database, it is crucial.
Algorithm Analysis is usually about the "speed" of an algorithm.
How do we measure speed?
- real or wall-clock time
- abstract time
By abstract time
we mean that we decide on some programming
event to count regardless of what constitutes
this event while ignoring various details.
The other issue when measuring speed is that the speed may depend
on the input in which case there are possibly three things of
interest to us:
- best case
- worst case
- average case
The best case is generally not of interest; its primary value is
to ascertain the minimum amount of work an algorithm must do regardless
of its input.
The worst case is often the thing we analyze first, because it represents
an upper bound on the time. This, however, is not the complete story,
because it represents the behavior of a single
some cases, an operation repeated multiple times will have a better
worst-case behavior than any single operation. Such worst-case
analysis is referred to as the amortized
The average case is often a good judge of what to expect in a
"generic" situation. This analysis is usually the most challenging and
often summons deep layers of mathematics to study it. The other issue
is how to characterize a generic situation. Usually the assumption is
that events are "equally likely", but that is often not the case.
For example in a text search, say, to count the occurrences of certain words,
one typically takes advantage of the well-known fact that certain words occur
much more frequently than others.