Why would we want to do this when Java already has everything we need?
There are severals goals:
learn how to implement the operations
understand the significance of one implementation
choice versus another.
use our versions to obtain counts of operation executions
in order to create an empirical analysis of
the effect of using these data structures in applications
In this document we will
focusing on the ArrayList class.
We will follow a method suggested by these steps:
Create illustrative examples to see the ArrayList member
functions work.
Create a "dummy" ArrayList class,
called AListAdapter,
which implements all the relevant interfaces and
member functions that ArrayList implements, but
with implementation functions which effectively say "not implemented."
Create an extension of
AListAdapter
called AList in which actually implement
the member functions. Because this class is an extensions
of the "working" class, AListAdapter,
there is no need to redefine every function, only those
of interest to us.
NetBeans Project
Create a NetBeans project AListDemo (with Main class).
This will
automatically create the package alistdemo in which we'll create
the necessary classes.
ArrayList First Example
alistdemo.Main
package alistdemo;
import java.util.*;
public class Main {
public static void main(String[] args) {
main1(args);
}
public static void main1(String[] args) {
List<String> L = new ArrayList<String>();
for (String s: new String[] {"aaa", "bbb", "ccc", "ddd"} ) {
System.out.println(L.add(s));
}
System.out.println(L.size());
System.out.println(L);
L.add("bbb");
System.out.println(L.size());
for (String s: L) {
System.out.println(s + " ");
}
System.out.println();
System.out.println(L.get(2));
System.out.println(L.set(2, "BBB"));
System.out.println(L);
L.add(2, "xxx");
System.out.println(L);
L.add(L.size(), "yyy"); // same effect as L.add("yyy")
System.out.println(L);
// L.add(L.size()+1, "zzz"); // this would throw an error
System.out.println("search for \"bbb\": " + L.indexOf("bbb"));
System.out.println("search for \"bbb\": " + L.lastIndexOf("bbb"));
System.out.println("search for \"ccc\": " + L.contains("ccc"));
System.out.println(L.remove("bbb"));
System.out.println(L);
System.out.println("remove(2): " + L.remove(2));
System.out.println(L);
L.clear();
System.out.println(L.isEmpty());
}
}
Observe how we express the list declaration operation:
List<String> L = new AList<String>();
Abstractly we might say
the statement is of the form:
Interface obj = new Implementation();
We conceivable could use this:
ArrayList<String> L = new AList<String>();
In general the former declaration is preferred because it allows the application
to change the implementation since it uses only operations specified by
the implementation. If there is some ArrayList-specific operation we wish to
use we would need to cast the object:
((ArrayList) L).operation
Strings from a website
For a more interesting example we want to get a list of string to work
with from the web.
Start by creating two Java Class in the util
package:
Class Name: DocCatcher
package: util
Class Name: WordProcess
package: util
Copy the contents into the classes:
util.DocCatcher
package util;
import javax.swing.text.MutableAttributeSet;
import javax.swing.text.html.HTMLEditorKit;
import javax.swing.text.html.HTML;
import java.io.InputStreamReader;
import java.io.IOException;
import java.net.URL;
import java.net.MalformedURLException;
public class DocCatcher {
public static String getBodyContent(String urlstr)
throws MalformedURLException, IOException {
/*
* The following convoluted code is necessary because getParser()
* is a protected method in HTMLEditorKit.
* Here we are creating an anonymous extension of HTMLEditorKit in
* which the getParser method calls the same method of the superclass.
*/
HTMLEditorKit.Parser parser = new HTMLEditorKit() {
@Override
public HTMLEditorKit.Parser getParser() {
return super.getParser();
}
}.getParser();
class DocStatus {
public String content = "";
public boolean body_started = false;
}
final DocStatus status = new DocStatus();
HTMLEditorKit.ParserCallback callback = new HTMLEditorKit.ParserCallback() {
// handle the tags: look for the BODY tag
@Override
public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos) {
if (t == HTML.Tag.BODY) {
status.body_started = true;
}
}
// handle the text between tags: concatenate all text after BODY tag
@Override
public void handleText(char[] text, int position) {
if (status.body_started) {
status.content += String.valueOf(text) + " ";
}
}
};
URL url = new URL(urlstr);
InputStreamReader r = new InputStreamReader(url.openStream());
parser.parse(r, callback, true);
return status.content;
}
}
and
util.WordProcess
package util;
import java.util.*;
import java.util.regex.*; // Matcher, Pattern
public class WordProcess {
public static void getWordsFromURL(
List<String> words, String url) throws Exception {
getWordsFromURL(words, url, Integer.MAX_VALUE);
}
public static void getWordsFromURL(
List<String> words, String url, int max_allowed) throws Exception
{
// a letter followed by 3 or more word characters, optionally
// followed by a single (non-word) character, like punctuation.
String filter = "([A-Za-z]\\w{3,}).?";
String content = DocCatcher.getBodyContent(url);
Matcher match = Pattern.compile(filter).matcher(content);
while (match.find() && words.size() < max_allowed)
{
String next = match.group(1); // first parenthesized group
next = next.toLowerCase();
words.add(next);
}
}
public static void printInRows(List<String> L, int maxrowsize) {
int colpos = 0;
for (String s : L) // this form uses L.iterator() !
{
if (s.length() > maxrowsize) {
System.err.println("\nstring" + s + " is too big\n");
return;
}
if (colpos + s.length() > maxrowsize) {
System.out.println();
colpos = 0;
}
System.out.print(s + " ");
colpos += s.length() + 1;
}
System.out.println();
}
}
Revised main function
In Main, add the import statement:
import util.WordProcess;
and add the code of the main2 function to the Main class:
Rewrite main so that it calls main2
(comment out the previous call):
public static void main(String[] args) {
// main1(args);
main2(args);
}
and then test-run.
AListAdapter
Our AListAdapter demonstrates all the functions
satisfied by ArrayList. The functions are all "unimplemented"
in the sense that their only code is to throw an
UnsupportedOperationException.
Our user-defined version will be an extension
of AListAdapter.
Creating this Adapter makes
it easy to create our own List implementations
without full functionality by simply extending this class.
We should also add "implements Serializable", which then requires
the definition of the mysterious data member:
public static final long serialVersionUID = <some value>;
in this and all other derived classes to avoid a Java compiler warning.
We'll omit for convenience since we do not plan to use the
Serializable aspect.
ArrayList also implements the
Collection and Iterable interfaces.
These are already implied by virtue of being
super-interfaces of List.
The RandomAccess interface requires no implementations and is
only used to guarantee for users of this class that "get" is
O(1).
In order to create this, create a new Java Class:
Class Name: AListAdapter
package: alist
Then insert the following content:
( click to show )
ArrayList Implementation
The basic idea for an array-based list
is that we want a wrapper around an array.
In order to create this, create a new Java Class:
Class Name: AList
package: alist
Then insert the following content:
( click to show )
Test drive
In Main, add the import statement:
import alist.AList;
Make this code
replacement in both main1 and main2 and then
run each of these main programs:
//List <String> L = new ArrayList<String>();
List <String> L = new AList<String>();
Implementation Features
Basic Array handling features
The array must have a certain capacity
as well as a current size.
The private data used to handle that are these:
private int capacity;
private int size = 0;
private E[] data;
It is assumed that all member
functions keep track of size correctly
so that we can give these definitions:
public int size() { return size; }
public boolean isEmpty() { return size == 0; }
public void clear() { size = 0; }
The primary constructor is this:
public AList(int capacity) {
this.capacity = capacity;
data = (E[]) new Object[capacity];
}
Observe how we must deal with creating an array of the unknown
generic type E.
If a user doesn't care to specify the capacity, these are used:
private static final int DEFAULT_CAPACITY = 10;
public AList() { this(DEFAULT_CAPACITY); }
The most basic capacity handler is a private function which
creates a new array and copies over the old one:
private void increaseCapacity(int new_capacity) {
if (new_capacity <= capacity) {
return;
}
E[] new_data = (E[]) new Object[new_capacity];
for (int i = 0; i < capacity; ++i) {
new_data[i] = data[i];
}
capacity = new_capacity;
data = new_data;
}
The most basic member function is the add function
which adds to the array's end:
public boolean add(E elt) {
if (size == capacity) {
increaseCapacity(capacity*2);
}
data[size++] = elt;
return true;
}
It is a design decision of sorts as to how much to increase the
capacity, but it turns out to be more efficient to use exponential
increments than arithmetic increments.
Index-related operations
One of the key notions of a List is that it keeps track of the
position of elements.
The search operations use a linear
searches front to rear or vice-versa
since there is no presumed structure on the data,
public int indexOf(Object obj) {
for (int i=0; i < size; ++i) {
if (data[i].equals(obj)) return i;
}
return -1;
}
public int lastIndexOf(Object obj) {
for (int i=size; i > 0; --i) {
if (data[i-1].equals(obj)) return i-1;
}
return -1;
}
public boolean contains(Object obj) {
return indexOf(obj) != -1;
}
The other operations are insertion or removal at a position
which causes the data to the right of the insert or remove position
to be shifted right or left, respectively.
public void add(int index, E elt) {
if (index > size || index < 0) {
throw new
IndexOutOfBoundsException("Index: " + index + ", " + "Size: " + size);
}
if (size == capacity) {
increaseCapacity(capacity*2);
}
for (int i = size; i > index; --i) { // shift these up
data[i] = data[i-1];
}
data[index] = elt; // insert this one
++size;
}
public E remove(int index) {
if (index >= size || index < 0) {
throw new
IndexOutOfBoundsException("Index: " + index + ", " + "Size: " + size);
}
E saved = data[size];
for (int i=index; i < size-1; ++i) { // shift elements down
data[i] = data[i+1];
}
--size;
return saved;
}
public boolean remove(Object o) {
int i = indexOf(o);
if (i == -1) return false;
remove(i);
return true;
}
The iterator and other support operations
The list's iterator member function
is used implicitly in constructions like this:
List<String< L = new AList<String>();
// add stuff to the list
for (String s: L) {
// the iterator used here to get successive values of s
}
A similar thing can be said about printing a list (or any object):
System.out.println( L );
calls the toString function implicitly.
In order to handle these situations we use the following functions:
public Object[] toArray() { return Arrays.copyOf(data, size); }
public String toString() { return Arrays.toString(toArray()); }
public ListIterator<E> listIterator() {
return Arrays.asList( (E[]) toArray() ).listIterator();
}
public Iterator<E> iterator() { return listIterator(); }
The textbook goes to pains to illustrate how to define the
iterator.
We have side-stepped the issue by:
using the Arrays.asList operator to define
the listIterator function
defining the iterator function as the result of the
listIterator function
Creating the iterator from scratch
If we want to define these ourselves, we can either
defined the iterator function which returns an
implementation of the Iterator interface with these functions:
public boolean hasNext()
public E next()
public void remove()
or define the listIterator function which returns
an implementation of the extended ListIterator interface
with these additional functions:
public int nextIndex()
public E previous()
public int previousIndex()
public void add(E elt)
public void set(E elt)
public boolean hasPrevious()
The add and remove
operations are potentially the most difficult,
but, according to the Java documentation, these operations are
"optional", which means that we can implement them by saying that
they're not implemented, which in Java terms means:
throw new UnsupportedOperationException();
The other point noted in the textbook is that the
implementation class needs to be an inner class
so that it can access the private data. Since this class
need never be reused, we'll make it an anonymous inner class,
giving us a definition like this:
public ListIterator<E> listIterator() {
return new ListIterator<E>() {
private int index = 0;
public boolean hasNext() { return index < size; }
public E next() { return data[index++]; }
public int nextIndex() { return index+1; }
public int previousIndex() { return index-1; }
public E previous() { return data[--index]; }
public void set(E elt) { data[index] = e; }
public boolean hasPrevious() { return index > 0; }
public void remove() { throw new UnsupportedOperationException(); }
public void add(E elt) { throw new UnsupportedOperationException(); }
};
}