Jumat, 06 Juli 2007

linkedlist vs arraylist hu.. hu... *_*

linkedlist vs arraylist

In my three years of talking in #java on efnet I have noticed the abundance of people using linked lists. I too have in my naive youth been taken in by the seemed elegance and flexibility of this data structure. It is an illusion. Linked lists are pretty much always crap. In all my years of programming I have yet to see a linked list used properly.

I will assume from here on that we are talking about SUNs java implementation on a 32 bit platform (I've used 1.4.1 on a Intel P4). First a few numbers:
- Object overhead (called OO from now on) is 4 bytes.
- Reference overhead (called RO from now on) is 4 bytes;

Memory

The ArrayList has a growth algorithm of (n*3)/2+1, meaning that each time the buffer is too small it will create a new one of size (n*3)/2+1 where n is the number of elements of the current buffer. The consequence of this growth pattern is that the amount of memory an ArrayList uses isn't totally trivial to calculate, but it's easy to calculate the worst and best cases: best case is OO+RO+n*RO and worst case is OO+RO+((n*3)/2+1)*RO bytes. For 100'000 objects these are 600'012 and 400'008 respectively. This means ArrayList memory consumption will always lie in between these limits.

LinkedList on the other hand has a simple growth pattern of just adding and removing nodes when it needs to. LinkedList is a doubly linked list so it has one reference for the previous object and one for the next (in addition to the reference to the actual data of course), furthermore the top level object keeps two references: one for the last and one for the first object to speed up adding and removing at the ends. The memory consumption for LinkedList is thus: OO+2*RO+(OO+3*RO)*n. For 100k objects this means 1'600'012 bytes.

The Vector growth pattern is N*2 and is included in the graph for completeness.

Speed

The important thing to remember when comparing LinkedList and ArrayList is that linked lists are more efficient speed wise when inserting and removing at random places in the list multiple times (note though that this advantage of linked lists can quickly be lost because of slow search times). If you're just adding to the end of the list, an ArrayList is what you want. Linked lists will be much slower due to the allocation time of each new node in the list.

A simple benchmark on my machine gave me the times 438 ms for ArrayList to add 200'000 objects and 1484 ms for LinkedList (source). If the ArrayList doesn't need to resize due to you reusing the buffer this time will go down to about 80 ms. This shows that using ArrayList and also reusing your buffers is a good idea.

1 komentar:

madhur mengatakan...

Hi,
Great Explanation. Could you please suggest me a book for "behind the scenes" working of java collections like linked list and arraylist etc.