Monday, August 16, 2010
The “conclusion” drawn from the article is that, given
the strategy with the best expected value is to skip past the first
sqrt(N) - 1 candidates and then choose the next “best so far”.
Translated loosely into a sorting algorithm, it goes something like this:
Ncandidates, calculate the number to skip.
- Find the “best” candidate within the skip distance.
- Move all the better candidates beyond the skip distance to the end.
Nby the number of candidates moved.
- Repeat from Step 1 until we run out of candidates.
- Perform insertion sort.
The marriage sort algorithm is not particularly fast, with a runtime of O(n1.5), but sorting algorithms are fundamental to computing, so I thought it would be fun to implement in Factor.
Note: Factor comes with some sorting algorithms. The
vocabulary implements merge sort and the
implements an in-place insertion sort.
First, some vocabularies and a namespace (we will be using locals to implement a couple of the words):
USING: kernel locals math math.functions sequences sorting.insertion ; IN: sorting.marriage
We can take the loose algorithm and structure the
leaving the bulk of the work for the
(marriage-sort) inner loop:
: marriage-sort ( seq -- ) dup length [ dup sqrt 1 - >fixnum dup 0 > ] [ (marriage-sort) ] while 2drop [ ] insertion-sort ;
We’ll need to find the index of the maximum element in a range:
:: find-max ( from to seq -- i ) from to >= [ f ] [ from from 1 + [ dup to < ] [ 2dup [ seq nth ] bi@ < [ nip dup ] when 1 + ] while drop ] if ;
That leaves the
(marriage-sort) word (probably more complex than
necessary, but it works):
:: (marriage-sort) ( seq end skip -- seq end' ) 0 skip seq find-max skip end [ 2dup < ] [ 2over [ seq nth ] bi@ <= [ 1 - [ seq exchange ] 2keep ] [ [ 1 + ] dip ] if ] while nip 1 - [ seq exchange seq ] keep ;
Some performance numbers (given a 10,000 element random array):
IN: scratchpad 10000 [ random-32 ] replicate IN: scratchpad dup clone [ natural-sort drop ] time Running time: 0.004123694 seconds IN: scratchpad dup clone [ marriage-sort ] time Running time: 0.063077446 seconds IN: scratchpad dup clone [ [ ] insertion-sort ] time Running time: 10.972027614 seconds
As you can see, slower than
natural-sort (which uses merge sort), but
much faster than
insertion-sort, with similar in-place semantics. It’s
worth noting that the code for
insertion-sort seems a little slow and
could probably be sped up quite a bit.
The code and some tests for this is on my GitHub.