## Marriage Sort

Monday, August 16, 2010

Several months ago, someone introduced a sorting algorithm called “Marriage Sort”. The inspiration for it came from an article analyzing how to (mathematically) select the best wife/husband.

The “conclusion” drawn from the article is that, given `N`

candidates,
the strategy with the best expected value is to skip past the first
`sqrt(N) - 1`

candidates and then choose the next “best so far”.

Translated loosely into a sorting algorithm, it goes something like this:

- Given
`N`

candidates, calculate the number to skip. - Find the “best” candidate within the skip distance.
- Move all the better candidates beyond the skip distance to the end.
- Reduce
`N`

by the number of candidates moved. - Repeat from Step 1 until we run out of candidates.
- Perform insertion sort.

The marriage sort algorithm is not particularly fast, with a runtime of
O(n^{1.5}), but sorting algorithms are fundamental to computing,
so I thought it would be fun to implement in
Factor.

*Note: Factor comes with some sorting algorithms. The sorting
vocabulary implements merge sort and the sorting.insertion vocabulary
implements an in-place insertion sort.*

First, some vocabularies and a namespace (we will be using locals to implement a couple of the words):

```
USING: kernel locals math math.functions sequences
sorting.insertion ;
IN: sorting.marriage
```

We can take the loose algorithm and structure the `marriage-sort`

word,
leaving the bulk of the work for the `(marriage-sort)`

inner loop:

```
: marriage-sort ( seq -- )
dup length
[ dup sqrt 1 - >fixnum dup 0 > ]
[ (marriage-sort) ] while 2drop
[ ] insertion-sort ;
```

We’ll need to find the index of the maximum element in a range:

```
:: find-max ( from to seq -- i )
from to >= [ f ] [
from from 1 + [ dup to < ] [
2dup [ seq nth ] bi@ < [ nip dup ] when 1 +
] while drop
] if ;
```

That leaves the `(marriage-sort)`

word (probably more complex than
necessary, but it works):

```
:: (marriage-sort) ( seq end skip -- seq end' )
0 skip seq find-max
skip end [ 2dup < ] [
2over [ seq nth ] bi@ <=
[ 1 - [ seq exchange ] 2keep ]
[ [ 1 + ] dip ] if
] while nip 1 - [ seq exchange seq ] keep ;
```

Some performance numbers (given a 10,000 element random array):

```
IN: scratchpad 10000 [ random-32 ] replicate
IN: scratchpad dup clone [ natural-sort drop ] time
Running time: 0.004123694 seconds
IN: scratchpad dup clone [ marriage-sort ] time
Running time: 0.063077446 seconds
IN: scratchpad dup clone [ [ ] insertion-sort ] time
Running time: 10.972027614 seconds
```

As you can see, slower than `natural-sort`

(which uses merge sort), but
much faster than `insertion-sort`

, with similar in-place semantics. It’s
worth noting that the code for `insertion-sort`

seems a little slow and
could probably be sped up quite a bit.

The code and some tests for this is on my GitHub.