Exploring
delimited continuation in
Common Lisp, in
particular,
shift
and reset
from Danvy and Filinski. There
is clcont package, which supports
shift/reset style continuation with rewriting expression inside
withcall/cc
macro. The implementation shown in this post is simple,
short (less than 50 lines), and uses monadic functions to capture
continuations.
As
Wadler mentioned in various papers,
monads could be used to express continuations. Monads consists of type
constructor M
and following two operations:
unit :: a > M a
(★) :: M a > (a > M b) > M b
It is possible to express concept of Monads in other programming languages than Haskell. Several implementation already exist, such as in Scheme and JavaScript, to name a few. In impure languages, there are not much needs for monad, since the problems solved by monads in purely functional language are solved with different techniques. Though, one of the programming techniques which could be solved with monad, and missing in most language is, expressing first class continuation.
One of the programming languages supporting first class continuation is
Scheme. Scheme has callwithcurrentcontinuation
, which captures
undelimited continuation as first class value. Some of the Scheme
implementations also support delimited continuations. Delimited
continuations are not in the Scheme language specifications so far,
though it
is
possible to implement it with undelimited continuations.
In a programming language without builtin support for continuation, one
need to express the computation in ContinuationPassingStyle (CPS) to
capture the continuation, and that is where monad is used for. Firstly,
defining a structure to represent continuation. The structure CONT
has
single field, a function taking the current continuation.
(defstruct (cont (:constructor makecont (fn)))
(fn #'values :type function))
(defun runcont (c k)
(funcall (contfn c) k))
Defining unit
and (★)
to make CONT
as an instance of monad. In the
code shown below, unit
is renamed to returnc
, and (★)
is renamed
to bindc
.
(defun returnc (x)
(makecont (lambda (k)
(funcall k x))))
(defun bindc (c f)
(makecont (lambda (k)
(runcont c (lambda (x)
(runcont (funcall f x) k))))))
Definition of returnc
makes CPS representation of x
wrapped with
CONT
structure. bindc
runs the continuation c
, then pass the
result to f
, and runs the continuation returned by f
with the
current continuation k
. Note that in bindc
, current continuation k
is not passed to c
but to the returned value of f
.
Some examples:
> (runcont (returnc 'foo) #'values)
FOO
> (runcont (bindc (returnc 21) (lambda (x) (returnc (* x 2)))) #'values)
42
Adding two syntax helper macros, letc*
and progc
. The macro letc*
is similar to let*
, but instead of binding result of pure expressions,
letc*
binds the variable to the argument passed to the continuation.
(defmacro letc* (bindings &body body)
(if (null bindings)
`(progn ,@body)
(destructuringbind (name c) (car bindings)
`(bindc ,c (lambda (,name)
(letc* ,(cdr bindings) ,@body))))))
The second example shown above could be written with letc*
as follows:
> (runcont (letc* ((x (returnc 21)))
(returnc (* x 2)))
#'values)
42
progc
is similar to letc*
, but discards the variable, intended to be
used in the codes where side effects of continuations are the main
concern.
(defmacro progc (&body body)
(if (null (cdr body))
(car body)
(let ((garg (gensym)))
`(bindc ,(car body)
(lambda (,garg)
(declare (ignore ,garg))
(progc ,@(cdr body)))))))
With monadic interfaces for continuation, shift
and reset
could be
expressed as below. The function reset
unwrap the CONT
structure if
the given argument is CONT
, otherwise returns the given value itself.
(defun reset (k)
(if (contp k)
(runcont k #'values)
k))
The macro shift
binds the current continuation to given var
as a
function defined with flet
, and then invoke the expr
. Inside expr
,
bounded continuation could be invoked as an ordinary function. In other
words, shift
captures the continuation from the captured place until
the first appearance of enclosing reset
.
(defmacro shift (var expr)
(let ((gk (gensym))
(garg (gensym)))
`(makecont (lambda (,gk)
(declare (function ,gk))
(flet ((,var (,garg)
(funcall ,gk ,garg)))
(declare (ignorable (function ,var)))
,expr)))))
Evaluating some sample expressions with shift
and reset
. The
resulting value of shift
is a CONT
:
> (shift k (k 2))
#S(CONT :FN #<FUNCTION (LAMBDA (#:G656)) {1004B9A55B}>)
To perform the computation in CONT
, one can apply reset
:
> (reset (shift k (k 2))
2
The resulting CONT
object from shift
could be passed to bindc
, as
those made from returnc
:
> (reset (letc* ((x (shift k (k 2))))
(returnc (+ x 3))))
5
When the expression in shift
returns without calling captured
continuation, the whole computation will escape immediately. Instead of
the sum of x
and y
or an error from +
, following expression
evaluates as symbol FOO
:
> (reset (letc* ((x (returnc 100))
(y (shift k 'foo)))
(returnc (+ x y))))
FOO
Since captured continuation is an ordinary function, it could be invoked
multiple times. In the following expression, the captured computation
with shift
could be viewed as (lambda (x) (+ x 3))
. The captured
continuation is applied twice, which results in (+ (+ 2 3) 3)
:
> (reset (letc* ((x (shift k (k (k 2)))))
(returnc (+ x 3))))
8
Inside the expression of shift
, further computation could be done with
returned value from captured computation. Following expression applies
captured continuation twice as in previous example, then multiplies by
2
:
> (reset (letc* ((x (shift k (* 2 (k (k 2))))))
(returnc (+ x 3))))
16
In the implementation of shift
and reset
shown here, continuations
are captured with monad, hence some wrapping with returnc
and
unwrapping with letc*
are required. In an implementation which has
builtin support of shift
and reset
(e.g.: Racket), the last example could be
written as:
> (reset (+ (shift k (* 2 (k (k 2)))) 3))
16
One common use of continuation is for nondeterministic
programming. Showing an implementation of choice
from
the
Danvy and Filinski's paper,
and its use with function triple
. The problem we have is to find out
all triples of distinct positive integers i
, j
, and k
less than or
equal to a given integer n
that sums to a given integer s
:
(defun fail ()
(shift k 'no))
(defun choice (n)
(shift k (loop for i from 1 to n do (k i) finally 'no)))
(defun triple (n s)
(letc* ((i (choice n))
(j (choice ( i 1)))
(k (choice ( j 1))))
(if (= s (+ i j k))
(returnc (list i j k))
(fail))))
To print the results of triple
, one may write as follows:
> (reset (letc* (ijk (triple 9 15)))
(returnc (print ijk))))
(6 5 4)
(7 5 3)
(7 6 2)
(8 4 3)
(8 5 2)
(8 6 1)
(9 4 2)
(9 5 1)
NIL
Note that unlike the original example in the paper, the version of
shift
in this post returns a value of CONT
structure. The expression
(triple 9 15)
is evaluated as CONT
structure, not the answer values
returned by (returnc (list i j k))
. The answers need to be unwrapped
with letc*
or bindc
to print with print
function.
Another famous problem solved with continuation is, socalled the samefringe problem. Two binary trees have the same fringe if they have exactly the same leaves reading from left to right. The problem is to decide whether two binary trees have the same fringe. Basically, we want to return from the tree traversal as soon as we detect that the trees are different.
(defun donep (x) (eq 'done x))
(defun nextp (x) (not (donep x)))
(defun next (n k) (lambda () (values n k)))
(defun walkerc (tree)
(cond
((null tree) (returnc 'done))
((atom tree) (shift k (next tree #'k)))
(t (progc
(walkerc (car tree))
(walkerc (cdr tree))))))
The function walkerc
takes a tree, returns a CONT
value. Each
element of the tree is converted as a closure with the function next
.
The closure returns two values: the element and the captured
continuation. The resulting CONT
could be viewed as a coroutine
object, which traverses the tree from right to left order.
With walkerc
, function samefringe
could be written as below:
(defun samefringe (t1 t2)
(labels ((rec (r1 r2)
(if (nextp r1)
(and (nextp r2)
(multiplevaluebind (n1 k1) (funcall r1)
(multiplevaluebind (n2 k2) (funcall r2)
(and (eql n1 n2)
(rec (funcall k1 nil)
(funcall k2 nil))))))
(donep r2))))
(rec (reset (walkerc t1))
(reset (walkerc t2)))))
Sample runs:
> (samefringe '((1) (2 (3 (4) 5))) '((1 (2) (3 4) 5)))
T
> (samefringe '((1) (2 (3 (4) 5))) '((1 (2) (4) 5)))
NIL
Both nondeterministic example and coroutine example could be directly written
in continuation passing style (CPS), which shall be slightly efficient since
direct CPS does not require to make the cont
structure.
Te implementation shown in this post is similar to the one done by Matthew D Swank, written in 2006.
]]>Coleslaw is a static site generator written in Common Lisp. As of the
version included in Quicklisp 20161208, Coleslaw supports plain
HTML, Markdown
and reStructuredText for
writing contents. Basic usage of Coleslaw is written
in README
file. Coleslaw uses .coleslawrc
file to configure the site. The
configuration file could be placed under the user's home directory (i.e.:
$HOME/.coleslawrc
), or under the directory containing all the other
files for building the site (e.g.: /path/to/my/site/.coleslawrc
). It
uses Common Lisp port
of Closure Templates
for defining theme templates. Markdown parser used by Coleslaw
is 3bmd.
Coleslaw recommends the use of plugins to customize and extend the
site. The archive page of this site and the pages
showing the posts for each tag are built with simple plugins. Plugins are
simply a Common Lisp source code, which should be a package with
Coleslaw's naming convention. The naming convention is
coleslawNAME_OF_PLUGIN
, where NAME_OF_PLUGIN
is name of the plugin
(e.g.: Package name of the archive
plugin is coleslawarchive
).
Suppose that, the directory containing the static site is structured as
below:
/path/to/my/site/
├── .coleslawrc
├── 404.page
├── about.page
├── plugins
│ ├── archive.lisp
│ └── tags.lisp
├── posts
│ ├── 20111021whattocountinsourcecode.post
│ ├── ...
│ └── 20161227migratingtocoleslaw.post
├── static
│ ├── benchavlinsertmember.html
│ └── ...
├── the.archive
└── themes
└── simple
├── archive.tmpl
├── base.tmpl
├── css
│ └── style.css
├── index.tmpl
├── post.tmpl
└── tag.tmpl
And the contents of .coleslawrc
similar to below:
(:author "8c6794b6"
:charset "UTF8"
:deploydir "/path/for/deploy/"
:domain "http://my.url.com"
:plugins ((mathjax)
(staticpages)
(sitemap)
(archive)
(tags))
:routing ((:post "posts/~a")
(:tagindex "tag/~a")
(:monthindex "date/~a")
(:numericindex "~d")
(:feed "~a.xml")
(:tagfeed "tag/~a.xml"))
:stagingdir "/tmp/coleslawstaging/"
:title "8c6794b6.github.io"
:theme "simple")
By evaluating (coleslaw:main #p"/path/to/my/site")
in Common Lisp,
coleslaw will generate the site contents. Some of the works done are:
Load plugins. Coleslaw will look for coleslawmathjax
,
coleslawstaticpages
, coleslawsitemap
, coleslawarchive
, and
coleslawtags
under /path/to/my/site/plugins
and installed
coleslaw's source directory. Coleslaw seeks the plugins under the
directory named plugins
of the site contents, and under the Coleslaw's
source directory. Plugins mathjax
, staticpages
, and sitemap
are
bundled within default coleslaw installation.
Search the specified theme. Coleslaw will search the
/path/to/my/site/themes
first, then if not found, search for the theme
under Coleslaw's default installation. The name of the theme is
specified in the file .coleslawrc
. There is
a
guide
for writing theme in the Coleslaw source. The theme simple
is the one
used for this site.
Copy the files under the directory named static
in /path/to/my/site
to the deploy directory.
Apply functions according to file extensions. Files for post pages have
.post
in its file name, static pages has .page
, and so on.
For archive
and tags
plugins, which are made for this site, the source
codes and templates are tightly coupled, so the plugins are not much
reusable.
From this SOF Q&A. Linked page was showing a scenario to change committer and author's name and email from old value to new value:
git filterbranch commitfilter '
if [ "$GIT_COMMITTER_NAME" = "<OLD_NAME>" ];
then
GIT_COMMITTER_NAME="<NEW_NAME>";
# ...
# modify other attributes if any, then...
git committree "$@";
else
git committree "$@";
fi' HEAD
Comment in the answer was mentioning git envfilter
will change all commits,
but the one shown above allows conditional behaviour.
There are
couple
articles mentioning it, including the
git add man page. Section EDITING PATCHES
in the manual has detailed explanation.
Suppose that, we want to write contents of following rose tree, by hand:
0

+ 1
 
 ` 2

+ 3

+ 4
 
 + 5
  
  + 6
  
  ` 7
 
 ` 8

` 9
Using Data.Tree.Tree
from containers
package for our purpose, the
definition is:
data Tree a = Node { rootLabel :: a
, subForest :: Forest a }
type Forest a = [Tree a]
This document explores couple alternative ways to write rose tree data structure, mainly focusing on situations when writing by hand.
> {# LANGUAGE FlexibleContexts #}
> {# LANGUAGE FlexibleInstances #}
> {# LANGUAGE FunctionalDependencies #}
> {# LANGUAGE GeneralizedNewtypeDeriving #}
> {# LANGUAGE MultiParamTypeClasses #}
> {# LANGUAGE TypeFamilies #}
> {# LANGUAGE UndecidableInstances #}
> module Main where
>
> import Data.Tree
> import Control.Monad.Writer
It could be nice if we can write the rose tree in single line like:
> t3' :: Tree Int
> t3' = buildBranch $ pnode 0 (pnode 1 2) 3 (pnode 4 (pnode 5 6 7) 8) 9
Here, pnode
is a polyvariadic function, treating the first argument as
element in itself, and rest as leaves. The code used for writing t3'
is shown in Take 3.
In straightforward way, sample tree may written like below:
> t0 :: Tree Int
> t0 =
> Node 0
> [ Node 1
> [ Node 2 [] ]
> , Node 3 []
> , Node 4
> [ Node 5
> [ Node 6 [], Node 7 []]
> , Node 8 [] ]
> , Node 9 [] ]
The sample tree containing 10 nodes already looks a bit cumbersome to write by hand. From my point of view, things making the typing hard were:
Char
was used instead of Int
, may not be a problem.Pretty simple replacement of type constructors with functions with lower
case letters. Also using variant of node
function named leaf
which
takes empty list as second argument of Node
. The sample tree looks
like below:
> t1 :: Tree Int
> t1 =
> node 0
> [ node 1
> [ leaf 2 ]
> , leaf 3
> , node 4
> [ node 5
> [leaf 6, leaf 7]
> , leaf 8 ]
> , leaf 9 ]
>
> node :: a > [Tree a] > Tree a
> node a fs = Node a fs
>
> leaf :: a > Tree a
> leaf a = Node a []
Confirm that the two trees are identical:
ghci> t1 == t0
True
Now t1
does not contain capital letters in its body, though nested
lists might still look clumsy.
Avoid typing commas and brackets (,
, [
, and ]
), let the
do notation
to take care of node grouping. The sample tree looks like
below:
> t2 :: Tree Int
> t2 = buildTree $ do
> mnode 0 $ do
> mnode 1 $ do
> mleaf 2
> mleaf 3
> mnode 4 $ do
> mnode 5 $ do
> mleaf 6 >> mleaf 7
> mleaf 8
> mleaf 9
Implementation is done with Writer
monad with simple DiffList
:
> newtype DiffList a = DiffList ([a] > [a])
> deriving (Monoid)
>
> instance Show a => Show (DiffList a) where
> show (DiffList f) = show (f [])
>
> type TreeBuilder a = Writer (DiffList a) ()
>
> snoc :: a > DiffList a
> snoc x = DiffList (x:)
>
> runTreeBuilder :: TreeBuilder (Tree a) > [Tree a]
> runTreeBuilder builder = case runWriter builder of (_,DiffList f) > f []
>
> buildTree :: TreeBuilder (Tree a) > Tree a
> buildTree builder = case runTreeBuilder builder of
> [] > error "buildTree: empty tree"
> t:_ > t
>
> mnode :: a > TreeBuilder (Tree a) > TreeBuilder (Tree a)
> mnode x builder = tell (snoc (Node x (runTreeBuilder builder)))
>
> mleaf :: a > TreeBuilder (Tree a)
> mleaf x = tell (snoc (Node x []))
Checking whether that t2
is identical to t0
:
ghci> t2 == t0
True
Use of do notation
has freed us from using commans and brackets,
though introduced redundancy with do
s. Introducing Monoid
wrapper
newtype and polyvariadic function to remove do
s. Sample tree may looks
like below:
> t3 :: Tree Int
> t3 = buildBranch $
> pnode 0
> (pnode 1
> (pleaf 2))
> (pleaf 3)
> (pnode 4
> (pnode 5
> (pleaf 6) (pleaf 7))
> (pleaf 8))
> (pleaf 9)
Now do
s are removed, thought increase of parenthesis is making the
code quite _lisp_y. Implementation is heavily inspired by
HSXML.
> newtype Branch a = Branch (DiffList (Tree a))
> deriving (Show, Monoid)
>
> class Monoid acc => BuildTree acc out ret  ret > out where
> build :: (acc>out) > acc > ret
>
> instance Monoid acc => BuildTree acc (Branch a) (Branch a) where
> build f acc = f acc
>
> instance (BuildTree acc out ret, e ~ acc) => BuildTree acc out (e>ret) where
> build f acc = \t > build f (acc <> t)
>
> pnode :: BuildTree (Branch a) (Branch a) ret => a > Branch a > ret
> pnode x = build (\(Branch (DiffList b)) > Branch (snoc (Node x (b []))))
>
> pleaf :: a > Branch a
> pleaf x = Branch (snoc (Node x mempty))
>
> buildBranch :: Branch a > Tree a
> buildBranch (Branch (DiffList ts)) = case ts [] of
> [] > error "buildBranch: empty branch"
> t:_ > t
Checking again:
ghci> t3 == t0
True
Other data type than Data.Tree.Tree
could be used, though haven't
explored.
In take 2, monadic approach may easy to combine with other monads with
mtl
. For instance, use State
monad and count the number of leaves
while traversing.
In take 3, by defining Branch
as instance of Num class
, t3
could
rewritten in single line, which is shown as t3'
at the beginning of
this document:
t3' = buildBranch $ pnode 0 (pnode 1 2) 3 (pnode 4 (pnode 5 6 7) 8) 9
Checking:
ghci> t3' == t0
True
Purpose of making as Num
instance is merely for helping syntax. Other
functions than fromInteger
may left undefined
:
> instance Num a => Num (Branch a) where
> (+) = undefined
> (*) = undefined
> negate = undefined
> abs = undefined
> signum = undefined
> fromInteger = pleaf . fromInteger
GHC has restriction in context reduction stack, as of ghc7.8.2, default
size is 21. Polyvariadic function taking more than 22 arguments needs
fcontextstack=N
option and increase the context stack with using
large context stack size N.
Change input method with Cx RET C\
or Mx setinputmethod
. There are
three french input methods for English keyboard: frenchprefix
,
frenchpostfix
, and frenchaltpostfix
. frenchprefix
input method will
let the accent to be typed before alphabets, e.g: to type ç
, type ,c
. In
frenchpostfix
input method, type c,
. Help of frenchprefix
input method,
which shown with Ch I
or Mx describeinputmethod
, had a nice table:
Input method: frenchprefix (mode line indicator:FR>)
French (Français) input method with prefix modifiers
effect  prefix  examples
++
acute  '  'e > é
grave  `  `a > à
circumflex  ^  ^a > â
diaeresis  "  "i > ï
cedilla  ~ or ,  ~c > ç ,c > ç
symbol  ~  ~> > » ~< > «
Characters with accents were able to type with these two modes, though the
character œ
was not. Found an entry in emacsdevel thread posted in 2008 which
mentioning french ç and œ. To type in œ, use latinpostfix
or
latinprefix
input method, the key sequence to type œ
is o/2
and /o2
,
respectively. latinprefix
and latinpostfix
input methods contains those
characters typed in with french input methods:
Input method: latinpostfix (mode line indicator:L<)
Latin character input method with postfix modifiers.
This is the union of various input methods originally made for input
of characters from a single LatinN charset.
 postfix  examples
++
acute  '  a' > á
grave  `  a` > à
circumflex  ^  a^ > â
diaeresis  "  a" > ä
tilde  ~  a~ > ã
cedilla  ,  c, > ç
ogonek  ,  a, > ą
breve  ~  a~ > ă
caron  ~  c~ > č
dbl. acute  :  o: > ő
ring  .  u. > ů
dot  .  z. > ż
stroke  /  d/ > đ
nordic  /  d/ > ð t/ > þ a/ > å e/ > æ o/ > ø
others  /  s/ > ß ?/ > ¿ !/ > ¡ // > °
 various  << > « >> > » o_ > º a_ > ª
Doubling the postfix separates the letter and postfix: e.g. a'' > a'
]]>Recently made a build hook for continuous integration with travis, in one of the
repository I have in github, by following
getting started guide.
The package is requiring base >= 4.6.0
, which is from ghc7.6.*
, though the
version used in travis build was base >= 4.5.*
, which is from ghc7.4.1
.
There is an issue in github
mentioning ghc version used by travis builds.
It is still possible to use CPP
, #ifdef
s and choose the appropriate
dependency package version. Might be an opportunity to do this, though I'm not
sure the package worth for having that much maintenance efforts.
Migrated to hakyll 4. Had a thought for doing this but haven't done for
... perhaps more than a year. As
already
mentioned,
Compiler
is not an instance of Arrow
, instead an instance of Monad
, Page
and MetaCompilers have been removed, and couple more changes.
Replaced "8c6794b6.github.com" to "8c6794b6.github.io", since URLs for ghpages
have changed. According to
User, Organization and Project Pages
from GitHub Help, repository with *.github.io
is the latest naming scheme.
Update 404 page, showing bigger lambda face now.
]]>There are several AVL Tree implementations already exist: a package uploaded to hackage, gist snippet, and polymorphic stanamically balanced AVL tree. As for classic data structure and algorithm exercise, decided to write a simple implementation.
> {# LANGUAGE BangPatterns #}
> module AVL where
For taking benchmarks and comparison with Data.Map
from container
package. None of these modules are used in code implementing AVL.
> import Control.DeepSeq (NFData(..), deepseq)
> import Criterion.Main
> import System.Random
> import qualified Data.Map as M
Like other binary trees, AVL tree has leaf constructor and node constructor with right and left branch, with a field to hold height of the node:
> data AVL a
> = Node {# UNPACK #} !Int !(AVL a) !a !(AVL a)
>  Leaf
> deriving (Eq, Show)
There was a note about order of constructors written in comments of Data.Map.Base:
When type has 2 constructors, a forward conditional jump is made
when successfully matching second constructor, in GHC 7.0.
This was still true in GHC 7.6.1, the version used at the time of
writing. This is the reason why Node
constructor came before the
Leaf
constructor.
An alias for leaf node:
> empty :: AVL a
> empty = Leaf
> {# INLINEABLE empty #}
Height of tree. Defining height of Leaf
node as '0'.
> height :: AVL a > Int
> height t = case t of
> Leaf > 0
> Node !n _ _ _ > n
> {# INLINE height #}
Insert element to AVL tree. This function calls rebalance
after
inserting new element. Also, new element is strictly evaluated inside
the local function go
.
> insert :: Ord a => a > AVL a > AVL a
> insert = go where
> go :: Ord a => a > AVL a > AVL a
> go !n Leaf = Node 1 Leaf n Leaf
> go !n (Node h l !n' r) = case compare n' n of
> LT > rebalance $ Node h l n' (insert n r)
> _ > rebalance $ Node h (insert n l) n' r
> {# INLINEABLE insert #}
A function to check whether given element is a member of tree or not. Given element and element of pattern matched node are, again strictly evaluated.
> member :: (Ord a, Eq a) => a > AVL a > Bool
> member _ Leaf = False
> member !x (Node _ l !y r) = case compare x y of
> LT > member x l
> GT > member x r
> EQ > True
> {# INLINEABLE member #}
There could be more functions, delete, merge, etc. But I'm little bit lazy to write them in this post.
Here comes the balancing function. It rebalances tree, four cases are considered: rightright, rightleft, leftleft, and leftright.
Firstly, comparing with Leaf
node, which is not needed to rebalance
any more. Then the height from left node and right node are compared,
leading to next case.
> rebalance :: AVL a > AVL a
> rebalance Leaf = Leaf
> rebalance n1@(Node _ l1 x1 r1) = case compare hL1 hR1 of
> LT > rightIsHeavy
> GT > leftIsHeavy
> EQ > updateHeight n1
> where
> hL1 = height l1
> hR1 = height r1
When right node is heavy, we compare the left node and right node again, and perform the rotations.
> rightIsHeavy = case r1 of
> Leaf > n1
> Node _ l2 x2 r2 >
> case compare (height l2) (height r2) of
The rightright case, single rotation swapping the node in the middle to top and top node to left:
> LT > Node (hL1+2) (Node (hL1+1) l1 x1 l2) x2 r2
The rightleft case, bringing the bottom element to top, top element to left, and middle element to right, with reordering each hanging nodes appropriately.
> GT > case l2 of
> Leaf > n1
> Node h3 l3 x3 r3 >
> Node (h3+1) (Node h3 l1 x1 l3) x3 (Node h3 r3 x2 r2)
Nothing left to do when nodes already has same height, merely returning the given node itself.
> EQ > n1
Leftleft case and leftright cases are symmetrical to above:
> leftIsHeavy = case l1 of
> Leaf > n1
> Node _ l2 x2 r2 >
> case compare (height l2) (height r2) of
> LT > case r2 of
> Leaf > n1
> Node h3 l3 x3 r3 >
> Node (h3+1) (Node h3 l2 x2 l3) x3 (Node h3 r3 x1 r1)
> GT > Node (hR1+2) l2 x2 (Node (hR1+1) r2 x1 r1)
> EQ > n1
> {# INLINE rebalance #}
Nonrecursive height updating function, used in EQ
case of
rebalance
:
> updateHeight :: AVL a > AVL a
> updateHeight t = case t of
> Leaf > Leaf
> Node _ Leaf n Leaf > Node 1 Leaf n Leaf
> Node _ l@(Node h _ _ _) n Leaf > Node (h+1) l n Leaf
> Node _ Leaf n r@(Node h _ _ _) > Node (h+1) Leaf n r
> Node _ l@(Node hl _ _ _) n r@(Node hr _ _ _) > Node h' l n r where
> h'  hl < hr = hr + 1
>  otherwise = hl + 1
> {# INLINE updateHeight #}
That's all for AVL tree to have insert
and member
function. For
testing, a function to check the balance:
> isBalanced :: AVL a > Bool
> isBalanced t = case t of
> Leaf > True
> Node h l _ r >
> abs (h  height l) <= 1 && abs (h  height r) <= 1 &&
> isBalanced l && isBalanced r
> {# INLINE isBalanced #}
Simple check:
ghci> isBalanced $ foldr insert empty [1..1024]
True
ghci> isBalanced $ foldr insert empty [1024,1023..1]
True
Benchmarks:
> instance NFData a => NFData (AVL a) where
> rnf Leaf = ()
> rnf (Node h l x r) = rnf h `seq` rnf l `seq` rnf x `seq` rnf r
>
> avlBenches :: [Benchmark]
> avlBenches =
> let tn n = foldr insert empty [0..n1::Int]
> insertAVL !k =
> let !x = let x' = tn k in x' `deepseq` x'
> in bench ("n=" ++ show k) (whnfIO $ insertRand x (0,k1))
> insertRand t (a,b) = do
> x < getStdRandom (randomR (a,b))
> let y = insert x t
> return $! y
> memberAVL k =
> let x = tn k
> in x `deepseq` bench ("n=" ++ show k) (whnfIO $ memberRand x (0,k1))
> memberRand t (a,b) = do
> x < getStdRandom (randomR (a,b))
> let y = member x t
> return $! y
> mn n = let xs = [0..n1::Int] in M.fromList $ zip xs (repeat ())
> insertMap !k =
> let !x = let x' = mn k in x' `deepseq` x'
> in bench ("n=" ++ show k) (whnfIO $ insertRandM x (0,k1))
> insertRandM m (a,b) = do
> x < getStdRandom (randomR (a,b))
> let y = M.insert x () m
> return $! y
> memberMap k =
> let !x = let x' = mn k in x' `deepseq` x'
> in bench ("n=" ++ show k) (whnfIO $ memberRandM x (0,k1))
> memberRandM m (a,b) = do
> x < getStdRandom (randomR (a,b))
> let y = M.member x m
> return $! y
> benchmarks =
> [ bgroup "AVL"
> [ bgroup "insert" [insertAVL (2^k)  k < [10..14::Int]]
> , bgroup "member" [memberAVL (2^k)  k < [10..14::Int]]
> ]
> , bgroup "Data.Map"
> [ bgroup "insert" [insertMap (2^k)  k < [10..14::Int]]
> , bgroup "member" [memberMap (2^k)  k < [10..14::Int]]
> ]
> ]
> in benchmarks
Using above benchmark as main:
> main :: IO ()
> main = defaultMain avlBenches
Compile, run the benchmark, and see the result in html report:
$ ghc O2 fllvm AVL.lhs mainis AVL o AVL
$ ./AVL o bench.html
Report is here.
Lessons learned: from above simple implementation, a data structure with
performance close to those provided by standard package could be
implemented. Benchmark show that AVL tree insertion was slightly slower
than insertion in Data.Map, performance of member lookup was almost
identical. Though when there is no need for making own implemntation,
just use the implementation from proven package, in most case those
provided data structures has more features, Data.Map.Map has useful
functions like insetWith
, unionWith
, etc.
By the way, during the benchmark I was using nfIO
instead of whnfIO
for a while, resulting to linearly increasing insertion time for while.
asTypeOf
function.
> module AsTypeOf where
When we have a type with multiple varialbles, e.g. Either
:
> rightChar = Right 'a'
In ghci:
ghci> :t rightChar
rightChar :: Either a Char
We are yet not sure for remaining type variables. In above case, we
still do not have a clue to fix the type of Left
constructor.
One way to fix the type used for Left
is wrapping multiple values in
single list:
> rightCharInList = [rightChar, Left True]
Showing the type:
ghci> :t rightCharInList
rightCharInList :: [Either Bool Char]
Or another, using asTypeOf
instead of list:
> rightChar' = rightChar `asTypeOf` Left True
Now the type for Left
constructor is fixed, value is identical to
rightChar
:
ghci> rightChar'
Right 'a'
ghci> :t rightChar'
rightChar' :: Either Bool Char
]]>> mapM_ putStrLn [show 1, show (1,2), show (1,2,3), show (1,2,3,4) ...]
The contents of list is String, but before applying 'show', type of elements differ. Goal of this post is to express above with using template haskell, like:
> mapM_ putStrLn $(tups 3)
Which expands to:
> mapM_ putStrLn [show 1, show (1,2), show (1,2,3))]
in caller module.
> {# LANGUAGE TemplateHaskell #}
>
> module MappingShow where
>
> import Control.Applicative
> import Control.Monad
> import Language.Haskell.TH
A pretty printer for template haskell Q monad:
> printQ :: Ppr a => Q a > IO ()
> printQ q = print . ppr =<< runQ q
Warming up, manually writing the tuple lists, and applying show to them:
> bTup :: Int > ExpQ
> bTup n = tupE [litE (integerL j)  j < [1..fromIntegral n]]
>
> take01 :: ExpQ
> take01 =
> [e
> mapM_ putStrLn [show $(bTup 1), show $(bTup 2), show $(bTup 3)]
> ]
Result:
ghci> printQ take01
Control.Monad.mapM_ System.IO.putStrLn [GHC.Show.show (1),
GHC.Show.show (1, 2),
GHC.Show.show (1, 2, 3)]
We cannot write:
> take02 = [e map putStrLn [show $(bTup i)  i < [1..10]] ]
staging error occurs:
Stage error: `i' is bound at stage 2 but used at stage 1
In the first argument of `bTup', namely `i'
In the expression: bTup i
In the first argument of `show', namely `$(bTup i)'
So we cannot directly build a list of tuple with passing argument, i
in above case. Though, since 'Q' is a Monad, we can run it, and then
rebind it:
> tups :: Int > ExpQ
> tups n = runQ $ do
> ts < foldM (\acc i > (:acc) <$> bTup i) [] [1..n]
> return $ ListE $ map (AppE (VarE 'show)) (reverse ts)
Results:
ghci> printQ $ tups 10
[GHC.Show.show (1),
GHC.Show.show (1, 2),
GHC.Show.show (1, 2, 3),
GHC.Show.show (1, 2, 3, 4),
GHC.Show.show (1, 2, 3, 4, 5),
GHC.Show.show (1, 2, 3, 4, 5, 6),
GHC.Show.show (1, 2, 3, 4, 5, 6, 7),
GHC.Show.show (1, 2, 3, 4, 5, 6, 7, 8),
GHC.Show.show (1, 2, 3, 4, 5, 6, 7, 8, 9),
GHC.Show.show (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)]
We can build a list of String elements in caller module, or prepare a template haskell expression which does the 'mapM_ putStrLn':
> mapShowTuples :: Int > ExpQ
> mapShowTuples n = (varE 'mapM_ `appE` varE 'putStrLn) `appE` tups n
Results:
ghci> printQ $ mapShowTuples 8
Control.Monad.mapM_ System.IO.putStrLn [GHC.Show.show (1),
GHC.Show.show (1, 2),
GHC.Show.show (1, 2, 3),
GHC.Show.show (1, 2, 3, 4),
GHC.Show.show (1, 2, 3, 4, 5),
GHC.Show.show (1, 2, 3, 4, 5, 6),
GHC.Show.show (1, 2, 3, 4, 5, 6, 7),
GHC.Show.show (1, 2, 3, 4, 5, 6, 7, 8)]
Sample caller module may look like below:
{# LANGUAGE TemplateHaskell #}
import MappingShow
t8s :: [String]
t8s = $(tups 8)
t8s_view :: IO ()
t8s_view = mapM_ putStrLn t8s
t8s_view_th :: IO ()
t8s_view_th = $(mapShowTuples 8)
main :: IO ()
main = t8s_view >> t8s_view_th
Result of running 'main':
ghci> main
1
(1,2)
(1,2,3)
(1,2,3,4)
(1,2,3,4,5)
(1,2,3,4,5,6)
(1,2,3,4,5,6,7)
(1,2,3,4,5,6,7,8)
1
(1,2)
(1,2,3)
(1,2,3,4)
(1,2,3,4,5)
(1,2,3,4,5,6)
(1,2,3,4,5,6,7)
(1,2,3,4,5,6,7,8)
]]>