Decoding encoding bencoded data with haskell

From Theory.org Wiki
Jump to: navigation, search

This is a crude implementation in haskell that uses list comprehension for parsing.

Here we will make a type for the structure of the bencoding.

> module BEncode (BValue (..)) where

To be able to distinguish digits

> import Char

Summary: a bvalue is either (vim regular expresions):
i\(0\|-\{0,1\}[1-9][0-9]*\)e - an integer in base 10.
\([1-9][0-9]*\):[0-\377]*\{\1\} - a string of exactly \1 characters, a bstring.
l<bvalue>*e - a heterogeneous list of bvalues.
d\(<bstring><bvalue>\)*e - a heterogeneous dictionary with strings as keys.

> data BValue = BInteger Int |
>     BString String |
>     BList [BValue] |
>     BDictionary [(String, BValue)] deriving (Eq, Ord)

The easy thing, the serialization.

> beShow :: BValue -> String
> beShow x = beShows x ""

We do this to have linear time.

> beShows :: BValue -> String -> String
> beShows (BInteger x) s = 'i' : shows x ('e':s)
> beShows (BString x) s = shows (length x) (':' : x ++ s)
> beShows (BList xs) s = 'l' : foldr beShows ('e':s) xs
> beShows (BDictionary xs) s = 'd' : foldr mapMap ('e':s) xs
>     where mapMap (k, v) s = beShows (BString k) (beShows v s)

And now we can make BValue an instance of Show.

> instance Show(BValue) where
>     showsPrec _ x = beShows x

And now the parser.

> beReads :: String -> [(BValue, String)]
> beReads ('i':s) = [(BInteger x, rs) | (x, 'e':rs) <- intReads s]
> beReads ('l':s) = [(BList xs, rs) | (xs, rs) <- beListReads s]
> beReads ('d':s) = [(BDictionary xs, rs) | (xs, rs) <- beDictionaryReads s]
> beReads s = [(BString x, rs) | (l, ':':ss) <- intReads s, (x, rs) <- [splitAt l ss]]
> beListReads ('e':s) = [([], s)]
> beListReads s = [(x:xs, rs) | (x, rs') <- beReads s, (xs, rs) <- beListReads rs']
> beDictionaryReads ('e':s) = [([], s)]
> beDictionaryReads s = [((x, y):xys, rs) | (BString x, rs') <- beReads s,
>                                           (y, rs'') <- beReads rs',
>                                           (xys, rs) <- beDictionaryReads rs'']

For now I don't know how to make reading an integer not looking to exponent part so:

> intReads (x:s) | isDigit x = [(n, rs) | (n, rs) <- intReads' (digitToInt x) s]
> intReads _ = []
> intReads' x (y:s) | isDigit y = [k | k <- intReads' (x * 10 + digitToInt y) s]
> intReads' x s = [(x, s)]

Let's make BValue an instance of Read also.

> instance Read(BValue) where
>     readsPrec _ x = beReads x