Way back in October last year (but only 2 entries ago) I posted some Perl code and wrote that I’d post the Haskell sometime soon. Well, that is now; the code is below. I intend to rewrite it to use Parsec at some point, but I haven’t tried that yet since this little hacky script works well enough; and look how long it has taken me to blog it!
module Main where
import Text.XML.HaXml.SAX
data ParserState = FindEntry | FindKeb | FindText
scan :: ParserState -> [SaxElement] -> [String]
scan _ [] = []
scan FindEntry ( (SaxElementOpen "entry" _) : es ) =
scan FindKeb es
scan FindKeb ( (SaxElementClose "entry") : es ) =
"(none)" : (scan FindEntry es)
scan FindKeb ( (SaxElementOpen "keb" _) : es ) =
scan FindText es
scan FindText ( (SaxCharData "\n") : es ) =
scan FindText es
scan FindText ( (SaxCharData txt) : es ) =
txt : (scan FindEntry es)
scan st ( _ : es ) = scan st es
findKebs :: String -> [String]
findKebs i =
let (es, erc) = saxParse "" i in
scan FindEntry es
To understand how it works the most important line is the type declaration “scan :: ParserState -> [SaxElement] -> [String]
“, which is not actually required by Haskell. From that line we know that “scan” is a function that expects a ParserState as its first parameter and a list of SaxElements as its second parameter, and will then return a list of Strings. Everything else is a simple matter of recursion and pattern matching :-)