honeycomb

    Dark Mode
Search:
Group by:
  Source   Edit

Honeycomb is a parser combinator library written in pure Nim. It's designed to be simple, straightforward, and easy to expand, while relying on zero dependencies from outside of Nim's standard library.

Honeycomb was heavily inspired by the excellent Python library parsy, as well as the existing but unmaintained combparser.

Example:

import honeycomb
let
  parser  = ((s("Hello") | s("Greetings")) << c(',') << whitespace) & (regex(r"\w+") << c("!."))
  result1 = parser.parse("Hello, world!")
  result2 = parser.parse("Greetings, peasants.")

assert result1.kind  == success
assert result1.value == @["Hello", "world"]

assert result2.kind  == success
assert result2.value == @["Greetings", "peasants"]
Honeycomb supports the following key features:
  • Predefined parsers and parser constructors for numerous basic parsing needs
  • An extensive library of combinators with which to combine them
  • Support for manually defining custom parsers / combinators
  • Forward-declared parsers to support mutually recursive parser definitions

Key functions and types

Core parser constructors

  • s - parse a literal string
  • c - parse a literal character, or one of a list or range of characters
  • regex - parse a regular expression match
  • nop - always succeed, consuming no input

Predefined parsers

  • eof - A parser that fails if there is any remaining input.
  • anyChar - A parser that succeeds for one character of any non-empty input.
  • whitespace - A parser that expects at least one whitespace character.
  • letter - A parser that expects one ASCII alphabetical character.
  • digit - A parser that expects one ASCII digit character.
  • alphanumeric - A parser that expects one ASCII alphanumeric character.

Parser combinators

  • & or chain - expect multiple parsers one after the other
  • | or oneOf - expect one of multiple parsers, preferring the left
  • >> or then - same as &, but discards the left-hand parser's result instead of creating a seq
  • << or skip - same as &, but discards the right-hand parser's result instead of creating a seq
  • * or times - expect a parser multiple times in a row, or a range of times
  • ! or negla - negative lookahead; expect a parser to fail, but consume no input
  • many - expect a parser 0 or more times
  • atLeast - expect a parser at least n times
  • atMost - expect a parser 0 to n times
  • optional - expect a parser optionally, returning the default value of its result type if it doesn't match
  • orEmpty - expect a parser optionally, returning it in a seq or an empty seq if it doesn't match
  • map - run a custom function on the value of a successful parse
  • mapEach - run a custom function on each value of a successful parse containing a seq
  • result - replace the value of a successful parse with a constant value
  • filter - filter the results of a successful parse by a predicate function
  • validate - validate the result of a successful parse by a predicate function
  • flatten - remove a level of nested seqs from a parser
  • removeEmpty - remove empty seqs from a parser resulting in nested seqs
  • desc - set a custom description to be shown when a parser fails
  • asSeq - wrap a parser's result in a seq
  • asString - convert a parser's result to a string via $

Execution and results

  • parse - execute a parser and convert it into a ParseResult
  • error - generate an error message from a failed ParseResult
  • raiseIfFailed - raise a ParseError from a failed ParseResult
  • lineInfo - get the line and column at which a parser ended

Advanced parser construction tools

  • createParser - manually create a parser by defining its processing function
  • succeed - create a successful ParseResult
  • fail - create a failed ParseResult
  • applyParser - apply a parser in the definition of a combinator, failing if the given parser fails
  • fwdcl - create a forward-declared parser which can be initialized later

Types

ParseError = object of CatchableError
  Source   Edit
Parser[T] = ref object
  body: proc (input: string): ParseResult[T]

A constructed parser.

See also:

  Source   Edit
ParseResult[T] = object
  case kind*: ParseResultKind
  of success:
    value*: T                ## The value of the successful parse.
  of failure:
    expected*: seq[string]   ## A `seq` of expected values for a failed parse.
  tail*: string              ## The remaining unparsed input.
  fromInput*: string         ## The input from which this result was generated.
  

The result of a parser run.

See also:

  Source   Edit
ParseResultKind = enum
  success, failure
  Source   Edit

Lets

alphanumeric = letter | digit
A parser that expects one ASCII alphanumeric character.   Source   Edit
anyChar = anyCharImpl
A parser that succeeds for one character of any non-empty input.   Source   Edit
digit = c('0' .. '9')
A parser that expects one ASCII digit character.   Source   Edit
eof = eofImpl
A parser that fails if there is any remaining input.   Source   Edit
letter = c('a' .. 'z') | c('A' .. 'Z')
A parser that expects one ASCII alphabetical character.   Source   Edit
whitespace = regex(r"\s+")
A parser that expects at least one whitespace character.   Source   Edit

Procs

func `!`[T](a: Parser[T]): Parser[T]

Succeeds if the given parser fails and fails if it succeeds, consuming no input regardless. The resulting value if successful will be the default for type T.

See also:

  • negla - textual equivalent to this operator

Example:

let
  parser  = !s("Hello")
  result1 = parser.parse("Hello, world!")
  result2 = parser.parse("Greetings, peasants!")

assert result1.kind  == failure
assert result1.error == "[1:1] Expected successful negative lookahead"
assert result2.kind  == success
assert result2.value == ""
  Source   Edit
func `&`[T](a, b: Parser[seq[T]]): Parser[seq[T]]

Expects each parser in sequence from left to right, creating a seq of their results. If one or both of the parsers already results in a seq of the other's type, the two seqs will be merged.

See also:

  • chain - textual equivalent to this operator

Example:

let
  parser = s("Hello, ") & s("world!")
  result = parser.parse("Hello, world!")

assert result.kind  == success
assert result.value == @["Hello, ", "world!"]
  Source   Edit
func `*`[T](a: Parser[T]; n: int): Parser[seq[T]]

Expects the parser a given number of times, returning a seq of the matches. Also supports slices as ranges of valid amounts (see *).

Note that this will succeed early if the given parser succeeds but doesn't consume any input, in order to prevent infinite loops caused by parsers like nop or atMost. This means it may not work correctly on parsers with non-deterministic behavior or which use/modify external state; this is intentionally undefined behavior.

See also:

Example:

let
  parser = s("Hello ") * 3
  result = parser.parse("Hello Hello Hello ")

assert result.kind  == success
assert result.value == @["Hello ", "Hello ", "Hello "]
  Source   Edit
func `<<`[T](a: Parser[T]; b: Parser): Parser[T]

Expects each parser in sequence from left to right, ignoring the result of the right parser if successful.

See also:

  • skip - textual equivalent to this operator
  • >> / then

Example:

let
  parser = s("Hello, ") << s("world!")
  result = parser.parse("Hello, world!")

assert result.kind  == success
assert result.value == "Hello, "
  Source   Edit
func `>>`[T](a: Parser; b: Parser[T]): Parser[T]

Expects each parser in sequence from left to right, ignoring the result of the left parser if successful.

See also:

  • then - textual equivalent to this operator
  • << / skip

Example:

let
  parser = s("Hello, ") >> s("world!")
  result = parser.parse("Hello, world!")

assert result.kind  == success
assert result.value == "world!"
  Source   Edit
func `|`[T](a, b: Parser[T]): Parser[T]

Succeeds if either parser succeeds, attempting them from left to right.

See also:

  • oneOf - textual equivalent to this operator

Example:

let
  parser  = s("Hello") | s("Greetings")
  result1 = parser.parse("Hello, world!")
  result2 = parser.parse("Greetings, peasants!")

assert result1.kind  == success
assert result1.value == "Hello"
assert result2.kind  == success
assert result2.value == "Greetings"
  Source   Edit
func c(expect: char): Parser[char] {....raises: [], tags: [].}

Creates a parser matching exactly the given character.

See also:

Example:

let
  parser = c('H')
  result = parser.parse("Hello, world!")

assert result.kind  == success
assert result.value == 'H'
  Source   Edit
func c(expect: Slice[char]): Parser[char] {....raises: [], tags: [].}
Creates a parser matching any one character from the given range.

Example:

let
  parser = c('H'..'K')
  result = parser.parse("Hello, world!")

assert result.kind  == success
assert result.value == 'H'
  Source   Edit
func c(expect: string): Parser[char] {....raises: [], tags: [].}
Creates a parser matching any one character from the given string.

Example:

let
  parser = c("HIJK")
  result = parser.parse("Hello, world!")

assert result.kind  == success
assert result.value == 'H'
  Source   Edit
func desc[T](a: Parser[T]; description: string): Parser[T]
Add a custom description to a parser, which is shown when the parser fails instead of the default expectation.

Example:

let
  parser = s("Hello, world!").desc("a nice greeting")
  result = parser.parse("Greetings, peasants!")

assert result.kind == failure
assert result.error == "[1:1] Expected a nice greeting"
  Source   Edit
func error(result1: ParseResult; showPos: bool = true): string

Generate an error message from a failed ParseResult. Returns an empty string if passed a successful result.

If showPos is true, the error message is prepended with the line and column number at which the error occurred.

See also:

Example:

let
  parser = s("Hello, world!")
  result = parser.parse("Greetings, peasants!")

assert result.kind  == failure
assert result.error == "[1:1] Expected 'Hello, world!'"
  Source   Edit
func filter[T](a: Parser[seq[T]]; fn: proc (x: T): bool): Parser[seq[T]]

Filter the results of a successful parse to seq by the given predicate, keeping only those results for which it returns true.

See also:

  Source   Edit
func lineInfo(result1: ParseResult): (int, int)

Get the line and column at which a ParseResult's tail begins.

See also:

  Source   Edit
func map[T, U](a: Parser[T]; fn: proc (x: T): U): Parser[U]

If the parser is successful, calls fn on the parsed value and succeeds with its return value.

See also:

Example:

from std/sugar import `=>`
let
  parser = (s("Hello, ") & s("world!")).map(x => (x[0], x[1]))
  result = parser.parse("Hello, world!")

assert result.kind  == success
assert result.value == ("Hello, ", "world!")
  Source   Edit
func nop[T](): Parser[T]
Creates a parser for the given type which always succeeds, consumes no input, and has a value of the default for type T.   Source   Edit
proc parse[T](p: Parser[T]; input: string): ParseResult[T]
Execute a parser on the given input.

Example:

let
  parser = s("Hello, world!")
  result = parser.parse("Hello, world!")

assert result.kind  == success
assert result.value == "Hello, world!"
  Source   Edit
func regex(expect: string): Parser[string] {....raises: [RegexError], tags: [].}

Creates a parser matching the given regex. The regex must match from the start of the input.

See also:

Example:

let
  parser = regex(r"\w+, \w+!")
  result = parser.parse("Hello, world!")

assert result.kind  == success
assert result.value == "Hello, world!"
  Source   Edit
func s(expect: string): Parser[string] {....raises: [], tags: [].}

Creates a parser matching exactly the given string.

See also:

Example:

let
  parser = s("Hello, world!")
  result = parser.parse("Hello, world!")

assert result.kind  == success
assert result.value == "Hello, world!"
  Source   Edit
func validate[T](p: Parser[T]; fn: proc (x: T): bool; expected: string): Parser[
    T]

Validate the result of a successful parse by the given predicate, failing if it returns false.

See also:

Example:

from std/strutils import parseInt
from std/sugar import `=>`
let
  parser  = digit.atLeast(1).join.map(a => a.parseInt)
    .validate(a => a < 500, "number less than 500")
  result1 = parser.parse("874")
  result2 = parser.parse("323")

assert result1.kind == failure
assert result2.kind == success
  Source   Edit

Converters

converter asString(a: Parser[char]): Parser[string] {....raises: [], tags: [].}
Implicitly converts char parsers to string parsers for ease of use.   Source   Edit

Macros

macro applyParser(parser, input, T: untyped)

Applies the given parser, evaluating to its result if successful or returning from the containing function if failed.

This is primarily a tool for simplifying the creation of combinators internally, and should only be used if you know what you're doing. It is only designed to function properly in the context of a block passed to createParser, and will raise a compile warning if used outside of this module.

Example:

# Honeycomb exports roughly this definition as the `&` operator.
func sequence[T](a, b: Parser[T]): Parser[seq[T]] =
  createParser(seq[T]):
    {.push warnings: off.} # Hide the warnings, we know what we're doing.
    let result1 = applyParser(a, input, seq[T])
    let result2 = applyParser(b, result1.tail, seq[T])
    {.pop.}
    return succeed(input, @[result1.value, result2.value], result2.tail)

let
  parser = sequence(s("Hello, "), s("world!"))
  result = parser.parse("Hello, world!")

assert result.kind  == success
assert result.value == @["Hello, ", "world!"]
  Source   Edit

Templates

template `&`[T](a, b: Parser[T]): Parser[seq[T]]
Same as &, exists to wrap non-seq parsers in seqs.   Source   Edit
template `&`[T](a: Parser[seq[T]]; b: Parser[T]): Parser[seq[T]]
Same as &, exists to wrap non-seq parsers in seqs.   Source   Edit
template `&`[T](a: Parser[T]; b: Parser[seq[T]]): Parser[seq[T]]
Same as &, exists to wrap non-seq parsers in seqs.   Source   Edit
template `*`[T](p: Parser[T]; n: Slice[int]): Parser[seq[T]]
Same as *, but takes a range of possible amounts, expecting at least the lower bound and at most the higher bound.   Source   Edit
template asSeq[T](a: Parser[T]): Parser[seq[T]]
Wraps a parser's result in a seq.   Source   Edit
template asString(a: Parser): Parser[string]
Converts a parser to a string parser via $.   Source   Edit
template atLeast[T](a: Parser[T]; n: int): Parser[seq[T]]

Expects the parser n or more times, returning a seq of the matches.

See also:

  Source   Edit
template atMost[T](a: Parser[T]; n: int): Parser[seq[T]]

Expects the parser n or fewer times, returning a seq of the matches.

See also:

  Source   Edit
template become[T](a: var Parser[T]; b: Parser[T])
Initialize a forward-declared parser created with fwdcl, after which it can be used.   Source   Edit
template chain[T](p1, p2: Parser[T]; ps: varargs[Parser[T]]): Parser[seq[T]]
Textual alternative to &. Accepts more than two parsers for convenience, chaining them in order.   Source   Edit
template createParser(T: typedesc; parser_body: untyped): Parser[T]

Convenience method for creating a custom Parser. Expects the parser's result type as a parameter (not a generic!) and a block which will form the body of the parser.

Inside the given block, the following bindings are exposed:

  • let input: string
    The input string to be parsed.

  • func succeed(input: string, value: T, tail: string)
    Creates a successful ParseResult with the given value.

  • func fail(input: string, expected: seq[string], tail: string)
    Creates a failed ParseResult with the given expected.

The block should return a ParseResult created by either succeed or fail, with the tail consisting of the remaining unparsed input. If the parser failed, the tail should almost always be the entire input; this should only not be the case when a combinator needs to partially consume the input, for example the & operator.

Example:

# This is just a contrived example; for this exact interaction,
#   `s("power level").result(9001)` would be better.
let
  parser = createParser(int):
    if input == "power level": return succeed(input, 9001, "")
    fail(input, @["'power level'"], input)

  result = parser.parse("power level")

assert result.kind  == success
assert result.value == 9001
  Source   Edit
template flatten[T](p: Parser[seq[seq[T]]]): Parser[seq[T]]

Remove one level of nested seqs from a parser.

See also:

Example:

let
  parser = digit.atLeast(3).atLeast(1).flatten
  result = parser.parse("127456")

assert result.kind  == success
assert result.value == @['1', '2', '7', '4', '5', '6']
  Source   Edit
template fwdcl[T](): var Parser[T]

Create a forward-declared parser.

A forward-declared parser can be used normally, but must be initialized with become before you can call parse on it. A variable containing a forward-declared parser must be declared with var for become to function.

You should only use a forward-declared parser if you absolutely need one. Their primary use case is for creating mutually recursive definitions. If you don't need to use the parser in combinators before it's possible to define it, you probably don't need a forward-declared parser.

Example:

var parser1 = fwdcl[string]()
let parser2 = s("Hello, ") >> parser1 << c('!')

parser1.become(s("world"))

let result = parser2.parse("Hello, world!")

assert result.kind  == success
assert result.value == "world"
  Source   Edit
template join(a: Parser[seq[string or char]]; delim: string or char = ""): Parser[
    string]
Joins a seq[string] parser into a single string, using the given delimiter.

Example:

let
  parser = (s("Hello, ") & s("world!")).join
  result = parser.parse("Hello, world!")

assert result.kind  == success
assert result.value == "Hello, world!"
  Source   Edit
template many[T](a: Parser[T]): Parser[seq[T]]

Expects the parser 0 or more times, returning a seq of the matches.

See also:

  Source   Edit
template mapEach[T, U](a: Parser[seq[T]]; fn: proc (x: T): U): Parser[seq[U]]

If the parser is successful, calls fn on each value in the resulting seq, succeeding with a seq of the results.

See also:

Example:

from std/strutils import toUpperAscii
let
  parser = (s("Hello, ") & s("world!")).mapEach(toUpperAscii)
  result = parser.parse("Hello, world!")

assert result.kind  == success
assert result.value == @["HELLO, ", "WORLD!"]
  Source   Edit
template negla(a: Parser): auto
Textual alternative to !.   Source   Edit
template oneOf[T](p1, p2: Parser[T]; ps: varargs[Parser[T]]): Parser[T]
Textual alternative to |. Accepts more than two parsers for convenience, attempting them in order.   Source   Edit
template optional[T](a: Parser[T]): Parser[T]

Expects the parser optionally, returning the default value of type T if it doesn't match.

See also:

  Source   Edit
template orEmpty[T](a: Parser[T]): Parser[seq[T]]
See also:   Source   Edit
template raiseIfFailed(result1: ParseResult)

Raise an exception if the given ParseResult is failed.

See also:

Example:

let
  parser = s("Hello, world!")
  result = parser.parse("Greetings, peasants!")

try:
  result.raiseIfFailed   #! Error: [1:1] Expected 'Hello, world!'
except ParseError:
  discard
  Source   Edit
template removeEmpty[T](p: Parser[seq[seq[T]]]): Parser[seq[seq[T]]]

Remove empty seqs from a parser returning nested seqs.

See also:

Example:

let
  parser = digit.atMost(3).atLeast(4).removeEmpty
  result = parser.parse("127456")

assert result.kind  == success
assert result.value == @[@['1', '2', '7'], @['4', '5', '6']]
  Source   Edit
template result[T](a: Parser; r: T): Parser[T]

If the parser is successful, succeeds with the given r as value.

See also:

Example:

let
  parser = s("power level").result(9001)
  result = parser.parse("power level")

assert result.kind  == success
assert result.value == 9001
  Source   Edit
template skip(a, b: Parser): auto
Textual alternative to <<.   Source   Edit
template then(a, b: Parser): auto
Textual alternative to >>.   Source   Edit
template times(a: Parser; n: auto): auto
Textual alternative to *.   Source   Edit