pyPEG – a PEG Parser-Interpreter in Python
Requires Python 3.x or 2.7
Older versions: pyPEG 1.x
Offers parsing and composing capabilities. Implements an intrinsic Packrat parser.
pyPEG uses memoization as speed enhancement. Create a Parser
instance to have a reset cache memory. Usually this is recommended if you're parsing another text – the cache memory will not provide wrong results but a reset will save memory consumption. If you're altering the grammar then clearing the cache memory for the respective things is required for having correct parsing results. Please use the clear_memory()
method in that case.
The instance variables are representing the parser's state.
| Regular expression to scan whitespace; default: |
|
|
| after parsing, |
| string to use to indent while composing; default: four spaces |
| level to indent to; default: |
| original text to parse; set for decorated syntax errors |
| filename where text is origin from |
| add blanks while composing if grammar would possibly be violated otherwise; default: True |
| keep otherwise cropped things like comments and whitespace; these things are being put into the |
__init__(self)
Initialize instance variables to their defaults.
clear_memory(self, thing=None)
Clear cache memory for packrat parsing.
This method clears the cache memory for thing
. If None
is given as thing
, it clears the cache completely.
| thing for which cache memory is cleared; default: |
parse(self, text, thing, filename=None)
(Partially) parse text
following thing
as grammar and return the resulting things.
This method parses as far as possible. It does not raise a SyntaxError
if the source text
does not parse completely. It returns a SyntaxError
object as result
part of the return value if the beginning of the source text
does not comply with grammar thing
.
| text to parse |
| grammar for things to parse |
| filename where text is origin from |
Returns (text, result)
with:
| unparsed text |
| generated objects |
| if input does not match types |
| if output classes have wrong syntax for their respective |
| if grammar contains an object of unkown type |
| if grammar contains an illegal cardinality value |
Example:
rd
)◊
)◊
])
||
compose(self, thing, grammar=None)
Compose text using thing
with grammar
. If thing.compose()
exists, execute it, otherwise use grammar
to compose.
|
|
|
|
Composed text
| if |
| if |
| if |
Example:
>>> from pypeg2 import Parser, csl, word
>>> p = Parser()
>>> p.compose(['hello', 'world'], csl(word))
'hello, world'
generate_syntax_error(self, msg, pos)
Generate a syntax error construct.
| string with error message |
|
|
Instance of SyntaxError
with error text
parse(text, thing, filename=None, whitespace=whitespace, comment=None, keep_feeble_things=False)
Parse text following thing
as grammar and return the resulting things or raise an error.
|
|
|
|
|
|
| regular expression to skip |
|
|
| keep otherwise cropped things like comments and whitespace; these things are being put into the |
generated things
| if |
| if input does not match types |
| if output classes have wrong syntax for |
| if |
| if |
Example:
>>> from pypeg2 import parse, csl, word
>>> parse("hello, world", csl(word))
['hello', 'world']
compose(thing, grammar=None, indent=" ", autoblank=True)
Compose text using thing
with grammar
.
|
|
|
|
| string to use to indent while composing; default: four spaces |
| add blanks if grammar would possibly be violated otherwise; default: True |
composed text
| if input does not match |
| if |
| if |
Example:
>>> from pypeg2 import compose, csl, word
>>> compose(['hello', 'world'], csl(word))
'hello, world'
attributes(grammar, invisible=False)
Iterates all attributes of a grammar
.
This function can be used to iterate through all attributes which will be generated for the top level object of the grammar
. If invisible is False
omit attributes whose names are starting with an underscore _
.
Example:
>>> from pypeg2 import attr, name, attributes, word, restline
>>> class Me:
... grammar = name(), attr("typing", word), restline
...
>>> for a in attributes(Me.grammar): print(a.name)
...
name
typing
>>>
how_many(grammar)
Determines the possibly parsed objects of grammar.
This function is meant to check if the results of a grammar can be stored in a single object or a collection will be needed.
| if there will be no objects |
| if there will be a maximum of one object |
| if there can be more than one object |
| if |
| if |
Example:
>>> from pypeg2 import how_many, word, csl
>>> how_many("some")
0
>>> how_many(word)
1
>>> how_many(csl(word))
2
Base class for all errors pyPEG delivers.
A grammar contains an object of a type which cannot be parsed, for example an instance of an unknown class or of a basic type like float
. It can be caused by an int
at the wrong place, too.
A grammar contains an object with an illegal value, for example an undefined cardinality.