Pyli/Syntax

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search

Introduction

This chapter describes the syntax of Pyli, as well as the semantics.

The syntax is very basic and lisp-like. Expressions are either an atom or a vector. Comments start with # and go to the end of the line.

Note that quoting, traditionally handled with ', `, ,, and @ in Common Lisp, are handled in a very different way in Pyli.

Note: I am not strictly describing where whitespace goes. I think you can figure it out below.

Overview

NOTE: This is the most correct and up-to-date.

Everything a function call

Everything, by default, is a function call:

(+ 1 2)
-> 3

The above looks up what '+' is (the addition function), what '1' is (the number 1), and what '2' is (the number 2), then executes the function that '+' refers to with the parameters 1 and 2.

Very Special Exceptions

There are some exceptions that violate this rule:

(quote foo)
-> "foo"

This will simply return foo, literally. Since symbols are really strings, this is the string "foo".

(inline foo)

This will take whatever foo is, evaluate it expecting a Vector, and patch that vector in whatever context this expression is in.

Quoting

Sometimes, you want to go up a level. That is, you want to talk about the expressions themselves rather than actually execute them. This is what quoting is all about.

(quote (some expression I made up))
(quote abc)

evaluates to:

(some expression I made up)
abc

which is a vector and a string.

A short hand is:

&(some expression I made up)
&abc

So the rule is, & means quote.

Evaluating

Sometimes you want to go down a level, that is, you want to talk about the things the expressions represent rather than the expressions. This is done with extra evaluation.

(eval something)
(eval (some expression I made up))

Evaluates to whatever something is, evaluated again. That is, if "something" referred to "apple", then we would look up what "apple" meant. If something were a vector (+ 1 1), then we would evaluate that.

In the second line, we evaluating (some expression I made up), which means, lookup 'some', 'expression', 'I', 'made', 'up', and then call the function that 'some' referred to with whatever 'expression', 'I', 'made', 'up'. With that result, evaluate it again, whatever it is, whether it is "apple" (look up what 'apple' is) or '(+ 1 1)'.

There is a convenient shorthand for this as well:

$something
$(some expression I made up)

Combining the Two

We can combine the two in interesting ways.

(define one 1)
&(+ $one one)
-> (+ 1 one)

This will lookup the word "one", and replace the second element with that value. The rest are left intact.

$&(+ 1 1)
-> 2

&$(+ 1 1)

This will evaluate the thing that &(+ 1 1) gives. Well, that gives a vector with three items, being "+", "1", and "1". Evaluating that will give you 2.

Passing Around Bits of Code

Oftentimes, you have to pass around bits of code that will be executed later. Vectors are, natively, representations of expressions that can be evaluated later on. So you would use the & marker (or the quote special function) to do this.

(if (> var 5)
   &(write "Bigger than 5!")
   &(write "Not bigger than 5."))

If we wrote it the naive way:

(if (> var 5)
   (write "Bigger than 5!")
   (write "Not bigger than 5."))

then before the function that "if" refers to is even called, we would have already called the two write expressions. The "if" function would actually try to evaluate whatever the result of those expressions were, and that would likely be an error.

Inlining

Oftentimes, we want to create an expression by patching together other vectors. This is easily accomplished with the + function, but it sometimes lacks the clarity that embedded the bits together would inspire.

To increase clarity, the inline special function is introduced. It has the following effect:

(define foo &(1 2 3))
(+ (inline foo))
# (+ 1 2 3)
-> 6
(+ 1 2 3 (inline foo) 1 2 3)
# (+ 1 2 3 1 2 3 1 2 3)
-> 18
(+ (inline (* foo 4)))
# (+ (inline (* (1 2 3) 4)))
# (+ (inline &(1 2 3 1 2 3 1 2 3 1 2 3))
# (+ 1 2 3 1 2 3 1 2 3 1 2 3)
-> 24

To further increase clarity, inlining can be represented with the '@' symbol. The above code is simply:

(define foo &(1 2 3))
(+ @foo)
(+ 1 2 3 @foo 1 2 3)
(+ @(* foo 4))

This is exceptionally useful if you want to pass along the rest args:

(defn foo &(a b (rest args))
   &(+ a b @args))

Which, when called with parameters:

(foo 1 2 3 4 5)
# a => 1
# b => 2
# args => (3 4 5)
# (+ a b @args)
# (+ 1 2 3 4 5)
15

Specifics

The following is a detailed description of the syntax in a lazy BNF format.

Text Format

Pyli programs must be written in a way that can be interpreted as unicode. (I don't have a way to tell if a file is in one encoding or another. One day, this problem will simply go away as the OS will provide this critical service.)

If the encoding can't be determined, it will be assumed to be UTF-8.

Whitespace

Whitespace includes newlines, tabs, spaces, and carriage returns.

Program

A pyli program is a bunch of expresssions mixed with comments.

pyli := comment* (expression comment*)*

A program may be a module or a script. The program is executed, one expression at a time, serially. There is no separate compile or read time evaluation.

Comments

A comments starts at a '#' and ends at the end of the line. There are no comment delimiters like /* ... */ in C. Note that this is very different than Lisp, which uses ';'.

comment := '#' to the end of the line

Expression

An expression can be one of the following:

expression := atom | vector | ampersand-quote | dollar-eval | at-inline | square-vector

The expression, when evaluated, is parsed into either a string (representing the atom) or a vector (representing the appropriate code). To evaluate it, this is simply passed as the only argument to the eval method. (See Pyli/Evaluation Functions.)

Atom

An atom is something like 'foo' or "a quoted string", or a number like 5, 5.0e4, etc...

atom := symbol | quoted-string | number | ...

Quoted String

A quoted string is surrounded by double quotes or single quotes. Inside, you can have text or escape sequences like \t, etc...

quoted-string := '"' text '"' | "'" text "'"
text := text-escape text | not-text-escape text | ε 
text-escape := '\\' | '\n' | '\r' | '\t' | "\'" | '\"'
            | '\b' [01]{8}
            | '\d' digit{3}
            | '\o' octal-digit{3}
            | '\x' hex-digit{2}
            | '\u' hex-digit{4}
            | '\U' hex-digit{8}
            | '\N{' unicode-name '}' # Not implemented yet

Symbol

A symbol is pretty much like lisp. There are some limitations but I haven't decided what they are yet.

symbol := much like Lisp

I'd like to use &, $, and @ in symbols, but these are really useful as shortcuts, so I likely won't.

Numbers

Here are the productions for numbers:

number = ('+' | '-')? (integer | integer-with-base | float) ('i' | 'd' | 'f')?

integer = digit+ (',' digit+)*

integer-with-base = '0b' bin-digit+ | '0o' oct-digit+ | '0d' digit+ | '0x' hex-digit+

float = digit+ ('.' digit*)? ('e' ('+'|'-')? digit*)?
      | digit* '.' digit+ ('e' ('+'|'-')? digit*)

The following define some very simple character classes referenced above:

bin-digit = 0, 1

oct-digit = 0-7

digit = 0-9

hex-digit = 0-9, a-f, A-F

Numbers can be simple integers, always expressed in base-10. The preceding + and - is allowed and not interpreted as an operator. Commas are allowed and ignored. Initial 0's are not interpreted as octal.

1 23 +1234 -1234 1,234,456 099

Or you can specify the base using the '0b' (binary), '0o' (octal), '0x' (hexadecimal) or even '0d' (decimal) prefixes. Hexadecimal allows A-F or a-f for the digits representing 10 to 15.

0b101 0o738 0d995 0xbeAD

These all evaluate to an Integer of the appropriate value. Note that you can't have a hexadecimal decimal. The final 'd' will always be interpreted as a digit.

Numbers can be represented with decimal points.

12.345 0.345 .435 1.

(Note: Although other locales switch the meanings of . and , we keep the US convention here.

Numbers can be represented in scientific notation, with 'e' being interpreted as "times ten to the power of".

12.345e24 -4.3e-12

These are all interpreted as Float of the appropriate value.

Finally, if you append 'i', 'f', or 'd', you will get an Integer, Float, or Decimal. (Decimals are just like floats except they are in base 10 and not 2 internally.)

12e10i => 120,000,000,000 => Integer
12e10f => 1.2e11 => Float
12e10d => Decimal

Vector

A vector is a bunch of expressions between parentheses.

vector := "(" expr* ")"

The meaning of this, if evaluated, is to call the function, macro, or generator that the first item evaluates to with the parameters following. Note that the parameters may not all be evaluated, or evaluated in left-to-right order. This depends on what the operator will do.

Ampersand-quote

An ampersand-quote is an expression that will be quoted. It translates to (quote expression).

Ampersand-quote := "&" expression

This syntax is interpreted as a call to quote with the expression being the sole argument.

&foo => (quote foo)
&(a b c) => (quote (a b c))

The net effect is to return the value unevaluated if it would normally be evaluated.

Square-vector

A square-vector is surrounded by square braces. It translates to (Vector 'expression1 expression2 ...).

square-vector := "[" expression* "]"

Square-vectors are not the equivalent of quoted expressions. Each expression inside the square-vector will be evaluated, unlike quoted expressions.

Note that the ampersand-quote and square-vector relate to each other in a strange way that may be useful. The following two expressions are equivalent:

[&a &b &c]
&(a b c)

Remember this if you'd ever like to quote something with an expression inside that should be evaluated or inlined.

Dollar-eval

A dollar-eval says to do an extra eval on the expression. It translates to (eval expression).

dollar-eval := "$" expression

Dollar-eval can cancel out an ampersand-quote:

$&(+ 1 2) # => 3

At-inline

An at-inline expression means to eval the following expression and inline the result in the vector. It translates to (inline expression).

at-inline := "@" expression
[1 2 3 @[4 5 6] 7 8 9] => (1 2 3 4 5 6 7 8 9)