4i: Parsing (Useful Idioms)

++alf

Alphabetic characters

Parse alphabetic characters, both upper and lowercase.

Source

    ++  alf  ;~(pose low hig)                               ::  alphabetic

Examples

        > (scan "a" alf)
        ~~a

        > (scan "A" alf)
        ~~~41.

        > (scan "AaBbCc" (star alf))
        "AaBbCc"

++aln

Alphanumeric characters

Parse alphanumeric characters - both alphabetic characters and numbers.

Source

    ++  aln  ;~(pose low hig nud)                           ::  alphanumeric

Examples

        > (scan "0" aln)
        ~~0

        > (scan "alf42" (star aln))
        "alf42"

        > (scan "0123456789abcdef" (star aln))
        "0123456789abcdef"

++alp

Alphanumeric and -

Parse alphanumeric strings and hep, "-".

Source

    ++  alp  ;~(pose low hig nud hep)                       ::  alphanumeric and -

Examples

        > (scan "7" alp)
        ~~7

        > (scan "s" alp)
        ~~s

        > (scan "123abc-" (star alp))
        "123abc-"

++bet

Axis syntax -, +

Parse the hep and lus axis syntax.

Source

    ++  bet  ;~(pose (cold 2 hep) (cold 3 lus))             ::  axis syntax - +

Examples

        > (scan "-" bet)
        2

        > (scan "+" bet)
        3

++bin

Binary to atom

Parse a tape of binary (0s and 1s) and produce its atomic representation.

Source

    ++  bin  (bass 2 (most gon but))                        ::  binary to atom

Examples

        > (scan "0000" bin)
        0

        > (scan "0001" bin)
        1

        > (scan "0010" bin)
        2

        > (scan "100000001111" bin)
        2.063

++but

Binary digit

Parse a single binary digit.

Source

    ++  but  (cook |=(a=@ (sub a '0')) (shim '0' '1'))      ::  binary digit

Examples

        > (scan "0" but)
        0
        > (scan "1" but)
        1
        > (scan "01" but)
        ! {1 2}
        ! 'syntax-error'
        ! exit
        > (scan "01" (star but))
        ~[0 1]

++cit

Octal digit

Parse a single octal digit.

Source

    ++  cit  (cook |=(a=@ (sub a '0')) (shim '0' '7'))      ::  octal digit

Examples

        > (scan "1" cit)
        1
        > (scan "7" cit)
        7
        > (scan "8" cit)
        ! {1 1}
        ! 'syntax-error'
        ! exit
        > (scan "60" (star cit))
        ~[6 0]

++dem

Decimal to atom

Parse a decimal number to an atom.

Source

    ++  dem  (bass 10 (most gon dit))                       ::  decimal to atom

Examples

        > (scan "7" dem)
        7
        > (scan "42" dem)
        42
        > (scan "150000000" dem)
        150.000.000
        > (scan "12456" dem)
        12.456

++dit

Decimal digit

Parse a single decimal digit.

Source

    ++  dit  (cook |=(a=@ (sub a '0')) (shim '0' '9'))      ::  decimal digit

Examples

        > (scan "7" dit)
        7
        > (scan "42" (star dit))
        ~[4 2]
        > (scan "26000" (star dit))
        ~[2 6 0 0 0]

++dog

. optional gap

Dot followed by an optional gap, used with numbers.

Source

    ++  dog  ;~(plug dot gay)                               ::

Examples

    > 1.234.703
                703
    1.234.703

    > (scan "a.        " ;~(pfix alf dog))
    [~~~. ~]

++doh

@p separator

Phonetic base phrase separator

Source

    ++  doh  ;~(plug ;~(plug hep hep) gay)                  ::

Examples

    /> ~nopfel-botduc-nilnev-dolfyn--haspub-natlun-lodmur-holtyd
    ~nopfel-botduc-nilnev-dolfyn--haspub-natlun-lodmur-holtyd
    /> ~nopfel-botduc-nilnev-dolfyn--
                haspub-natlun-lodmur-holtyd
    ~nopfel-botduc-nilnev-dolfyn--haspub-natlun-lodmur-holtyd
    > (scan "--" doh)
    [[~~- ~~-] ~]
    > (scan "--      " doh)
    [[~~- ~~-] ~]

++dun

-- to ~

Parse phep, --, to null, ~.

Source

    ++  dun  (cold ~ ;~(plug hep hep))                      ::  -- (phep) to ~

Examples

    > (scan "--" dun)
    ~
    > (dun [[1 1] "--"])
    [p=[p=1 q=3] q=[~ u=[p=~ q=[p=[p=1 q=3] q=""]]]]

++duz

== to ~

Parse stet, ==, to null ~.

Source

    ++  duz  (cold ~ ;~(plug tis tis))                      ::  == (stet) to ~

Examples

    > (scan "==" duz)
    ~
    > (duz [[1 1] "== |=..."])
    [p=[p=1 q=3] q=[~ u=[p=~ q=[p=[p=1 q=3] q=" |=..."]]]]

++gah

Newline or ' '

Whitespace component, either newline or space.

Source

    ++  gah  (mask [`@`10 ' ' ~])                           ::  newline or ace

Examples

    /> ^-  *  ::  show spaces
                """
                   -
                 -
                  -
                """
    [32 32 32 45 10 32 45 10 32 32 45 0]
    /> ^-  *
                """

                """
    [32 32 32 10 32 10 32 32 0]
    /> ^-  (list ,@)
                %-  scan  :_  (star gah)
                """

                """
    ~[32 32 32 10 32 10 32 32]

++gap

Plural whitespace

Separates tall runes

Source

    ++  gap  (cold ~ ;~(plug gaq (star ;~(pose vul gah))))  ::  plural whitespace

++gaq

End of line

Two spaces, a newline, or comment.

Source

    ++  gaq  ;~  pose                                       ::  end of line
                 (just `@`10)
                 ;~(plug gah ;~(pose gah vul))
                 vul
             ==

++gaw

Classic whitespace

Terran whitespace.

Source

    ++  gaw  (cold ~ (star ;~(pose vul gah)))               ::  classic white

++gay

Optional gap

Optional gap.

Source

    ++  gay  ;~(pose gap (easy ~))                          ::

++gon

Long numbers

Parse long numbers - Numbers which wrap around the shell with the line

Source

     ++  gon  ;~(pose ;~(plug bas gay fas) (easy ~))         ::  long numbers \ /

Examples

        > (scan "\\/" gon)
        [~~~5c. ~ ~~~2f.]

        > (gon [[1 1] "\\/"])
        [p=[p=1 q=3] q=[~ u=[p=[~~~5c. ~ ~~~2f.] q=[p=[p=1 q=3] q=""]]]]

++gul

Axis syntax < or >

Parse the axis gal and gar axis syntax.

Source

    ++  gul  ;~(pose (cold 2 gal) (cold 3 gar))             ::  axis syntax < >

Examples

        > (scan "<" gul)
        2

        > (scan ">" gul)
        3

++hex

Hex to atom

Parse any hexadecimal number to an atom.

Source

    ++  hex  (bass 16 (most gon hit))                       ::  hex to atom

Examples

        > (scan "a" hex)
        10
        > (scan "A" hex)
        10
        > (scan "2A" hex)
        42
        > (scan "1ee7" hex)
        7.911
        > (scan "1EE7" hex)
        7.911
        > (scan "1EE7F7" hex)
        2.025.463
        > `@ux`(scan "1EE7F7" hex)
        0x1e.e7f7

++hig

Uppercase

Parse a single uppercase letter.

Source

    ++  hig  (shim 'A' 'Z')                                 ::  uppercase

Examples

        > (scan "G" hig)
        ~~~47.

        > `cord`(scan "G" hig)
        'G'

        > (scan "ABCDEFGHIJKLMNOPQRSTUVWXYZ" (star hig))
        "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

        > (hig [[1 1] "G"])
        [p=[p=1 q=2] q=[~ [p=~~~47. q=[p=[p=1 q=2] q=""]]]]

++hit

Hex digits

Parse a single hexadecimal digit.

Source

    ++  hit  ;~  pose                                       ::  hex digits
               dit
               (cook |=(a=char (sub a 87)) (shim 'a' 'f'))
               (cook |=(a=char (sub a 55)) (shim 'A' 'F'))
             ==

Examples

        > (scan "a" hit)
        10
        > (scan "A" hit)
        10
        > (hit [[1 1] "a"])
        [p=[p=1 q=2] q=[~ [p=10 q=[p=[p=1 q=2] q=""]]]]
        > (scan "2A" (star hit))
        ~[2 10]

++iny

Indentation block

Apply ++rule to indented block starting at current column number, omitting the leading whitespace.

Accepts

sef is a ++rule

Produces

A ++rule.

Source

    ++  inde  |*  sef/rule                                  :: indentation block
      |=  nail  ^+  (sef)
      =+  [har tap]=[p q]:+<
      =+  lev=(fil 3 (dec q.har) ' ')
      =+  eol=(just `@t`10)
      =+  =-  roq=((star ;~(pose prn ;~(sfix eol (jest lev)) -)) har tap)
          ;~(simu ;~(plug eol eol) eol)
      ?~  q.roq  roq
      =+  vex=(sef har(q 1) p.u.q.roq)
      =+  fur=p.vex(q (add (dec q.har) q.p.vex))
      ?~  q.vex  vex(p fur)
      =-  vex(p fur, u.q -)
      :+  &3.vex
        &4.vex(q.p (add (dec q.har) q.p.&4.vex))
      =+  res=|4.vex
      |-  ?~  res  |4.roq
      ?.  =(10 -.res)  [-.res $(res +.res)]
      (welp [`@t`10 (trip lev)] $(res +.res))
    ::

Examples

    > (scan "abc" (iny (star ;~(pose prn (just `@`10)))))
    "abc"
    > (scan "abc" (star ;~(pose prn (just `@`10))))
    "abc"
    > (scan "  abc\0ade" ;~(pfix ace ace (star ;~(pose prn (just `@`10)))))
    "abc
        de"
    > (scan "  abc\0ade" ;~(pfix ace ace (iny (star ;~(pose prn (just `@`10))))))
    ! {1 6}
    ! exit
    > (scan "  abc\0a  de" ;~(pfix ace ace (iny (star ;~(pose prn (just `@`10))))))
    "abc
        de"

++low

Lowercase

Parse a single lowercase letter.

Source

    ++  low  (shim 'a' 'z')                                 ::  lowercase

Examples

        > (scan "g" low)
        ~~g
        > `cord`(scan "g" low)
        'g'
        > (scan "abcdefghijklmnopqrstuvwxyz" (star low))
        "abcdefghijklmnopqrstuvwxyz"
        > (low [[1 1] "g"])
        [p=[p=1 q=2] q=[~ [p=~~g q=[p=[p=1 q=2] q=""]]]]

++mes

Hexbyte

Parse a hexbyte.

Source

    ++  mes  %+  cook                                       ::  hexbyte
               |=({a/@ b/@} (add (mul 16 a) b))
             ;~(plug hit hit)

Examples

        > (scan "2A" mes)
        42
        > (mes [[1 1] "2A"])
        [p=[p=1 q=3] q=[~ u=[p=42 q=[p=[p=1 q=3] q=""]]]]
        > (scan "42" mes)
        66

++nix

Letters and -

Parse Letters and -.

Source

    ++  nix  (boss 256 (star ;~(pose aln cab)))             ::

Examples

    > (scan "as_me" nix)
    q=435.626.668.897

    > `@t`(scan "as_me" nix)
    'as_me'

++nud

Numeric

Parse a numeric character - A number.

Source

    ++  nud  (shim '0' '9')                                 ::  numeric

Examples

    > (scan "0" nud)
    ~~0
    > (scan "7" nud)
    ~~7
    > (nud [[1 1] "1"])
    [p=[p=1 q=2] q=[~ [p=~~1 q=[p=[p=1 q=2] q=""]]]]
    > (scan "0123456789" (star nud))
    "0123456789"

++prn

Printable character

Parse any printable character.

Source

    ++  prn  ;~(less (just `@`127) (shim 32 256))

Examples

    > (scan "h" prn)
    ~~h
    > (scan "!" prn)
    ~~~21.
    > (scan "\01" prn)
    ! {1 1}
    ! exit

++qit

Chars in cord

Parse an individual character to its cord atom representation.

Source

    ++  qit  ;~  pose                                       ::  chars in a cord
                 ;~(less bas soq prn)
                 ;~(pfix bas ;~(pose bas soq mes))          ::  escape chars
             ==

Examples

    > (scan "%" qit)
    37
    > `@t`(scan "%" qit)
    '%'
    > (scan "0" qit)
    48
    > (scan "E" qit)
    69
    > (scan "a" qit)
    97
    > (scan "\\0a" qit)
    10
    > `@ux`(scan "\\0a" qit)
    0xa
    > (scan "cord" (star qit))
    ~[99 111 114 100]

++qit

Chars in cord

Parse an individual character to its cord atom representation.

Source

    ++  qit  ;~  pose                                       ::  chars in a cord
                 ;~(less bas soq prn)
                 ;~(pfix bas ;~(pose bas soq mes))          ::  escape chars
             ==

Examples

    > (scan "%" qit)
    37
    > `@t`(scan "%" qit)
    '%'
    > (scan "0" qit)
    48
    > (scan "E" qit)
    69
    > (scan "a" qit)
    97
    > (scan "\\0a" qit)
    10
    > `@ux`(scan "\\0a" qit)
    0xa
    > (scan "cord" (star qit))
    ~[99 111 114 100]

++qut

Cord

Parse single-soq cord with {gap}/ anywhere in the middle, or triple-single quote (aka triple-soq) cord, between which must be in an indented block.

Source

     ++  qut  ;~  pose                                       ::  cord
                 ;~  less  soqs
                   (ifix [soq soq] (boss 256 (more gon qit)))
                 ==
                 %-  inde  %+  ifix
                   :-  ;~  plug  soqs
                         ;~(pose ;~(plug (plus ace) vul) (just '\0a'))
                       ==
                   ;~(plug (just '\0a') soqs)
                 (boss 256 (star qat))
             ==
    ::

Examples

    > (scan "'cord'" qut)
    q=1.685.221.219
    > 'cord'
    'cord'
    > `@ud`'cord'
    1.685.221.219
    > '''
                Heredoc isn't prohibited from containing quotes
                '''
    'Heredoc isn't prohibited from containing quotes'

++soz

Delimiting '''

Parse a triple-single quote, used for multiline strings.

Source

    ++  soz  ;~(plug soq soq soq)                          ::  delimiting '''

Examples

    > (scan "'''" soz)
    [~~~27. ~~~27. ~~~27.]
    > (rash '"""' soz)
    ! {1 1}
    ! exit

++sym

Term

A term: a letter(lowercase), followed by letters, numbers, or -.

Source

    ++  sym
      %+  cook
        |=(a=tape (rap 3 ^-((list ,@) a)))
      ;~(plug low (star ;~(pose nud low hep)))
    ::

Examples

    > (scan "sam-2" sym)
    215.510.507.891
    > `@t`(scan "sam-2" sym)
    'sam-2'
    > (scan "sym" sym)
    7.174.515

++ven

+>- axis syntax

Axis syntax parser

Source

    ++  ven  ;~  (comp |=({a/@ b/@} (peg a b)))             ::  +>- axis syntax
               bet
               =+  hom=`?`|
               |=  tub/nail
               ^-  (like axis)
               =+  vex=?:(hom (bet tub) (gul tub))
               ?~  q.vex
                 [p.tub [~ 1 tub]]
               =+  wag=$(p.tub p.vex, hom !hom, tub q.u.q.vex)
               ?>  ?=(^ q.wag)
               [p.wag [~ (peg p.u.q.vex p.u.q.wag) q.u.q.wag]]
             ==

Examples

    ~zod/arvo=/hoon/hoon> (scan "->+" ven)
    11

    ~zod/arvo=/hoon/hoon> `@ub`(scan "->+" ven)
    0b1011

    ~zod/arvo=/hoon/hoon> (peg (scan "->" ven) (scan "+" ven))
    11

    ~zod/arvo=/hoon/hoon> ->+:[[1 2 [3 4]] 5]
    [3 4]

++vit

Base64 digit

Parse a standard base64 digit.

Source

    ++  vit                                                 ::  base64 digit
      ;~  pose
        (cook |=(a/@ (sub a 65)) (shim 'A' 'Z'))
        (cook |=(a/@ (sub a 71)) (shim 'a' 'z'))
        (cook |=(a/@ (add a 4)) (shim '0' '9'))
        (cold 62 (just '-'))
        (cold 63 (just '+'))
      ==

Examples

    ~zod/arvo=/hoon/hoon> (scan "C" vit)
    2
    ~zod/arvo=/hoon/hoon> (scan "c" vit)
    28
    ~zod/arvo=/hoon/hoon> (scan "2" vit)
    54
    ~zod/arvo=/hoon/hoon> (scan "-" vit)
    62

++vul

Comments to null

Source

    ++  vul  %+  cold   ~                                   ::  comments
             ;~  plug  col  col
               (star prn)
               (just `@`10)
             ==

Parse comments and produce a null. Note that a comment must be ended with a newline character.

Examples

    > (scan "::this is a comment \0a" vul)
    ~
    > (scan "::this is a comment " vul)
    ! {1 21}
    ! exit