4i: Parsing (Useful Idioms)

    ++alf

    Alphabetic characters

    Parse alphabetic characters, both upper and lowercase.

    Source

        ++  alf  ;~(pose low hig)                               ::  alphabetic
    

    Examples

            > (scan "a" alf)
            ~~a
    
            > (scan "A" alf)
            ~~~41.
    
            > (scan "AaBbCc" (star alf))
            "AaBbCc"
    

    ++aln

    Alphanumeric characters

    Parse alphanumeric characters - both alphabetic characters and numbers.

    Source

        ++  aln  ;~(pose low hig nud)                           ::  alphanumeric
    

    Examples

            > (scan "0" aln)
            ~~0
    
            > (scan "alf42" (star aln))
            "alf42"
    
            > (scan "0123456789abcdef" (star aln))
            "0123456789abcdef"
    

    ++alp

    Alphanumeric and -

    Parse alphanumeric strings and hep, "-".

    Source

        ++  alp  ;~(pose low hig nud hep)                       ::  alphanumeric and -
    

    Examples

            > (scan "7" alp)
            ~~7
    
            > (scan "s" alp)
            ~~s
    
            > (scan "123abc-" (star alp))
            "123abc-"
    

    ++bet

    Axis syntax -, +

    Parse the hep and lus axis syntax.

    Source

        ++  bet  ;~(pose (cold 2 hep) (cold 3 lus))             ::  axis syntax - +
    

    Examples

            > (scan "-" bet)
            2
    
            > (scan "+" bet)
            3
    

    ++bin

    Binary to atom

    Parse a tape of binary (0s and 1s) and produce its atomic representation.

    Source

        ++  bin  (bass 2 (most gon but))                        ::  binary to atom
    

    Examples

            > (scan "0000" bin)
            0
    
            > (scan "0001" bin)
            1
    
            > (scan "0010" bin)
            2
    
            > (scan "100000001111" bin)
            2.063
    

    ++but

    Binary digit

    Parse a single binary digit.

    Source

        ++  but  (cook |=(a=@ (sub a '0')) (shim '0' '1'))      ::  binary digit
    

    Examples

            > (scan "0" but)
            0
            > (scan "1" but)
            1
            > (scan "01" but)
            ! {1 2}
            ! 'syntax-error'
            ! exit
            > (scan "01" (star but))
            ~[0 1]
    

    ++cit

    Octal digit

    Parse a single octal digit.

    Source

        ++  cit  (cook |=(a=@ (sub a '0')) (shim '0' '7'))      ::  octal digit
    

    Examples

            > (scan "1" cit)
            1
            > (scan "7" cit)
            7
            > (scan "8" cit)
            ! {1 1}
            ! 'syntax-error'
            ! exit
            > (scan "60" (star cit))
            ~[6 0]
    

    ++dem

    Decimal to atom

    Parse a decimal number to an atom.

    Source

        ++  dem  (bass 10 (most gon dit))                       ::  decimal to atom
    

    Examples

            > (scan "7" dem)
            7
            > (scan "42" dem)
            42
            > (scan "150000000" dem)
            150.000.000
            > (scan "12456" dem)
            12.456
    

    ++dit

    Decimal digit

    Parse a single decimal digit.

    Source

        ++  dit  (cook |=(a=@ (sub a '0')) (shim '0' '9'))      ::  decimal digit
    

    Examples

            > (scan "7" dit)
            7
            > (scan "42" (star dit))
            ~[4 2]
            > (scan "26000" (star dit))
            ~[2 6 0 0 0]
    

    ++dog

    . optional gap

    Dot followed by an optional gap, used with numbers.

    Source

        ++  dog  ;~(plug dot gay)                               ::
    

    Examples

        > 1.234.703
                    703
        1.234.703
    
        > (scan "a.        " ;~(pfix alf dog))
        [~~~. ~]
    

    ++doh

    @p separator

    Phonetic base phrase separator

    Source

        ++  doh  ;~(plug ;~(plug hep hep) gay)                  ::
    

    Examples

        /> ~nopfel-botduc-nilnev-dolfyn--haspub-natlun-lodmur-holtyd
        ~nopfel-botduc-nilnev-dolfyn--haspub-natlun-lodmur-holtyd
        /> ~nopfel-botduc-nilnev-dolfyn--
                    haspub-natlun-lodmur-holtyd
        ~nopfel-botduc-nilnev-dolfyn--haspub-natlun-lodmur-holtyd
        > (scan "--" doh)
        [[~~- ~~-] ~]
        > (scan "--      " doh)
        [[~~- ~~-] ~]
    

    ++dun

    -- to ~

    Parse phep, --, to null, ~.

    Source

        ++  dun  (cold ~ ;~(plug hep hep))                      ::  -- (phep) to ~
    

    Examples

        > (scan "--" dun)
        ~
        > (dun [[1 1] "--"])
        [p=[p=1 q=3] q=[~ u=[p=~ q=[p=[p=1 q=3] q=""]]]]
    

    ++duz

    == to ~

    Parse stet, ==, to null ~.

    Source

        ++  duz  (cold ~ ;~(plug tis tis))                      ::  == (stet) to ~
    

    Examples

        > (scan "==" duz)
        ~
        > (duz [[1 1] "== |=..."])
        [p=[p=1 q=3] q=[~ u=[p=~ q=[p=[p=1 q=3] q=" |=..."]]]]
    

    ++gah

    Newline or ' '

    Whitespace component, either newline or space.

    Source

        ++  gah  (mask [`@`10 ' ' ~])                           ::  newline or ace
    

    Examples

        /> ^-  *  ::  show spaces
                    """
                       -
                     -
                      -
                    """
        [32 32 32 45 10 32 45 10 32 32 45 0]
        /> ^-  *
                    """
    
                    """
        [32 32 32 10 32 10 32 32 0]
        /> ^-  (list ,@)
                    %-  scan  :_  (star gah)
                    """
    
                    """
        ~[32 32 32 10 32 10 32 32]
    

    ++gap

    Plural whitespace

    Separates tall runes

    Source

        ++  gap  (cold ~ ;~(plug gaq (star ;~(pose vul gah))))  ::  plural whitespace
    

    ++gaq

    End of line

    Two spaces, a newline, or comment.

    Source

        ++  gaq  ;~  pose                                       ::  end of line
                     (just `@`10)
                     ;~(plug gah ;~(pose gah vul))
                     vul
                 ==
    

    ++gaw

    Classic whitespace

    Terran whitespace.

    Source

        ++  gaw  (cold ~ (star ;~(pose vul gah)))               ::  classic white
    

    ++gay

    Optional gap

    Optional gap.

    Source

        ++  gay  ;~(pose gap (easy ~))                          ::
    

    ++gon

    Long numbers

    Parse long numbers - Numbers which wrap around the shell with the line

    Source

         ++  gon  ;~(pose ;~(plug bas gay fas) (easy ~))         ::  long numbers \ /
    

    Examples

            > (scan "\\/" gon)
            [~~~5c. ~ ~~~2f.]
    
            > (gon [[1 1] "\\/"])
            [p=[p=1 q=3] q=[~ u=[p=[~~~5c. ~ ~~~2f.] q=[p=[p=1 q=3] q=""]]]]
    

    ++gul

    Axis syntax < or >

    Parse the axis gal and gar axis syntax.

    Source

        ++  gul  ;~(pose (cold 2 gal) (cold 3 gar))             ::  axis syntax < >
    

    Examples

            > (scan "<" gul)
            2
    
            > (scan ">" gul)
            3
    

    ++hex

    Hex to atom

    Parse any hexadecimal number to an atom.

    Source

        ++  hex  (bass 16 (most gon hit))                       ::  hex to atom
    

    Examples

            > (scan "a" hex)
            10
            > (scan "A" hex)
            10
            > (scan "2A" hex)
            42
            > (scan "1ee7" hex)
            7.911
            > (scan "1EE7" hex)
            7.911
            > (scan "1EE7F7" hex)
            2.025.463
            > `@ux`(scan "1EE7F7" hex)
            0x1e.e7f7
    

    ++hig

    Uppercase

    Parse a single uppercase letter.

    Source

        ++  hig  (shim 'A' 'Z')                                 ::  uppercase
    

    Examples

            > (scan "G" hig)
            ~~~47.
    
            > `cord`(scan "G" hig)
            'G'
    
            > (scan "ABCDEFGHIJKLMNOPQRSTUVWXYZ" (star hig))
            "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
    
            > (hig [[1 1] "G"])
            [p=[p=1 q=2] q=[~ [p=~~~47. q=[p=[p=1 q=2] q=""]]]]
    

    ++hit

    Hex digits

    Parse a single hexadecimal digit.

    Source

        ++  hit  ;~  pose                                       ::  hex digits
                   dit
                   (cook |=(a=char (sub a 87)) (shim 'a' 'f'))
                   (cook |=(a=char (sub a 55)) (shim 'A' 'F'))
                 ==
    

    Examples

            > (scan "a" hit)
            10
            > (scan "A" hit)
            10
            > (hit [[1 1] "a"])
            [p=[p=1 q=2] q=[~ [p=10 q=[p=[p=1 q=2] q=""]]]]
            > (scan "2A" (star hit))
            ~[2 10]
    

    ++iny

    Indentation block

    Apply ++rule to indented block starting at current column number, omitting the leading whitespace.

    Accepts

    sef is a ++rule

    Produces

    A ++rule.

    Source

        ++  inde  |*  sef/rule                                  :: indentation block
          |=  nail  ^+  (sef)
          =+  [har tap]=[p q]:+<
          =+  lev=(fil 3 (dec q.har) ' ')
          =+  eol=(just `@t`10)
          =+  =-  roq=((star ;~(pose prn ;~(sfix eol (jest lev)) -)) har tap)
              ;~(simu ;~(plug eol eol) eol)
          ?~  q.roq  roq
          =+  vex=(sef har(q 1) p.u.q.roq)
          =+  fur=p.vex(q (add (dec q.har) q.p.vex))
          ?~  q.vex  vex(p fur)
          =-  vex(p fur, u.q -)
          :+  &3.vex
            &4.vex(q.p (add (dec q.har) q.p.&4.vex))
          =+  res=|4.vex
          |-  ?~  res  |4.roq
          ?.  =(10 -.res)  [-.res $(res +.res)]
          (welp [`@t`10 (trip lev)] $(res +.res))
        ::
    

    Examples

        > (scan "abc" (iny (star ;~(pose prn (just `@`10)))))
        "abc"
        > (scan "abc" (star ;~(pose prn (just `@`10))))
        "abc"
        > (scan "  abc\0ade" ;~(pfix ace ace (star ;~(pose prn (just `@`10)))))
        "abc
            de"
        > (scan "  abc\0ade" ;~(pfix ace ace (iny (star ;~(pose prn (just `@`10))))))
        ! {1 6}
        ! exit
        > (scan "  abc\0a  de" ;~(pfix ace ace (iny (star ;~(pose prn (just `@`10))))))
        "abc
            de"
    

    ++low

    Lowercase

    Parse a single lowercase letter.

    Source

        ++  low  (shim 'a' 'z')                                 ::  lowercase
    

    Examples

            > (scan "g" low)
            ~~g
            > `cord`(scan "g" low)
            'g'
            > (scan "abcdefghijklmnopqrstuvwxyz" (star low))
            "abcdefghijklmnopqrstuvwxyz"
            > (low [[1 1] "g"])
            [p=[p=1 q=2] q=[~ [p=~~g q=[p=[p=1 q=2] q=""]]]]
    

    ++mes

    Hexbyte

    Parse a hexbyte.

    Source

        ++  mes  %+  cook                                       ::  hexbyte
                   |=({a/@ b/@} (add (mul 16 a) b))
                 ;~(plug hit hit)
    

    Examples

            > (scan "2A" mes)
            42
            > (mes [[1 1] "2A"])
            [p=[p=1 q=3] q=[~ u=[p=42 q=[p=[p=1 q=3] q=""]]]]
            > (scan "42" mes)
            66
    

    ++nix

    Letters and -

    Parse Letters and -.

    Source

        ++  nix  (boss 256 (star ;~(pose aln cab)))             ::
    

    Examples

        > (scan "as_me" nix)
        q=435.626.668.897
    
        > `@t`(scan "as_me" nix)
        'as_me'
    

    ++nud

    Numeric

    Parse a numeric character - A number.

    Source

        ++  nud  (shim '0' '9')                                 ::  numeric
    

    Examples

        > (scan "0" nud)
        ~~0
        > (scan "7" nud)
        ~~7
        > (nud [[1 1] "1"])
        [p=[p=1 q=2] q=[~ [p=~~1 q=[p=[p=1 q=2] q=""]]]]
        > (scan "0123456789" (star nud))
        "0123456789"
    

    ++prn

    Printable character

    Parse any printable character.

    Source

        ++  prn  ;~(less (just `@`127) (shim 32 256))
    

    Examples

        > (scan "h" prn)
        ~~h
        > (scan "!" prn)
        ~~~21.
        > (scan "\01" prn)
        ! {1 1}
        ! exit
    

    ++qit

    Chars in cord

    Parse an individual character to its cord atom representation.

    Source

        ++  qit  ;~  pose                                       ::  chars in a cord
                     ;~(less bas soq prn)
                     ;~(pfix bas ;~(pose bas soq mes))          ::  escape chars
                 ==
    

    Examples

        > (scan "%" qit)
        37
        > `@t`(scan "%" qit)
        '%'
        > (scan "0" qit)
        48
        > (scan "E" qit)
        69
        > (scan "a" qit)
        97
        > (scan "\\0a" qit)
        10
        > `@ux`(scan "\\0a" qit)
        0xa
        > (scan "cord" (star qit))
        ~[99 111 114 100]
    

    ++qit

    Chars in cord

    Parse an individual character to its cord atom representation.

    Source

        ++  qit  ;~  pose                                       ::  chars in a cord
                     ;~(less bas soq prn)
                     ;~(pfix bas ;~(pose bas soq mes))          ::  escape chars
                 ==
    

    Examples

        > (scan "%" qit)
        37
        > `@t`(scan "%" qit)
        '%'
        > (scan "0" qit)
        48
        > (scan "E" qit)
        69
        > (scan "a" qit)
        97
        > (scan "\\0a" qit)
        10
        > `@ux`(scan "\\0a" qit)
        0xa
        > (scan "cord" (star qit))
        ~[99 111 114 100]
    

    ++qut

    Cord

    Parse single-soq cord with {gap}/ anywhere in the middle, or triple-single quote (aka triple-soq) cord, between which must be in an indented block.

    Source

         ++  qut  ;~  pose                                       ::  cord
                     ;~  less  soqs
                       (ifix [soq soq] (boss 256 (more gon qit)))
                     ==
                     %-  inde  %+  ifix
                       :-  ;~  plug  soqs
                             ;~(pose ;~(plug (plus ace) vul) (just '\0a'))
                           ==
                       ;~(plug (just '\0a') soqs)
                     (boss 256 (star qat))
                 ==
        ::
    

    Examples

        > (scan "'cord'" qut)
        q=1.685.221.219
        > 'cord'
        'cord'
        > `@ud`'cord'
        1.685.221.219
        > '''
                    Heredoc isn't prohibited from containing quotes
                    '''
        'Heredoc isn't prohibited from containing quotes'
    

    ++soz

    Delimiting '''

    Parse a triple-single quote, used for multiline strings.

    Source

        ++  soz  ;~(plug soq soq soq)                          ::  delimiting '''
    

    Examples

        > (scan "'''" soz)
        [~~~27. ~~~27. ~~~27.]
        > (rash '"""' soz)
        ! {1 1}
        ! exit
    

    ++sym

    Term

    A term: a letter(lowercase), followed by letters, numbers, or -.

    Source

        ++  sym
          %+  cook
            |=(a=tape (rap 3 ^-((list ,@) a)))
          ;~(plug low (star ;~(pose nud low hep)))
        ::
    

    Examples

        > (scan "sam-2" sym)
        215.510.507.891
        > `@t`(scan "sam-2" sym)
        'sam-2'
        > (scan "sym" sym)
        7.174.515
    

    ++ven

    +>- axis syntax

    Axis syntax parser

    Source

        ++  ven  ;~  (comp |=({a/@ b/@} (peg a b)))             ::  +>- axis syntax
                   bet
                   =+  hom=`?`|
                   |=  tub/nail
                   ^-  (like axis)
                   =+  vex=?:(hom (bet tub) (gul tub))
                   ?~  q.vex
                     [p.tub [~ 1 tub]]
                   =+  wag=$(p.tub p.vex, hom !hom, tub q.u.q.vex)
                   ?>  ?=(^ q.wag)
                   [p.wag [~ (peg p.u.q.vex p.u.q.wag) q.u.q.wag]]
                 ==
    

    Examples

        ~zod/arvo=/hoon/hoon> (scan "->+" ven)
        11
    
        ~zod/arvo=/hoon/hoon> `@ub`(scan "->+" ven)
        0b1011
    
        ~zod/arvo=/hoon/hoon> (peg (scan "->" ven) (scan "+" ven))
        11
    
        ~zod/arvo=/hoon/hoon> ->+:[[1 2 [3 4]] 5]
        [3 4]
    

    ++vit

    Base64 digit

    Parse a standard base64 digit.

    Source

        ++  vit                                                 ::  base64 digit
          ;~  pose
            (cook |=(a/@ (sub a 65)) (shim 'A' 'Z'))
            (cook |=(a/@ (sub a 71)) (shim 'a' 'z'))
            (cook |=(a/@ (add a 4)) (shim '0' '9'))
            (cold 62 (just '-'))
            (cold 63 (just '+'))
          ==
    

    Examples

        ~zod/arvo=/hoon/hoon> (scan "C" vit)
        2
        ~zod/arvo=/hoon/hoon> (scan "c" vit)
        28
        ~zod/arvo=/hoon/hoon> (scan "2" vit)
        54
        ~zod/arvo=/hoon/hoon> (scan "-" vit)
        62
    

    ++vul

    Comments to null

    Source

        ++  vul  %+  cold   ~                                   ::  comments
                 ;~  plug  col  col
                   (star prn)
                   (just `@`10)
                 ==
    

    Parse comments and produce a null. Note that a comment must be ended with a newline character.

    Examples

        > (scan "::this is a comment \0a" vul)
        ~
        > (scan "::this is a comment " vul)
        ! {1 21}
        ! exit