Gall Apps
This document describes Gall, the Arvo userspace vane. It is split into two main sections: Agents and Threads.
Agents
Urbit code lives in the following basic categories:
- Runtime (Nock interpreter, persistence engine, IO drivers, jets)
- Kernel vanes (managed by Arvo)
- Userspace agents (managed by Gall, permanent state)
- Userspace imps (managed by Spider, transient state)
This lays out the framework for the third category: Agents.
An agent is a piece of software that is primarily focused on maintaining and distributing a piece of state with a defined structure. It exposes an interface that lets programs read, subscribe to, and manipulate the state. Every event happens in an atomic transaction, so the state is never inconsistent. Since the state is permanent, when the agent is upgraded with a change to the structure of the state, the developer provides a migration function from the old state type to the new state type.
It's not too far off to think of an agent as simply a database with developer-defined logic. But an agent is significantly less constrained than a database. Databases are usually tightly constrained in one or more ways because they need to provide certain guarantees (like atomicity) or optimizations (like indexes). Urbit is a single-level store, so atomicity comes for free. Many applications don't use databases because they need relational indices; rather, they use them for their guarantees around persistence. Some do need the indices, though, and it's not hard to imagine an agent which provides a SQL-like interface.
On the other hand, an agent is also a lot like what many systems call a "service". An agent is permanent and addressable -- a running program can talk to an agent just by naming it. An agent can perform IO, unlike most databases. This is a critical part of an agent: it performs IO along the same transaction boundaries as changes to its state, so if an effect happens, you know that the associated state change has happened. You should be careful, though, to avoid creating implicit non-atomicity -- eg if it comes into an inconsistent state pending a response to an IO action.
But the best way to think about an agent is as a state machine. Like a state machine, any input could happen at any time, and it must react coherently to that input. Ouput (effects) and the next state of the machine are both pure functions of the previous state and the input event. Of course, it's important to ensure there isn't an order of events that could cause your agent to enter an inconsistent state.
We often think of state machines as finite state machines, but of course in this instance, the state is not actually finite (though it should be definable in a regular recursive data type).
Specification
An agent is defined as a core with a set of arms to handle various
events. These handlers usually produce a list of effects and the next
state of the agent. The interface definition can be found in
sys/zuse.hoon
, which at the time of writing is:
++ agent =< form |% +$ step (quip card form) +$ card (wind note gift) +$ note $% [%arvo =note-arvo] [%agent [=ship name=term] =task] == +$ task $% [%watch =path] [%watch-as =mark =path] [%leave ~] [%poke =cage] [%poke-as =mark =cage] == +$ gift $% [%fact path=(unit path) =cage] [%kick path=(unit path) ship=(unit ship)] [%watch-ack p=(unit tang)] [%poke-ack p=(unit tang)] == +$ sign $% [%poke-ack p=(unit tang)] [%watch-ack p=(unit tang)] [%fact =cage] [%kick ~] == ++ form $_ ^| |_ bowl ++ on-init *(quip card _^|(..on-init)) :: ++ on-save *vase :: ++ on-load |~ old-state=vase *(quip card _^|(..on-init)) :: ++ on-poke |~ [mark vase] *(quip card _^|(..on-init)) :: ++ on-watch |~ path *(quip card _^|(..on-init)) :: ++ on-leave |~ path *(quip card _^|(..on-init)) :: ++ on-peek |~ path *(unit (unit cage)) :: ++ on-agent |~ [wire sign] *(quip card _^|(..on-init)) :: ++ on-arvo |~ [wire sign-arvo] *(quip card _^|(..on-init)) :: ++ on-fail |~ [term tang] *(quip card _^|(..on-init)) --
Here's a skeleton example of an implementation:
^- agent:gall =| state=@ |_ bowl:gall +* this . ++ on-init `this :: ++ on-save !>(state) :: ++ on-load |= =old-state=vase `this(state !<(@ old-state-vase)) :: ++ on-poke |= [=mark =vase] ~& > state=state ~& got-poked-with-data=mark =. state +(state) `this :: ++ on-watch |= path `this :: ++ on-leave |= path `this :: ++ on-peek |= path *(unit (unit cage)) :: ++ on-agent |= [wire sign:agent:gall] `this :: ++ on-arvo |= [wire sign-arvo] `this :: ++ on-fail |= [term tang] `this --
We also supply a default-agent
library, which is useful if you don't
want to handle some of the arms. Most of the above could also be
implemented as:
/+ default-agent ^- agent:gall =| state=@ |_ =bowl:gall +* this . default ~(. (default-agent this %|) bowl) :: ++ on-init on-init:default ++ on-save on-save:default ++ on-load on-load:default ++ on-poke |= [=mark =vase] ~& > state=state ~& got-poked-with-data=mark =. state +(state) `this :: ++ on-watch on-watch:default ++ on-leave on-leave:default ++ on-peek on-peek:default ++ on-agent on-agent:default ++ on-arvo on-arvo:default ++ on-fail on-fail:default --
So, an agent is a core with 10 arms. The handlers correspond to different sorts of input. We'll discuss each of these in detail, but first a few general concepts.
Basic concepts
Interacting with an agent
There are two basic ways programs can interact with an agent: you may
subscribe to data as described above, or you may "poke" an agent to send
it a command. Suppose there is an agent for a calendar service; then,
pokes would likely include %create-event
, %modify-event
,
%delete-event
, and %change-time-zone
(along with assocated data).
Subscription paths could include /next-event
and /all-events
.
Agents generally conform to CQRS -- pokes may change state, but subscriptions generally shouldn't. In practice, there may be state changes required to properly initialize the subscription, but these shouldn't change the essential state of the agent.
Cards
Most of the handlers produce a (quip card agent:gall)
, which is just
[(list card) agent:gall]
. The first allows us to produce effects, and
the second allows us to maintain state.
A card is one of two things: a note
or a gift
. You %pass
note
s
and you %give
gift
s. When you pass a note
, you expect zero or more
responses; when you give a gift
, you will not get a response. Since you
may get a response to a note
, you tag it with a wire
so that you can
identify the response.
Thus, we say you "pass a note
along a wire
", or you "give a gift
".
These phrases correspond to producing a card that looks like:
[%pass /my/wire a-note]
or
[%give a-gift]
When you give a gift
, that's the end of the story, but when you pass a
note
, you may get a response. If you do, then the response will come
tagged with the wire
that you used to pass the note
. It's generally
good practice to make your wire
s fairly unique within your agent, since
otherwise you may not be able to distinguish the responses.
Notes
An agent may pass note
s to either Arvo or another agent. If the note
is
to another agent, then it should usually be one of these:
[%pass /my/wire %agent [our.bowl agent-name] %watch /a/path] [%pass /my/wire %agent [our.bowl agent-name] %leave ~] [%pass /my/wire %agent [our.bowl agent-name] %poke %foo-mark !>(poke-data)]
Note that to unsubscribe to a path
, you must send the unsubscription on
the same wire
that sent for the original subscription. Don't subscribe
to separate path
s along the same wire
, because then you can't properly
distinguish them for cancelling (besides not being able to distinguish
the subscription updates). In other words, besides letting you
distinguish your card
s, the wire
also identifies requests for the system.
If the note
is not to another agent but to Arvo itself, then it is a
request to one of the vanes, which are kernel modules that provide
various system services, including IO. Here are some examples:
[%pass /my/wire %arvo %b %wait (add now.hid ~s10)]
This is a request to the %b
vane (Behn, the timer vane) to set a timer
for 10 seconds in the future. After 10 seconds, Behn will respond with
a %wake
card, which you will recieve in +on-arvo
. It will come back
on /my/wire
.
=/ =path /(scot %p our.hid)/home/(scot %da now.hid)/my-file/txt =/ contents=cage [%txt !>(~['text file line 1' 'line 2'])] [%pass /my/wire %arvo %c %info (foal:space:userlib path contents)]
This is a request to the %c
vane (Clay, the filesystem vane) to write
a file to /my-file/txt
in the home desk. You will not receive a
response to this note
.
=/ =path /my-file/txt =/ =moat [da+now.hid da+(add now.hid ~h1) path] [%pass /my/wire %arvo %c %warp our.hid %home ~ %many & moat]
This is a request to the %c
vane to send us a %writ
card
whenever
/my-file/txt
changes in the next hour. There may be many responses to
this note if the file changes many times.
=/ =request:http [%'GET' 'https://example.com' ~ ~] [%pass /my/wire %arvo %i %request request *outbound-config:iris]
This is a request to the %i
vane (Iris, the HTTP client vane) to make
a GET HTTP request to example.com. The response will come as an
%http-response
card.
Gifts
In contrast to the many possible note
s, there are only two types of
gift
s that an agent may give:
[%give %fact (unit path) =cage] [%give %kick (unit path)]
A subscription update is a new piece of subscription content for all
subscribers on a given path
. If no path
is given, then the update is
only given to the program that instigated this request. Typical use of
this mode is in +on-watch
to give an initial update to a new
subscriber to get them up to date.
A subscription close closes the subscription for all subscribers on a
given path
. If no path
is given, then the update is only given to the
program that instigated this request. Typical use of this mode would be
in +on-watch
to produce a single update to a subscription then close
the subscription.
Vases and cages
A vase
is a piece of dynamic data. Structurally, it's a pair of an
explicit reification of a type and an untyped noun. This lets us
represent a value which has a type that isn't known at compile time. A
vase has three operations:
-
!>
is a unary rune that lifts a statically typed value to a dynamically-typedvase
. For example,!>('hi')
gives[#t/@t q=26.984]
. -
!<
is a binary rune that takes a mold and a dynamically typedvase
and reduces it to a statically typed value. If thevase
does not in fact have the type of the mold you give it, then it nest-fails, else it producesvalue
. For example,!<(@t !>('hi'))
produces'hi'
while!<(^ !>('hi'))
causes a nest fail.. -
The compiler takes text and converts it to a
vase
of the compiled code. Agents shouldn't need this directly, but Gall uses this to compile agents tovase
s, on which it calls!<(agent:gall compiled-agent-vase)
.
A cage
is simply a pair of a mark and a vase. A mark is a textual tag
that should correspond to the particular dynamic type in the vase.
In agents, we use vases to represent types which Gall doesn't know about when it was compiled, but which nevertheless need to go outside the agent. In practice, there are three common cases:
-
The data in a "poke" request is of a type that is defined by each agent, so it must by dynamic. This is the input to
++on-poke
as well as the cage in the%poke
case of the%agent
note. -
The data in a subscription update is defined by each agent, so it must be dynamic. This is the cage in
%fact
, both when giving the update and in the input to+on-agent
. -
The state of an agent is also unique to each agent. Most of the time, Gall doesn't interact directly with agent's state, but when upgrading an agent, it must pass the state of the old version of the agent to the new version of the agent. This is the output of
+on-save
and the input to+on-load
.
State
In the definition of an agent, the entire core is wrapped with the ^|
rune, which corresponds to the "iron" variance mode. This means it's
"contravariant" in the input, which allows Gall to write to the new bowl
on every event. It also means that the context of the core is opaque to
Gall, which means you can store state in your context in whatever
structure you want.
This is why most of the handler arms produce not just a list of cards
but also a new agent:gall
. Usually, this will be the "same" agent in
the sense that the code will be the same, but you may have changed the
state that's stored in its (opaque to Gall) context.
The skeleton example above gives an example of keeping an atom as your
state. Define the state in the context of your agent core, then you can
change it with =.
or %=
, as long as you produce the new version of
the agent core.
When you upgrade an agent, you need to extract the state from your
opaque context and produce it to Gall as a dynamically typed vase.
Usually, this will be easy: just call !>
on whatever state you wish
to preserve. This is +on-save
.
When the new agent is about to be started, Gall will call it with
+on-load
with the vase just produced above. This allows you to ingest
your old state and continue right where you left off.
If the type of your state hasn't changed, you can just
=. state !<(state-type old-state-vase)
If it has changed, then it should look more like:
=. state (upgrade-state !<(old-state-type old-state-vase))
It's useful to tag your state with a version number so that the
+upgrade-state
function can take a tagged union of all your old state
types and upgrade from any of them to the current state type. For
example, it may have the following structure:
+$ old-state-types $% [%0 s=state-0] [%1 s=state-1] [%2 s=state-2] == ++ upgrade-state |= old=old-state-types =? old ?=(%0 -.old) (state-0-to-1 s.old) =? old ?=(%1 -.old) (state-1-to-2 s.old) ?> ?=(%2 -.old) s.old
Bowl
The core takes as input a bowl
, which includes useful info like the
current ship, the current time, and a renewable source of entropy. This
information is available to any of the handlers.
Arms
A description of each of the handler arms follows.
+on-init
This arm is called once when the agent is started. It has no input and lets you perform any initial IO.
+on-save
This arm is called immediately before the agent is upgraded. It
packages the permament state of the agent in a vase
for the next version
of the agent. Unlike most handlers, this cannot produce effects.
+on-load
This arm is called immediately after the agent is upgraded. It receives
a vase
of the state of the previously-running version of the agent,
which allows it to cleanly upgrade from the old agent.
+on-poke
This arm is called when the agent is "poked". The input is a cage
, so
it's a pair of a mark and a dynamic vase
.
+on-watch
This arm is called when a program wants to subscribe to the agent on a
particular path
. The agent may or may not need to perform setup steps
to intialize the subscription. It may produce a %give
%subscription-result
to the subscriber to get it up to date, but after
this event is complete, it cannot give further updates to a specific
subscriber. It must give all further updates to all subscribers on a
specific path
.
If this arm crashes, then the subscription is immediately terminated.
More specifcally, it never started -- the subscriber will receive a
negative %watch-ack
. You may also produce an explicit %kick
to
close the subscription without crashing -- for example, you could
produce a single update followed by a %kick
.
+on-leave
This arm is called when a program becomes unsubscribed to you. Subscriptions may close because the subscriber intentionally unsubscribed, but they also could be closed by an intermediary. For example, if a subscription is from another ship which is currently unreachable, Ames may choose to close the subscription to avoid queueing updates indefinitely. If the program crashes while processing an update, this may also generate an unsubscription. You should consider subscriptions to be closable at any time.
+on-peek
This arm is called when a program reads from the agent's "scry" namespace, which should be referentially transparent. Unlike most handlers, this cannot perform IO, and it cannot change the state. All it can do is produce a piece of data to the caller, or not.
If this arm produces [~ ~ data]
, then data
is the value at the the
given path
. If it produces [~ ~]
, then there is no data at the given
path
and never will be. If it produces ~
, then we don't know yet
whether there is or will be data at the given path
.
+on-agent
This arm is called to handle responses to %give
moves to other agents.
It will be one of the following types of response:
-
%poke-ack
: acknowledgment (positive or negative) of a poke. If the value is~
, then the poke succeeded. If the value is[~ tang]
, then the poke failed, and a printable explanation (e.g. a stack trace) is given in thetang
. -
%watch-ack
: acknowledgment (positive or negative) of a subscription. If negative, the subscription is already ended (technically, it never started). -
%fact
: update from the publisher. -
%kick
: notification that the subscription has ended.
+on-arvo
This arm is called to handle responses to %pass
move
s to vanes. The
list of possible responses from the system is statically defined in
sys/zuse.hoon (grep for + sign-arvo
).
+on-fail
If an error happens in +on-poke
, the crash report goes into the
%poke-ack
response. Similarly, if an error happens in
+on-subscription
, the crash report goes into the %watch-ack
response. If a crash happens in any of the other handlers, the report
is passed into this arm.
Threads
As discussed in Agents, Urbit code lives in the following basic categories:
- Runtime (Nock interpreter, persistence engine, IO drivers, jets)
- Kernel vanes (managed by Arvo)
- Userspace agents (managed by Gall, permanent state)
- Userspace threads (managed by Spider, transient state)
This describes the last category: Threads.
A thread is a monadic function that takes arguments and produces a result. It may perform input and output while running, so it is not a pure function. It may fail.
An agent's strength is that it's permanent and bulletproof. All state transitions are defined, and each action it performs is a transaction. Code upgrades preserve existing state.
An agent's weakness is complex input and output. Since each state transition must be explicitly handled, the complexity of an agent explodes with the amount of IO it handles. At best, this results in long and complex code; at worst, unexpected states are mishandled, corrupting permanent state.
A thread's strength is that it can easily perform complex IO operations. It uses what's often called the IO monad (plus the exception monad) to provide a natural framework for IO.
A thread's weakness is that it's impermanent and may fail unexpectedly. In most of its intermediate states, it expects only a small number of events (usually one), so if it receives anything it didn't expect, it fails. When code is upgraded, it's impossible to upgrade a running thread, so it fails.
Thus, for anything that needs to be permament, use an agent. When you need to do a long or complex sequence of IO operations, reduce that to a single logical IO operation by spinning it out into a thread. If you only change your agent's state in response to success of the thread, an IO failure will never result in partially applied state changes.
A thread may also be run from the dojo by prefixing its name with -
and giving it any arguments it requires. If alone, any result will be
printed to the screen; else the output may be piped into an agent or
other sinks.
Specification
+$ thread $-(vase _*form:(strand ,vase))
A thread is a running strand. It receives an argument as a vase and may prodduce a vase of a result.
+$ tid @tasession +$ start-args [parent=(unit tid) use=(unit tid) file=term =vase]
To run a thread, poke Spider with mark %spider-start
and a value of
type start-args
. parent
is an optional parent, use
is an optional
id, file
is the filename, and vase
is the argument. If the parent
is specified (usually to the thread that spawned this thread), then
when the parent ends, the child will end too.
A thread may succeed or fail. You may recieve the return value or be
notified of failure by subscribing to Spider at /thread-result/[tid]
,
where tid
is the same one you passed in use
. Take care to create a
tid
that will be unique from any other that may be created, for
example with:
(scot %ta (cat 3 'my-agent_' (scot %uv (sham eny))))`
The %fact
that comes back on this subscription will have mark either
%thread-fail
or %thread-done
. If it failed, the content will have
type [term tang]
with an explanation of the error. If it succeeded,
the content will be whatever the thread produced.
+$ input [=tid =cage]
You may poke a thread by poking Spider with %spider-input
. tid
is
the id of the thread you wish to poke, and cage
is the data to poke it
with.
You may subscribe to a thread by subscribing to Spider at
/thread/[tid]/path
.
You may stop a thread by poking Spider with %spider-stop
and data
[tid ?]
, where tid
is the thread you wish to stop and the loobean
decides whether to quit nicely (suppresses printfs).
Writing threads
A thread is just a function from a vase of arguments to a strand that produces a vase of the result. We say "just" not because it's obvious yet what that means but because it means we only need to explain how to write strands, and threads are just a particular case.
The type of a strand is defined in lib/strand, but you will rarely touch
those types directly. Most of the time, you will use two functions
pure
and bind
combined with any of the functions defined in
lib/strandio.
Strand fundamentals
When writing strands, you almost always want to start by specializing
strand
to your return type. A strand can produce any type, but you
must specify up front what type it will produce. By convention, we use
the variable m
for this. For example, if you are going to produce a
@ud
, use:
=/ m (strand ,@ud) ^- form:m ...
Note that m
is a core that has three arms:
+form
is the mold of the strand, suitable for ^-
.
+pure
produces a strand that does nothing except return a value. So,
(pure:(strand ,@tas) %foo)
is a strand that produces %foo
without
doing any IO. Thus, the most trivial thread is:
/- spider /+ strandio =, strand=strand:spider ^- thread:spider |= vase =/ m (strand ,vase) ^- form:m (pure:m *vase)
+bind
, used with ;<
, takes two strands and runs one after the other,
letting you use the result of the first strand in the second. For
example:
;< num=@ud bind:m (pure:m 5) (pure:m +(num))
Has the same result as:
(pure:m 6)
Other useful functions
There are many useful strand functions in lib/strandio and some other libraries. For example:
+sleep
waits for the specified time.
+get-time
gets the current time.
+poke
pokes an agent.
+watch
subscribes to an agent.
+fetch-json
produces the JSON at a particular URL.
+retry
tries a strand repeatedly with exponential backoff until it
succeeeds.
+start-thread
starts another thread.
send-raw-card
sends any card.
Examples
/- spider /+ strandio =, strand=strand:spider ^- thread:spider |= arg=vase =/ m (strand ,vase) ^- form:m =+ !<([arg=@dr ~] arg) ;< now-1=@da bind:m get-time:strandio ;< ~ bind:m (sleep:strandio arg) ;< now-2=@da bind:m get-time:strandio (pure:m !>(`@dr`(sub now-2 now-1)))
The first seven lines are the same boilerplate we saw before. After
that, it !<
's the argument to its expected type. It saves the current
time in now-1
, sleeps how long the argument says to, then gets the
time again and produces the difference between the two times. This
number should be close to the argument you passed in, plus a little for
computation time. You can run this at the dojo with -time ~s1
.
/- spider /+ *ph-io =, strand=strand:spider ^- thread:spider |= args=vase =/ m (strand ,vase) ;< ~ bind:m start-simple ;< ~ bind:m (raw-ship ~bud ~) ;< ~ bind:m (dojo ~bud "[%test-result (add 2 3)]") ;< ~ bind:m (wait-for-output ~bud "[%test-result 5]") ;< ~ bind:m end-simple (pure:m *vase)
This is uses lib/ph/io
to run a pH test. Aqua and pH are described
elsewhere, but if you've initialized Aqua by running |start %aqua
and
:aqua +solid
, then you can run -ph-add
to execute this thread.
It starts by running some virtualization initialization with
start-simple
. Then, it boots a guest ship called ~bud, types a line
into the dojo, and waits for it to printed the expected output.
It's easy to see how this building blocks could be combined to perform
complex tests. For example, you could start several ships and have them
|hi
each other, sync from each other, or do anything else. Check out
lib/ph/io
and lib/ph/azimuth
for other useful functions in constructing
pH tests.