cleaned up readme

This commit is contained in:
Ellen Arvidsson 2025-06-24 16:53:42 +02:00
parent 78a4c6df6d
commit 33b8a30610
3 changed files with 480 additions and 453 deletions

466
README.md
View file

@ -2,464 +2,24 @@
experimental type system implementation
## sappho grammar
## getting and building
```sh
git clone ssh://git@gitta.log-e.se:222/lnsol/log-e-sappho.git
cd ./log-e-sappho
cmake . -B build
cd build
make
```
s, t ::= s | t (disjunction, a.k.a. union)
| s & t (conjunction, a.k.a. intersection)
| s => t (implication)
| s -> t (arrow, a.k.a. function)
| forall x . t (forall, polymorphic)
| box t (box, c.f. modular logic)
| a [t..] (type operator application)
| x (variable)
| n (nominal)
| true (true, a.k.a. top)
| false (false, a.k.a. bottom)
| { m : t } (member, a.k.a. atomic record)
## repository content
x ∈ var
n ∈ nominal
a ∈ opname
```
## sappho subtyping rules
```
---- [identity]
s, γ ⊢ s, δ
---- [true-right]
γ ⊢ true, δ
---- [false-left]
false, γ ⊢ δ
s, t, γ ⊢ δ
---- [conj-left]
s & t, γ ⊢ δ
γ ⊢ s, t, δ
---- [disj-right]
γ ⊢ s | t, δ
s, γ ⊢ δ
t, γ ⊢ δ
---- [disj-left]
s | t, γ ⊢ δ
γ ⊢ s, δ
γ ⊢ t, δ
---- [conj-right]
γ ⊢ s & t, δ
// XXX
// remove s => t? always make progress
s => t, s, t, γ ⊢ δ
---- [impl-left]
s => t, s, γ ⊢ δ
// discussion about implication below
// check how "normal sequent calculus handles this"
s, γ ⊢ t, δ
---- [impl-right]
γ ⊢ s => t, δ
// box works as a kind of "forall" for concrete types
box t, t, γ ⊢ δ
---- [box-left]
box t, γ ⊢ δ
// (.)-- filters the contexts.
γ-- ⊢ t, δ--
---- [box-right]
γ ⊢ box t, δ
// using cyclic proof strategy, rule for dealing with aliases/type operators
// becomes really simple. However, this requires well-formedness rules on types
// to exclude typedefs like
// type A = A
// Or that our cycle detection excludes paths between "identical" states.
γ ⊢ expand(a[t..]), δ
----
γ ⊢ a[t..], δ
expand(a[t..]), γ ⊢ δ
----
a[t..], γ ⊢ δ
// member types are filtered according to member name
γ >> m ⊢ δ >> m
---- [member]
γ ⊢ δ
// foralls are unwrapped and variable substituted with fresh name
// i.e. we treat the bound variable nominally
n fresh
γ [n] ⊢ δ [n]
---- [forall]
γ ⊢ δ
c, γ* ⊢ a_1, ..., a_n
∀ I ⊆ {1, ..., n}.
c, γ* ⊢ { a_i | i ∈ I }, δ*
OR
{ b_i | i ∈ ¬I }, γ* ⊢ d, δ*
---- [loads-of-fun]
a_1 -> b_1, ..., a_n -> b_n, γ ⊢ c -> d, δ
```
### context operators
* box filtering
```
(t, γ)-- =def= t, γ-- if t == box s, for some s
γ-- otherwise
```
* member unwrapping
```
(t, γ) >> m =def= s, (γ >> m) if t == { m : s }, for some s
t, (γ >> m) if t == box s, for some s
γ >> m otherwise
```
* forall-unwrapping and substitution
```
(t, γ) [n] =def= s [n/x], (γ [n]) if t == forall x . s, for some x, s
t, (γ [n]) if t == box s, for some s
γ [n] otherwise
```
### Semantic function subtyping
*Here we try to give an intuitive explainer of how to reason about function
subtyping in the semantic subtyping domain.*
Given function types
```
a_1 -> b_1, ..., a_n -> b_n
```
the conjunction (intersection) of `a_i -> b_i`, i.e.
```
intsec = (a_1 -> b_1) & ... & (a_n -> b_n)
```
represents the set of functions that combines the properties of all
`a_i -> b_i` by themselves. To get an idea of what functions this describes,
let's start with reasoning about values, that is, functions in `intsec`.
First we note that any function in this set accepts any argument that are in
either of the `a_i`s. That is, it can take any argument in the set represented
by
```
a = a_1 | ... | a_n
```
Furthermore, a function `f` in `intsec` "promises" to map any value in `a_i` to
a value in its corresponding `b_i`.
Specifically, given `I ⊆ {1..n}`, a value `v` in the intersection
```
a_I_intsec = &[i ∈ I] a_i
```
is mapped to a value in
```
b_I_intsec = &[i ∈ I] b_i
```
that is,
```
v ∈ &[i ∈ I] a_i implies f(v) ∈ &[i ∈ I] b_i
```
*In the following we will "abuse notation" a bit. Try to look at it like
talking to an old friend, and you just instinctively know exactly what they
mean without explaining...*
Now, given a type `c -> d`, how do we know if
```
intsec = &[i ∈ {1..n}] a_i -> b_i <: c -> d
```
?
Or equivalently, friend speaking, if
```
∩[i ∈ {1..n}] a_i -> b_i ⊆ c -> d
```
?
Immediately, it should be obvious that
```
c ⊆ a (= [i ∈ {1..}] a_i)
```
Furthermore, it is necessary that any function in `intsec` will have to map all
values in `c` into `d`, specifically
```
f ∈ intsec, v ∈ c implies f(v) ∈ d
```
Note that this does not mean that any function in `a_i -> b_i` must map all
values in `a_i` into `d`, since it's not necessary that `a_i \ c = ø`.
Specifically, it is not necessary that `b_i ⊆ d`. However, taking `I ⊆ {1..n}`,
assuming
```
a_I_intsec = &[i ∈ I] a_i ⊆ c
```
then, necessarily
```
b_I_intsec = &[i ∈ I] b_i ⊆ d
```
Conversely, if
```
b_I_intsect ⊆ d
```
then any function in `intsec` maps `a_I_intsec` into `d`. If `a_I_intsec`
happens to intersect with `c`, it "covers" a part of `c`. I.e. the set `I` can
be understood as promising to map a part of `c` into `d`.
Defining this formally, we say that `I_cand ⊆ {1..n}` is a (`d`-)*candidate* if,
```
∩[i ∈ I_cand] b_i ⊆ d
```
We can thus see a candidate `I_cand` as a combination, or more specifically,
intersection of the function types `{a_i -> b_i | i ∈ I_cand}`, that according
to the reasoning above promises to map a part of the domain into `d`, namely
```
∩[i ∈ I_cand] a_i
```
A candidate thus "covers" part of the domain or formally speaking, `I`
(`d`-)covers a domain `s` if `I` is a (`d`-)candidate and
```
s ⊆ ∩[i ∈ I] a_i
```
If `IS ⊆ P({1..n})` then we say that `IS` (`d`-)covers `s` if all of `s` is
covered by some `I ∈ IS`, that is
```
s ⊆ [I ∈ IS | I is d-candidate] ∩[i ∈ I] a_i
```
Given the function types `{a_i -> b_i | i ∈ {1..n}}` and its set of candidates
`IC = {I_cand_1..I_cand_k}`, it is fairly easy to see that
```
&[i ∈ {1..n}] (a_i -> b_i) <: c -> d
```
iff `IC` `d`-covers `c`:
```
c ⊆ ( [I_cand ∈ IC] ∩[i ∈ I_cand] a_i )
```
Using some *pro gamer moves*, we can see that this is equivalent to
```
c ⊆ ( ∩[I_cand ∈ IC] [i ∉ I_cand] a_i )
```
This gives us the final form for our subtyping rule
```
c, γ* ⊢ a_i | ... | a_n, δ*
forall I ⊆ {1..n}.
&[i ∈ I] b_i, γ* ⊢ d, δ*
or
c, γ* ⊢ |[i ∉ I] a_i, δ*
---- [loads-o-fun]
a_1 -> b_1, ..., a_n -> b_n, γ ⊢ c -> d, δ
```
### Recursive subtyping
Type aliases creates the possibility of recursive types, and thus we need to
handle recursive subtyping
Handling subtyping co-inductively would enable us to handle the relation
```
type A = { x : { y : A } }
type B = { y : { x : B } }
A <: { x : B }
{ x : B } <: A
```
We have made progress if we "switch context"
If we make progress and end up at the same sequent, co-induction gives us the
conclusion.
Some actions/rules lead to equivalent contexts, i.e. does not switch
context/make progress.
### Other
True ~ { x : True } on the left
False ~ { x : False } on the right
or rather
True ~ { x : True } in positive position
False ~ { x : False } in negative position
### implication
```
γ ⊢ s => t, u => s, δ
if s, then u => s holds
if ¬s, then s => t holds
```
altho
```
γ ⊢ s => t, s, δ
```
Given the msph program
```
nominal n
type Woops[e] = {
member f : (e <: false => n) | (e <: false)
}
type Noooes = {
member f : true
}
```
we can prove that, for any type `t`
```
Noooes <: Woops[t]
```
The derivation looks like this:
```
---- [identity]
true, t <: false n, (t <: false)
---- [impl-right]
true ⊢ (t <: false => n), (t <: false)
---- [disj-right]
true ⊢ (t <: false => n) | (t <: false)
---- [member]
{ f : true } ⊢ { f : (t <: false => n) | (t <: false) }
---- [expand-right]
{ f : true } ⊢ Woops[t]
---- [expand-left]
Nooes ⊢ Woops[t]
---- [impl-right]
⊢ Nooes => Woops[t]
---- [box-right]
⊢ Nooes <: Woops[t]
```
This looks a bit dangerous at first. However, if we translate the implication to
using disjunction and negation, we get
```
(t <: false => N) | (t <: false) == (¬(t <: false) | n) | (t <: false)
```
Considering the two cases
```
(1) t == false
(2) t != false
```
we see that in (1)
```
(¬(t <: false) | n) | (t <: false) ==
(¬(true) | n) | (true) ==
true
```
and in (2)
```
(¬(t <: false) | n) | (t <: false) ==
(¬(false) | n) | false ==
(true | n) | false ==
true
```
## What is true?
Our "set interpretation" of types says `true` represents the full universe of
"things".
What are these things?
For all intents and purposes, a programmer can look at them as the values of
our language.
What does this mean in practical terms?
Well, it means something of type `true` is "a value".
What is "a value" then?
That's a good question. It can be anything really.
An integer?
Yep!
A function?
Yes, m'am!
A solution to the halting problem?
Well, yeah! But good luck constructing that.
A...
I think you get my point. A value can be anything from a simple integer to
nuclear apocalypse. The thing is, from the point of view of our program, it
means "we cannot really say anything here".
But couldn't that be dangerous?
Depends on what you mean by dangerous.
I mean, to be able to give everything a type?
Well, let me ask you this: do you like rat poison?
What? To eat?
Yeah!
Nooo, that'd be dangerous!
Yeah, exactly my point.
What?
You know what rat poison is obviously, despite it being dangerous?
Well, yeah...
Maybe you got it already, but my point is: rat poison is dangerous, but you
still know about it, and its existance. So, something being dangerous does not
mean you have this blank spot of knowledge that you just keep ignoring because
of this perceived danger.
### interpretation of ⊢
what is the interpretation of
γ ⊢ δ?
given a concrete type k
conjunction of γ hold for k
then
disjunction of δ hold for k
This repository contains:
* a library implementation of *sappho* called `spho`.
* a simple language parser (and soon type checker), called `msph`, developed in
tandem with `spho`.
* documentation of the sappho type system in `docs/`

379
docs/about/formalities.md Normal file
View file

@ -0,0 +1,379 @@
# description of sappho formals
## sappho grammar
```
s, t ::= s | t (disjunction, a.k.a. union)
| s & t (conjunction, a.k.a. intersection)
| s => t (implication)
| s -> t (arrow, a.k.a. function)
| forall x . t (forall, polymorphic)
| box t (box, c.f. modular logic)
| a [t..] (type operator application)
| x (variable)
| n (nominal)
| true (true, a.k.a. top)
| false (false, a.k.a. bottom)
| { m : t } (member, a.k.a. atomic record)
x ∈ var
n ∈ nominal
a ∈ opname
```
## sappho subtyping rules
```
---- [identity]
s, γ ⊢ s, δ
---- [true-right]
γ ⊢ true, δ
---- [false-left]
false, γ ⊢ δ
s, t, γ ⊢ δ
---- [conj-left]
s & t, γ ⊢ δ
γ ⊢ s, t, δ
---- [disj-right]
γ ⊢ s | t, δ
s, γ ⊢ δ
t, γ ⊢ δ
---- [disj-left]
s | t, γ ⊢ δ
γ ⊢ s, δ
γ ⊢ t, δ
---- [conj-right]
γ ⊢ s & t, δ
// XXX
// remove s => t? always make progress
s => t, s, t, γ ⊢ δ
---- [impl-left]
s => t, s, γ ⊢ δ
// discussion about implication below
// check how "normal sequent calculus handles this"
s, γ ⊢ t, δ
---- [impl-right]
γ ⊢ s => t, δ
// box works as a kind of "forall" for concrete types
box t, t, γ ⊢ δ
---- [box-left]
box t, γ ⊢ δ
// (.)-- filters the contexts.
γ-- ⊢ t, δ--
---- [box-right]
γ ⊢ box t, δ
// using cyclic proof strategy, rule for dealing with aliases/type operators
// becomes really simple. However, this requires well-formedness rules on types
// to exclude typedefs like
// type A = A
// Or that our cycle detection excludes paths between "identical" states.
γ ⊢ expand(a[t..]), δ
----
γ ⊢ a[t..], δ
expand(a[t..]), γ ⊢ δ
----
a[t..], γ ⊢ δ
// member types are filtered according to member name
γ >> m ⊢ δ >> m
---- [member]
γ ⊢ δ
// foralls are unwrapped and variable substituted with fresh name
// i.e. we treat the bound variable nominally
n fresh
γ [n] ⊢ δ [n]
---- [forall]
γ ⊢ δ
c, γ* ⊢ a_1, ..., a_n
∀ I ⊆ {1, ..., n}.
c, γ* ⊢ { a_i | i ∈ I }, δ*
OR
{ b_i | i ∈ ¬I }, γ* ⊢ d, δ*
---- [loads-of-fun]
a_1 -> b_1, ..., a_n -> b_n, γ ⊢ c -> d, δ
```
### context operators
* box filtering
```
(t, γ)-- =def= t, γ-- if t == box s, for some s
γ-- otherwise
```
* member unwrapping
```
(t, γ) >> m =def= s, (γ >> m) if t == { m : s }, for some s
t, (γ >> m) if t == box s, for some s
γ >> m otherwise
```
* forall-unwrapping and substitution
```
(t, γ) [n] =def= s [n/x], (γ [n]) if t == forall x . s, for some x, s
t, (γ [n]) if t == box s, for some s
γ [n] otherwise
```
### Semantic function subtyping
*Here we try to give an intuitive explainer of how to reason about function
subtyping in the semantic subtyping domain.*
Given function types
```
a_1 -> b_1, ..., a_n -> b_n
```
the conjunction (intersection) of `a_i -> b_i`, i.e.
```
intsec = (a_1 -> b_1) & ... & (a_n -> b_n)
```
represents the set of functions that combines the properties of all
`a_i -> b_i` by themselves. To get an idea of what functions this describes,
let's start with reasoning about values, that is, functions in `intsec`.
First we note that any function in this set accepts any argument that are in
either of the `a_i`s. That is, it can take any argument in the set represented
by
```
a = a_1 | ... | a_n
```
Furthermore, a function `f` in `intsec` "promises" to map any value in `a_i` to
a value in its corresponding `b_i`.
Specifically, given `I ⊆ {1..n}`, a value `v` in the intersection
```
a_I_intsec = &[i ∈ I] a_i
```
is mapped to a value in
```
b_I_intsec = &[i ∈ I] b_i
```
that is,
```
v ∈ &[i ∈ I] a_i implies f(v) ∈ &[i ∈ I] b_i
```
*In the following we will "abuse notation" a bit. Try to look at it like
talking to an old friend, and you just instinctively know exactly what they
mean without explaining...*
Now, given a type `c -> d`, how do we know if
```
intsec = &[i ∈ {1..n}] a_i -> b_i <: c -> d
```
?
Or equivalently, friend speaking, if
```
∩[i ∈ {1..n}] a_i -> b_i ⊆ c -> d
```
?
Immediately, it should be obvious that
```
c ⊆ a (= [i ∈ {1..}] a_i)
```
Furthermore, it is necessary that any function in `intsec` will have to map all
values in `c` into `d`, specifically
```
f ∈ intsec, v ∈ c implies f(v) ∈ d
```
Note that this does not mean that any function in `a_i -> b_i` must map all
values in `a_i` into `d`, since it's not necessary that `a_i \ c = ø`.
Specifically, it is not necessary that `b_i ⊆ d`. However, taking `I ⊆ {1..n}`,
assuming
```
a_I_intsec = &[i ∈ I] a_i ⊆ c
```
then, necessarily
```
b_I_intsec = &[i ∈ I] b_i ⊆ d
```
Conversely, if
```
b_I_intsect ⊆ d
```
then any function in `intsec` maps `a_I_intsec` into `d`. If `a_I_intsec`
happens to intersect with `c`, it "covers" a part of `c`. I.e. the set `I` can
be understood as promising to map a part of `c` into `d`.
Defining this formally, we say that `I_cand ⊆ {1..n}` is a (`d`-)*candidate* if,
```
∩[i ∈ I_cand] b_i ⊆ d
```
We can thus see a candidate `I_cand` as a combination, or more specifically,
intersection of the function types `{a_i -> b_i | i ∈ I_cand}`, that according
to the reasoning above promises to map a part of the domain into `d`, namely
```
∩[i ∈ I_cand] a_i
```
A candidate thus "covers" part of the domain or formally speaking, `I`
(`d`-)covers a domain `s` if `I` is a (`d`-)candidate and
```
s ⊆ ∩[i ∈ I] a_i
```
If `IS ⊆ P({1..n})` then we say that `IS` (`d`-)covers `s` if all of `s` is
covered by some `I ∈ IS`, that is
```
s ⊆ [I ∈ IS | I is d-candidate] ∩[i ∈ I] a_i
```
Given the function types `{a_i -> b_i | i ∈ {1..n}}` and its set of candidates
`IC = {I_cand_1..I_cand_k}`, it is fairly easy to see that
```
&[i ∈ {1..n}] (a_i -> b_i) <: c -> d
```
iff `IC` `d`-covers `c`:
```
c ⊆ ( [I_cand ∈ IC] ∩[i ∈ I_cand] a_i )
```
Using some *pro gamer moves*, we can see that this is equivalent to
```
c ⊆ ( ∩[I_cand ∈ IC] [i ∉ I_cand] a_i )
```
This gives us the final form for our subtyping rule
```
c, γ* ⊢ a_i | ... | a_n, δ*
forall I ⊆ {1..n}.
&[i ∈ I] b_i, γ* ⊢ d, δ*
or
c, γ* ⊢ |[i ∉ I] a_i, δ*
---- [loads-o-fun]
a_1 -> b_1, ..., a_n -> b_n, γ ⊢ c -> d, δ
```
### Recursive subtyping
Type aliases creates the possibility of recursive types, and thus we need to
handle recursive subtyping
Handling subtyping co-inductively would enable us to handle the relation
```
type A = { x : { y : A } }
type B = { y : { x : B } }
A <: { x : B }
{ x : B } <: A
```
We have made progress if we "switch context"
If we make progress and end up at the same sequent, co-induction gives us the
conclusion.
Some actions/rules lead to equivalent contexts, i.e. does not switch
context/make progress.
### Other
True ~ { x : True } on the left
False ~ { x : False } on the right
or rather
True ~ { x : True } in positive position
False ~ { x : False } in negative position
### implication
```
γ ⊢ s => t, u => s, δ
if s, then u => s holds
if ¬s, then s => t holds
```
altho
```
γ ⊢ s => t, s, δ
```
Given the msph program
```
nominal n
type Woops[e] = {
member f : (e <: false => n) | (e <: false)
}
type Noooes = {
member f : true
}
```
we can prove that, for any type `t`
```
Noooes <: Woops[t]
```
The derivation looks like this:
```
---- [identity]
true, t <: false n, (t <: false)
---- [impl-right]
true ⊢ (t <: false => n), (t <: false)
---- [disj-right]
true ⊢ (t <: false => n) | (t <: false)
---- [member]
{ f : true } ⊢ { f : (t <: false => n) | (t <: false) }
---- [expand-right]
{ f : true } ⊢ Woops[t]
---- [expand-left]
Nooes ⊢ Woops[t]
---- [impl-right]
⊢ Nooes => Woops[t]
---- [box-right]
⊢ Nooes <: Woops[t]
```
This looks a bit dangerous at first. However, if we translate the implication to
using disjunction and negation, we get
```
(t <: false => N) | (t <: false) == (¬(t <: false) | n) | (t <: false)
```
Considering the two cases
```
(1) t == false
(2) t != false
```
we see that in (1)
```
(¬(t <: false) | n) | (t <: false) ==
(¬(true) | n) | (true) ==
true
```
and in (2)
```
(¬(t <: false) | n) | (t <: false) ==
(¬(false) | n) | false ==
(true | n) | false ==
true
```

88
docs/about/interpret.md Normal file
View file

@ -0,0 +1,88 @@
# interpretation of sappho
Different aspects of sappho interpreted.
## What is true?
Our "set interpretation" of types says `true` represents the full universe of
"things".
What are these things?
For all intents and purposes, a programmer can look at them as the values of
our language.
What does this mean in practical terms?
Well, it means something of type `true` is "a value".
What is "a value" then?
That's a good question. It can be anything really.
An integer?
Yep!
A function?
Yes, m'am!
A solution to the halting problem?
Well, yeah! But good luck constructing that.
A...
I think you get my point. A value can be anything from a simple integer to
nuclear apocalypse. The thing is, from the point of view of our program, it
means "we cannot really say anything here".
But couldn't that be dangerous?
Depends on what you mean by dangerous.
I mean, to be able to give everything a type?
Well, let me ask you this: do you like rat poison?
What? To eat?
Yeah!
Nooo, that'd be dangerous!
Yeah, exactly my point.
What?
You know what rat poison is obviously, despite it being dangerous?
Well, yeah...
Maybe you got it already, but my point is: rat poison is dangerous, but you
still know about it, and its existance. So, something being dangerous does not
mean you have this blank spot of knowledge that you just keep ignoring because
of this perceived danger.
### interpretation of ⊢
what is the interpretation of
γ ⊢ δ?
given a concrete type k
conjunction of γ hold for k
then
disjunction of δ hold for k