HN Gopher Feed (2017-12-05) - page 1 of 10 ___________________________________________________________________
Dependent Types for F#
109 points by jackfoxy
https://jackfoxy.github.io/DependentTypes/___________________________________________________________________
rob-kuz - 6 hours ago
Very cool library that Jack implemented here using an idea from my
blog. We are using it in production. Works really nice.
[deleted]
dwohnitmok - 3 hours ago
I'm very much a newbie when it comes to F# but as far as I can tell
this isn't an example of what's traditionally called dependent
types, but rather is a way of providing a set of helper functions
and types for "smart" constructors, that is providing a
validation/verification function that takes in some raw type and
gives back an option of either the type you want or none.As a
litmus test (and I'd be happy if either the OP or someone else
chimes in), as an example of dependent types, I'd expect to be able
to compose constraints, e.g. having a type called And<'A, 'B>
and Or<'A, 'B> that would then let me compose constraints e.g.
And
and crucially have their verification
functions automatically compose as well, instead of having to
invent a whole new type GreaterThan3AndIsEven with a whole new
verification function.As far as I can tell right now, that's not
possible with this library and I need to go down the route of
making the custom GreaterThan3AndIsEven type.Another good litmus
test is whether literals also need to be passed through the
verification function or not (or whether you have to pass them
through the verification, get an option, and then unwrap the option
with a comment indicating that it's safe to do so); for a usage of
dependent types I would expect the answer to be no usage of the
verification function would be required.The crucial thing that
dependent types let you do is that you can manipulate types and
values with many of the same tools. Composition (the and and or
from above) is one hallmark of this ability. Another is being able
to call a value-level function in a type signature. For example a
type signature that looks like x : if (a + b == c) then Boolean
else Int FWIW I don't mean to take away from this library; the
practice of using smart constructors is something I'd love all
expressively statically typed languages embrace wholeheartedly.
jackfoxy - 59 minutes ago
OP here.Since Martin-L?f type theory has introduced dependent
types through the dependent function. No one I am aware of has
improved on this approach, or even attempted a different way of
introducing dependent types. And I think theory is very
important.In Practice I think
https://en.wikipedia.org/wiki/Dependent_type gets it right in
that dependent types exist independently of dependent functions.
[A] dependent type is a type whose definition depends on a value.
I think ordinary programmers solving ordinary problems benefit
far more from the type safety dependent types provide than
dependent functions or compile time creation checks. In ordinary
programming most data is independent of the code. There are
notable exceptions, of course.If in practical programming we may
only ever use a term from computer science when it meets the CS
definition in every way, I invite the reader to start a list of
all the terms where this is already not the case.Also ordinary
programmers are more likely to gain at least a superficial
understanding of dependent types when every implementation of
smart constructors does not call the resulting types something
specific to that programming language, but otherwise unconnected
to most anything else.Constructor composition is a neat idea and
merits investigation. The idea of course springs from function
(dependent function) composition. Not exactly what you are
thinking, but I think somewhat related, take a look at the
section Converting between dependent types in the tutorial
https://jackfoxy.github.io/DependentTypes/tutorial.html
phillipcarter - 4 hours ago
This looks neat!
lkitching - 5 hours ago
Does this library actually allow you to express dependent types or
is it just for runtime validation? I can't see any examples in the
documentation of a type depending on a value like the common let
v : Vect 3 Int = [1,2,3] example.There aren't many types shown in
the examples but it looks like let myNonEmptyIntSetOpt = [] |>
Set.ofList |> NonEmptyIntSet.TryCreate would just return None at
runtime rather than a compilation error?
jackfoxy - 5 hours ago
How can you do a compile time check on data that has not entered
the system?You can apply the same technique to Vect 3 [1,2,3] or
any other type.
cstrahan - 28 minutes ago
> How can you do a compile time check on data that has not
entered the system?You would have to supply proof that your
types are correct; when you have a value who's type you'll only
know at run time, you'll need to perform case analysis to bring
additional info about the value's type into scope in the
respective branches. To make concrete what I'm describing here,
I'll use Idris's ++ (concatenation) as an example, which
provides an inductive proof for the length of the resulting
vector: (++) : (xs : Vect m elem) -> (ys : Vect n elem) ->
Vect (m + n) elem (++) [] ys = ys (++) (x::xs) ys = x
:: xs ++ ys The first line is the type signature, which claims
that the length of the resulting vector is the sum of the
lengths of the two argument vectors.The second line pattern
matches on the null/empty constructor of the vector, and then
returns ys. By pattern matching on the null constructor, we
know that m == 0. We also know that n == length ys (whatever
that may be). We know that 0 + x = x, so if we apply that to
this case, we know that: length ([] ++ ys) == length [] +
length ys == 0 + length ys == length ys == n == m + n
(because we know m==0, and 0+n==n) ... so the compiler is
happy here.The third and final line pattern matches on the non-
empty (or cons, if you're a fan of Lisp) constructor. If this
pattern applies, then we can make use of the fact that length
(x::xs) == 1 + length xs. Of course, that that implies length
xs < length (x::xs). That permits us to apply our induction
hypothesis to the (xs ++ ys) sub expression, which tells that
length (xs ++ ys) == length xs + length ys. Putting all of that
together, we see that length (x :: xs ++ ys) == 1 +
length (xs ++ ys) == 1 + length xs + length ys == 1 + (m -
1) + length ys == m + length ys == m + n That concludes
our proof by induction[1].Note that this proof is checked
completely at compile time, and that it is still valid in the
face of types which we won't know until we run our
program.That's what dependent types give you.Here's an example
that would not compile: (++) : (xs : Vect m elem) -> (ys :
Vect n elem) -> Vect (m + n) elem (++) [] ys = ys (++)
(x::xs) ys = x :: x :: xs ++ ys {- !!! note that I cons x here
*twice* -} In conclusion, your library sounds like it could be
an excellent way to provide smart constructors; at the same
time, it's important to not conflate smart constructors with
dependent types. I would urge you to continue developing your
library, though I would recommend switching the name to
something that's more indicative of what it actually does. That
would have the additional benefit of not confusing people who
read your project description and are hearing about dependent
types for the first time.[1]:
https://en.wikipedia.org/wiki/Mathematical_induction
lkitching - 5 hours ago
How do you express the type `Vect 3 Int` using this library? F#
doesn't support it and all the example types appear to depend
on other types, not on values like 3.
jackfoxy - 5 hours ago
I'm pretty sure there are vector libraries available to F#.
Dig around and you will find them. Then just apply dependent
type to a type in that library. I don't work with them
myself.
lkitching - 5 hours ago
I'm not asking specifically about vectors, that's just a
common example of a dependent type. To take another example
from the documentation, can you express a type like type
digits = RegexString (Regex "^[0-9]+$") where all values
of type digits are strings containing only digits? Then
let s1 : digits = "1234" would typecheck but let s2 :
digits = "abc" would not? That would be very surprising
given the F# type system.
vorotato - 5 hours ago
Basically when you take some value the user gives you and
cast it to the dependent type, you can then cover the case
where you got some result because the cast was successful or
you got none. You can't have dependent types at compile time,
at least not completely so, though you could use something
like fslint to catch any data being entered in
code.https://news.ycombinator.com/item?id=7567310
vorotato - 5 hours ago
it would seem that in practice you can go even further,
essentially writing exhaustive proofs. However F* already
supports this and compiles to F#, so I would say if you
want that level of correctness you should probably use a
language which is better at describing the totality of a
function.
[deleted]
jackfoxy - 5 hours ago
a dependent type is a type whose definition depends on a value
https://en.wikipedia.org/wiki/Dependent_type
HumanDrivenDev - 4 hours ago
Is there a reason dependent types aren't more widely used? Is it
just a case of people not wanting to learn anything new, or are
there some show-stopping issues in practice?The idea seems
promising. If I'm not mistaken it would let you statically declare
that an integer had to be between values X and Y, for example.
imh - 2 hours ago
I think it's a an practicality thing. Getting these systems
powerful seems sorta solved, but getting them easy to use seems
tricky. AFAIK it's similar to computer formalized proofs. We can
do them, but it's super verbose and tricky to make it convenient
enough to be worthwhile.
posterboy - 3 hours ago
you can do that in a function preamble (constructor), too.
Dependent types are derived from higher order logic, a thing many
programs simply don't need if they are straight up first order
logic in essence and the use would be constrained to refactoring
for DRY KISS principles, but dependent types are probably not
simple in comparison.
lomnakkus - 3 hours ago
> you can do that in a function preamble (constructor),
too.Well, yeah, but that's a runtime failure. Obviously that's
heaps better than just failing at some arbitrary later time,
but if might be better still to force the program writer to
consider up front what should happen if construction were to
fail.In the particular case of preconditions for constructing
data I would say that constructor validation is usually
sufficient for my purposes, at least. I routinely use it in
Haskell (abstract newtypes) and it seems to work fine,
especially when you couple it with general validation (REST
services).EDIT: I should mention: What I personally find very
attractive about fully dependently typed languages is the fact
that you, the programmer, get to choose exactly how much you
want to prove. Want to work with a JSONValue which can be any
old JSON? Sure, you can do that! Want to work with a JSONValue
which must contain (at least) keys 'foo' and 'bar' which map to
data of type 'Foo' and 'Bar' at all times? Sure you can do that
too!(JSON may not be the best example, but I hope you get the
meaning.)Of course type inference suffers a bit, error messages
probably too. However, I don't see that as a huge problem.
YMMV.
pjmlp - 3 hours ago
I guess the main reason is the lack of mapping to actual world
cases.Usually you have tutorials about peano numbers and such,
instead of how do I show a set of data records, straight out of a
PostgreSQL database.So far, Type-Driven Development with Idris,
seems to be the best book for real world cases, and even it might
a bit over the top for the common CRUD developer.
pron - 3 hours ago
One of the core issues is that arbitrary dependent types require
writing difficult formal proofs [0]. Xavier Leroy, who, as the
main author of CompCert -- probably the only non-trivial program
proven end-to-end [1] with the use of a proof assistant based on
dependent types -- has often said that even he, a world expert,
had found this so laborious that he views manual proofs as viable
only to small programs (and CompCert, while non-trivial, is
certainly a small program). Short of that, the question is
exactly how useful it is to prove only those properties that are
easy to prove. Another issue is that there may be better
alternatives (which could be combined with restricted, "easy",
forms of dependent types. Contract systems (like Java's JML), are
as expressive as dependent types, but separate specification from
verification, meaning, they allow you to state program properties
formally, and then use either (expensive) formal proofs or weaker
forms of verification (like generated tests) to verify them.
Proofs are both extremely costly and are very rarely a
requirement; their added confidence beyond other, weaker, but far
cheaper, forms of verification is very rarely worth their
tremendous cost.[0]: See this example of a simplified Quicksort
in Idris: https://github.com/bmsherman/blog/wiki/Quicksort-in-
Idris[1]: End-to-end verification is ensuring that global
correctness properties (e.g., "the database is always
consistent", or "no user ever gets access to another's data") are
preserved all the way to source code or even to machine-code.
Most "formally verified" software, at least in industry, isn't
verified end-to-end. Rather, either global properties are
verified not against the code but a high-level description of the
software or only local properties are verified at the code level,
or both (but with a gap).
poizan42 - 53 minutes ago
Have you seen CakeML[0]? It is verified that the semantics of
the x86 machine code it outputs in the end are the same as the
SML code it takes as input, up to out-of-memory errors.[0]:
https://cakeml.org/
moomin - 3 hours ago
How to use dependent types in practice is still a research topic.
There?s some interesting problems with having a Turing powerful
type system. Like type equality doesn?t work the same way.
Verdex_3 - 3 hours ago
I think it's two things. The first one is that most people are
not comfortable changing up the programming paradigm that they
are most familiar with. They're familiar with adding some if-
statements to verify at run time. They're not familiar with
using the type system to prove the properties of their data.
Proving non-trivial properties can be hard (and in some cases
impossible), so it makes sense that few people are jumping at the
opportunity to learn everything from the ground up all over again
when the alternative is to just use an if-statement.The second
thing is that sometimes you're going to get dynamic data at run
time and you're going to have to use a run time property checker
anyways. So for example if you needed to parse a bunch of text
sent in by the user. You can still use dependent types to
offload as much as possible to the type system, but at some point
you'll have to deal with receiving user data. And if we go back
to my first point, people are uncomfortable with using new
things. They won't have a good idea of when to put what into the
type system and what into the run time, so the default will be to
just do it all at run time.Liquid types look kind of promising,
but I'm not sure if it's still an active research area. The
stuff rust is doing with affine types is also promising. It's
not dependent, but you can make a bunch of very nifty compile
time checked apis. Finally, ML style types are slowly becoming
more familiar in general in the industry. Once everyone is fully
familiar with type parameters, they'll start to ask about kinds
and values in types. However, it may take a while.
lomnakkus - 3 hours ago
> You can still use dependent types to offload as much as
possible to the type system, but at some point you'll have to
deal with receiving user data.Of course you have do deal with
receiving data at run time, but I think very few people
appreciate that it's actually possible to do the "input
verification" at one specific point in your program and then
have a "proven" safe input and then never have to do any
validation/verification again. This even goes for things like
"give me a vector of integers between 1 to 3 of size exactly 9
as input". Of course you still have to handle the invalid cases
in that one specific place, but that's no different from how
you'd ideally do validation anyway. That stuff already should
be in a single place, and dependent types make that utterly
obvious at compile time :).(I think I may actually be agreeing
with what you're saying, but I though it worth expounding on
what dependent types can do for you.)Btw, AFAIUI Liquid Types,
at least as far as LiquidHaskell goes, is still a thing, though
it's definitely quite "researchy" and who knows whether it'll
become 'mainstream'.Liquid Types also seem to be somewhat
orthogonal to dependent types since they usually just rely on
an external solver that works by "magic" (SMT) and which has
built-in knowledge of e.g. arithmetic whereas most attempts at
dependent types seem to want to avoid building in any of that
knowledge in favor of induction + a more general "tactics" or
"elaboration" type solving where the programmer guides the
solver along. (Idris is an example of the latter, I think.)
pron - 1 hours ago
> it's actually possible to do the "input verification" at
one specific point in your program and then have a "proven"
safe input and then never have to do any
validation/verification again.This can be done with
"tainting" types, which are simple intersection types (even
Java has them as one of its new pluggable type systems [1]),
and doesn't require dependent types.[1]:
https://checkerframework.org/manual/#tainting-checker
jwdunne - 1 hours ago
Your last point is interesting - I believe Idris and/or Agda
in fact go out of their way to show a natural number type
defined using induction as a key example. I remember being
blown away by that.Coming from dynamic languages towards
appreciation of static types, this sort of thing is
inspiring. I've been looking forward to writing some
practical work in those languages when children and time give
the chance.
hinkley - 1 hours ago
There really does seem to be a Last Mile problem with a lot of
these systems for formalism.You're going to get garbage data
from paying customers. With a good information architecture
you'll have 'lines' in the system and all of the data that goes
past a certain point is guaranteed to be sanitized, at which
point all this formalism helps you avoid really bad
mistakes.What ends up happening most times is a set of
proprietary bespoke thunking layers that progressively sanitize
the data until its usable. Everybody writes their own and they
either suck or took lots of effort to get right. Or, the
entire system is full of uncertainty and reads like a Choose
Your Own Adventure as written by a squirrel.Maybe there's a
space there between the user and the formal type system that
needs a set of transformation tools. Like htmltidy, but
without the html.
skybrian - 4 hours ago
Lack of compelling tutorials for beginners? The examples I've
seen are either solving toy problems in a complicated way or not
understandable at all.
HumanDrivenDev - 4 hours ago
What's a toy problem in this case? Anything where you can
replace even part of a dynamic contract seems like a win.
platz - 3 hours ago
> twblalock 468 days ago [-]> The Idris example seems to need
further explanation: >> In Idris, we can say "the add
function takes two integers and returns an integer, but its
first argument must be smaller than its second argument":>>
add : (x : Nat) -> (y : Nat) -> {auto smaller : LT x y} ->
Nat>> add x y = x + y>That's all well and good, if you know
the values of x and y at compile time. Consider a program
that reads x and y from STDIN. The user could provide an x
that is equal to or larger than y (or could provide only one
value, or values that are not numbers). I see no way to deal
with that except to throw a runtime error. Is that what would
happen?In the case where the values are read from the
external environment, first you would have to compare them
before calling the function.The comparator would return
either a proof that x <= y, or x > y.the proof ensures that
at compile time you cannot mess this up.in other words, You
have to perform the check at runtime, but the results for the
check are enforced via a proof that is ensured to be correct
at compile time.Connecting the proof the to the type system
at compile time is the magic of dependent types> 18 points by
platz 468 days ago [-]here's a full code example in Idris:
import Data.String -- takes two integers, and a proof
that x < y, and yields an integer add : (x :
Integer) -> (y : Integer) -> (prf : x < y =
True) -> -- require a proof that that x < y Integer
add x y prf = x + y main : IO () main = do sx
<- getLine -- read string from input sy <- getLine --
read string from input let Just x = parseInteger sx --
assuming int parse is ok, else error let Just y =
parseInteger sy -- assuming int parse is ok, else error
case decEq (x < y) True of -- decEq constructs a proof if x
< y is True Yes prf => print (add x y prf) No
=> putStrLn "no prf, x is not less than y" lets say I mess
up the sign of the comparison on the case line and write
decEq (x > y) instead... then I'd get a type error When
checking argument prf to function Main.add: Type mismatch
between x > y = True (Type of prf)
and x < y = True (Expected type) there's no
way to construct the prf value artificially, or sneak in
different parameters that are unrelated to the prf value.it's
either a compile error or it's valid.c.f.
https://news.ycombinator.com/item?id=12349384
skybrian - 3 hours ago
It may or may not be a win. If the constraint being validated
is trivial and the complexity added is substantial then maybe
it's not a win?A common example for demonstrating dependent
types seems to validating the length of a list, but they
don't show how it's useful in a problem where validating the
length is important (for example to prevent security
issues).Also, a good example would be performing a
calculation on external input data (from user input, a file,
or a network connection), rather than on a constant, and
showing how invalid input is handled.
dwohnitmok - 2 hours ago
If you're talking about the restricted/smart constructor approach
that I think is exemplified by this library, i.e. the approach
that hides the construction of a type behind a mandatory
verification function, it's a pattern that's been around for a
long time. OOP constructors are perhaps the best known example,
although they often through an exception on invalid input rather
than represent this at the type level with an Option/Maybe type
or Either type.I personally think that this approach (with the
type level tracking) is indeed woefully underused (and is an
easy-to-understand-and-implement win in most codebases) and would
usually chalk it up to lack of familiarity in the community.That
being said, there are definitely domains that can make this
approach (without the help of general dependent types) cumbersome
to the point that I would consider scrapping it. One example of
this is doing statistics and collection types. Most statistical
operations operate over a collection of values. This is easily
supplied by a standard collection type such as List or Vector. A
significant chunk require a non-empty collection of values (think
aggregate statistics such as mean). A type such as NonEmptyList
or NonEmptyVector comes in handy here along with the
corresponding verification function. A smaller chunk require
collections of at least two elements (e.g. a sample estimate of
standard deviation). Now you need a separate AtLeast2ElemList
with a separate verification function. Estimates of higher
statistical moments that are used with ever-decreasing frequency
the higher you go up require at least 3, 4, etc. elements in a
collection. Now you need an AtLeast3ElemList, an
AtLeast4ElemList, etc. set of types with all their corresponding
distinct verification functions. That's a lot of pain for little
gain. In practice I usually personally handle this with covering
90% of cases with a collection and NonEmptyCollection type and
then in those other cases have the aggregation function take a
general collection and return an optional value rather than
enforce GreaterThanN-ness at the input type. This is where
dependent types could step in and have a single function
AtLeastNElems with a single corresponding verification function
and cut down on this madness.Which brings me to general dependent
types. If you're talking about general dependent types, the
answer is the ergonomics are hard and best practices are still
nascent. Coq, Agda, ATS, and Idris are all examples of
dependently typed languages, but none of them are close to
mainstream. Part of that is because they just haven't had close
to the resources devoted to them that mainstream languages have
had. Part of that is because some aspects of using them are still
somewhat open programmer UX problems. Because you can write any
sort of total function in a type signature you have to worry
about a couple of things.1. Compile times. You're doing complex
stuff with complex types. It will always be true that you can
create pathologically long compile times (this is true in most
type systems far weaker than dependent types). What will make or
break your type system is if the pathological cases come up
regularly in normal code (sad case) or if a user has to create
some really messed up code in order to hit those cases (happy
case). This is even harder to get right if you give users a ton
of rope when it comes to types.2. If you're not careful not only
can you verify, you must verify. It is often the case in a
dependently typed language that you will use part of a type to
provide a value used in a computation. E.g. a type of
GreaterThanX might embed a comparison function that then gets
used later in the computation of a value. Most languages have
some notion of a "BelieveMeThisIsTrue" value which can be of any
type and blows up at runtime if it actually ends up getting
called. This functions as an escape hatch that lets you
circumvent any type checks because you can always submit this
value and have everything type check. You might hope that
whenever the burden of verification gets too high, you can just
submit "BelieveMeThisIsTrue" if you're sure that some property is
true and omit any necessary checks. However, if you've embedded a
value-level computation that you are planning to use later on in
the type, then "BelieveMeThisIsTrue" will blow up your whole
program, e.g. when you go to use the comparison function you've
embedded in your GreaterThanX type. This means that you gotta
make sure that verification is pretty easy or that it's pretty
hard to end up forcing mandatory verification with absolutely no
escape hatch. Both are hard problems.Dependent types are
extremely exciting, but finding the best way of using them in
practice is still something of a research problem.So far the
nicest applications of them have been in restricted domains where
you don't get the full power of dependent types, which limits
their expressiveness, but makes it a lot easier to get the
programmer ergonomics right See projects such as LiquidHaskell
(https://ucsd-progsys.github.io/liquidhaskell-blog/) or Cryptol
(https://cryptol.net/).
DonaldFisk - 3 hours ago
Ada has always had integer ranges. Dependent types include
integer ranges, but integer ranges aren't necessarily dependent
types.Dependent types do a bit more than that. Supposing you
want to put a random pixel on a canvas C, but you don't know C's
dimensions at compile time (perhaps because the user can change
them at run time). The numerical value of the pixel's x and y
coordinate ranges aren't known at compile time, but it is known
that they're not negative and less than C.width and C.height.
C = new Canvas ... // random(n) returns an integer of
type range(0, n) x = random(C.width) // x has type range(0,
C.width) y = random(C.height) // y has type range(0,
C.height) // drawPoint(w, j, i) takes args of type canvas,
range(0, w.width), range(0, w.height). drawPoint(C, x, y) //
The types of x and y depend on the run-time value of C. Using
dependent typing would enable us to remove another source of
bugs.
HumanDrivenDev - 2 hours ago
> Ada has always had integer ranges. Dependent types include
integer ranges, but integer ranges aren't necessarily dependent
types.I was under the impression that Adas integer ranges were
checked at runtime, like contracts. Is that not the case?
DonaldFisk - 2 hours ago
That's correct, it's a run-time check. According to
https://en.wikibooks.org/wiki/Ada_Programming/Types/range
A range is a signed integer value which ranges from a
First to a last Last. It is defined as range First
.. Last When a value is assigned to an object with such
a range constraint, the value is checked for validity
and Constraint_Error exception is raised when the value
is not within First to Last.
HumanDrivenDev - 2 hours ago
Right. I said dependent types would allow you to statically
declare integer ranges, as is my understanding.It's
interesting to me how Ada takes the approach of blending
types and run time contracts. In most languages it would be
two different things - a function that takes int, then a
contract or assert that handles the value.
mjn - 1 hours ago
Some Lisp implementations (like SBCL) also do that
blending, although they can statically catch some integer
range violations too. For example if you declare a
variable to be an (integer 50 100) and try to assign a
literal 30 to it, you'll get a compile-time warning in
SBCL: ; Constant 30 conflicts with its asserted type
(INTEGER 50 100). But in many other cases it'll generate
a runtime check instead, because the static type checking
is basically best-effort.
DonaldFisk - 2 hours ago
They allow you to statically declare integer ranges (as
well as other things), even if their bounds are unknown
until run-time, and without requiring a run-time type
check.
xyz-x - 6 hours ago
Nice lib and library tests!
Animats - 4 hours ago
It's like subclassing, but it has to be called something else to be
cool. Objects are so last-cen.
joeblubaugh - 4 hours ago
It's significantly different than subclassing. The wikipedia
examples are instructive: a dependent type is a type whose
definition depends on a value. A "pair of integers" is a type.
A "pair of integers where the second is greater than the
first" is a dependent type because of the dependence on the
value.