HN Gopher Feed (2017-10-17) - page 1 of 10
79 points by trevorrileshttp://simonmar.github.io/posts/2017-10-17-hotswapping-haskell.html
GarvielLoken - 2 hours ago
We are going to be back in the 70ies/80ies of running live image
programming-systems in no time. History repeats itself. Mark my
pjmlp - 1 hours ago
That is what web apps are all about actually.
teraflop - 3 hours ago
I don't want to knock the technical achievement here -- it's a cool
hack -- but I'm really surprised that it was deemed to be the best
choice for a production system.In the first place, "we can't
compile our code on every change because it takes too long" is a
really awful situation to be in. Are developers not building and
testing their changes before deploying them? Can Facebook not
afford a continuous integration system that can run builds in
parallel? It sounds like this problem is only happening because the
application is a giant monolith, but for some reason splitting it
up would slow down development even more... I'm not sure I buy that
reasoning.The article says that "Haskell?s strict type system means
we?re able to confidently push new code knowing that we can?t crash
the server", which is a real stretch. In addition to all of the
usual ways a computation can diverge, this hot-swapping system adds
a whole new variety of failure modes. The article talks about how
the code needs to be carefully audited to prevent memory leaks, but
it doesn't even mention the weird things that can happen when
mutable state is preserved across code modifications. Debugging is
a pain when your data structures can get into states that aren't
reachable with any single version of the code. (This is a well-
known issue in Linux kernel live-patching, for instance.)
jjnoakes - 3 hours ago
I agree. Fun read and cool hack, but it definitely feels like
they are stretching to justify the more fun of the two options
(spend time on this or spend time fixing the root cause).
cies - 3 hours ago
Glad you guys both can make a better trade off than the
engineers that actually have their hands on the problem.
/sYou're reading a blog post, you do not know all they have
tried, nor the various intricacies they're dealing with.
cakoose - 1 hours ago
Yeah, my initial reaction was "I can see how these design
decisions might make sense, but the blog post is
horrible."These kinds of designs typically emerge over a long
and windy history and, for someone who was part of that
process, it's difficult to coherently describe the final
state to an outsider. Good textbook authors have this skill.
Most tech blog authors do not. (I think that part of the
problem is that people don't respect just how difficult it
actually is.)My guess: restarting a large fleet of processes
is a pain. The rollout will typically be throttled to avoid
connection churn, among other things. For risky code
changes, you probably want a slow rollout anyway, but if
you're just tweaking abuse detection rules (almost just a
config change), it's nice to have your changes take effect
more quickly. Dynamic loading seems like one reasonable way
to achieve that goal.Tangent: people, please stop making
analogies to mechanical engineering feats that are WAY more
difficult than what you did . People have been loading
shared libraries forever; it's like adding an AUX port, not
swapping out the engine. It's not even in the same league as
Ksplice or as the JVM's dynamic loading/deoptimization.
jjnoakes - 2 hours ago
They explained their justification; if they don't want random
people on random forums disagreeing with their justification
because it wasn't complete enough, they are free to make it
teraflop - 2 hours ago
You're right, I don't know all the intricacies of their
system. That's why I said "I'm surprised" rather than "this
is a bad design decision". It doesn't mean I can't point out
potential pitfalls that I think the blog post glosses over.
jimbokun - 37 minutes ago
"In the first place, "we can't compile our code on every change
because it takes too long" is a really awful situation to be
in."Isn't this exactly the problem Go was invented to solve?
kornish - 8 minutes ago
It was one of them. However, given the other writing/talks
Facebook has put out about their usage of Haskell and Haxl, Go
is probably not a good fit for their use case due to language
expressivity concerns (not declarative enough, not enough type
safety, not syntactically flexible enough for writing DSLs).
tazjin - 7 minutes ago
It "solves" it by not doing any of the things that you'd expect
a modern language's compiler to do.In my opinion the time
wasted debugging Go issues that could have been statically
prevented is better spent waiting for a slightly longer compile
cycle to finish.
goialoq - 36 minutes ago
JVM hotswapping is ages old, but it's usually used only in
testing, not production.
forkerenok - 1 hours ago
I remember at Standard Chartered they have a Haskell monolith
project of a few million LoC and they are relying on incremental
building. I wonder why that wasn't an option for facebook.
From podcast: http://www.haskellcast.com/episode/002-don-stewart-
hood_syntax - 22 minutes ago
I'm not 100% sure, but they do use a custom compiler. That
might explain the difference
JonCoens - 33 minutes ago
I should have emphasized the speed of deployment being a first
order concern more. We certainly can (and do) build our code for
every change, but not at the speed that we want to be updating.We
use a monorepo for all of the benefits it has, and deploying fast
business logic updates this way helps mitigate one of its
downsides (particularly when you've maximally parallelized the
build). I've found https://danluu.com/monorepo/ to give a quick
overview of how chopping up the repo would have separate
downsides.The section about "Sticky Shared Objects" speaks
directly to mutable state across code modifications, just with a
Terribledactyl - 1 hours ago
I think the main benefit is the middle point. It sounds like they
have programs with huge memory footprints and (I?m guessing)
caches that take a while to warm up. This lets them avoid that.
Fraud detection is probably time sensitive and slow responses
goialoq - 29 minutes ago
They could transfer the cache data from one (old) server
instance to another (new) one.
elihu - 9 minutes ago
It kind of sounds like they're running into some limitations of
GHC: it tends to take a long time to compile stuff, and it tends
to generate some very big binaries. For most applications, those
aren't major problems but in their use case (hundreds of
thousands of lines of code deployed to many servers) it is an
issue so they're working around it. That allows them to keep
working in the language they prefer and are productive in, which
is great.Improving GHC compile times and reducing the binary size
would be better, but presumably a lot of work has already gone
into those problems and if it were easy someone would have done
it by now. As for myself, I really like using Haskell and I'm
glad whenever I hear about it being used in industry.
wyager - 2 hours ago
While I agree a slow build indicates a problem with their build
infra, haskell?s purity and type system do rule out issues with
mutable State (presumably there isn?t any in the hot-swapped
module) and invalid states (the type system prevents invalid
states from being constructed, given the way they have a fixed
hot-cold API).The article describes the hot-swapped module as
containing frequently changing business logic, which sounds like
it?s something they can probably do via an interface with well-
constrained or no mutability.
rdtsc - 3 hours ago
Hotswapping just like security is one of those things that is hard
to bolt on later, unless it is built in deeply into the very core
of the language / runtime.Erlang (and Elixir) define hotswapping
very well. It is a standard way to upgrade code in production in
some places. And even with it being well defined it is still very
hard and there are enough corner cases to handle.But when used
correctly, it is really magical and can achieve nice
properties.Besides just upgrading code, hotswapping (at least in
Erlang) can be used for debugging -- you can update the running
code with extra log statements to catch sneaky corner cases. Maybe
it is a customer setup, that is very hard to replicate.Or you can
use it for local development, as you edit code, the module gets
auto-reloaded (with a helper).It can also be used to deliver hot
fixes. Say if the fix is simple and the customer cannot wait for a
full release to be built, can update their system on the spot to
tie them over. Not idea but I've seen it save the day many times.
andy_ppp - 41 minutes ago
Couldn't agree more, when the article said "Starting and tearing
down millions of heavy processes a day would create undue churn
on other infrastructure" I just thought, yes, I best you'd
struggle to create an architecture so monolithic in Erlang or
Elixir. Just one of the many benefits of course... add on the the
number of process you can create on one machine while maintaining
crusso - 1 hours ago
Or you can use it for local development, as you edit codeThis is
a huge feature for me in my Elixir development. I mostly use
Elixir for some server code that manages many connections to
external network entities. It would be a huge hassle to bring
down my server application every time I want to make a
change.With Elixir (yeah, Erlang), I can normally recompile the
module I'm working on and deploy it in the running server. Not
only is it a good way to constantly observe Erlang hot-swapping
in action on my dev machine, it's a huge time saver.
tekacs - 3 hours ago
People might find this interesting to compare and constrast to
(certainly I noticed parallels with) Netflix's 'serverless'
platform, discussed yesterday:https://medium.com/netflix-techblog
/developer-experience-les...Edit: As a sibling commenter notes this
is most eminently doable with e.g. Common Lisp and BEAM
(Erlang/Elixir), but more folks are (publicly) attempting this in
other environments now (I've experimented with a number of
approaches to this the last few years, so I'm trying to keep score
- would love to see any comments on other attempts below).:
Quote: "At the core of the redesign is a Dynamic Scripting Platform
which provides us the ability to inject code into a running Java
application at any time. This means we can alter the behavior of
the application without a full scale deployment."
cies - 3 hours ago
> Common Lisp and BEAM (Erlang/Elixir)Clojure as well right?
tekacs - 3 hours ago
Yup having strong namespaces in general makes this easier I
think and in Clojure tools like devcards (CLJS) and Ring reload
(CLJ) support this workflow.
Mikeb85 - 2 hours ago
Anything JVM based. Also Smalltalk.
aaron-lebo - 3 hours ago
It's basically this:https://news.ycombinator.com/item?id=8804381htt
p://nullprogram.com/blog/2014/12/23/You'd be surprised how many
languages can do this. Though it's hard to beat lisp (and Erlang),
where it is the default.
lallysingh - 1 hours ago
really, anyone who bothers reading the dlopen(3) manpage:
unloading code is straightforward. The trick is in getting the
code called from existing code.
pjmlp - 1 hours ago
Loading is straightforward, unloading depends.