Creating a spec for destructuring

clojure.png

A while back David Nolen had a thoughtful post about using spec as a tool for thought, which included an exploration of creating a spec for clojure.core/let.

The latest Clojure alpha actually includes a spec for let that covers destructuring and I thought it might be interesting to walk through the details of how it is implemented.

I'll pick up approximately where David left off. A typical let looks like this:

(let [a 1
      b 2]
  (+ a b))

We can define an initial spec for clojure.core/let by splitting it into bindings and body:

(require '[clojure.spec :as s]
         '[clojure.spec.gen :as gen])

(s/fdef let
  :args (s/cat :bindings ::bindings
               :body (s/* any?)))

We then need to more fully define bindings as a vector of individual bindings. Each binding is made of a binding-form and an init-expr that computes the value of the local binding:

(s/def ::bindings (s/and vector? (s/* ::binding)))
(s/def ::binding (s/cat :binding ::binding-form 
                        :init-expr any?))

The expressions can be anything so we leave those as any?. The binding-form is where things get interesting. Let's first allow for binding-form to be just simple (no namespace) symbols. That's enough to create something to work with.

;; WORK IN PROGRESS
(s/def ::binding-form simple-symbol?)

Now that we have a full spec, we can actually try a few things. Let's try an example of conforming our bindings.

(s/conform ::bindings '[a 1, b 2])
;;=> [{:binding a, :init-expr 1} {:binding b, :init-expr 2}]

Looks good! We get back a vector of binding maps broken into the binding and the initial expression.

Now we need to expand our spec to include sequential destructuring and map destructuring.

Sequential destructuring

Sequential destructuring binds a series of symbols to the corresponding elements in the bound value. Optionally, the symbols may be followed by a variadic argument (using &) and/or an alias for the overall sequence (using :as).

Some examples:

;; Sequential destructuring examples:
[a b]
[a b :as s]
[a b & r]

To describe a sequential spec we use the spec regex operators:

;; WORK IN PROGRESS
(s/def ::seq-binding-form
  (s/cat :elems (s/* simple-symbol?)
         :rest  (s/? (s/cat :amp #{'&} :form simple-symbol?))
         :as    (s/? (s/cat :as #{:as} :sym simple-symbol?))))

Let's try it out:

(s/conform ::seq-binding-form '[a b])
;;=> {:elems [a b]}
(s/conform ::seq-binding-form '[a b :as s])
;;=> {:elems [a b], :as {:as :as, :sym s}}
(s/conform ::seq-binding-form '[a b & r])
;;=> {:elems [a b & r]}

Hang on a sec, what happened in the last example? The elems snagged & r as well because & is a symbol. We need to redefine our notion of what a binding symbol is to exclude the symbol &, which is special in the language of destructuring:

;; WORK IN PROGRESS
(s/def ::local-name (s/and simple-symbol? #(not= '& %)))
(s/def ::seq-binding-form
  (s/cat :elems (s/* ::local-name)
         :rest  (s/? (s/cat :amp #{'&} :form ::local-name))
         :as    (s/? (s/cat :as #{:as} :sym ::local-name))))

(s/conform ::seq-binding-form '[a b & r :as s])
;;=> {:elems [a b], :rest {:amp &, :form r}, :as {:as :as, :sym s}}

That's better. But it turns out I've not really been spec'ing the full truth of sequential destructuring. Each of the ::elems can itself be sequentially destructured, and even the rest arg can be destructured.

We need to back up to the beginning and reconsider the definition of ::binding-form to add the possibility of either a ::local-name (our improved simple symbol) or a sequential destructuring form. (We'll add map later.)

(s/def ::local-name (s/and simple-symbol? #(not= '& %)))

;; WORK IN PROGRESS (still missing ::map-binding-form)
(s/def ::binding-form
  (s/or :sym ::local-name
        :seq ::seq-binding-form))

(s/def ::seq-binding-form
  (s/cat :elems (s/* ::binding-form)
         :rest  (s/? (s/cat :amp #{'&} :form ::binding-form))
         :as    (s/? (s/cat :as #{:as} :sym ::local-name))))

Now ::binding-form is a recursive specification. Binding-forms are either symbols or sequential forms, which may themselves contain binding-forms. The registry provides naming indirection which makes this possible.

Let's try our prior example again and see things have changed.

(s/conform ::seq-binding-form '[a b & r :as s])
;;=> {:elems [[:sym a] [:sym b]], :rest {:amp &, :form [:sym r]}, :as {:as :as, :sym s}}

Our conformed result is a bit more verbose as it now indicates for each binding form what kind of binding it is. While this is more verbose to read, it's also easier to process. Here's how a recursive binding form example looks:

(s/conform ::seq-binding-form '[a [b & c] [d :as e]])
;;=> {:elems [[:sym a]
;;            [:seq {:elems [[:sym b]], :rest {:amp &, :form [:sym c]}}]
;;            [:seq {:elems [[:sym d]], :as {:as :as, :sym e}}]]}

Finally we are ready to look at map destructuring.

Map destructuring

Map destructuring has a number of entry forms that can be used interchangeably:

  • <binding-form> key - for binding either a local name with (get m key) or recursively destructuring
  • :keys [key ...] - for binding locals with the same name as each key to the value retrieved from the map using the key as a keyword. In addition the specified keys can be either symbols or keywords and simple or qualified. In all cases, the local that gets bound is a short symbol and the value is looked up as a keyword.
  • :<ns>/keys [key ...] - same as :keys, but where ns is used as the namespace for every key
  • :syms [sym ...] - for binding locals with the same name as each sym to the value retrieved from the map using sym, which may be either simple or qualified.
  • :<ns>/syms [sym ...] - same as :syms, but where ns is used as the namespace for every symbol.
  • :strs [str ...] - for binding locals with the same name as each sym to the value retrieved from the map using str as a sym, which must be simple.
  • :or {sym expr} - for providing default values for any missing local that would have been bound based on other entries. The keys should always be simple symbols (the same as the bound locals) and the exprs are any expression.
  • :as sym - binds the entire map to a local named sym.

There is a lot of functionality packed into map binding forms and in fact there are really three different map specs combined into this single map. We call this a "hybrid" map spec.

The first part describes just the fixed well-known attributes in a typical s/keys spec:

(s/def ::keys (s/coll-of ident? :kind vector?))
(s/def ::syms (s/coll-of symbol? :kind vector?))
(s/def ::strs (s/coll-of simple-symbol? :kind vector?))
(s/def ::or (s/map-of simple-symbol? any?))
(s/def ::as ::local-name)

(s/def ::map-special-binding
  (s/keys :opt-un [::as ::or ::keys ::syms ::strs]))

The second part describes the basic binding form specs (examples like {n :name} ), although the left hand side here can further destructure.

(s/def ::map-binding (s/tuple ::binding-form any?))

And finally we need to handle the new functionality for namespaced key or symbol sets (like :<ns>/keys or :<ns>/syms) which we'll describe here as a map entry tuple:

(s/def ::ns-keys
  (s/tuple
    (s/and qualified-keyword? #(-> % name #{"keys" "syms"}))
    (s/coll-of simple-symbol? :kind vector?)))

Then we can put all of these together into the ::map-binding-form by combining them as an s/merge or the well-known attributes and a description of the possible tuple forms:

;; collection of tuple forms
(s/def ::map-bindings
  (s/every (s/or :mb ::map-binding
                 :nsk ::ns-keys
                 :msb (s/tuple #{:as :or :keys :syms :strs} any?)) :into {}))

(s/def ::map-binding-form (s/merge ::map-bindings ::map-special-binding))

And finally we need to go back and define our parent spec to include map bindings:

(s/def ::binding-form
  (s/or :sym ::local-name
        :seq ::seq-binding-form
        :map ::map-binding-form))

And that's it! Here's an example binding form that shows several features of destructuring:

(s/conform ::binding-form
  '[[x1 y1 :as p1]
    [x2 y2 :as p2]
    {:keys [color weight]
     :or {color :black weight :bold}
     :as opts}])
;;=> [:seq {:elems [[:seq {:elems [[:sym x1] [:sym y1]], :as {:as :as, :sym p1}}]
;;                  [:seq {:elems [[:sym x2] [:sym y2]], :as {:as :as, :sym p2}}]
;;                  [:map {:keys [color weight]
;;                         :or {color :black, weight :bold}
;;                         :as opts}]]}]

Now that we have a spec for destructuring, we can reuse it anywhere destructuring is allowed - in fn, defn, for, etc. We could even leverage it to implement destructuring itself. Rather than recursively parsing the binding form, we could simply conform it to receive a more regular structure described in terms of the parts we've defined in the spec.

The Next Five Years of ClojureScript

I delivered a talk at the well attended ClojuTRE conference in Tampere, Finland this past September titled "The Next Five Years of ClojureScript". The ClojureScript community is growing at a healthy clip and many recent adopters are unaware that the ClojureScript development effort is so mature. I decided it was time to highlight how far the project has come and celebrate the incredible work of the community. While it's certainly true that Cognitect has and continues to lead core development, it's the community that has collectively delivered on the promise of the project by filling in so many important details. And of course, outside of core development there's been an unbelievable amount of broader open source activity to ensure ClojureScript is able to achieve the reach Rich Hickey talked about five years ago. Whether web browser, iOS, or Android - the ClojureScript community is bringing the simplicity of Clojure where we need it most.

The talk ends with some thoughts about ClojureScript looking ahead into the next five years. In many ways ClojureScript has and continues to be ahead of the JavaScript mainstream with respect to best practices. Concepts which are only starting to break into the mainstream such as immutability, single atom application state, and agile UI development via robust hot-code reloading are old news to ClojureScript users. And thanks to the under appreciated Google Closure compiler, ClojureScript offers features like dead code elimination and precise code splitting that popular JavaScript tooling is unlikely to achieve anytime in the near future.

Still despite some of these continuing issues, the JavaScript ecosystem offers many riches and looking at 2017 we'll be focusing on deeper integration with the various JavaScript module formats. As with Clojure and Java, a core ClojureScript value proposition is a simpler programming model that allows users to frictionlessly integrate those solutions from a vast ecosystem that precisely fits their needs.

Works on My Machine: Self Healing Code with clojure.spec

Works On My Machine is the place where Cognitects reflect on tech, culture, and the work we do. The views expressed on Works On My Machine are those of the author and don’t necessarily reflect the official position of Cognitect.

How can we can make code smarter? One of the ways is to be more resilient to errors. Wouldn't it be great if a program could recover from an error and heal itself? This code would be able to rise above the mistakes of its humble programmer and make itself better.

The prospect of self-healing code has been heavily researched and long sought after. In this post, we will take a look at some of the key ingredients from research papers. Then, drawing inspiration for one of them, attempt an experiment in Clojure using clojure.spec.

Self Healing Code Ingredients

The paper Towards Design for Self-healing outlines a few main ingredients that we will need.

  • Failure Detection - This one is pretty straight forward. We need to detect the problem in order to fix it.
  • Fault Diagnosis - Once the failure has been detected, we need to be able to figure out exactly what the problem was so that we can find a solution.
  • Fault Healing - This involves the act of finding a solution and fixing the problem.
  • Validation - Some sort of testing that the solution does indeed solve the problem.

With our general requirements in hand, let's turn to another paper for some inspiration for a process to actually achieve our self healing goal.

Self Healing with Horizontal Donor Code Transfer

MIT developed a system called CodePhage which is a system inspired from the biological world with horizontal gene transfer of genetic material between different organisms. In it they use a "horizontal code transfer system" that fixes software errors by transferring correct code from a set of donor applications.

This is super cool. Could we do something like this in Clojure?

Clojure itself has the fundamental ability with macros to let the code modify itself. The programs can make programs! That is a key building block but clojure.spec is something new and has many other advantages that we can use.

  • clojure.spec gives code the ability to describe itself. With it we can describe the data the functions take as input and output in a concise and composable way.
  • clojure.spec gives us the ability to share these specifications with other code in the global registry.
  • clojure.spec gives us the ability to generate data from the specifications, so we can make example data that fits the function's description.

With the help of clojure.spec, we have all that we need to design and implement a self-healing code experiment.

Self Healing Clojure Experiment

We'll start with a simple problem.

Imagine a programmer has to write a small report program. It will be a function called report that is made up of three helper functions. It takes in a list of earnings and outputs a string summary of the average.

(defn report [earnings]
  (-> earnings
      (clean-bad-data)
      (calc-average)
      (display-report)))

The problem is that our programmer has made an error in the calc-average function. A divide by zero error will be triggered on a specific input.

Our goal will be to use clojure.spec to find a matching replacement function from a set of donor candidates.

img1

Then replace the bad calc-average function with a better one, and heal the report function for future calls.

img2

The Setup

Let's start with the report code. Throughout the code examples I will be using clojure.spec to describe the function and its data. If you haven't yet looked at it, I encourage you to check out the spec Guide.

The first helper function is called clean-bad-data. It takes in a vector of anything and filters out only those elements that are numbers.

(defn clean-bad-data [earnings]
  (filter number? earnings))

Let's create a couple of specs to help us describe it. The first, earnings will be a vector, (for the params) with another vector of anything.

(s/def ::earnings (s/cat :elements (s/coll-of any?)))

The next spec for the output of the function we will call cleaned-earnings. It is going to have a custom generator for the purposes of this experiment, which will constrain the generator to just returning the value [[1 2 3 4 5]] as its example data[^1].

(s/def ::cleaned-earnings (s/with-gen
                            (s/cat :clean-elements (s/coll-of number?))
                            #(gen/return [[1 2 3 4 5]]))

An example of running the function is:

(clean-bad-data [1 2 "cat" 3])
;=>(1 2 3)

If we call spec's exercise on it, it will return the custom sample data from the generator.

(s/exercise ::cleaned-earnings 1)
;=> ([[[1 2 3 4 5]] {:clean-elements [1 2 3 4 5]}])

Now we can spec the function itself with s/fdef. It takes the earnings spec for the args and the cleaned-earnings spec for the return value.

(s/fdef clean-bad-data
        :args ::earnings
        :ret ::cleaned-earnings)

We will do the same for the calc-average function, which has the flaw vital to our experiment - if we pass it an empty vector for the earnings, the count will be zero and result in a run time divide by zero error.

(defn calc-average [earnings]
  (/ (apply + earnings) (count earnings)))

(s/def ::average number?)

(s/fdef calc-average
    :args ::cleaned-earnings
    :ret ::average)

Finally, we will create the rest of the display-report function and finish specing the function for report.

(s/def ::report-format string?)

(defn display-report [avg]
  (str "The average is " avg))

(s/fdef display-report
        :args (s/cat :elements ::average)
        :ret ::report-format)

(defn report [earnings]
  (-> earnings
      (clean-bad-data)
      (calc-average)
      (display-report)))

(s/fdef report
        :args ::earnings
        :ret ::report-format)

Giving a test drive:

(report [1 2 3 4 5])
;=> "The average is 3"

And the fatal flaw:

(report [])
;=>  EXCEPTION! Divide by zero

Now we have our problem setup. We need to have our donor candidates.

The Donor Candidates

We are going to have a separate namespace with them. They will be a number of them, all function speced out. Some of them will not be a match for our spec at all. Those bad ones include:

  • bad-calc-average It returns the first number in the list and doesn't calc the average at all.
  • bad-calc-average2 It returns a good average function but the result is a string. It won't match the spec of our calc-average function.
  • adder It takes a number and adds 5 to it. It also won't match the spec of calc-average.

There is a matching function called better-calc-average that matches the spec of our calc-average function and has the additional check for divide by zero.

(s/def ::numbers (s/cat :elements (s/coll-of number?)))
(s/def ::result number?)

(defn better-calc-average [earnings]
  (if (empty? earnings)
    0
    (/ (apply + earnings) (count earnings))))

This is the one that we will want to use to replace our broken one.

We have the problem. We have the donor candidates. All we need is the self-healing code to detect the problem, select and validate the right replacement function, and replace it.

The Self Healing Process

Our process is going to go like this:

  • Try the report function and catch any exceptions.
  • If we get an exception, look through the stack trace and find the failing function name.
  • Retrieve the failing function's spec from the spec registry
  • Look for potential replacement matches in the donor candidates
    • Check the orig function's and the donor's :args spec and make sure that they are both valid for the failing input
    • Check the orig function's and the donor's :ret spec and make sure that they are both valid for the failing input
    • Call spec exercise for the original function and get a seed value. Check that the candidate function's result when called with the seed value is the same result when called with the original function.
  • If a donor match is found, then redefine the failing function as new function. Then call the top level report form again, this time using the healed good function.
  • Return the result!
(ns self-healing.healing
  (:require [clojure.spec :as s]
            [clojure.string :as string]))

(defn get-spec-data [spec-symb]
  (let [[_ _ args _ ret _ fn] (s/form spec-symb)]
       {:args args
        :ret ret
        :fn fn}))

(defn failing-function-name [e]
  (as-> (.getStackTrace e) ?
    (map #(.getClassName %) ?)
    (filter #(string/starts-with? % "self_healing.core") ?)
    (first ?)
    (string/split ? #"\$")
    (last ?)
    (string/replace ? #"_" "-")
    (str *ns* "/" ?)))

(defn spec-inputs-match? [args1 args2 input]
  (println "****Comparing args" args1 args2 "with input" input)
  (and (s/valid? args1 input)
       (s/valid? args2 input)))

(defn- try-fn [f input]
  (try (apply f input) (catch Exception e :failed)))

(defn spec-return-match? [fname c-fspec orig-fspec failing-input candidate]
  (let [rcandidate (resolve candidate)
        orig-fn (resolve (symbol fname))
        result-new (try-fn rcandidate failing-input)
        [[seed]] (s/exercise (:args orig-fspec) 1)
        result-old-seed (try-fn rcandidate seed)
        result-new-seed (try-fn orig-fn seed)]
    (println "****Comparing seed " seed "with new function")
    (println "****Result: old" result-old-seed "new" result-new-seed)
    (and (not= :failed result-new)
         (s/valid? (:ret c-fspec) result-new)
         (s/valid? (:ret orig-fspec) result-new)
         (= result-old-seed result-new-seed))))

(defn spec-matching? [fname orig-fspec failing-input candidate]
  (println "----------")
  (println "**Looking at candidate " candidate)
  (let [c-fspec (get-spec-data candidate)]
    (and (spec-inputs-match? (:args c-fspec) (:args orig-fspec) failing-input)
         (spec-return-match? fname c-fspec orig-fspec  failing-input candidate))))

(defn find-spec-candidate-match [fname fspec-data failing-input]
  (let [candidates (->> (s/registry)
                        keys
                        (filter #(string/starts-with? (namespace %) "self-healing.candidates"))
                        (filter symbol?))]
    (println "Checking candidates " candidates)
    (some #(if (spec-matching? fname fspec-data failing-input %) %) (shuffle candidates))))


(defn self-heal [e input orig-form]
  (let [fname (failing-function-name e)
        _ (println "ERROR in function" fname (.getMessage e) "-- looking for replacement")
        fspec-data (get-spec-data (symbol fname))
        _ (println "Retriving spec information for function " fspec-data)
        match (find-spec-candidate-match fname fspec-data [input])]
    (if match
      (do
        (println "Found a matching candidate replacement for failing function" fname " for input" input)
        (println "Replacing with candidate match" match)
        (println "----------")
        (eval `(def ~(symbol fname) ~match))
        (println "Calling function again")
        (let [new-result (eval orig-form)]
          (println "Healed function result is:" new-result)
          new-result))
      (println "No suitable replacment for failing function "  fname " with input " input ":("))))

(defmacro with-healing [body]
  (let [params (second body)]
    `(try ~body
          (catch Exception e# (self-heal e# ~params '~body)))))

What are we waiting for? Let's try it out.

Running the Experiment

First we call the report function with a non-empty vector.

(healing/with-healing (report [1 2 3 4 5 "a" "b"]))
;=>"The average is 3"

Now, the big test.

(healing/with-healing (report []))
; ERROR in function self-healing.core/calc-average Divide by zero -- looking for replacement
; Retrieving spec information for function  {:args :self-healing.core/cleaned-earnings, :ret :self-healing.core/average, :fn nil}
; Checking candidates  (self-healing.candidates/better-calc-average self-healing.candidates/adder self-healing.candidates/bad-calc-average self-healing.candidates/bad-calc-average2)
; ----------
; **Looking at candidate  self-healing.candidates/better-calc-average
; ****Comparing args :self-healing.candidates/numbers :self-healing.core/cleaned-earnings with input [[]]
; ****Comparing seed  [[1 2 3 4 5]] with new function
; ****Result: old 3 new 3
; Found a matching candidate replacement for failing function self-healing.core/calc-average  for input []
; Replacing with candidate match self-healing.candidates/better-calc-average
; ----------
; Calling function again
; Healed function result is: The average is 0
;=>"The average is 0"

Since the function is now healed we can call it again and it won't have the same issue.

(healing/with-healing (report []))
;=>"The average is 0"

It worked!

Taking a step back, let's a take a look at the bigger picture.

Summary

The self healing experiment we did was intentionally very simple. We didn't include any validation on the :fn component of the spec, which gives us yet another extra layer of compatibility checking. We also only checked one seed value from the spec's exercise generator. If we wanted to, we could have checked 10 or 100 values to ensure the replacement function's compatibility. Finally, (as mentioned in the footnote), we neglected to use any of spec's built in testing check functionality, which would have identified the divide by zero error before it happened.

Despite being just being a simple experiment, I think that it proves that clojure.spec adds another dimension to how we can solve problems in self-healing and other AI areas. In fact, I think we have just scratched the surface on all sorts of new and exciting ways of looking at the world.

For further exploration, there is a talk from EuroClojure about this as well as using clojure.spec with Genetic Programming

[^1]: The reason for this is that if the programmer in our made up example didn't have the custom generator and ran spec's check function, it would have reported the divide by zero function and we would have found the problem. Just like in the movies, where if the protagonist had just done x there would be no crisis that would require them to do something heroic.

2016 State of Clojure Community Survey Now Open

clojure.png

It's time for the annual State of Clojure Community survey!

If you are a user of Clojure, ClojureScript, or ClojureCLR, we are greatly interested in your responses to the following survey:

State of Clojure 2016

The survey contains four pages:

  1. General questions applicable to any user of Clojure, ClojureScript, or ClojureCLR
  2. Questions specific to the JVM Clojure (skip if not applicable)
  3. Questions specific to ClojureScript (skip if not applicable)
  4. Final comments

The survey will close December 23rd. We will release all of the data and our analysis in January. We are greatly appreciative of your input!

 

A Major Datomic Update

The latest release of Datomic includes some additive new features to enable more architectural flexibility for our customers, especially those building microservices platforms and projects.  With the advent of the new Client API, users have much more choice when it comes to their deployment topology.  I am also very pleased to announce the new simplified pricing model: Starter for explorers, Pro for production use, and Enterprise for customized licensing/support.  Customers at each level will now have access to identical features, including unrestricted Peer counts per Transactor.  For more, see the official announcement.

Works on My Machine: How We Work: Distributed

Working for a distributed company -- Cognitect is scattered across much of the United States and Europe -- does have its ups and downs. I love not having to commute. But I miss hanging out with my coworkers, live and in person. I love that my office is just upstairs, in that spare bedroom. But sometimes I wish I could put more distance between my job and the rest of my life.  I love that the Internet lets me talk to just about anyone, anywhere. And sometimes I wish I could throw my computer, complete with its bogged-down network connection out the window.

Working for a distributed company also means that I get asked "the question" a fair bit. Actually the question is really a family of questions: "What's it like?" is a common variation. So is "Isn't it hard to get things done?" Then there is "What skills do I need to work remotely?" and, of course "How do I talk my boss -- or potential boss -- into this?"

Interactive Development with Clojure.spec

clojure.spec provides seamless integration with clojure.test.check's generators. Write a spec, get a functioning generator, and you can use that generator in a REPL as you're developing, or in a generative test.

To explore this, we'll use clojure.spec to specify a scoring function for Codebreaker, a game based on an old game called Bulls and Cows, a predecessor to the board game, Mastermind. You might recognize this exercise if you've read The RSpec Book, however this will be a bit different.

Works On My Machine: Understanding Var Bindings and Roots

When designing applications and systems it can be important to understand the inner workings of certain aspects of the language being used by developers. One area of Clojure that is traditionally rather opaque and poorly understood is the inner workings of Vars, and how they interact with the Clojure Language. I recently encountered some behavior that seemed puzzling: