Developing the Language of the Domain

2U, a company that provides the technology, services and data architecture to transform its clients into digital universities, asked for Cognitect’s help making their operations more efficient. They had a deep client-onboarding pipeline, constrained by the speed of content-creation, and they wanted to eliminate unnecessary bottlenecks in their software systems.

Iteration Zero

During our Iteration Zero, we worked with 2U engineers to sketch out a high-level view of their software architecture, with a particular eye toward information flows. As this diagram grew to cover most of a conference-room wall, it became clear that the architecture had a lot of moving parts and information silos, not surprising for a successful startup after several years of rapid growth.

The key insight of the architecture review was that 2U relied on people to move data between software systems — mostly content, but also configuration, access credentials, and design artifacts. Streamlining these processes meant not only building software systems that could communicate directly with one another, but also incorporating human work-products into an automated workflow.

The Language of the Domain

Working with a mixed team of developers and operations specialists at 2U, we set out to develop a model of the business domain. It started with a simple question: What types of entities — what kinds of stuff — do you work with? Programmers may recognize this as a classic object-oriented design exercise, but we weren’t limited to object-oriented nomenclature or a complex modeling language such as UML. Instead, we took advantage of our primary tools, Clojure and Datomic, to let the customer lead us toward a model that was meaningful to their business.

As we collected answers to these questions, we could quickly encode them in Clojure’s Lisp-like syntax:

(entity University
  "An institution of higher learning"
  (attr :university/shortname string identity #"[a-z]{2,4}"
    "Unique, human-readable abbreviation for this university")
  (attr :university/full-name string non-blank unique
    "Fully-expanded name of the university for public display"))

(entity Program
  "Something offered by a school or college culminating in
  a degree or certificate."
  (attr :program/full-name string non-blank
    "Fully-expanded name of the program for public display"))

(relationship offers
  "Every Program is owned by a single University"
  [Program :program/university University]
  [University :university/programs Program many])

Although this looks like Clojure code, it’s just data. Clojure’s edn reader and clojure.spec made it easy to parse and load this data into a Datomic database. Then it was a simple matter of programming to transform it into a variety of representations.

The very first artifact we deployed at 2U was an internal web app to explore the domain model, including a graph visualization of entities and relationships:

To gather feedback, we printed a poster-sized version of the same diagram, hung it in a corridor next to a stack of post-its, and asked everyone in the company to suggest improvements.

As we collected feedback, we learned that almost every team had a unique perspective on the business. Some of the most commonly-used terms had widely-divergent definitions. It was not exactly a blind-men-and-elephant scenario: None of the definitions was actually incorrect, but neither were any of them complete.

The fable of the blind men and the elephant: Each perceives only one part, on which he bases his (incorrect) conception of the whole animal. By contrast, one team's understanding of a domain concept may be entirely correct for that team's work, yet still inadequate to represent that concept for the business as a whole.

The fable of the blind men and the elephant: Each perceives only one part, on which he bases his (incorrect) conception of the whole animal. By contrast, one team's understanding of a domain concept may be entirely correct for that team's work, yet still inadequate to represent that concept for the business as a whole.

Many of our collaborators told us this was the most valuable part of the process for them, as it expanded their understanding of the business and their role in it. We were applying agile software development practices to a fundamentally cognitive task. The goal was not to produce software but to better understand the domain, helping 2U to learn about itself. The software behind the model and visualizations was merely a means to that end.

A Foundation in Data

Even while the model was still evolving, Datomic’s flexible approach to schema enabled us to start work on a database of domain knowledge. Since Datomic schemas are themselves expressed as data, we could automate the process of keeping the database in sync with the domain model.

;; Sample Datomic schema elements generated from domain model
[{:db/ident       :university/shortname
  :db/doc         "Unique abbreviation for this university"
  :db/valueType   :db.type/string
  :db/cardinality :db.cardinality/one
  :db/unique      :db.unique/identity
  :db/index       true}

 {:db/ident       :program/university
  :db/valueType   :db.type/ref
  :db/cardinality :db.cardinality/one}]

When data is the foundation, repetitive programming tasks can often be replaced by higher-leverage meta-programming. With Datomic, Clojure, and clojure.spec, we took an abstract model and generated a database. Adding Pedestal into the mix, we generated web-service interfaces for managing information about entities in the database. Using ClojureScript, Om, and React, we deployed a series of “mini apps,” rich interactive forms tailored to particular business roles, all sharing the common back-end.

Since 2U was already using Slack for inter-team communication, we integrated a Slack “bot” into various team channels, posting notifications and reminders with direct links into the associated web apps. All of this information flowed back into a dashboard view to help project managers track work-in-progress across the organization.

Lessons Learned

When we started, our stakeholders at 2U had hoped to arrive at a “canonical” data model for a new system which would be the “source of truth” for the rest of the organization. We set out to iterate towards a canonical model on which everyone could agree.

Of course, building consensus takes time. We were fortunate to be paired with an energetic 2U project manager willing to spend long hours asking questions and collecting feedback on our evolving data model. Even so, it took months to arrive at something we felt confident in. Fortunately, the flexibility of Clojure and Datomic allowed us to keep our models up-to-date as we learned about the business domain, even when that required revisiting some of our earliest decisions.

In the end, we realized there wasn’t any one “canonical” model. The various definitions of core business concepts were all equally valid and equally necessary. This led us to our final data model, in which the core entities were represented not as single objects but as faceted collections of views. Each team, each software system could shine a spotlight on one face of the entity, and the sum of those views made up the complete picture. The new services we developed with 2U allowed each team to continue working with the model that best served them while also bridging the communications gap between formerly-siloed systems.

Ultimately, we learned that “canonical” and “truth” are asymptotic targets. In any organization, teams tend to view the rest of the business through the lens of the system they interact with. A diversity of models is inevitable as each team adapts their understanding of the domain to fit the task at hand. For human beings, it’s a necessary cognitive shortcut. For software (and the people who write it) it’s a call to action: to adapt to reality rather than trying to reshape it.