Dependency Inversion with Elixir Protocols
I’ve always liked the Dependency Inversion Principle (DIP). Once understood, it helps us decouple code in surprising ways.
The definition of DIP (p. 6) has two statements:
A. High level modules should not depend upon low level modules. Both should depend upon abstractions.
B. Abstractions should not depend upon details. Details should depend upon abstractions.
Let’s make those definitions clearer by looking at an example.
Inverting a dependency
Suppose we have two modules: A
and B
. If module A
calls a function of B
,
then we can say A
depends on B
. We can represent that dependency with an
arrow:
A -> B
A
is the high-level module, and B
is the low-level details module.
Here are a few concrete examples of how that could look:
# A calls B
defmodule A do
def do_work do
B.do_some_work()
end
end
# A uses B
defmodule A do
use B
...
end
# A imports B's functions
defmodule A do
import B
...
end
So, A
’s behavior not only depends on its own internals but also on B
’s
behavior. Anytime we change B
, we have to consider its effects on A
.
And it doesn’t stop there. Dependencies can be transitive:
A -> B -> C
In that case, changes in C
affect B
, which in turn affects A
.
When our modules are interconnected, a change in one module can have ripple effects throughout our codebase. That makes our low-level modules more difficult to change and our high-level modules more likely to break unexpectedly.
And that’s what the dependency inversion principle tries to solve. DIP removes coupling between modules by introducing an abstraction, preventing low-level modules’ changes to ripple throughout the codebase.
Going back to our A -> B
example, DIP suggests introducing an abstraction X
that both modules depend upon, therefore inverting the dependencies:
A -> X <- B
Now, both A
and B
depend on an abstraction X
, and X
must be independent
of the details of B
.
With that in mind, let’s look at an example with Elixir protocols.
Dependency Inversion through Protocols
A protocol specifies an API that should be defined by its implementations.
A protocol defines an abstraction (the API) that high-level and low-level modules depend on (part A of DIP’s definition).
- The high-level modules depend on the abstraction by calling the protocol’s functions instead of directly calling the concrete implementations.
- The low-level modules depend on the abstraction by implementing the protocol’s API for their data structures.
Finally, the protocol knows nothing about the low-level modules or their data structures, and thus, the abstraction is independent of the details (part B of DIP’s definition).
Working with collections
Let’s walk through a hypothetical example where we don’t have Elixir’s Enum
,
List
, or Map
modules.
Suppose that in our work, we need to iterate over lists. We might define a
List.map/2
function that works as follows:
[1, 2, 3, 4] |> List.map(fn i -> transform_data(i) end)
Over time, we perform more operations on lists, so we create several more functions:
List.count/1
List.filter/2
List.any?/1
List.all?/1
...
After some more time, we get new feature requests that require us to perform the same types of operations with maps.
“No problem,” we say. We create a generic Enumerator
module that depends on
List
and Map
, and we use guard clauses to choose which implementation to
use:
defmodule Enumerator do
def map(list, fun) when is_list(list), do: List.map(list, fun)
def map(map, fun) when is_map(map), do: Map.map(map, fun)
def count(list) when is_list(list), do: List.count(list)
def count(map) when is_map(map), do: Map.count(map)
...
end
That code could work. But our Enumerator
module now depends on both List
and
Map
:
Enumerator -> List
Enumerator -> Map
Therefore, a change to either module could change Enumerator
’s behavior and
break modules that depend on Enumerator
.
How could we decouple our modules?
Depend on a protocol
Instead of depending on the concrete modules, we can invert dependencies by
having Enumerator
, List
, and Map
depend on an Enumerable
protocol:
Enumerator -> Enumerable <- List
Enumerator -> Enumerable <- Map
Enumerable
can define a protocol API (the abstraction) for interacting with
collections.
defprotocol Enumerable do
def map(enumerable, fun)
def count(enumerable)
...
end
Enumerator
will depend on the abstraction. The protocol remains ignorant (in
the good sense) of any details. And List
and Map
only need to implement the
protocol’s API for their own data structures.
defimpl Enumerable, for: List do
def map(list, fun), do: List.map(list, fun)
def count(list), do: List.count(list)
...
end
defimpl Enumerable, for: Map do
def map(map, fun), do: Map.map(map, fun)
def count(map), do: Map.count(map)
...
end
Now, changes to List
and Map
do not require a change in Enumerator
. We
have decoupled our code and made our program more resilient!
Adding a new collection
We can see just how well our code adapts to change by noting what happens when we add another collection, say ranges.
To allow ranges to behave as collections with the rest of our code, we do not
need to change any modules that call Enumerator
, Enumerator
itself, or
Enumerable
. All we have to do is implement the protocol for ranges:
defimpl Enumerable, for: Range do
def map(range, fun), do: range |> Range.to_list() |> List.map(fun)
def count(range), do: range |> Range.to_list() |> List.count()
...
end
Voilà! The only module that changed in our system was Range
.
Of course, this example is a bit contrived. It is very similar to what Elixir’s built-in Enum module does with the Enumerable protocol. So, let’s look at a couple of other examples in the wild.
In the wild: Phoenix.Param
Phoenix ships with some really nice path helpers. For example:
iex> MyAppWeb.Router.Helpers.user_path(conn, :edit, @user)
"/users/234/edit"
Interestingly, we’re not passing the user ID (234
) as the third argument into
user_path/3
. Instead, we’re passing the whole @user
struct.
How does Phoenix know to get the :id
from the struct?
The answer is the Phoenix.Param protocol.
A protocol that converts data structures into URL parameters.
If Elixir didn’t have protocols, the router helpers would have to somehow depend
on our schemas to generate URLs. Maybe we couldn’t create path helpers that
infer the :id
field at all!
But with the protocol, the router code doesn’t have to depend on our schema
code. Instead, the router helpers and our schemas both depend on an abstraction:
Phoenix.Param
.
MyAppWeb.Router.Helpers -> Phoenix.Param <- MyApp.User
You can use that protocol to change how Phoenix renders URLs paths. For example,
suppose our user URLs require a handle
instead of an id
. We can simply
define the protocol’s implementation for our User
struct:
defimpl Phoenix.Param, for: MyApp.User do
def to_param(%{handle: handle}) do
handle
end
end
Now, user_path(conn, :edit, @user)
would give us /users/germsvel/edit
– all
without having to change MyAppWebb.Router.Helpers
!
In the wild: Bamboo.Formatter
We can see another example in Bamboo, an email library. The to
, from
, cc
,
and bcc
fields required a two-tuple: {name, email}
.
Bamboo could have required library users to pass the two-tuple, but that feels clunky. It’s much easier to pass an entire struct:
Bamboo.Email.new_email(from: user)
To do that without protocols, Bamboo.Email
would have to somehow depend on
MyApp.User
.
Instead, Bamboo uses the Bamboo.Formatter protocol as an abstraction to turn data structures into two-tuples:
Bamboo.Email -> Bamboo.Formatter <- MyApp.User
Thus, people using Bamboo can define their own implementations of the
Bamboo.Formatter.format_email_address/2
API. For example, they could set the
name to be the user’s full name – all without having to change any of Bamboo’s
emailing code:
defimpl Bamboo.Formatter, for: MyApp.User do
def format_email_address(user, _opts) do
fullname = "#{user.first_name} #{user.last_name}"
{fullname, user.email}
end
end
Protocols for the win
There are many other examples of protocols in the wild. But I hope this post gives you a glimpse of the power protocols have to make our systems more resilient to change.