As I see it, encapsulation in the broad sense is not about keeping data together with the code that operates on it. That is more specifically called data encapsulation, and it is a very restrictive and inexpressive model of privacy adopted by OO languages like C++ and Java.
Rather, encapsulation is more generally about the ability to define a set of invariants and protect them with privacy mechanisms... the purpose being that downstream code should be allowed to assume that all of the relavant functions are basically black boxes that uphold these invariants.
So the user of a library may see:
mod stack {
/// A stack of even numbers.
pub struct EvenStack(...);
impl EvenStack {
pub fn new() -> EvenStack;
}
/// Place a number at the end of the stack if it is even,
/// otherwise do nothing. Returns `true` when the item is
/// successfully added.
pub fn try_push(stack: &mut EvenStack, x: i64) -> bool;
/// Remove the number at the end of the stack.
/// The result, if present, will always be even.
pub fn pop(stack: &mut EvenStack) -> Option<i64>;
}
and should be able to trust the documentation and have reasonable expectations about the behavior of these functions, without knowing their implementation.
Meanwhile, the author sees:
mod stack {
pub struct EvenStack(Vec<i64>);
impl EvenStack {
pub fn new() -> EvenStack {
EvenStack(vec![])
}
}
pub fn try_push(stack: &mut EvenStack, x: i64) -> bool {
if x % 2 == 0 {
stack.0.push(x);
true
} else {
false
}
}
pub fn pop(stack: &mut EvenStack) -> Option<i64> {
stack.0.pop(x)
}
}
and ought to be able to argue that the documented invariants are properly upheld without needing to know anything about the implementation of lower-level things like Vec::push
or Vec::pop
or a % b
, because those things in turn already have documented behavior that is encapsulated somewhere in the standard library or the compiler.
The "boundary" of encapsulation is how far we need to look in order to see all code that is capable of violating the invariants. Here, all usage of stack.0
is clearly confined to mod stack
, and the implementation of these functions protect the invariants, so we may treat everything as black boxes once we are beyond that boundary.
Suppose we instead had something like:
mod stack {
... contents from before ...
pub(super) fn raw_push(stack: &mut EvenStack, x: i64) {
stack.0.push(x);
}
}
Now that we've added this pub(super)
function that is capable of violating the invariant that numbers are even, the boundary of encapsulation has grown to include whatever module contains mod stack
. Perhaps you might see something like:
mod containing_module {
mod stack {
...
}
pub fn foo(stack: &mut stack::EvenStack) {
stack::raw_push(stack, 1);
assert_eq!(stack::pop(stack), 1);
}
}
In this code, we can see that stack::pop
is called and it returns a value that is not even, despite it clearly saying otherwise in its (user-facing) documentation! That's because, thanks to the existence and visibility of raw_push
, code inside containing_module
is now inside the encapsulation boundary of that invariant, and thus cannot treat the functions inside mod stack
merely as black boxes. (code outside of containing_module
can still view them as black boxes, because we can easily verify that containing_module
itself upholds the invariant)
You will notice that this definition of encapsulation is descriptive rather than prescriptive. I.e. it is not a principle to be followed, but rather, a question to be answered: Is invariant Y encapsulated in module X?
The actual principle to be followed here is that modules should ideally be as small as necessary, and expose as few public items as they need to, so that this question becomes easier to answer! Generally, at the very least one should be able to assume that the crate
is the encapsulation boundary of all invariants documented inside that crate, and if it isn't, that should be considered a bug.
(but there are exceptional circumstances (especially involving macros) where sometimes the boundary of encapsulation must lie beyond a crate; that's why #[doc(hidden)]
exists!)