Is AI the Answer to Unknown Unknowns?
Stay Updated with Agentic Labs
Join Our Mailing List
…as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns—the ones we don't know we don't know.
Funny to start an article about understanding codebases with a 22-year-old quote from a former US Secretary of Defense, but here we are.
Rumsfeld was pilloried at the time, winning a Foot in Mouth Award for his “bizarre remarks.” But time has vindicated Rumsfeld. Unknown unknowns were never a wild, unthinking comment; they are a core epistemological concept.
If you don’t acknowledge unknown unknowns, do you even know, bro?
The Rumsfeld Matrix and the Johari Window
Let’s step back even further than 2002 and the Iraq War to sunkissed fifties California and UCLA. Two psychologists, Joseph Luft and Harrington Ingham, try to help people understand themselves and how they appear to others.
They devise this matrix to illustrate the interplay between self-awareness and external perception, which they call the Johari Window:
The Johari Window consists of four quadrants:
- Arena (known to self and others): This represents information about yourself that you know and others also know.
- Façade (known to self, unknown to others): This includes things you know about yourself but keep hidden from others.
- Blind Spot (unknown to self, known to others): This represents aspects of yourself that others can see but you are unaware of.
- Unknown (unknown to self and others): This quadrant represents information, behaviors, or traits that neither you nor others are aware of.
What is the thinking here? It is to know knowledge. By recognizing these areas of knowledge and ignorance, the Johari Window encourages us to increase self-awareness, seek feedback, practice self-disclosure, and embrace personal development. The framework helps us understand the complexities of self-knowledge and understanding.
The term unknown unknowns sprang from here and worked its way through government agencies and businesses before Rumsfeld made it famous. The “Rumsfeld Matrix” version looks like this:
What are some examples of each? Known knowns are simple. They're the facts and skills we're consciously aware of possessing. For a software developer, this might include knowledge of a language, familiarity with frameworks, or understanding of specific algorithms.
Known unknowns are also fairly straightforward. They’re the areas where we're aware of our lack of knowledge. You might know you don't understand a particular part of the codebase or know you need to learn Deno to keep up with the cool kids.
Unknown knowns are a little more esoteric. These are things we know implicitly but aren't consciously aware of knowing. For developers, this could be intuitive skills like problem-solving developed through experience or unconscious best practices internalized over time. In some ways, this is the knowledge most in need of elicitation, as it is the biases intrinsic to your thoughts that could take you off track. But it is also the one that is a little askew for our conversation.
And then there are unknown unknowns. These are the concepts we don't even know exist, and that's where the fun begins.
Understanding at Work
The reference to the Johari Window above isn’t just to set the scene for where Rumsfeld’s utterings came from. It also shows what is required to learn. The Johari Window is about knowledge of self, but the concepts are the same for knowledge of X, where X is any complex system–like, say, a codebase.
When we approach a new codebase, we're essentially trying to expand our “Arena”–the known knowns. We're aiming to shrink our “Façade” by sharing what we know, and reduce our “Blind Spot” by learning from others.
This allows us to follow these arrows:
We’re still left with this unknown unknown problem. Without assistance, we can chip away at this, but only passively. For instance, you can ask for feedback on an issue and through that discover a new approach, but actively exploring the space without prior knowledge of what you’re missing is genuinely difficult.
Without assistance, it can also appear as if this quadrant doesn’t exist. After all, if we don’t know we don’t know, maybe we think we know all there is to know. This leads us to another psychologist duo, this time from nineties New York state: David Dunning and Justin Kruger.
The Dunning-Kruger effect is when people overestimate their knowledge or abilities. Their lives exist entirely here:
The problem is this leads to the Peak of Mount Stupid:
To “know,” you must move off this peak, into the valley of despair (when you finally encounter known unknowns and feel stupid), and then through the slope of enlightenment, through learning about our unknown unknowns, to Guru status. We need to take this journey:
How?
Finding Unknowns With AI
Let’s move this conversation back to codebases. What are some possible unknown unknowns for engineers in a complex codebase?
- Dependencies. Reliance on numerous third-party libraries and frameworks you're unaware of.
- Vulnerabilities. CVE (Common Vulnerabilities and Exposures) issues hidden deep within your supply chain.
- Side effects. Unintended modifications to state or data structures by innocuous functions.
- Edge cases. Unusual inputs that your error handling doesn't account for.
There are also bigger unknowns in a large enough codebase. You might be completely unaware that an API or a function exists. You might not know about entire modules or microservices that handle background tasks. You might be oblivious to custom build tools or deployment scripts.
Not knowing these details is inevitable. The vanilla way to deal with this is through knowledge sharing and documentation. New developers will be onboarded to codebases. Old developers will be sent docs.
But static information can only take you so far. First, static == stale. As soon as the next commit is added to a repo, the README is wrong. And this is one of those problems that can be described as “slowly, then all at once.” You end up with detrimental documentation.
Second, static documentation fails to capture the interconnected nature and dynamic behavior of complex systems. It presents information in a linear, isolated manner, making it difficult for developers to understand the full context and implications of code changes. This limitation can lead to a fragmented understanding of the codebase, where developers might grasp individual components but struggle to see how they interact in the larger system.
Dynamic exploration is necessary to uncover the true depth of a complex system. This is where AI steps in, offering a way to actively probe the codebase, identifying connections to, and implications of, unknown unknowns.
The impact of AI in uncovering unknown unknowns cannot be overstated. Without AI, developers find themselves in a situation akin to following a GPS without seeing the overall map, or fixing symptoms without diagnosing the underlying disease. AI has an omniscient view, with no unknown unknowns, so developers can use it to as a guide to the entire landscape of their codebase and make informed decisions.
Consider how a senior developer might use AI to learn about an unfamiliar part of their organization's codebase.
First, AI can construct a comprehensive knowledge base of the code. It analyzes every line, function, and dependency, creating a thorough understanding of the entire system. You aren’t reliant on human developers to document the code effectively. AI can produce explanations at various levels of abstraction, from high-level overviews of the entire codebase to detailed descriptions of individual functions. The documentation is continuously updated, ensuring it always reflects the current state of the code.
The dev can explore the project at different levels–codebase, system, or file. Unlike brute-force code reading or relying on potentially outdated documentation, AI enables dynamic exploration of the codebase. It provides an up-to-date, comprehensive view that can quickly pay off in a developer's learning journey.
AI then serves as an interactive partner. By offering insights at various levels of abstraction, from high-level overviews to detailed function descriptions, AI helps developers navigate the complexities of large codebases efficiently. Developers can engage in conversations to explore the codebase and learn what they don’t know. AI can guide the developer down the path, moving unknown unknowns into the known unknown quadrants, and then help the developer learn and move the problem to a known known.
We’re building this with Glide, an AI tool that accelerates the process of converting unknown unknowns into known information. While it doesn't replace human insight and creativity, it significantly enhances our ability to navigate and understand complex codebases.
If you want to try Glide within your codebase, you can reach out for a demo, and we’ll show you the unknown unknowns of your code.