Collaborative Policymaking Using Human-centered AI. Part Two.


Share on linkedin
Share on twitter

Part One Summary

In part one of this article, I propose that many organizations are largely guessing about how to obtain better results, and that they fail to leverage scientific thinking (evidence) in the quest for more effective and efficient outcomes. I maintain that an increase in the application of scientific thinking to public policy and social science problems is being hindered by the perception that science is synonymous with randomized controlled trials, and that RCTs are often impractical in fast-paced environments. I also touch on how modes of governance are evolving. The business-centric New Public Management of the 80s and 90s is giving way, with Europe in the lead, to the natural evolution of a collaborative governance style (New Public Governance) that emphasizes trust, social capital and cooperation as core features. But while these ideas (using evidence and engaging in collaborative policymaking) are gaining ground in the literature, and in more forward-thinking countries and organizations, there are still many organizations around the world that could, but currently do not, benefit from engaging in deep evidence-informed policymaking underpinned by open collaboration and deliberative dialogue. I further maintain that barriers to rapid progress in these areas are linked to a critical shortage of pivotal tools and methodologies available to practitioners, decision-makers and stakeholders. In part two, below, I describe the tools that PolicySpark has developed to overcome these barriers, and how these tools work.


The way policies and programs are designed has a pervasive impact on results and decision-making. Think of the old cliché about building a house on a shaky foundation. We know intuitively that this is not a great idea because it sets us up for difficulty down the road. It involves a lack of foresight. The same is true for policies and programs. When they are built on guesswork instead of pooled wisdom and the best available evidence, there’s an increased likelihood that sub-par results will follow, along with persistently poor outcomes over time. In addition to a lack of appropriate use of evidence, policy and program interventions produce weak results, or fail, because stakeholders often don’t believe that their voices have been heard. People are reluctant to accept policies which have been imposed on them when they are not provided with an opportunity to have a meaningful say in how those policies are designed and implemented.

In this article I describe a fresh approach to addressing these fundamental issues. Leveraging modern data science and human-centered algorithms, PolicySpark has developed an easy-to-understand, visual and intuitive approach to bridging and reconciling the differing policy, research, and stakeholder perspectives that are invariably found around all intervention systems. As I explain in the following sections, this is a crucial missing tool set that unifies evidence-based modeling, stakeholder voice and scientific thinking into a single analytical platform specifically designed to find out “what works.” These tools have important applications in program design, strategic review, implementation, ongoing management, performance measurement, and evaluation/assessment.

Using Evidence in Design and Decision-making

What can we do to significantly improve how we design and implement policies and programs without making it too complicated and without putting in too much time and effort? As I’ve already alluded to, one important thing we can do is to increase our use of evidence. There’s an ever-widening consensus that to be effective and efficient, interventions must be informed by good evidence. National governments (Canada, the United Kingdom, the United States, the European Union, Sweden, Norway and many others), multilateral organizations (the United Nations, OECD, World Bank Group), non-governmental and donor-based organizations, major philanthropic foundations, and shareholder-based interests in the private sector, all increasingly point to the need to use more evidence in design and decision-making.

What does it look like to use evidence? What does this mean? There’s no rigid definition for evidence-informed decision-making, but the Public Health Agency of Canada provides one description of how evidence might be used in the public health field: “evidence-informed decision-making is ‘the process of distilling and disseminating the best available evidence from research, practice and experience and using that evidence to inform and improve public health policy and practice. Put simply, it means finding, using and sharing what works in public health.” In this article, I’ll describe a practical approach to doing exactly this using groundbreaking data science and engagement tools that can be applied not only to public health problems, but to a wide variety of complex social and business problems.

The Benefits of Using Evidence

Why are people interested in using evidence for intervention design and decision-making? What are the benefits? Actually, it seems that there are a number of highly significant benefits. At the top of the list is that interventions that use evidence are more likely to achieve desired outcomes. It makes a lot of sense that minimizing guesswork and seat-of-the-pants thinking will increase the chances of getting good results. Another good reason to use evidence is that focusing on doing things that are supported by the evidence increases the likelihood that resources are not spent on activities that have little or no impact. We might think of this as intelligent resource allocation. Yet another compelling reason to use evidence is that it builds trust, buy-in and communication with stakeholders and benefactors. This is especially true when stakeholders are invited to participate directly in the evidence-gathering process (i.e., their knowledge is a valued primary source of evidence). Increased use of evidence also leads to enhanced perceptions of institutional/organizational credibility and accountability. From a management perspective, the correct use of evidence strengthens a number of important activities, including program design, strategic review, implementation, ongoing management, performance measurement, and evaluation/assessment.

Types of Evidence

Before going on to look at the unique tools we’ve developed to gather and use pooled evidence, let’s examine some of the different types of evidence that may be available to managers and decision-makers. The most obvious type is research evidence. This can take the form of published scientific literature, which provides insight into what has been studied and learned in the sphere of activity that is being targeted by the intervention. A second type of evidence can come directly from people who have experience working with or living in the intervention system. We might call these people knowledge holders. There will inevitably be a number of different groups of knowledge holders involved in any given system, all possessing different lived perspectives on how the system works. While there’s little doubt that good scientific information provides us with a primary source of credible evidence, the lived experiences of stakeholders can constitute highly insightful experiential evidence. And as I mentioned earlier, interventions are more likely to succeed when stakeholders are provided with an opportunity to have a meaningful say in how policies are designed and implemented.

To illustrate the value of experiential evidence, let’s consider the problem of diabetes in Indigenous communities. Western, scientific views of treating diabetes generally focus on body weight and physical activity, and emphasize individual lifestyle factors such as diet, exercise and weight control. But when Giles et al. (2007) set out to ask Indigenous people about their perceptions of what causes diabetes, they found that other key factors were important – factors that are absent from Western medical models. The Indigenous people that were interviewed pointed to concepts such as relationship to the land, relationships to other, and relationships to the sacred as being crucial to their state of health. What this tells us is that different kinds of evidence may be vital to drawing a complete picture of what drives outcomes. Based on what was found in this study, it seems likely that a government or other intervention aimed at reducing diabetes in Indigenous settings that relies solely on a Western scientific mindset would be ineffective because it would be missing important spiritual and cultural factors that mediate Indigenous health outcomes. These factors only surfaced through a deliberate and systematic attempt to draw out stakeholder views and experiences. Granted, not all stakeholder views will be objective in a scientific sense, but these perspectives capture valuable insights into parts of reality that are not readily encoded by reductionist thinking. Further, stakeholder perspectives are reflections of how people think and what they believe, and since social interventions are largely concerned with influencing how people behave in order to realize desired outcomes, gaining detailed insight into thinking and beliefs can provide program designers and decision-makers with crucial strategic information that can be used, along with other evidence, to make better-informed decisions about where to intervene and why.

A third type of evidence, expert opinion, can be seen as a hybrid between research evidence and experiential evidence. Expert opinion can be gathered in person (or virtually) from researchers and other professionals who have direct contact with the intervention context. Finally, evidence may be extracted from existing data sets. This includes big data, be they from government or the private sector, or open data, which are often well-structured data sets that are available for use by the public at large. As will become clear below, evidence from virtually any source that expresses how factors in a system affect one another can be used to gain key insights into how program systems work.

Bringing Evidence Together Using a Common Language

A significant challenge to using evidence effectively involves finding a practical method to systematically gather disparate sources of information (e.g., research evidence or evidence comprised of lived stakeholder experiences) in ways that allow them to be combined synergistically to triangulate clearer views of reality. One possible approach to this problem is to identify a unifying conceptual structure that bridges evidence together using a common language. One such structure or language is causality. Causality is simply characterizing and describing how one thing affects another. We can use this to bridge different sources of evidence, even if they look superficially dissimilar, because every source of evidence will contain within it a set of (hypothetical) cause and effect structures. Characterizing evidence through this lens not only permits comparisons among evidence types, it also provides a powerful foundation upon which to understand which particular factors in a system are driving outcomes.

Using causality as a central integrating construct to merge evidence in this way aligns well with the emphasis that has been placed on causality by experts in the field of program design and evaluation. Coryn et al. (2011) state that the core construct of a program is causation, whereby a theory or model describes the cause-and-effect sequence through which actions are presumed to produce long-term outcomes or benefits. In their textbook on program theory, Funnell and Rogers (2011) state that, in essence, a program theory is an explicit theory or model of how an intervention, such as a project, program, strategy, initiative, or policy, contributes to a chain of intermediate results and finally to the intended or observed outcomes. Carol Weiss, a pioneer in the exploration of program theory, stated in 1997 that program theories are popular largely because of their potential to explain how interventions work (Rogers and Weiss, 2007). These ideas and assertions are still very much with us today. As was noted in a recent special issue of the Canadian Journal of Program Evaluation centered on modern program theory, “interest in describing and understanding the underlying logic of social programs is pervasive and persistent.” (Whynot et al., 2019).

Better Tools

At the heart of all this is the idea that uncovering the details of how things work, causally, can provide us with a powerful way to design and plan a program. But while there’s been an apparent longstanding recognition that examining causation holds great potential for deepening our understanding, and despite many attempts to use models to do this, tools and approaches have remained seriously limited. Why? This is likely because modeling techniques have either been overly simplistic or too complicated, expensive and unwieldy. Logic models, for example, while used widely, are often far too simplistic. They are also frequently cobbled together quickly and in the absence of any real evidence. On the other hand, models that may come closer to accounting for real-world complexity, for example structural equation modeling or multivariate analysis, are often difficult for people to grasp easily and require extensive and expensive data input. Similarly, and as I mentioned in part one of this article, experiments that take the form of randomized controlled trials, while sometimes used very effectively, are also difficult and expensive to set up, implement and interpret. What seems to be missing so far, in terms of realizing the full potential for models and theories to significantly catalyze better outcomes, is an intuitive, rapid and inexpensive, but nonetheless rigorous way to systematically gather and visualize evidence, and to use that evidence to make sense of complex intervention structures.

What would better tools look like? For starters, they would be easy to understand, easy to apply, and wouldn’t take too much time and effort to use. Practical tools would take advantage of what evidence has to offer, hitting the right balance between intuitiveness and rigor, without drastically changing how planning and assessment are done. To be effective, this kind of tool set would provide the means to systematically gather, compare and combine different forms of evidence, leverage powerful analytical tools to understand what the evidence tells us, and outline clear ways to test what we come up with so that we can be confident that the resulting intervention blueprints are reasonably accurate.

Conceptually, the best way to do all this would be to follow the same approach that scientists use to investigate phenomena in many fields: (1) muster an intelligent, informed guess about how something might work (generate hypotheses), (2) collect precise information about how the system behaves in the real world (observation and measurement), and (3) compare the original guess to the observations. If you do this, and the model turns out to be reasonably good at predicting what happens ‘out there’, then this model will provide a powerful way to plan and adjust the intervention at hand. This is essentially a road map to formally applying the basic scientific method to the problem of understanding the inner workings of policies, programs and other complex social interventions. As I explain below, what’s new and powerful about PolicySpark’s approach is that we leverage customized technology to untangle real-world complexity to generate deep, rich hypotheses about how interventions work based on multiple sources of evidence, including evidence derived from deliberative stakeholder dialogue.  

Leveraging Strategic Intelligence

PolicySpark’s main purpose is to fill the methodological gaps described above by developing, deploying and refining accessible and easy to use tools to produce the strategic intelligence needed to find out what works. The cornerstone of our approach is an intuitive, hands-on method that uses research evidence and stakeholder evidence (collected in participatory fora) to build visual models that are easy to interpret. The models help identify what we call system “leverage points.” These can be thought of as the small number of key factors that exert the highest levels of influence on chosen outcomes. This information can be used to highlight the best places to intervene in program systems and to construct full program theories. All of this can be used to engage in more efficient design, planning, implementation and assessment. Underlying our approach are powerful, purpose-built, data science algorithms that keep humans in control, which I describe below. Similar analytical tools are driving innovation in many areas, and we feel the time has come to apply these techniques to improving policies, programs and other interventions.

How the Tools Work

To provide a clearer idea of what these tools look like and how they work, I’ve summarized our process in the following series of steps. What I describe here is a general approach that can be used to investigate and better understand almost any complex social or business problem. 

1. Identify key outcomes and sources of evidence.

The first step is to get a clear idea of the problem to be investigated. This is done by identifying one or two top-level system outcomes. Once this is clear, we identify all sources of available evidence that describe the problem. Typical evidence sources include research literature, stakeholder knowledge, expert opinion, and available existing datasets (e.g., big data/open data). Evidence need not be restricted to these sources though, and any source of evidence that describes cause and effect can be used. 

2. Build evidence-based models.

Once all the sources of evidence are lined up, we then build a model for each source. A series of models can be built by the various groups of people around the intervention system, including policy and decision-makers, identified stakeholder groups, and experts. Research evidence, for example existing evidence extracted from academic literature, is incorporated into a separate model by PolicySpark personnel, sometimes collaboratively with project proponents. To construct these models, we’ve developed a technique for mapping out all the important causes and effects in a system. We map and connect all the factors together using an intuitive, visual approach that anyone can use and understand. The connections in the models are weighted with different things like strength, difficulty and time, which is a customized approach we’ve developed specifically to better understand complex social and business problems. Our hands-on, participatory modeling approach to capturing stakeholder knowledge builds enthusiasm, interest and trust among stakeholders. This is because people are able to meaningfully and directly participate in developing the programs of which they are a part. Using a variety of evidence helps to build strong ideas (hypotheses) about how systems work. The modeling approach that we use is based on a branch of mathematics called graph theory. Graph theory algorithms power many of the artificial intelligence technologies and business applications that are rapidly expanding in the modern world.

3. Make sense of the models.

The raw models that we build with the evidence reflect a fair amount of the complexity that we intuitively know is out there. Really, these models look like tangled spaghetti before we process them. This is actually good, because it tells us that we’ve been thorough in terms of capturing many of the things that could be affecting outcomes. The heart of our approach is to boil the models down to remove the noise so that we can see what really matters. This is where the data science and algorithms come in. Running the original complicated-looking (but information dense) models through our algorithmic platform enables us to obtain results that point to a small set of factors (hypothetical causes) that have the largest influences on the outcomes of interest. We call these the system “leverage points.” The leverage points represent truly valuable strategic intelligence. They are essentially very good, evidence-based ideas about where to intervene in the system in order to produce the desired effects. Once we have the leverage points, we can then elaborate on the involved results pathways, identify required behaviour change, and determine more efficiently what should go into design, planning, implementation and assessment. We can also use the models to carry out simulation, which is the ability to ask “what-if” questions. By manipulating factors in the model, it’s possible to see where the other factors will go, and where the system is predicted to go overall. This can be a powerful way for managers and decision-makers to visualize different hypothetical management scenarios. As mentioned above, it’s important to remember that although our platform uses an algorithmic approach to untangle complexity, we do this in a way that keeps humans in control. We see artificial intelligence as a technology that should augment human decision-making, not replace it. Our tools do not make autonomous decisions. Instead, they are built to assist us as we attempt to untangle the causal factors driving complex systems. One of the terms we sometimes use to describe our tools – a term that might be thought of as a refinement on the concept of artificial intelligence – is extended intelligence.   

4. Measure outcomes.

In part one, and earlier in this part of the article, I outlined the importance of taking a scientific approach to figuring out what works. In the step above, we essentially use evidence to take the first step along this path, which is to muster a high-quality guess at what’s driving results. We make sense of complexity by identifying system leverage points and then hypothesizing that these leverage points will affect outcomes in particular ways. Measuring outcomes is like taking the second step in the scientific method, which involves collecting information about how the system behaves in the real world (observation and measurement). For the kinds of interventions that we’re talking about here, measurement and observation are usually carried out through the use of indicators. Performance measurement and indicator work sometimes gets confusing and unnecessarily expensive. This is because measurement is often not properly (or not at all) tied to an underlying model. People get confused about what to measure and why and end up measuring a lot of things that don’t need to be measured (and missing important things that should be measured). This comes back to poor or absent underlying models. With the evidence-based, leverage points approach, we are able to generate strong foundational models, which can and should be used to ground performance measurement activity. This makes performance measurement much more rational, straightforward and efficient. The purpose of indicators, which is often misunderstood, is to gather the information needed to test the strength of the original ideas about how the intervention might work (hypotheses). Having good evidence-based models permits us to define a performance measurement framework that identifies and precisely measures a small, but highly strategic set of indicators that will yield the key information necessary (no more and no less) to test the verity of the original set of hypotheses.

5. Test the models.

Which brings us to the final step. After creating a good model, which also supports the creation of a theory of change comprising specific results pathways and using these to define exactly what should be measured, we’re in a good position to implement (or adjust) the intervention and begin collecting performance information. It’s at this stage that it becomes important to devise a concrete way to test how well things are working. We need to do this because without proper testing, we don’t actually know how good the original model is. The real measure of a model is how well it predicts what happens. By using evidence to create the model in the first place, we definitely set ourselves up to win, because it’s more likely that a good evidence-based model will make accurate predictions compared to a model that’s based on guesswork. But this isn’t quite enough. We still need to test the model to see how strong it really is. The prize that we’re reaching for is a model that does a good job of predicting the key outputs and activities that will produce the desired results. Such a model is extremely valuable. It can guide everything from design to resource allocation to implementation, helping to ensure that time, money and effort are not mis-spent doing unnecessary and ineffective things. An evidence-based model that stands up to scrutiny functions like a guiding compass for management and decision-making. In keeping with the scientific approach to understanding interventions and taking full advantage of the strong foundational models that we develop using the evidence-technology union, we are able to leverage testing to its fullest extent by not only defining and measuring the right indicators, but by designing and implementing appropriate scientific experiments to probe the strength of the models. The detailed causal hypotheses (foundational models) that we generate using our unique approach makes such experimentation not only possible, but highly accessible, practical and inexpensive. In part one of this article, I speculated that one important factor standing in the way of more rational (scientific) policymaking is the perception that doing science on complex interventions necessarily involves the conduct of randomized controlled trials (RCTs), that RCTs are hard, expensive, and impractical, and that this poses a serious barrier for many organizations. Our approach circumvents this barrier by taking a different tack. By using evidence to design deep intervention architectures, we’re able to generate rich sets of competing causal hypotheses that can be tested using on-the-fly performance information, which can be collected in a highly targeted and efficient fashion. This, combined with appropriate, simplified experimental designs (e.g., ‘N-of-1’ experiments), enables us to provide our clients with heretofore unobtainable insight into how their interventions work, both quickly and inexpensively.    

Empowering People and Organizations

I would portray what I’ve laid out in this article as a crucial missing tool set, grounded in fresh methodological thinking. The main purpose of the tool set is to synergistically integrate and operationalize key ideas that are already recognized as useful in the pursuit of understanding interventions, and to do so in a way that merges the ideas into a product that is more than the sum of its parts. The ideas – building good evidence-based models, leveraging stakeholder knowledge, and using scientific thinking to find out what works – are certainly not new. In fact, there is growing demand to implement these ideas. What is new is the creation of a clear, practical way to bring these ideas together into a unified analytical platform, and specifically, into a platform that is rooted in modern data science, which as we know, is proliferating rapidly in multiple fields and spheres of activity.

Getting a better idea of how results come about has always been important. Whether the resources that are being applied to solve problems come from our own pockets (taxes), from the coffers of collective, multilateral entities, from private donors, or in the case of the private sector, from shareholders, it seems necessary, even ethical, to do our best to spend those resources wisely and responsibly. In many cases, the quality of people’s lives depends on how effectively policies and programs shape the conditions required for the emergence of positive change. These points become even more relevant as the resource base shrinks under the pressure of the global pandemic, and under other unpredictable stressors that will no doubt arise in the future. To support positive change, my vision for these tools, which I believe outlines a long-overdue shift in how to elicit better results, is to empower people and organizations by helping them produce the precise strategic intelligence that they need to design and implement better interventions. We are dedicated to this vision, and we continue to work with interested people, organizations and institutions to improve outcomes in specific contexts, and to promote the development of new communities of practice centered on participatory, evidence-based problem-solving.


Coryn, C. L., Noakes, L. A., Westine, C. D., & Schröter, D. C. (2011). A systematic review of theory-driven evaluation practice from 1990 to 2009. American Journal of Evaluation, 32(2), 199–226.

Funnell, S. C., & Rogers, P. J. (2011). Purposeful Program Theory: Effective Use of Theories of Change and Logic Models. John Wiley & Sons.

Giles, B. G., Findlay, C. S., Haas, G., LaFrance, B., Laughing, W., & Pembleton, S. (2007). Integrating conventional science and aboriginal perspectives on diabetes using fuzzy cognitive maps. Social Science & Medicine, 64(3), 562–576.

Rogers, P. J., & Weiss, C. H. (2007). Theory-based evaluation: Reflections ten years on: Theory-based evaluation: Past, present, and future. New Directions for Evaluation, 2007(114), 63–81.

Whynot, J., Lemire, S., & Montague, S. (2019). The Current Landscape of Program Theorizing. Canadian Journal of Program Evaluation, 33(3).

Collaborative Policymaking Using Human-centered AI. Part One.

Are We Trying to Solve Problems by Guessing? There’s no shortage of challenging or even existential problems facing modern human societies. As a fully global and highly technological species, we need to understand and manage myriad domains at multiple scales.

Leave a Reply

Your email address will not be published. Required fields are marked *