Measuring the impact of service design in a world of public sector management metrics has always been tricky. Social outcomes take a long time to be realised.
Proxy ‘output’ measures often tell us what is happening rather than why it is happening, and can drive perverse behaviours. Problems that service design addresses sit within complex systems, and it is often difficult to isolate a specific intervention from changing elements or innovations surrounding it. There is a growing recognition that services are never fully (re)designed and need constant evaluation to evolve and improve. Impact measurement needs to reflect this. Rather than something measured at the end of the project, it needs to be iterative and become part of continuous service development. And rather than relying on purely quantitative data, it needs to become more experience and practice-led, with frontline staff and service users empowered and supported to use it to make continuous improvements in the service they deliver, or their own behaviours.
In 2017, at the Measured Summit in New York, experts and students came together to discuss how to measure the impact of design. At the follow-up event in London, six months later, we were still discussing the basics: why are we measuring impact, and for whom? Traditionally, in the nondesign world (of our clients), impact has been measured through evaluation, to prove something has worked (or to argue it will). As designers, we also measure impact formatively, to reflect on and improve the service we are designing, as well as the process for designing it. Table 1 shows that by expanding the groups of people that use these two approaches can lead us to different forms and functions of impact measurement.
The challenges around evaluating design are well documented.
Social outcomes often take a long time to appear, particularly where the approach is preventative. For example, you might only see the impact of increasing financial self-management on homelessness figures five or ten years later.
As a result of the previous challenge, organisations have created output or proxy metrics to measure more immediate impact, which can be useful. For example, primary school assessments of childhood obesity give a more current indicator of early intervention initiatives that will reduce Type II diabetes in the longer term. However, these tend to be either an existing interaction measure (e.g. GP visits or waiting times) or an easily-quantifiable measurement (e.g. an annual weight measurement). They capture what is happening, rather than why and how people feel. This is where much of the value of service design is delivered. Metrics are quite often driven by what the organisation deems is important rather, rather than what users value (e.g. train passenger surveys about punctuality and price, rather than anxiety or stress experienced during the journey). And these practices can also lead to perverse service behaviours. In the UK, the police’s ‘Offences Brought to Justice’ target drove the police to target young people (so-called ‘low hanging fruit’) rather than more dangerous criminals, in order to drive up their impact measurements.
Social challenges are part of bigger systems; specific interventions do not take place in isolation. For example, a service-level intervention to prevent homelessness will be affected by rent price increases or changes to the benefit or welfare system. Place-based approaches to health will be affected by the particular physical (buildings, transport, environmental) and human (communities, services, politics) elements in that place. Some new types of innovation are actively encouraging this complexity. ‘Combinatorial innovation’, as it is being trialled in the UK through the NHS test-bed programme, deliberately tests a number of technological innovations at the same time, or alongside other new approaches, which means setting a control is impossible. Traditional evaluation frameworks are scientifically grounded with control groups and a small number of quantitative variables. But these function less well in the messier world of social challenges, where it is difficult to dissect the effect of a specific change from other parts of the system that are swirling around it.
In a constantly evolving world, a service re-design is never complete and impact measurement is part of delivering a constantly-improving service.
This calls into question not only the usefulness but indeed validity of traditional, evaluative, end-of-test impact measurement alone. As a civil servant, I have been guilty of writing “we are going to pilot [...] with a view to rolling it out nationally” in various strategies. It means that pilot programmes are set up to succeed rather than to be allowed to fail, even if the experiment proves not to work. There are examples of expensive trials that were set on a course to succeed by their political masters, despite evidence to the contrary.1
Instead, I would argue for an iterative, experience and practice-led approach.
By iterative, I mean plotting a series of proxy measures that give a sense of how you are moving towards outcomes (and using this data to pivot throughout). Theories of change are useful in helping to think through how an action results in an outcome. By mapping the causal links and assumptions, one can also identify a wide variety of metrics and indicators that can track interim progress. Colleagues from the agency Nile and the Royal Bank of Scotland, presenting their redesign of the Scottish £5 note at the SDN conference in London in 2016, explained how they measured the ripple effects of design throughout the process as well as the large splash at the end. These can take many forms: output data, surveys, responses to cultural probes, feedback and customer insight. Digital data provides a world of new possibilities, allowing people to track activity and behaviour as it happens, follow how people are using digital services, (e.g. trend research and social media listening), as well as real-time forums for posing questions and surveys. For example, Uscreates developed a ‘Children’s Centre in a Box’ for the UK’s Children’s Society, by messaging parents each week to measure how they were using the activities provided within the Box, and using this to measure which ones were most valuable, and which ones needed improvement.
What is interesting about this data is that it is much more qualitative, behavioural and experiential than traditional proxy measures. By experience and practice-led, I mean both valuing qualitative feedback and insight from users, as well as the tacit knowledge and opinions of frontline staff on whether an idea is working or not. There is a debate on the value of ‘intuition’ in professional decision making. It is clearly resisted in the scientific world of reason. However, where it is based on cognitive experience and is combined with other metrics, it provides another valuable, sense-making measure.
Supporting frontline staff to have a more central role in assessing whether something is working or not offers wider value for how services and policy are continuously improved. Getting user feedback is fairly standard for service designers. But if we are to promote the use of practice-led judgments of value, we also need to support frontline staff to widen the cognitive sources on which they are basing their assessments, including listening to users. We also need to help people make sense of unstructured qualitative data. In short term this can be done by supporting them to code/assess it, and in the longer term by using AI to quantify sentiment from video and text responses, freeing up staff to create ideas for improvements.
Therefore iterative, experience and practice-led measurement belies a different type of culture and mindset around how we continually develop and improve services. Our work with organisations wishing to move to a more preventative and early intervention approach has highlighted a need to move from a culture where frontline staff are gatekeepers of a service at the time of crisis - following process and activity/time targets - to one where frontline staff are problem-solvers, doing what they feel is best to achieve the right outcome for the person. In the latter world, frontline staff have control over their service, using impact data and their own trusted feedback to see how it is working and being empowered to change and improve it. In a constantly evolving world, a service re-design is never complete and impact measurement is part of delivering a constantlyimproving service.
But it is not that easy to roll back decades of public sector management and create a mindset that the RSA calls ‘think like a system, act like an entrepreneur’.2Leaders need to promote a reflective and learning culture so frontline staff can actively look for feedback from users, be trusted to give it themselves, and make improvements to the service they deliver. Service design, through its involvement of staff in the design process, and the artefacts it creates (e.g. a problem solving conversation guide rather than tick-box forms) is a powerful vehicle for the culture change.
And what of users? They are also included in the audience in Table 1. As well as developing and improving services, iterative impact measurement can be part of their delivery. A preventative approach also requires greater self-awareness and resilience-building within users. Impact measurement should not be seen as a one-way stream, with organisations s ucking up data and making decisions. Rather, gathering feedback and data and relaying it back to users (on its own, as a selfquantified visualisation or with additional tailored advice) can be part of the service offer itself. The act of recording data about your health can prompt you to adapt healthier living behaviours. Research and implementation are intertwined through apps like ‘mappify’ that pulse checks people’s health, or ‘Colour in City’ which used digital technology to collect people’s experiences of their city, prompting behaviour change.
The idea that a service is static - is designed, evaluated and stays the same - looks increasingly out of date. Services, both their digital and face-to-face components, need to change and evolve with the systems that surround them. As well as designing the service, designers need to upskill frontline staff to lead this iterative change. Perhaps more importantly than summative evaluation (proving that something works) is formative evaluation (learning what works and improving what doesn’t). In traditional frameworks, evaluation comes at the end of the process. Instead, we need to see impact measurement as part of the delivery of the service itself, with frontline staff and users looking at the variety of iterative measures, reflecting on how well things are working, and shifting their behaviour and making changes if they are not.
References 1 Nina Holm Vohnsen (2011) Absurdity and the sensible decision 2 Burbridge, I (2017) The System Entrepreneur
This article is part of Touchpoint Vol. 9 No. 2 - Measuring Impact and Value. Touchpoint Journal is available to purchase in print and PDF format. Become an SDN member, or upgrade your community membership, to be able to read all articles online and download the full-issue PDF at no charge.
Design practices are becoming increasingly future-focussed, reflecting the complexities of the design challenges that we face. Futures thinking can offer us tools and methods to help with this, but more than that, it might offer us a new way of seeing the world that we design for.
Meet Anne Stenros, the Chief Design Officer (CDO) of Helsinki, and learn about her vision of the design impact in the city of Helsinki. Anne sees the city as an organisation and her objective is to to utilise design knowledge and enforce an experimentation culture among the city leaders.