Home » A Statistician's View, Departments

Modernizing Government Statistics for the 21st Century

1 July 2022 1,483 views 2 Comments

Anil Arora has worked at Statistics Canada for more than 25 years, leading significant programs and transformations. He was appointed chief statistician of Canada in September of 2016. He has also served in policy and regulatory roles in the government of Canada at Natural Resources Canada and Health Canada. Arora has led substantive international initiatives, working with the United Nations and OECD, and received numerous awards for leadership.

The demands for information have shifted dramatically in recent years. Understanding what is happening—on average at the national, provincial, or city level—is nice but no longer sufficient. Understanding the flows of goods, services, and labor requires a higher level of detail. Understanding the differential impacts of health, economic, or labor changes on subpopulations is essential to social cohesion and to our societal and economic well-being. In order to develop policies that support those who are most vulnerable—those who are disproportionately affected by the rising cost of living—we require access to far more detailed and timely information.

The data ecosystem has expanded enormously in recent years, and the responsible use of new data sources is allowing us to see sooner and act faster. Data needs to be collected and processed quickly if it is to be useful. It’s no longer enough to conduct surveys and look in the rear-view mirror at what happened. Citizens expect near-real-time information and predictive models to help decision-makers plan our future. Integrating data is the next step. That enables us to better understand the complex interactions between societal, economic, and environmental factors, and then to design policies that have greater impact. It helps us to redefine how we see and react to phenomena, to better understand root causes, and to be more aware of both intended and unintended consequences.

Currently, 40 percent of Statistics Canada’s programs are based in whole or in part on data available from administrative and alternative sources. Satellite data, regulatory data, point-of-sale scanner data—just to name a few types drawn from the wells of public and private sector data holdings alike. This is data we have evaluated, processed, and deemed fit for purpose. It is subject to the same statistical rigor; scientific method; and ethical, privacy, and disclosure controls as all the information in the agency’s care. 

Surveys of citizens and businesses are at the core of how we collect information. But our agency has also been incorporating administrative and regulatory data from other government entities for more than a century now. It’s not a new development. 

The reason Statistics Canada can responsibly integrate data from different sources is because we have invested in remaining current and relevant as a national statistical office. We are committed to staying ahead of changes in society and technology by experimenting, learning, adapting, and partnering.

Several years ago, our agency embarked on an employee-led journey to modernize our operations. That journey got us to a place where we were able to respond to the urgent demands for data during the pandemic. We pushed ourselves to become more user-centric.

To make our workforce more curious, able to take intelligent risks, and better connected, we scaled up our statistical capacity and infrastructure and implemented a significant set of new tools and processes for integrating data responsibly from multiple sources. We have also strengthened our governance system with external expert committees on ethics, trust, and privacy.

All this work has enabled us to play a leadership role in a competitive data market. Through it all, we kept our focus on answering the questions society puts to us. I believe that’s the real value proposition of a national statistical office. It’s not just about putting more data out there. It’s trying to make sense of what’s happening in society and showing how different parts of it are intricately connected.

If we don’t do that, someone else will. The questions are only getting more sophisticated—and our answers more complex. For example, immigration used to be about counting how many people a country brings in. We now need to understand so much more: the skills they bring with them, their family structures, and how to integrate all that productively into society. Issues like these require us to find and weave together different sources of information on situations that are interconnected and evolving, in order to spin them into actionable insights for decision-makers. The traditional methods won’t always get us there. That’s ultimately why we’re incorporating more alternative data. 

We are also working toward acquiring and disseminating more disaggregated data, as they play an important role in understanding the lived experiences of various populations groups, specifically those who are marginalized or have been less visible in our data historically.

Our federal government recognizes the importance of this work and recently made a major investment in our agency’s Disaggregated Data Action Plan

In my talk for CNSTAT [Committee on National Statistics] in May, I gave several examples of how we’re combining data from multiple sources in our programs, in ways that might help level the playing field for society. Here I have space for one, housing, but urge you to view my talk for the examples on health, emergency and recovery benefits data, and health and equity. 

The agency’s Canadian Housing Statistics Program has been in development for five years and will grow as new data sources become available. This ambitious program links individual, “micro” data on properties with broader data on homeowner characteristics to provide a comprehensive portrait of Canada’s housing market. Remarkably, that portrait is drawn almost exclusively from administrative data. We conduct a number of high-quality surveys on housing in many programs at the agency, but some critical questions cannot be answered this way.

For example, we would love to know the yearly trends on housing supply in rural regions. But it can be operationally prohibitive to survey these sparsely populated areas. We also want to know the extent of demand coming from those residing outside our country—where and what kinds of properties they own. But it’s not feasible to send surveys all over California, China, and the Middle East to ask people there if they own a residential property in Canada. 

Similarly, it would be difficult to design a sampling strategy to collect information on properties left undeveloped. To address housing supply issues, we need to go a step further and produce information on the availability and value of vacant lands across Canada. Often, it is just too costly or complex to provide answers to important questions through traditional survey means. 

So, our housing program has developed a partnering model to acquire administrative data that helps fill these information gaps—ethically and efficiently. The data we acquire range from municipal property assessment files to land registry files to immigration and tax records. It comes to us in disparate forms from an array of municipal, provincial, federal, and private sources. 

They are all run through our agency’s ethical frameworks and quality management tools to ensure they meet our high standards.

Then, we combine, triangulate, and analyze the data from different angles. And faster than you might think, we produce comprehensive, harmonized, and granular data and insights on the characteristics of Canada’s housing stock and owners. 

The generosity of our partners and the ingenuity of those running this program have helped liberate the agency. We can now go beyond answering basic questions like, “How many houses do Canadians have?” We can dig into the data and explore, “What does housing mean, at this point in time in the Canadian social, economic, and political landscape?” We can also go further and provide unique data on first-time homebuyers and the homeownership journeys of new Canadians, as well as on hot-button issues like housing investors and satellite families. 

Thus, when policymakers come to us wondering about housing price increases, supply constraints, concentration and inequalities of homeownership, and ownership by nonresidents of Canada, we can give them high-quality, timely answers thanks to administrative data. We can tell them how immigrant homeowners, despite earning lower incomes than Canadian-born owners, buy more expensive properties in certain provinces and how that relates to their desire to own newer homes in urban settings so they have broader access to services and job opportunities, which increases their social connections as well as lets them accumulate wealth through homeownership, which studies show is more important to them than other types of assets such as registered pension plans.

Housing is the most important asset owned by many Canadians and one of the most critical and complex issues of our time. It deserves and demands this kind of thorough exploration. Our outputs will also help policymakers explore housing affordability and equity as they grapple with fulfilling Canada’s National Housing Strategy Act, which recognizes adequate housing as a fundamental human right affirmed in international law. 

This national housing strategy was formulated in 2016. It’s a $40 billion plan that aims to ensure all Canadians have affordable housing that meets their needs by 2030. Against that ticking clock, researchers, community groups, policymakers, and journalists all come to Statistics Canada to understand the complex dynamics of our housing markets and track the country’s progress toward meeting this goal. 

The core data from our Housing Statistics Program will be made available to more than 4,000 municipalities across Canada by the end of 2022. As well, researchers will be able to access de-identified microdata files through our research data centers. We are also exploring generating synthetic housing data through algorithms to allow researchers a more granular understanding while protecting Canadians’ privacy. 

We have not yet exploited to their fullest extent the alternative data available to us. New methods may need to be introduced as we get a better understanding of the challenges we are facing, as has been done for traditional surveys over the years. One new challenge is that, in this digital and data-driven world, information is increasingly being monetized. That’s happening even in the public sector, as government organizations seek to defray the costs of their operations. In these cases, we must take into consideration the business models of these organizations so as to coexist and not compete with the products they produce. At the same time, we remain committed to fulfilling our mandate as a national statistical office to produce data that serves the public good. I would argue that the utility of this data exceeds the nominal revenue it generates. 

It is important that NSOs [national statistical offices] assert the unique value proposition of publicly owned data so that we are not constrained in acquiring and developing these data sources for the public good. They are vital to informing government policy and programs—and guiding decision-making right down to the individual citizen level.

Conclusion

Combining data sources has led to a renaissance among national statistical offices. The richer array of available data sources has increased our power to provide more timely and relevant information to policy- and decision-makers. It’s also come with new challenges and opportunities, which I elaborate upon in my oral comments. In short, we’re now delving into ethics, privacy, quality, equity, trust, and transparency like never before. We’re working to make our governance mechanisms more robust. We’re bringing in other disciplines, such as cognitive research, and putting more emphasis on them than before. We’re augmenting our statistical toolkit. We’re expanding our partnerships and should continue to do so on an international scale. This is the future.
 
We also have to stay current. We have to keep asking ourselves whether our instruments reflect what’s happening in society now. Just because an indicator has been around for 100 years doesn’t mean it’ll survive the next 10 years. Look at how quickly the Consumer Price Index is evolving. We need to up our game, have equitable systems, bring in multiple dimensions, and put more weight on them than we have before. That’s how we’ll remain definitive in an environment where everyone thinks their version of the truth is the reality.

Editor’s Note: This piece was adapted from a May 16 keynote presentation Anil Arora gave at a National Academies workshop. View his slides and a video of his presentation, along with those of the other presenters.

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading...

2 Comments »

  • Jane Gentleman said:

    I am a former StatsCan employee. I left in 1999 to work at the U.S. Nat’l Center for Health Statistics, and I retired in 2014. I was very impressed to read the Chief Statistician’s speech. But will somebody please edit his future reports to be consistent about using “data” as being either singular or plural? That word is really plural (the singular being “datum”), but fewer and fewer writers and speakers are using it that way nowadays. The Chief Statistician’s speech used it both ways, often, which detracted from its professionalism and would have distracted some readers/listeners.

  • Pedro Silva said:

    Excellent speech / article. The challenges faced in a developed country like Canada are compounded by more difficult environments faced by most NSOs in the developing world: imperfect legal support; underfunding; poorer governance; insufficient statistical capacity. But it is reassuring to see Statistics Canada’s leadership in showing the way. However, I believe that in the future we will see more ‘public statistics’ produced outside of government but having similar objectives. I cannot imagine how NSOs alone can meet the existing and projected future societal demands.