Unconventional Data Sources Fuel Research Innovations
- Market-research panels offer researchers access to millions of participants, specializing in the ability to engage hard-to-reach groups. But their use is currently limited to less than 15% of psychological studies.
- Population-level administrative data offer an affordable and detailed source of information for longitudinal studies.
- Data from global positioning systems (GPS) can be integrated with other types of data, such as heart rate or life satisfaction, and can be analyzed with familiar statistical methods like correlations and regressions.
- Special ethical and data quality considerations may be needed when researchers use unconventional data sources.
- By engaging in interdisciplinary collaborations, researchers are more likely to be exposed to new approaches to research, including the use of unconventional data sources.
Administrative data support research on rare and long-term outcomes • GPS data can provide new insights on movement behavior • Interdisciplinary work leads to innovative thinking
As a postdoctoral researcher studying experimental psychology at New York University (NYU) in the early 2000s, Leib Litman had no problem finding participants for his large studies on episodic memory.
“NYU is this huge place,” Litman said in an interview with the Observer. “There is an endless participant pool of undergraduates that you have access to pretty much at any time of the year, except maybe in the summer where it gets a little bit more difficult to recruit participants.”
But when he later moved on to a faculty position at a small, private college, he realized access to large numbers of students was a luxury he would no longer be afforded. This need for participants to fuel his research led Litman and a colleague in computer science, Jonathan Robinson, to create a suite of online tools that would expedite the process of identifying participants.
“I’ll never forget the first time we did a research study online and collected 500 people in a matter of an hour,” Litman said. “It was really one of those life-changing moments when I realized, you know, this is a complete revolution in science.”
Litman is now one of the cofounders and chief research officer for CloudResearch, an online research and participant recruitment platform. He is also a professor of psychology at Touro University’s Lander College.
What started as a solution to a personal research problem now serves tens of thousands of researchers at over 5,000 institutions. Litman remembers one of the first times he unveiled the project to the research community at an APS Annual Convention about a decade ago.
“It was like standing room only,” he said. “People were extremely, extremely interested because it was very clear that the problems that I was having, everybody else was having, too.”
Since then, online studies have become the norm for psychological research, and CloudResearch’s Connect is one of the major platforms that researchers turn to for participant recruitment. But CloudResearch also offers another option to find participants that Litman believes has been largely underutilized for behavioral research: market-research panels.
With their Prime Panels platform, CloudResearch aggregates over 100 million participants from 300 market-research panels—a participant pool that massively eclipses the approximately 100,000 available on Connect.
Yet Litman estimates that only about 10%–15% of psychological studies turn to market-research panels for participant recruitment, though they are more common in other disciplines like political science.
In a recent article for Advances in Methods and Practices in Psychological Science, Litman and his colleagues provided a tutorial on the best practices in using market-research panels for behavioral science to help researchers decide if panels are the right approach for their studies (Moss et al., 2023).
Panels are run by market-research platforms with the goal of recruiting participants to understand consumer behavior and perceptions around a particular product. They vary in their approach, but they usually include a rewards program that incentivizes participation. Panels specialize in targeting different populations, organized by factors such as demographic segments, geographic regions, or language-specific recruitment. They also allow researchers to sample participants from most countries around the world.
“The main benefit of aggregating across multiple platforms is the ability to reach people at the kinds of scales that can’t be matched at all with any single platform,” Litman said. “Like when you’re looking for difficult-to-reach clinical participants or participants within specific cities or even ZIP code areas. For consumer research, you can find people who are using products in a very specific way.”
The challenge with using market-research panels is the lack of control over the platform, which can lead to data quality issues. Researchers do not control how much participants are paid and need to screen carefully to weed out fraudulent participation.
“There are a lot of papers that are written that just contain misinformation because they didn’t do enough to clean the data,” Litman said.
CloudResearch has combatted the issue of bad data quality by creating Sentry, a tool that automatically filters out low-quality and fraudulent responses by examining the technical and behavioral characteristics of each participant before they enter a survey. The tool takes about 20 seconds per participant and filters out about 30% of panel traffic. Even so, researcher vigilance is a must.
“The vast majority of fraud is removed through that mechanism,” Litman said. “But there’s only so much we could do, and so it is a partnership between CloudResearch and the researchers.”
Litman has seen the landscape of psychological research change drastically over the past decade, with online research revolutionizing what’s possible for social sciences, but he asserts that the ease of accessing participants brings new challenges that researchers must learn to problem-solve.
“It has to be done right, otherwise you run the risk of misinforming science and misinforming the public,” he said.
Administrative data support research on rare and long-term outcomes
Another methodological approach less chosen by psychologists is the use of data from administration systems. These data are created as individuals interact with government and private administrative systems in areas such as health care, social welfare, criminal justice, and education.
In the United States, multiple large-scale administrative systems are designed for research, including birth and death records from the National Vital Statistics System, school test scores from the National Center for Education Statistics, and use of health care services from the Veterans Health Administration. Some of these data are publicly available, while sensitive information has restricted access and specific protocols for researchers to follow.
Leah Richmond-Rakerd, an assistant professor of psychology at the University of Michigan and a 2024 APS Janet Taylor Spence Award recipient, first became interested in the power of administrative data while working with epidemiologic survey data during her graduate research.
“That really helped to introduce me to the benefits of things like representative sampling and being able to work with large data sources to study associations across population subgroups or over time,” Richmond-Rakerd said in an interview with the Observer.
Richmond-Rakerd and her colleagues recently had a paper published in Current Directions in Psychological Science that describes a few distinct, and largely untapped, benefits of using population-level administrative data for psychological research.
First, data collection is expensive, especially when done over large scales or over an extended period. And for longitudinal studies, it can be difficult to ensure the sample stays consistent.
“If we’re conducting research on people over time, they may drop out of studies over time, and we may lose access to them and their information,” she said.
Conversely, administrative data can often be accessed at no cost to the researcher. And because administrative data have detailed information about the timing of specific events—the time a new medication is prescribed, for example—they can pinpoint what factors led to a specific outcome.
These data also offer the opportunity to study conditions that are rare in the population, such as schizophrenia or suicide mortality.
“Often times, when researchers are interested in those kinds of things, they have to turn to more selected samples to obtain sufficient numbers of people,” Richmond-Rakerd said. “But in population-level administrative data, researchers can study those kinds of lower prevalence conditions while still working within a representative data source.”
Another unique opportunity for researchers using administrative data is to link that information to other datasets, such as those that contain residential information or large-scale environmental characteristics. For example, Richmond-Rakerd worked with colleagues at the University of Virginia, Duke University, the University of Auckland, and the University of Otago to study the link between risk for dementia and the characteristics of the neighborhoods in which individuals lived.
“We don’t yet, in the United States, have the ability to link information about people’s interactions with different types of systems at the individual level nationwide,” Richmond-Rakerd said. “Those kinds of population-level administrative data sources do, however, exist in other countries, such as the ones that my team has worked with in New Zealand and Denmark, and in other countries such as Sweden.”
Large-scale datasets come with challenges. For example, some information, including about social identities, may not be systematically or precisely measured.
“Administrative data traditionally are not collected specifically for research,” she clarified. “These data are recorded as part of the carrying out or delivery of various public services.”
GPS data can provide new insights on movement behavior
Location-based data have also been included in the recent wave of new data sources used for psychological research. Researchers have begun to experiment with ways to incorporate GPS data into research on behavior, tracking patterns of movement and locations visited.
Interdisciplinary Work Leads to Innovative Thinking
Sharon Koppman, a sociologist and associate professor at the University of California, Irvine’s Paul Merage School of Business, has seen the influence of interdisciplinary environments on innovation: In her research, she has found that the presence of inroads into other disciplines often allows for novel approaches to slip in.
Koppman and her colleague Erin Leahey, a professor of sociology at the University of Arizona, looked to their own field of sociology to investigate the factors that lead scientists to adopt unconventional methods—such as accessing data from atypical sources—in their research. In a study focused on individuals with sociology PhDs, the researchers found that participants with higher status in their careers were more likely to try unconventional methods than those with lower status. In this case, these higher-status participants were primarily men who were affiliated with top-tier universities.
“They’re more likely to innovate and also fail,” Koppman said in an interview with the Observer. “But they’ve already kind of made it, and so their failures are not really going to affect them very much.”
Koppman said researchers from some fields are more likely to try new approaches than others, which can often be influenced by how a field defines itself. If a field is beholden to a particular method, such as the ethnographic approach of anthropology, trying a new approach can feel like changing the definition of what it means to work in that field. By creating departments that include perspectives from multiple disciplines, Koppman believes institutions can help facilitate a more consistent exchange among scientists as they become familiar with new methods, data sources, and approaches to research.
GPS data can be integrated with other types of data, such as heart rate or life satisfaction, and can be analyzed with familiar statistical methods like correlations and regressions. But researchers require a specific skillset to use these data effectively.
In a 2022 tutorial paper, Sandrine Müller and colleagues describe how to manage challenges associated with these data, such as privacy considerations and how to interpret the psychological implications of movement patterns (Müller et al., 2022).
Like market-research panels, GPS data require a specific data quality process before they can be analyzed. Researchers must identify and remove inaccurate GPS records, which are not uncommon because of frequent technical issues such as lapses in satellite connectivity.
To ensure ethical use of GPS data, researchers must give special consideration to how the data are secured and disconnected from any participant identifiers. This includes removing the coordinates of the home and work locations of participants and assigning labels to obscure exact locations.
Richmond-Rakerd also emphasized the unique ethical considerations of relying on administrative data. She stressed the importance of using responsible research practices when using these data, such as developing research questions and hypotheses before engaging with datasets.
“It’s important to keep in mind with administrative data that you’re often working with very, very large-scale data resources, and so most associations will be statistically significant,” she said, adding that it can be helpful to focus more on effect size than significance.
As researchers continue to learn how to most effectively use unconventional data sources, they share lessons learned with those in their own fields, and also with collaborating researchers from other fields. Richmond-Rakerd anticipates that use of administrative datasets will become more common as psychologists collaborate with researchers in fields like economics and health, where they are more commonly used, as well as those outside of the United States.
“More interdisciplinary collaboration isn’t just beneficial for bringing in new theoretical or methodological perspectives, but also opens up opportunities for psychologists to gain more experience and training in working with these kinds of data resources,” Richmond-Rakerd said.
Feedback on this article? Email [email protected] or login to comment.
References
Koppman, S. & Leahey, E. (2019). Who moves to the methodological edge? Factors that encourage scientists to use unconventional methods. Research Policy, 48(9), Article 103807. https://doi.org/10.1016/j.respol.2019.103807
Moss, A. J., Hauser, D. J., Rosenzweig, C., Jaffe, S., Robinson, J., & Litman, L. (2023). Using market-research panels for behavioral science: An overview and tutorial. Advances in Methods and Practices in Psychological Science, 6(2). https://doi.org/10.1177/25152459221140388
Müller, S. R., Bayer, J. B., Ross, M. Q., Mount, J., Stachl, C., Harari, G. M., Yung-Ju, C., & Huyen, H. T. (2022). Analyzing GPS data for psychological research: a tutorial. Advances in Methods and Practices in Psychological Science, 5(2). https://doi.org/10.1177/25152459221082680
Richmond-Rakerd, L. S., Dent, K. R., Andersen, S. H., D’Souza, S., & Milne, B. J. (2024). Population-level administrative data: A resource to advance psychological science. Current Directions in Psychological Science. https://doi.org/10.1177/09637214241275570
APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines.
Please login with your APS account to comment.