At the 2016 American Evaluation Association conference, I chaired a session on benefits and challenges with ICTs in Equity-Focused Evaluation. The session frame came from a 2016 paper on the same topic. Panelists Kecia Bertermann from Girl Effect, and Herschel Sanders from RTI added fascinating insights on the methodological challenges to consider when using ICTs for evaluation purposes and discussant Michael Bamberger closed out with critical points based on his 50+ years doing evaluations.
ICTs include a host of technology-based tools, applications, services, and platforms that are overtaking the world. We can think of them in three key areas: technological devices, social media/internet platforms and digital data.
An equity focus evaluation implies ensuring space for the voices of excluded groups and avoiding the traditional top-down approach. It requires:
- Identifying vulnerable groups
- Opening up space for them to make their voices heard through channels that are culturally responsive, accessible and safe
- Ensuring their views are communicated to decision makers
It is believed that ICTs, especially mobile phones, can help with inclusion in the implementation of development and humanitarian programming. Mobile phones are also held up as devices that can allow evaluators to reach isolated or marginalized groups and individuals who are not usually engaged in research and evaluation. Often, however, mobiles only overcome geographic inclusion. Evaluators need to think harder when it comes to other types of exclusion – such as that related to disability, gender, age, political status or views, ethnicity, literacy, or economic status – and we need to consider how these various types of exclusions can combine to exacerbate marginalization (e.g., “intersectionality”).
We are seeing increasing use of ICTs in evaluation of programs aimed at improving equity. Yet these tools also create new challenges. The way we design evaluations and how we apply ICT tools can make all the difference between including new voices and feedback loops or reinforcing existing exclusions or even creating new gaps and exclusions.
Some of the concerns with the use of ICTs in equity- based evaluation include:
- Are we falling victim to ‘elite capture’ — only hearing from higher educated, comparatively wealthy men, for example? How does that bias our information? How can we offset that bias or triangulate with other data and multi-methods rather than depending only on one tool-based method?
- Are we relying too heavily on things that we can count or multiple-choice responses because that’s what most of these new ICT tools allow?
- Are we spending all of our time on a device rather than in communities engaging with people and seeking to understand what’s happening there in person?
- Is reliance on mobile devices or self-reporting through mobile surveys causing us to miss contextual clues that might help us better interpret the data?
- Are we falling into the trap of fallacy in numbers – in other words, imagining that because lots of people are saying something, that it’s true for everyone, everywhere?
- Do digital tools require a costly, up-front investment that some organizations are not able to make?
- How do fear and resistance to using digital tools impact on data gathering?
- What kinds of organizational change processes are needed amongst staff or community members to address this?
- What new skills and capacities are needed?
- How are researchers and evaluators managing informed consent considering the new challenges to privacy that come with digital data? (Also see: Rethinking Consent in the Digital Age)?
- Are evaluators and non-profit organizations equipped to keep data safe?
- Is it possible to anonymize data in the era of big data given the capacity to cross data sets and re-identify people?
- What new risks might we be creating for community members? To local enumerators? To ourselves as evaluators? (See: Developing and Operationalizing Responsible Data Policies)
Evaluation of Girl Effect’s online platform for girls
Kecia walked us through how Girl Effect has designed an evaluation of an online platform and applications for girls. She spoke of how the online platform itself brings constraints because it only works on feature phones and smart phones, and for this reason it was decided to work with 14-16 year old urban girls in megacities who have access to these types of devices yet still experience multiple vulnerabilities such as gender-based violence and sexual violence, early pregnancy, low levels of school completion, poor health services and lack of reliable health information, and/or low self-esteem and self-confidence.
The big questions for this program include:
- Is the content reaching the girls that Girl Effect set out to reach?
- Is the content on the platform contributing to change?
Because the girl users are on the platform, Girl Effect can use features such as polls and surveys for self-reported change. However, because the girls are under 18, there are privacy and security concerns that sometimes limit the extent to which the organization feels comfortable tracking user behavior. In addition, the type of phones that the girls are using and the fact that they may be borrowing others’ phones to access the site adds another level of challenges. This means that Girl Effect must think very carefully about the kind of data that can be gleaned from the site itself, and how valid it is.
The organization is using a knowledge, attitudes and practices (KAP) framework and exploring ways that KAP can be measured through some of the exciting data capture options that come with an online platform. However it’s hard to know if offline behavior is actually shifting, making it important to also gather information that helps read into the self-reported behavior data.
Girl Effect is complementing traditional KAP indicators with web analytics (unique users, repeat visitors, dwell times, bounce rates, ways that users arrive to the site) with push-surveys that go out to users and polls that appear after an article (“Was this information helpful? Was it new to you? Did it change your perceptions? Are you planning to do something different based on this information?”) Proxy indicators are also being developed to help interpret the data. For example, does an increase in frequency of commenting on the site by a particular user have a link with greater self-esteem or self-efficacy?
However, there is only so much that can be gleaned from an online platform when it comes to behavior change, so the organization is complementing the online information with traditional, in-person, qualitative data gathering. The site is helpful there, however, for recruiting users for focus groups and in-depth interviews. Girl Effect wants to explore KAP and online platforms, yet also wants to be careful about making assumptions and using proxy indicators, so the traditional methods are incorporated into the evaluation as a way of triangulating the data. The evaluation approach is a careful balance of security considerations, attention to proxy indicators, digital data and traditional offline methods.
Using SMS surveys for evaluation: Who do they reach?
Herschel took us through a study conducted by RTI (Sanders, Lau, Lombaard, Baker, Eyerman, Thalji) in partnership with TNS about the use of SMS surveys for evaluation. She noted that the rapid growth of mobile phones, particularly in African countries, opens up new possibilities for data collection. There has been an explosion of SMS surveys for national, population-based surveys.
Like most ICT-enabled MERL methods, use of SMS for general population surveys brings both promise:
- High mobile penetration in many African countries means we can theoretically reach a large segment of the population.
- These surveys are much faster and less expensive than traditional face-to- face surveys.
- SMS surveys work on virtually any GSM phone.
- SMS offers the promise of reach. We can reach a large and geographically dispersed population, including some areas that are excluded from FTF surveys because of security concerns.
- Coverage: We cannot include illiterate people or those without access to a mobile phone. Also, some sample frames may not include the entire population with mobile phones.
- Non-response: Response rates are expected to be low for a variety of reasons, including limited network connectivity or electricity; if two or people share a phone, we may not reach all people associated with that phone; people may feel a lack of confidence with technology. These factors might affect certain sub-groups differently, so we might underrepresent the poor, rural areas, or women.
- Quality of measurement. We only have 160 CHARACTERS for both the question AND THE RESPONSE OPTIONS. Further, an interviewer is not present to clarify any questions.
RTI’s research aimed to answer the question: How representative are general population SMS surveys and are there ways to improve representativeness?
Three core questions were explored via SMS invitations sent in Kenya, Ghana, Nigeria and Uganda:
- Does the sample frame match the target population?
- Does non-response have an impact on representativeness?
- Can we improve quality of data by optimizing SMS designs?
One striking finding was the extent to which response rates may vary by country, Hershel said. In some cases this was affected by agreements in place in each country. Some required a stronger opt-in process. In Kenya and Uganda, where a higher percentage of users had already gone through an opt-in process and had already participated in SMS-based surveys, there was a higher rate of response.
These response rates, especially in Ghana and Nigeria, are noticeably low, and the impact of the low response rates in Nigeria and Ghana is evident in the data. In Nigeria, where researchers compared the SMS survey results against the face-to-face data, there was a clear skew away from older females, towards those with a higher level of education and who are full-time employed.
Additionally, 14% of the face-to-face sample, filtered on mobile users, had a post-secondary education, whereas in the SMS data this figure is 60%.
Additionally, Compared to face-to-face data, SMS respondents were:
- More likely to have more than 1 SIM card
- Less likely to share a SIM card
- More likely to be aware of and use the Internet.
This sketches a portrait of a more technological savvy respondent in the SMS surveys, said Herschel.
The team also explored incentives and found that a higher incentive had no meaningful impact, but adding reminders to the design of the SMS survey process helped achieve a wider slice of the sample and a more diverse profile.
Response order effects were explored along with issues related to questionnaire designers trying to pack as much as possible onto the screen rather than asking yes/no questions. Hershel highlighted that that when multiple-choice options were given, 76% of SMS survey respondents only gave 1 response compared to 12% for the face-to-face data.
Lastly, the research found no meaningful difference in response rate between a survey with 8 questions and one with 16 questions, she said. This may go against common convention which dictates that “the shorter, the better” for an SMS survey. There was no observable break off rate based on survey length, giving confidence that longer surveys may be possible via SMS than initially thought.
Hershel noted that some conclusions can be drawn:
- SMS excels for rapid response (e.g., Ebola)
- SMS surveys have substantial non-response errors
- SMS surveys overrepresent
These errors mean SMS cannot replace face-to-face surveys … yet. However, we can optimize SMS survey design now by:
- Using reminders during data collection
- Be aware of response order effects. So we need to randomize substantive response options to avoid bias.
- Not using “select all that apply” questions. It’s ok to have longer surveys.
However, she also noted that the landscape is rapidly changing and so future research may shed light on changing reactions as familiarity with SMS and greater access grow.
Summarizing the opportunities and challenges with ICTs in Equity-Focused Evaluation
Finally we heard some considerations from Michael, who said that people often get so excited about possibilities for ICT in monitoring, evaluation, research and learning that they neglect to address the challenges. He applauded Girl Effect and RTI for their careful thinking about the strengths and weaknesses in the methods they are using. “It’s very unusual to see the type of rigor shown in these two examples,” he said.
Michael commented that a clear message from both presenters and from other literature and experiences is the need for mixed methods. Some things can be done on a phone, but not all things. “When the data collection is remote, you can’t observe the context. For example, if it’s a teenage girl answering the voice or SMS survey, is the mother-in-law sitting there listening or watching? What are the contextual clues you are missing out on? In a face-to-face context an evaluator can see if someone is telling the girl how to respond.”
Additionally,“no survey framework will cover everyone,” he said. “There may be children who are not registered on the school attendance list that is being used to identify survey respondents. What about immigrants who are hiding from sight out of fear and not registered by the government?” He cautioned evaluators to not forget about folks in the community who are totally missed out and skipped over, and how the use of new technology could make that problem even greater.
Another point Michael raised is that communicating through technology channels creates a different behavior dynamic. One is not better than the other, but evaluators need to be aware that they are different. “Everyone with teenagers knows that the kind of things we communicate online are very different than what we communicate in a face-to-face situation,” he said. “There is a style of how we communicate. You might be more frank and honest on an online platform. Or you may see other differences in just your own behavior dynamics on how you communicate via different kinds of tools,” he said.
He noted that a range of issues has been raised in connection to ICTs in evaluation, but that it’s been rare to see priority given to evaluation rigor. The study Herschel presented was one example of a focus on rigor and issues of bias, but people often get so excited that they forget to think about this. “Who has access.? Are people sharing phones? What are the gender dynamics? Is a husband restricting what a woman is doing on the phone? There’s a range of selection bias issues that are ignored,” he said.
Quantitative bias and mono-methods are another issue in ICT-focused evaluation. The tool choice will determine what an evaluator can ask and that in turn affects the quality of responses. This leads to issues with construct validity. If you are trying to measure complex ideas like girls’ empowerment and you reduce this to a proxy, there can often be a large jump in interpretation. This doesn’t happen only when using mobile phones for evaluation data collection purposes but there are certain areas that may be exacerbated when the phone is the tool. So evaluators need to better understand behavior dynamics and how they related to the technical constraints of a particular digital or mobile platform.
The aspect of information dissemination is another one worth raising, said Michael. “What are the dynamics? When we incorporate new tools, we tend to assume there is just one-step between the information sharer and receiver, yet there is plenty of literature that shows this is normally at least 2 steps. Often people don’t get information directly, but rather they share and talk with someone else who helps them verify and interpret the information they get on a mobile phone. There are gatekeepers who control or interpret, and evaluators need to better understand those dynamics. Social network analysis can help with that sometimes – looking at who communicates with whom? Who is part of the main infuencer hub? Who is marginalized? This could be exciting to explore more.”
Lastly, Michael reiterated the importance of mixed methods and needing to combine online information and communications with face-to-face methods and to be very aware of invisible groups. “Before you do an SMS survey, you may need to go out to the community to explain that this survey will be coming,” he said. “This might be necessary to encourage people to even receive the survey, to pay attention or to answer it.” The case studies in the paper “The Role of New ICTs in Equity-Focused Evaluation: Opportunities and Challenges” explore some of these aspects in good detail.