Chatbot as a Telehealth Intervention Strategy in the COVID-19 Pandemic: Lessons Learned from an Action Research Approach

The COVID-19 pandemic and the need for social distancing have created a demand for new and innovative solutions in healthcare systems worldwide. One of the strategies that have been implemented are chatbots, which can be helpful in providing reliable health information and preventing people from seeking assistance in healthcare centers and being unnecessarily exposed to the virus. In this context, although a high number of chatbots have been implemented worldwide, little has been discussed about the process and challenges in developing and implementing this technology. This paper reports on an action research, which designed a novel chatbot as a prompt response to the COVID-19 pandemic. The chatbot is intended to be a first layer of interaction with the public, performing triage of patients and providing information about COVID-19 on a large scale and without human contact. Our contribution is twofold: (i) we reflected on the development process and discussed lessons learned and recommendations to support a multidisciplinary development and evolution process of the chatbot; and (ii) we identified some interactive and technological features that can be used as a reference framework for this kind of technology. These contributions can be useful to other researchers and multidisciplinary teams facing similar challenges.


Introduction
The exponential growth of COVID-19 cases [1], which rapidly evolved into a global pandemic worldwide, demanded a fast and effective response of healthcare services, which included the explosion of innovation and breakthroughs in technology and healthcare. Telehealth aims to provide medical support to patients at home through information and communication technological support, and has been widely explored as an effective tool for fighting the pandemic and avoiding the overload of healthcare services [2], [3].
One of the technologies that have flourished in the domain of telehealth are chatbots [4]- [7], i.e., computational systems enabling interaction with another (human or artificial) agent via text or speech [8]. Their potential to provide information 24/7, consistent response quality, non-judgmental approach (making people feel more willing to disclose private information), a remote interaction from people's homes without increasing the risk of COVID-19 transmission, as well as the possibility to be easily scaled to respond to the high increase for demand made them an especially attractive tool during the pandemic.
Chatbots were developed to support users in different aspects regarding COVID-19: information dissemination, symptoms monitoring or screening, behavior change support and mental health support [9]. A number of chatbots focusing on the pandemic were developed in different countries, not only to provide people with general information about COVID-19, but also to be able to take into consideration local specificities and recommendations (e.g. [10]- [12]).
Nowadays, developing a rule-based chatbot (i.e. based on a fixed predefined set of rules to generate responses for users' queries, without creating any new answers [8]) is not a complex task, particularly when using platforms developed for this purpose ((e.g. BLiP 1 or DialogFlow 2 ). For instance, in the literature there are reports of developing and launching a chatbot for screening COVID-19 symptoms in only five days [13].
The challenges associated with developing a chatbot for healthcare services arise from the need of the involvement of a multidisciplinary team, able to focus on distinct requirements which are essential for the success and reliability of the application. The chatbot must provide accurate medical information, especially considering that information about COVID-19 has been continually evolving, which means that updates and changes need to be frequently performed [9], [14] and need to fight misinformation as well [15]- [17]. Additionally, information must be adapted from the medical discourse to a discourse that lay people can easily understand [9]. Finally, the system must be easy to use and generate a positive user experience, so people choose to use it [11], [13].
Even though there have been several publications about chatbots [9]- [14], [16], [18], very little has been reported on the process followed to develop chatbots [11], [13] and how to tackle the challenges imposed by a pandemic situation. This paper reports on an action research, which designed a novel chatbot as a prompt response to the COVID-19 pandemic, in the state of Minas Gerais (MG), Brazil. Action research is an approach that mixes practice and research where one can learn from acting in a situated problem or context [19]. It involves intervening in an actual situation or context and doing research that can both inform this intervention and produce critical knowledge about it [20]. Due to the emergency need of responding to COVID-19 demands, acting and researching was not only an option but actually the only option for those involved in the COVID-19 healthcare practice. In this sense, our team acted and reflected on the development process of the chatbot and discussed challenges, lessons learned and recommendations to support a multidisciplinary development and evolution process of the chatbot. We also identified some interactive and technological features that can be used as a reference framework for this kind of technology.
The remainder of this paper is organized as follows: Section 2 briefly reviews other works on chatbots for COVID-19 and their design process; Section 3 describes our action research approach; followed by our results (Section 4) and the discussion of these results (Section 5). Finally, we present the limitations (Section 6), conclusions and next steps (Section 7) of our work.

Related Work
At the time of writing this paper, over one year after COVID-19 had its first case confirmed outside China [11], several chatbots developed specifically to deal with (or that have included) COVID-19 are available. Many governments have implemented chatbots to help inform their population about COVID, taking into account not only global recommendations, but also local guidelines. Likewise, existing "off-the-shelf" chatbots aimed at medical diagnosis have also included COVID in their databases, as is the case of Symptoma, which was evaluated to show its accuracy in identifying COVID symptoms [12].
Our paper focuses on the design process integrating the multidisciplinary perspectives required for developing a COVID-19 chatbot and the ensuing challenges. It is worth noting that COVID-19 brought specific requirements to the development of chatbot, such as the urgency and constant changes of information, as scientists learned more about COVID-19 [9], [14], [16]. Thus, although there is much work related to developing chatbots in general e.g. [21]- [25] and even in the healthcare domain [26]- [30], in this section we will focus on how other works have dealt with or discussed these issues. Our review of the literature yielded two papers that describe the design process adopted in the development of COVID-related chatbots.
In [11], "COVID-19 Preventable" chatbot was developed by the Emergency Operation Center (EOC) and the Department of Disease Control (DDC) in Thailand. The design team was composed by doctors and public health government officers employed by the DDC. They adopted a design process based on the Design Science Research Methodology, consisting in 6 steps: definition of target problem, establishment of objectives, design and development, demonstration, evaluation, and communication. The last four steps covered a two-phase development cycle, the first yielding the deployment of an initial version of the chatbot for self-screening and provision of information about COVID related topics; and the second one providing updates and broadening the coverage of the information available. Despite the successful outcomes -the chatbot became one of the DDC's official interactive channels and was accessed by over 500,000 people in Thailand. The authors reported the absence of a multidisciplinary team as one of the project's shortcomings.
In [13], the authors described the successful development of a chatbot for daily entry screening of healthcare workers to prevent the spread of COVID-19 in the University of California, San Francisco (UCSF) Health. They adopted an agile product design process, comprising 6 tasks: identification of initial requirements; research of platform to be used; minimum viable product deployment; improvement and stabilization steps; and scaling up the number of users. The authors argue that rapid product design and deployment in the healthcare setting require a multidisciplinary team with a strong digital product management background. The authors ascribed their successful implementation to the multidisciplinarity of their team, with experience in developing similar tools. However, they have not provided details of their multidisciplinary team; nor have they explained the role each one took in the development process.
Other works reviewed report chatbots designed to meet varied demands brought about by COVID-19. No discussion is reported regarding the design process adopted. Penn Medicine [10] accounts for the development of a chatbot in view of the benefits offered -24/7 availability, consistent answers, large scalability capacity and interaction in multiple languages. Despite the existence of other COVID-19 "off-the-shelf" chatbots, they report having opted to create a new one to cater for local content. In shared work with specialized companies, they created a chatbot able to provide information about COVID-19, to screen patients and direct them to receive care within their institution. They describe some of the challenges and actions taken regarding the creation of an appropriate, accurate and updated content for patients, as well as concerns regarding the bot's role in the screening process.
Other chatbot proposals aimed at dealing with specific local content prompted by COVID-19 have also been published. In [18], authors described a chatbot under development in Italy to support COVID information, screening and telecare targeting people in remote areas. They described the communication workflow implemented and compared it to other existing chatbots. In [31], authors also target health support to people in remote areas of the country, though not on COVID matters, but rather on other healthcare issues (i.e. prenatal care and chronic diseases) which they expected would be impacted by COVID-imposed social distancing.
Along the lines of the studies herein reviewed, the present study was motivated by the potential benefits chatbots may bring to screening support and dissemination of trustful health information in the context of COVID-19. In the case of our work we focused on a public Brazilian telehealth system, TeleCOVID-19 which took into consideration specific needs for Brazilian users, not only in regards to language, but also to specificities associated with the pandemics in the country and the associated medical services. Unlike previously published works, we provide details on the action research conducted and systematize our findings to inform research and development of multidisciplinary team working processes to develop telehealth chatbots.

Research Approach
The ultimate goal of every healthcare-oriented research for fighting COVID-19 is to save lives. A technology-based intervention can help to achieve this goal by improving the healthcare service during the pandemic. Our research started with the very practical goal of adapting and improving a public healthcare service for a university hospital network so it could cope with the conditions introduced by the pandemic. In particular, the aim was to prepare for the possibility of overload combined with the need, not only to ensure social distancing, but also to minimize the risks of virus spread for both patients and staff. However, even if one can come up with a practical technological intervention in such an unknown domain, there emerges the immediate question of how to assess its results, contributions and eventual shortcomings. When combining both an intervention in a social setting and research aimed to learn from it, action research [19], [20], [32], [33] is an approach often employed in the social sciences and in Human-Computer Interaction (HCI) [34], [35]. Although a consolidated method, there are different approaches to action research, which requires a brief overview of the method and a clarification about the approach we are using here.
Action research is an endeavor that entails both an action in an actual setting and the effort to produce scholarly knowledge [34]. It is an approach by which researchers try to change reality by acting on or in it [33], committed to both changing a social system and generating critical knowledge about it [32]. Quality and validity, both practical and scientific, is assured by the cyclic nature of the action research, where multiple cycles (or spirals) of plan-actobserve-reflect are undertaken in order to refine the lessons learned. It departs from the more positivist-oriented view of researchers not interfering in the object of study to the opposite stance of studying a social reality precisely by trying to change it and then verifying the actual effects of such change. In this perspective, researchers can take different positions regarding the setting they study, and collaboration between researchers and participants are not only allowed but also desirable. Some approaches take a more formal separation between researchers as external consultants joining and collaborating with a group of peopletheir clientsthat have a certain problem to be investigated and solved e.g. [35]. However, this perspective would not correspond to our actual context, which was characterized by a multidisciplinary team with different positions and interests in the project. According to Herr & Anderson's positionality terminology [36,Ch. 3], the position a researcher takes when conducting an action research can vary from an outsider to a collaborator or an insider, thus influencing the process, the outcome and the quality of the research [36]. Insiders are those who are or become part of the organization or social group under investigation and are, therefore, more affected by the research products and outcomes. Outsiders are those experts (usually researchers) that get involved with the organization or social group under investigation mostly because of the research and can stay and observe things at a certain distance. In the following section, we provide a description of the actual context and people's roles in order to clarify the different positions each team member and group had on our action research.

Actual Context and Setting
Our research was conducted by a multidisciplinary team comprising four expertise groups ( Fig. 1): I) a telehealth center team made up of IT personnel and physicians. Affiliated to a university hospital responsible for developing, deploying and maintaining a set of telehealth initiatives, the telehealth center has a triple purpose of training undergrad students (future physicians), supporting medical research and providing the local population with free healthcare services integrated with the municipal, state and national public healthcare network [37]; II) a group of senior professors and junior researchers (undergrad students) from the Medical School, participating in the project as coordinators or collaborators in the telehealth center concomitantly with their school research and/or assistencial duties; III) a group of Applied Linguistics professors and post-doctoral researchers from the Arts Faculty, who brought to the project their expertise in computational linguistics and cross-cultural translation and adaptation; IV) a group affiliated to the Computer Science Department of the same university, comprising professors, post-doctoral researchers and undergrad students, with skills covering software development, data science and HCI.
Initially, in the action part of our action research approach, the telehealth center team (I) and the Medical School faculty (II) developed a first version of the chatbot for the population seeking public healthcare assistance. Using Kock's terminology, they were the researchers' clients, that is, the owners of a practical problem (providing public healthcare assistance during the COVID-19 crisis) asking for expert help in a certain domain, namely, computing and applied linguistics [35]. Researchers from the other two groups (applied linguists (III) and computer scientists (IV) provided advice and support for specific questions regarding technological possibilities and solutions. Throughout the process, all groups developed a stronger form of collaboration, working together in several practical and research tasks, frequently sharing and discussing the demands and issues as they emerged, and holding biweekly follow-up meetings. Therefore, groups I and II could be more precisely characterized as insiders, not clients, because they were researchers directly involved in the healthcare service provision. Analogously, using Herr & Anderson's positionality terminology [36], groups III and IV were outsiders not directly involved in providing healthcare assistance, but collaborating in designing, evolving and evaluating the technological intervention. That particular organization set the ground for an ongoing endeavor that enabled both the deployment of technology and mutual learning about and from it, as depicted in Fig. 1.

Research Problem and Questions
The demand for a quick response and the strong social component of the targeted disease posed unprecedented challenges for IT and healthcare professionals working to fight COVID-19. Pressing needs were dealt with and problems were tackled and discussed along the way. This was a relevant step because insights could be drawn from pinpointing the challenges faced and both the successful and less-successful solutions found. Our analysis thus has the two-fold aim of (i) defining the challenges faced as they emerged in a concrete scenario of an actual telehealth project, and (ii) of drawing lessons learned around the process of meeting these challenges so that our findings can be leveraged in similar projects and contexts. The overall problem this research addresses is how to develop and deploy a telehealth intervention in the context of the COVID-19 pandemic. Particularly, which strategies could most successfully be leveraged in terms of team expertise, skills, processes, and technology, and which challenges were faced and dealt with along the way. In the case of this study, we focused on a specific type of technology, namely a chatbot for providing a first level type of healthcare service without human contact and at scale. The scope and functions of the chatbot will be described in the next section, since they are products of our first action research cycle.
Besides being an intervention to solve a practical problem (action), the research part in an action research requires research questions to be posed and pursued in order for it to differ from pure technology design [38]. In this work, our research question can be summarized as the following: • RQ1) What are the socio-technical challenges for designing and deploying a telehealth chatbot in the context of the COVID-19 pandemic?
In the context described above, this question implied answering the following sub-questions: • RQ1-A) In the social part, how can we organize a multidisciplinary team and leverage the different available skills, backgrounds and visions from different experts?
• RQ1-B) In the technical part, what kind of technology or technical infrastructure is needed in order to support this multidisciplinary team's ultimate goal, namely, to provide a helpful piece of technology for fighting the COVID-19 pandemic?
This research was submitted to and approved by the Universidade Federal de Minas Gerais Research Ethics Committee (CAAE: 35953620.9.0000.5149). The service terms of use (agreed to by users) described which data would be collected and how it would be used for improving the chatbot and for research purposes. Researchers had to sign a Data Use Commitment Term 3 to have access to the data.

Action Research Cycle
In this section, we describe some of the situations experienced during the first cycle of our action-research, the decisions and findings of each phase of the cycle regarding the chatbot, as well as some issues and concerns associated with them and relevant to our research questions.

Planning
We identified two core functionalities our chatbot was expected to provide: supportive healthcare assistance and health education. We thus opted for a chatbot that would support the screening of suspect cases of COVID-19 and provide updated and trustworthy information for educating the population about the disease.
Additionally, a couple of important requirements were identified by our team from the very beginning of our work. These requirements were based on the emerging local and international health authorities' recommendations about COVID-19 4 and the clinical experience of our medical team. They served as our starting point and are summarized below: • Information had to be accurate, both about COVID-19 and procedures to follow. In particular, with regards to how to proceed, the most accepted clinical guidelines and recommendations by the Brazilian Ministry of Health and the World Health Organization (WHO) had to be adhered to; • Language needed to be simple and clear, so that all content could be understood by the general population.
For that purpose, we had to bear in mind the varying levels of health literacy in Brazil; • Finally, technology had to be simple to use and not become an obstacle for users. Although chatbots are already used in various contexts in Brazil, it could be expected that people would have very different degrees of experience with them.
With these requirements in mind, the telehealth center team looked for a feasible solution and started to develop partnerships with other municipal governments and public healthcare units. Based on previous relationships, it was defined that the technology would be initially deployed in the Brazilian cities of Belo Horizonte, Divinópolis and Teófilo Otoni. The project had to consider different goals and infrastructure available in each location. For instance, in Divinópolis and Teófilo Otoni, the chatbot was intended to be integrated into a novel experimental telemedicine system for remote patient care, monitoring and follow-up, which was not available at first in Belo Horizonte.
Researchers in Applied Linguistics and Computer Science gradually joined the group for advising on the project. It is important to note that the team worked under strong time and practical constraints as time was of the essence, in order to support the population, especially considering that the number of confirmed cases gradually increased in the country.
Finally, as the project evolved and the team aligned with the project goals and needs, plans were made to evaluate the project quality and results. Clinical and user evaluations that focused on different aspects of the chatbot were planned as part of the practical and research goals of the project. Our focus in this paper is on reporting and reflecting upon the process of our interdisciplinary collaboration to develop the technology.

Acting
The chatbot assistant was developed using BLiP, a platform for developing conversational applications. On the platform, conversations are designed by means of a flowchart, where each message is defined and chained with its expected responses and follow-up messages. The process of building a chatbot on BLiP is straightforward and very quick in case of simple tasks, as it provides a default standard interface and native integration with popular messaging apps. The disadvantage is that it is not based on natural language processing (NLP) 5 and the chatbot inherits many limitations of the provided standard interfaces. It should be noted that the need for deploying a 100% working solution ready and available to the general public, including potential patients in need of care, limited more advanced experiments in terms of innovative technology solutions.
The chatbot is officially referred to as TeleCOVID, whereas the conversational agent introduces herself as "ANA", a short and popular Brazilian name. Its two main purposesscreening of suspect COVID-19 cases and education of the Brazilian Portuguese speaking populationwere developed as follows.
Screening of suspect COVID-19 cases. Fig. 2   Based on those directions, our chatbot was built to associate a color tag according to the severity of the reported symptoms. Patients with dyspnea or who report a fainting sensation are ranked as "red" and advised to seek emergency care as soon as possible. Patients who have fever persisting over 3 days, or returning after 48 hours are ranked as "orange" and advised to seek care in a hospital. The ones who do not have any of the warning signs, but who have comorbidities which may increase the risk of a severe disease, are ranked as "yellow" and advised to seek a reference healthcare center or a teleconsultation service, if there is one available. The ones who do not have any warning signs or comorbidities are ranked as "green" and receive advice on how to deal with their mild disease symptoms (i.e. home isolation, rest and staying hydrated). They are also advised to call for a teleconsultation, should the need arise. With the exception of patients ranked as "red", all the other tag groups were invited to experience our chatbot's educational session.
The educational session was initially built based on 75 frequently asked questions according to the database from the Telehealth Center. All the answers to these selected questions were given by Brazilian Health professionals based on the best available evidence. The question-answer pairs were grouped in 11 different categories: general information, transmission, symptoms, advice for suspected cases, treatment, house care, hygiene, lifestyle, mask use, pregnancy and pets. The database was later expanded to 85 question-answer pairs organized into the previous 11 topics plus an additional for "diagnosis" (see Table 1). The functionality of the question-and-answer session depicted in Fig. 3 provides a straightforward interaction to users.  Availability and Access. TeleCOVID was made available to the public in three different ways: on WhatsApp, as well as on the official website of the telehealth center 7 to serve most cities in the state of Minas Gerais, as well as some other states in which the service is active; on the official city government application of Divinópolis 8 to be used as a mobile app accessed via a new menu option; and on the official government website of the city of Teófilo Otoni 9 . Each version was an independent application, mirroring one another though adapted in order to fulfill particular needs of each city. It should be noted that in Divinópolis and Teófilo Otoni the systems were devised to send patients automatically to a teleconsultation system available, if the user agreed to use the service and to provide his/her phone number. For the other locations, the telehealth center's website version was not automatically integrated to the public healthcare system and worked more as an assisting informative system.
Interacting with TeleCOVID. Once a start stimulus is received, the virtual assistant begins a conversation welcoming users and asking for their consent to use the service as well as showing a link to the chatbot's terms of use. If the user agrees to them, the assistant will ask for some personal information such as name, age, gender and location. The interface and interactive features were inherited from the BLiP platform. Since our strategy was to prioritize the fastest possible deployment, we opted for BLiP's default native interface, which offered a standard chatbot window (see Fig. 4). In this sense, we were only able to define the interaction in terms of the flow and messages of the conversation. The website and mobile versions are similar and offer button and link shortcuts that users can click on. This feature is not available for users interacting through WhatsApp using its standard chat interface, requiring users to type their messages. For that reason, numbers were provided whenever possible to work as pure text shortcuts, mimicking a command-line menu navigation scheme. After the user's general information is collected, the assistant asks whether the user feels sick and wants to check what s(he) is feeling; or if s(he) wants information and guidelines about COVID-19 (see Fig. 4).
Screening Intent. If the user chooses the first option, our virtual assistant ANA will then ask the questions based on the decision tree depicted in Fig. 2 and associate a color tag according to the severity of the symptoms. Users in the cities of Divinópolis and Teófilo Otoni who are classified with the tags red and orange are asked by the virtual assistant to inform their phone numbers and have the option to be automatically forwarded to the respective teleconsultation system. Patients who access the telehealth center's website and receive red and orange tags are strongly advised to seek emergency care. The cases classified in the other "colors" (yellow and green) receive the information according to the decision tree described previously (see Fig. 2).
Educational Intents. If the user selects the second option at the end of the Welcoming Intent or after the screening process, for all users (except those tagged as a red case) the assistant will start the education session asking them to choose one out of the 12 available topics they are interested to know about. The questions related to the selected topic are then displayed to the user. For instance, if in the "General" category users choose the question "What is SARS-CoV-2?", they will receive the answer "SARS-CoV-2 is the name scientists gave to the coronavirus which is causing the current pandemic."

Observing
As soon as the first version of our TeleCOVID chatbot was launched, several changes and refinements in the technology were required. In order to accommodate the demands arising from the system that was in use (e.g. fixes and minor improvements), the requests for improvements from our team, and the modifications needed as a result of external factors (e.g. new scientific findings on COVID-19 or recommendation updates from healthcare authorities), we had to develop a reflective look on our development process. In this section, we report some events that taught us what we consider to be important lessons.
Quick and dirty approach. The first consideration refers to the pressing need of deploying a working solution quickly. It not only constrained the options for more innovative and potentially better solutions but also brought about consequences that hindered the quality of the product and our work process. In particular, the need to adapt to three different sites due to the particular infrastructure and healthcare services provided was addressed by means of the deployment of three differently adapted copies of the same original chatbot. As the work progressed and changes were made to fix bugs or minor improvements, there were situations where the three different versions behaved differently and even got out of sync. As the team tested and inspected the production version of the chatbot, sometimes problems could not be spotted in the version under inspection, but were experienced by users at a certain location. Consequently, not all problems could be properly located internally by the team, and some of them could only be located when we were analyzing our evaluation data and the chatbot conversations logs. For instance, after every interaction, users were asked to grade their experience using the chatbot. Analyzing the data of users who reported not having had a satisfactory experience, we identified a subset who had trouble informing their telephone number. The problem was that the assistant required a specific phone number format, but the format was not informed in the interface to users. Users simply got a general "invalid phone number" message. Although it was a simple problem that could have been easily spotted during an HCI heuristic inspection or even during a regular system test, it was not found prior to the conversation log analysis. One of the reasons was that most inspections were being conducted in the telehealth center's website chatbot version, considered the first and "primary" one, which did not request users' phone numbers.
This incident made us think about the need to have a more adaptive chatbot, where we could use a single application version that would be able to behave differently depending upon conditions. Taking into account users' location or interface used (e.g. website, mobile app, or WhatsApp) would be simple to implement in our bot. However, anything beyond this would demand a more sophisticated technology infrastructure and an interactive approach that initially we had no means of anticipating or planning and implementing in advance. Along with other technological limitations we were identifying "on the fly" and some research discussions we had, we started to aim for an "ideal" evolved technology that would meet this and other more sophisticated requirements, such as NLP capabilities and voice support. Throughout this process, the technical members of the research team experimented with different and more sophisticated chatbot technologies and several parallel experimental versions of ANA were developed. This resulted in the proposal of a reference technology framework that will be presented and discussed in the next section.
Keeping up with the latest news. Another important consideration was the constant need to perform quick COVID-19 content updates, a condition that is inherent to the pandemic context. We faced several situations in which this was the case. First, as the entire world learned more about COVID-19 and how to cope with it, knowledge and information about the disease evolved and changed constantly and fast. Furthermore, COVID-related misinformation and the "infodemic" [17] were phenomena that could not be ignored and frequently demanded the inclusion of information particularly targeted to fight the latest fake-news. An immediate consequence for our technology was that we needed to keep it up to date by frequently reviewing the information, specially regarding the educational session. Besides the challenges and possible instabilities caused by frequent changes in the technology, the very process of reviewing scientific information is a complex one, as every researcher knows. In addition, it demanded language adaptation so it could be understood by the general public.
Second, as authorities struggled to manage the pandemic, their recommendations kept changing along the way, being influenced by a variety of factors such as the infection curve, location, private and public pressures, and even their political and ideological inclinations. At first, our solution to shield the project from these variations was to recommend users to seek more information on 136, the official Ministry of Health telephone hotline dedicated to COVID-19. At some point, however, the government shut down the service, and all versions needed to be updated to suppress this recommendation. Several times, depending on the infection rate and healthcare service overload, the city, state and federal administrations changed the official guidelines for social distancing and lockdown, which, often times, resulted in conflicts between them. There was a clear need for a more dynamic management of information but no infrastructure was specifically available for that purpose. Moreover, the task of maintaining information updated is not a trivial one. It demands, at least, the constant monitoring of media and news outlets as well as the official health and administrative authorities channels. In our case, this was accomplished through the cooperative efforts of technologists, physicians and linguists in order to keep the chatbot up to date. In other words, the need for precise, clear, up to date and easily accessible information demanded an intense multidisciplinary collaboration, not without challenges, as we will describe next.
The multiple challenges of multidisciplinarity. Regarding the question-answer pairs in the educational session, there were two pressing matters. First, there was the need to adapt the language of the questions and answers so that they were as clear and as direct as possible to the general population. Second, there was the need to keep the questionanswer pairs up-to-date both in terms of content and relevance (with time, some issues were less sought, while new queries arose).
As previously stated, the initial version of the 75 questions were answered by the team of medical researchers. Their answers were long, detailed and technical. This posed several problems. In terms of the technology, long answers hindered the interaction aspect of the chatbot and could discourage its use. In terms of language, long and technical answers are difficult to be processed and understood by a significant part of the population (e.g. low literacy groups).
In order to solve this problem, it was decided that the team of applied linguists would revise the initial document with the question-answer pairs and adapt the language so that it would be less technical, more clear and colloquial, resulting in a text that was more attuned with the interactive aspect of the chatbot (emulate a conversation) and the profile of its targeted user. It was also decided that they would perform this process of revising and adapting the language every time the question-answer pairs needed to be updated and/or expanded. The process would be as follows: the medical team would answer the questions first, then the linguists would revise and adapt the language of the question-answer pairs, which would then return to the medical team for approval. The process allowed for generating question-answer pairs that were both clear and accurate. The content of these questions and answers were maintained in a shared spreadsheet file. The decision to use this technology was made in the beginning of the project and took into consideration that it was a well-known easy to use technology to all the team. Nonetheless, as the number of questions increased and the dynamic nature of the updates required, the overhead generated by the need to coordinate the changes was felt by the team.
Regarding the relevance of the question-answer pairs, our team of medical researchers and linguists decided to identify frequently asked questions from comments left in relation to streaming videos of prestigious Brazilian physicians on YouTube about the disease. This decision gave us access to a wider range of disease related questions and doubts and enabled us to cover more ground in terms of what people wanted to know about it. Our team of applied linguists then developed a methodology for extracting and classifying the questions obtained through the comment session on YouTube. The process of answering these questions followed the same path described in the previous paragraph.
Research and action or Action and research? Most of the situations reported above emerged as a result of the need to meet both practical and research goals, a typical action research scenario. The dual-character of the project was that of serving the population and improving the healthcare service on the practical side, as well as producing knowledge in the field of healthcare informatics, or technology applied to healthcare, on the research side. On the action side, we have delivered a working solution that has been used by approximately 3,500 people from April 2020 to January 2021, peaking almost 1,000 users December 2020 roughly following the state's new COVID-19 confirmed cases trend. TeleCOVID has received more than 44,000 messages, and sent more than 90,000 messages to people in the cities it serves and their surroundings.
On the research side, we have tracked situations and problems that directed or influenced the course of our actionsthe telehealth intervention, as herein reported. This report consists of an observation of our own work process and generated products. In the end, it has been a reflective exercise, and next we discuss the lessons learned from our reflections.

Reflecting Upon the First Cycle
The last step in our action-research cycle requires us to review and organize what we have learned during our first cycle, what Susman & Evered called specifying learning [33]. This is a way to both organize our learning and prepare for the next cycle. Our reflections consist of a critical look at our work, that included the situations reported above, as well as others not included for the sake of space. They were collaboratively developed and refined both internally, among our team members, and externally, among other researchers from the Medical, Linguistics, and Computer Science fields each of us has had contact with. They will be discussed in the next section.

Lessons Learned
In this section we present the lessons learned through our research. To do so, we articulate the results generated regarding the whole action-research cycle and discuss these results. We have organized our lessons learned into two main contributions in regards to two critical aspects of our experience: the process required for a multidisciplinary team to build the TeleCOVID chatbot, and the basic framework in which the chatbot was built.

A Multidisciplinary Process for Building and Maintaining the TeleCOVID Chatbot
As a first overall result of our first action-research cycle we have articulated (see Fig. 5) the process we have created for working together in the development of the TeleCOVID chatbot, how the work of our multidisciplinary team is interconnectedly organized, as well as the data we collected and analyzed.
In Fig. 5, reading from right to left, first we have our team of Human-Computer Interaction (HCI) researchers who are involved in every aspect of the technological development of the chatbot, from the implementation to the evaluation of how users interact and perceive the technology. The HCI team also developed some parallel experimental versions of ANA for testing so that we could work on the improvements without disrupting the functioning of the version that was already operating and integrated with the telehealth center.
Following the assignments of the HCI team, the data scientists proceed with the collection and analysis of data. We collect data both from the chatbot log itself and from social media. From the chatbot, we collect the logs from each user's conversation both from the screening session diagnosis, as well as the topics of questions and answers that were most queried by the users. From social media, we collect comments left on videos posted by Brazilian physicians about COVID-19. We are mainly interested in the questions asked in the comments section that we analyze in search of new questions and topics to include in the chatbot's educational session. These questions will also be used as training phrases to help improve the chatbot's accuracy in recognizing variations of the same question when we include NLP capabilities in it.
Our team of applied linguists is responsible for the pre-analysis of the data that will be used to extend and update the chatbot's educational session. Currently, this entails the analysis of the most frequent questions identified both in the chatbot and in social media and the process of selecting and updating the question-answer pairs that are part of the educational session. This content is curated by both our team of linguists and our team of medical researchers. Together, they select which pairs of questions and answers need to be updated and which new questions will be answered and included in the educational session.  The process of updating the answers and creating new ones is done by our team of medical researchers in a process we call "text preparation". The updated version of the question-answer pairs, including the new ones added, is then revised by our team of linguists. This "text revision" entails a process of linguistic adaptation of the wording of questions and answers in order to simplify the language so that it is less technical and more colloquial and accessible to low literacy users. At the end of this process, the text returns to our team of medical researchers for approval. It is then uploaded in our chatbot so that it can be tested and evaluated.
The process is concluded after adjustments emerging after each test and evaluation are implemented and uploaded to the version of the chatbot integrated with the telehealth center.

A Reference Framework for the Next TeleCOVID Chatbot Versions
Our virtual assistant ANA has been hitherto able to relieve the healthcare system of the target locations by screening suspected COVID-19 cases and clarifying questions about the disease and the pandemic. However, the current version of the artificial agent is far from leveraging the full potential of chatbots' technology. Although ANA is a messaging agent, it can only understand a limited number of users' input messages and provide a limited number of responses. Both users' inputs and responses need to be predefined due to the limitations of the platform we use and the basic way we use it, as previously discussed. Although it has served our primary purpose of providing healthcare service, its limitations have ended up generating additional burdens that a more cutting-edge technology could, potentially, eliminate or minimize.
We envision some main points that existing technology can help to address. In Fig. 6, we depict a reference framework generated as a result of our research aimed at illustrating the technological components of a better equipped and improved TeleCOVID chatbot. On the left, we can see a component of user interfaces, in the plural form of the word, to emphasize that, ideally, we want a single chatbot application that can be accessed through different interfaces by users, such as a website, a mobile app, a standard messenger app, and using different input modes, such as text and voice (when possible). Users' messages are then first received by a conversation control component that directs the flow of the conversation. This is the role played by the BLiP platform flowchart, which has proved to be very effective in ensuring that users follow a controlled path of interaction, as is necessary for the screening session. However, for the educational session, user interaction can be more flexible and we can leverage the power of natural language processing (NLP) by artificial intelligence. Thus, a natural language engine is desirable at this point. It can act both as a means of interpreting users questions more flexibly and of handling messages that could not be interpreted at first (e.g. by simple pattern matching) or were unexpected at a certain point in the conversation. The other components have to do with the challenges we have faced in maintaining an updated database of questions and answers. Instead of hard coding them in the chatbot configuration, a better solution is probably working with a knowledge base depicted at the bottom right of the picture. This knowledge base is queried by means of a search engine, where additional intelligence might be leveraged in order to address contextual or even individual variation. For instance, changes in the authorities' recommendations throughout the time or a piece of news in a certain region could be addressed by the use of recommendation systems techniques e.g. [41]. This leads us to the need of a database of users and contexts, where we could store contextual relevant facts and characteristic user profiles (e.g. context and user models), based on which we can provide different answers with a single chatbot version. The two databases would be fed by our interactive process as discussed before. Finally, the conversation logs component would also serve as an information source for this analytical process of curation and preparation of the databases serving the search engine.
It is worth noticing that our framework differs in level and goal from other ones found in literature e.g. Adamopoulou & Moussiades's general chatbot architecture in [8]. Usually, they tend to be schemes for generalpurpose chatbots and technologically-oriented pictures. Our main goal here is to provide a socio-technical framework to support our work process, address the problems we faced, and provide a more adaptive interaction where we see needed, leveraging what worked well in our practice and in this application-specific chatbot. Furthermore, although we tried other chatbot platforms available in the market, we have temporarily concluded that the current one can support our needs by adding integration with NLP and a custom search engine. This is possible with a reasonable development effort and is desirable as a strategy not to disrupt the service. The external databases of knowledge base and users and context models should provide a more efficient way to support the dynamic maintenance of information by the team, decoupled from the more technically-sensitive core chatbot technology that only the technologists of the team can manage.

Limitations
While it could be argued that some of the problems faced or even most of them might have been avoided or that situations could have been better addressed, it is important to consider that the pandemic created a unique, novel and time pressing situation and no "manual" or guide was available at the time. The value of the project lies precisely on that, on what we could observe and were able to learn as we reflect back on it.
We do not mean to contend that our results and findings, organized as contributions in the section above, can be generalized. Generalizability is not considered to be an appropriate criterion of quality and validity for action research; rather, transferability, that is, the ability to transfer the findings from one context to another, is a more appropriate one [34], [36]. In this sense, our findings might be useful to others facing the task of having to develop a healthcare chatbot in a similar context. However, as put by Lincoln & Guba [42] (p. 298) as cited by Herr & Anderson [36] (p. 61-62), "the burden of proof [that the findings are transferable to a particular context] lies less with the original investigator than with the person seeking to make an application elsewhere". We cannot anticipate the sites and contexts where transferability might be sought, but we have tried to provide a sufficient description of our context in order to make such similarity judgements possible, as suggested by Lincoln & Guba.
Finally, we should notice that our organized contributions as presented in the previous section are rather limited because, although grounded in our first cycle experience, they have not been submitted to the test of practice yet. Action research is primarily an interactive process and its cyclic nature is critical to improve its quality [43]. At the end of a first cycle, what we learned is certainly not complete and proven knowledge. We do contend, however, that these findings are useful to us, since they will be applied in our next research cycles as the project continues.

Conclusion and Future Work
We have described the design of a novel chatbot as a prompt response to the COVID-19 pandemic, performing triage of patients and providing information about COVID-19 on a large scale and without human contact. We reflected on the development process and discussed lessons learned and recommendations to support a multidisciplinary development and evolution process of the chatbot, and we also identified some interactive and technological features that can be used as a reference framework for this kind of technology.
The next steps in our research include a formal and thorough assessment of the chatbot that will allow us to evaluate its usability and user experience. Particularly, for the screening session we plan to analyze how precise is the information about the symptoms the chatbot is able to collect from users, when compared to the information these users provide in person to the doctor. The results of these evaluations will also be used for improving the chatbot and guiding the steps in the next cycles of our action-research.