In the context of universal hype on Coronavirus, I decided to do at least something useful (but no less hype). In this article I’ll talk about how to create and deploy the Telegram Bot using Rule-Based NLP methods in 2.5 hours (that’s how much it took me) to answer FAQ questions using the COVID-19 case as an example.In the course of work, we will use the good old Python, Telegram API, a couple of standard NLP libraries, as well as Docker.

UFO Care Minute
The pandemic COVID-19, a potentially severe acute respiratory infection caused by the SARS-CoV-2 coronavirus (2019-nCoV), has officially been announced in the world. There is a lot of information on Habré on this topic - always remember that it can be both reliable / useful, and vice versa.
We urge you to be critical of any published information.
Wash your hands, take care of your loved ones, stay at home whenever possible and work remotely.
Read publications about: coronavirus | remote work
Brief Preface
This article describes the process of creating a simple Telegram Bot answering FAQ questions on COVID-19. Development technology is extremely simple and versatile, and can be used for any other cases. I emphasize once again that I do not pretend to be State of the Art, but only offer a simple and effective solution that can be reused.Since I believe that the reader of this article already has some experience with Python, we will assume that you already have Python 3.X installed and the necessary development tools (PyCharm, VS Code), you can create a Bot in Telegram via BotFather, and therefore, I will skip these things.1. Configure API
The first thing you need to install is the wrapper library for the Telegram API " python-telegram-bot ". The standard command for this is:pip install python-telegram-bot --upgrade
Next, we’ll build the framework of our small program by defining “handlers” for the following Bot events:- start - Bot's launch command;
- help - help command (help);
- message - text message processing;
- error - an error.
The signature of the handlers will look like this:def start(update, context):
pass
def help(update, context):
pass
def message(update, context):
pass
def error(update, context):
pass
Next, by analogy with the example from the library documentation, we define the main function in which we assign all these handlers and start the bot:def get_answer():
"""Start the bot."""
updater = Updater("Token", use_context=True)
dp = updater.dispatcher
dp.add_handler(CommandHandler("start", start))
dp.add_handler(CommandHandler("help", help))
dp.add_handler(MessageHandler(Filters.text, message))
dp.add_error_handler(error)
updater.start_polling()
updater.idle()
if __name__ == "__main__":
get_answer()
I draw your attention to the fact that there are 2 mechanisms how to launch a bot:- Standard Polling - periodic polling of Bot using standard Telegram API tools for new events (
updater.start_polling()
); - Webhook - we start our server with an endpoint, to which events from the bot arrive, it requires HTTPS.
As you already noticed, for simplicity we use the standard Polling.2. We fill in standard handlers with logic
Let's start with a simple one, fill in the start and help handlers with standard answers, we get something like this:def start(update, context):
"""Send a message when the command /start is issued."""
update.message.reply_text("""
!
COVID-19.
:
- * ?*
- * ?*
- * ?*
..
!
""", parse_mode=telegram.ParseMode.MARKDOWN)
def help(update, context):
"""Send a message when the command /help is issued."""
update.message.reply_text("""
( COVID-19).
:
- * ?*
- * ?*
- * ?*
..
!
""", parse_mode=telegram.ParseMode.MARKDOWN)
Now, when the user sends the / start or / help commands, they will receive the answer prescribed by us. I draw your attention to the fact that the text is formatted in Markdownparse_mode=telegram.ParseMode.MARKDOWN
Next, add error logging to the error handler:def error(update, context):
"""Log Errors caused by Updates."""
logger.warning('Update "%s" caused error "%s"', update, context.error)
Now, let's check whether our Bot works. Copy the whole code written in a single file, for example app.py . Add the necessary imports .Run the file and go to Telegram ( do not forget to insert your Token into the code ). We write the commands / start and / help and rejoice:
3. We process the message and generate a response
The first thing we need to answer the question is the Knowledge Base. The simplest thing you can do is create a simple json file in the form of Key-Value values, where Key is the text of the proposed question, and Value is the answer to the question. Knowledge Base Example:{
" ?": " — . - , , , . , , . , .",
" ?": " :\n \n \n \n \n\n , .",
" ?": " :\n- ( , , )\n- ( )",
}
The algorithm for answering the question will be as follows:- We get the text of the question from the user;
- Lemmatize all the words in the user's text;
- We do not clearly compare the resulting text with all the lemmatized questions from the knowledge base ( Levenshtein distance );
- We select the most “similar” question from the knowledge base;
- We send the answer to the selected question to the user.
To implement our plans, we need libraries: fuzzywuzzy (for fuzzy comparisons) and pymorphy2 (for lemmatization).Create a new file and implement the sounded algorithm:import json
from fuzzywuzzy import fuzz
import pymorphy2
morph = pymorphy2.MorphAnalyzer()
with open("faq.json") as json_file:
faq = json.load(json_file)
def classify_question(text):
text = ' '.join(morph.parse(word)[0].normal_form for word in text.split())
questions = list(faq.keys())
scores = list()
for question in questions:
norm_question = ' '.join(morph.parse(word)[0].normal_form for word in question.split())
scores.append(fuzz.token_sort_ratio(norm_question.lower(), text.lower()))
answer = faq[questions[scores.index(max(scores))]]
return answer
Before writing a message handler, we will write a function that saves the history of correspondence in a tsv file:def dump_data(user, question, answer):
username = user.username
full_name = user.full_name
id = user.id
str = """{username}\t{full_name}\t{id}\t{question}\t{answer}\n""".format(username=username,
full_name=full_name,
id=id,
question=question,
answer=answer)
with open("/data/dump.tsv", "a") as myfile:
myfile.write(str)
Now, use the method we wrote in the message text message handler:def message(update, context):
"""Answer the user message."""
answer = classify_question(update.message.text)
dump_data(update.message.from_user, update.message.text, answer)
update.message.reply_text(answer)
Voila, now go to Telegram and enjoy the writing:
4. Configure Docker and deploy the application
As the classic said: “If you execute, then it’s beautiful to execute.”, So that we have everything as people, we’ll configure containerization using Docker Compose.For this we need:- Create Dockerfile - defines the image of the container and the entry point;
- Create docker-compose.yml - launches many containers using a single Dockerfile (in our case it is not necessary, but in case you have many services, it will be useful.)
- Create boot.sh (the script is responsible directly for launching).
So, the contents of the Dockerfile:#
FROM python:3.6.6-slim
#
WORKDIR /home/alex/covid-bot
# requirements.txt
COPY requirements.txt ./
# Install required libs
RUN pip install --upgrade pip -r requirements.txt; exit 0
#
COPY data data
#
COPY app.py faq.json reply_generator.py boot.sh ./
#
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
#
RUN chmod +x boot.sh
#
ENTRYPOINT ["./boot.sh"]
The content of docker-compose.yml:# docker-compose
version: '2'
#
services:
bot:
restart: unless-stopped
image: covid19_rus_bot:latest
container_name: covid19_rus_bot
# boot.sh
environment:
- SERVICE_TYPE=covid19_rus_bot
# volume
volumes:
- ./data:/data
The contents of boot.sh:#!/bin/bash
if [ -n $SERVICE_TYPE ]
then
if [ $SERVICE_TYPE == "covid19_rus_bot" ]
then
exec python app.py
exit
fi
else
echo -e "SERVICE_TYPE not set\n"
fi
So, we are ready, in order to start all this you need to execute the following commands in the project folder:sudo docker build -t covid19_rus_bot:latest .
sudo docker-compose up
That's it, our bot is ready.Instead of a conclusion
As expected, all code is available in the repository .This approach, shown by me, can be applied in any case for answering FAQ questions, just customize the knowledge base! Regarding the knowledge base, it can also be improved by changing the structure of Key and Value to arrays, so each pair will be an array of potential questions on one topic and an array of potential answers to them (for a variety of answers, you can choose randomly). Naturally, the Rule-Based approach is not too flexible for scaling, but I am sure that this approach will withstand a knowledge base with about 500 questions.Those who have read to the end I invite you to try my Bot here .