- Use the Botfather to register your bot
- Keep the token for step 4 below
- Set env var
TELEGRAM_TOKEN
to your bot token - Run
telegram_wrap.py
The base Azure Function image does not contain the necessary chromium packages to run selenium webdriver. This project creates a custom docker image with the required libraries such that it can be run as Azure Function.
- For more details, see blog https://towardsdatascience.com/how-to-create-a-selenium-web-scraper-in-azure-functions-f156fd074503
- Docker desktop
- Azure Container Registry
- Azure CLI
- Azure Core Tools version 2.x
- (optional) Visual Studio Code
Run the following commands that installs chromium, chrome driver and selenium on top of the Azure Function base image:
$acr_id = "<<your acr>>.azurecr.io"
docker login $acr_id -u <<your username>> -p <<your password>>
docker build --tag $acr_id/selenium .
docker push $acr_id/selenium:latest
Run the following commands:
$rg = "<<your resource group name>>"
$loc = "<<your location>>"
$plan = "<<your azure function plan P1v2>>"
$stor = "<<your storage account adhering to function>>"
$fun = "<<your azure function name>>"
$acr_id = "<<your acr>>.azurecr.io"
az group create -n $rg -l $loc
az storage account create -n $stor -g $rg --sku Standard_LRS
az appservice plan create --name $plan --resource-group $rg --sku P1v2 --is-linux
az functionapp create --resource-group $rg --os-type Linux --plan $plan --deployment-container-image-name $acr_id/selenium:latest --name $fun --storage-account $stor
docker build --tag telegrab.azurecr.io/selenium . && docker push $acr_id/selenium:latest && az functionapp config container set -n $fun -g $rg --docker-custom-image-name $acr_id/selenium:latest
- In Azure portal, under the Function, click Copy Function Url and copy the Url
curl -F "url=<< Azure Function Url >>" https://api.telegram.org/bot<< Your token >>/setWebhook
Every time you run telegram_wrap.py
, you unset the webhook. So you'll need to run the setWebhook
above command after every local run.
Test the Function in the portal or in your browser. The following code in init.py will return all URLs in the following webpage:
import azure.functions as func
from selenium import webdriver
def main(req: func.HttpRequest) -> func.HttpResponse:
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome("/usr/local/bin/chromedriver", chrome_options=chrome_options)
driver.get('http://www.ubuntu.com/')
links = driver.find_elements_by_tag_name("a")