This is a fork of the original housing scrapper, adapted to the colombian market.
This was tested with Python 3.11.
To install dependencies:
python3.8 -m venv env
pip3 install -r requirements.txt
brew instal openssl
export LDFLAGS="-L/usr/local/opt/openssl/lib"
export CPPFLAGS="-I/usr/local/opt/openssl/include"
export PKG_CONFIG_PATH="/usr/local/opt/openssl/lib/pkgconfig"
sqlite properties.db init.db
There's a configuration.sample.yml
that you can use as a template for your configuration. Copy that file to a new one
in the root folder and name it configuration.yml
You need to configure two aspects of the script: the listing providers and the notifier.
For the notifier you need to create a Telegram bot first: Create a Telegram bot
Creating the bot will give you an authorization token. Save it for later, you'll need it.
A bot can't talk with you directly, you have two options: you talk to it first, thus allowing it to reply to you, or you
can add it to a group. Whatever option you choose, you need to get the chat_id
of either your account or the group.
After you've done either of the above, run this little script to find the chat_id
(replace with your authorization token):
import telegram
MY_TOKEN='<insert your telegram token>'
bot = telegram.Bot(token=MY_TOKEN)
print([u.message.chat.id for u in bot.get_updates()])
You'll see a list with an element, that's the chat_id
you need to save for later. Write it down :-)
With the authorization token and the chat id you can now configure the notifier. Here's an example:
notifier:
messages:
- 'Hey, I have found new properties. Check them out:'
- 'I hope it is lucky day today:'
enabled: true
chat_id: <CHAT_ID>
token: <TOKEN>
One down, one more to go. Now we need to configure the providers. For the sake of simplicity I'll include a sample, which I hope will be good enough:
providers:
zonaprop:
base_url: 'https://www.zonaprop.com.ar'
sources:
- '/departamentos-alquiler-2-habitaciones.html'
- '/ph-alquiler-2-habitaciones.html'
argenprop:
base_url: 'https://www.argenprop.com'
sources:
- '/departamento-alquiler-pais-argentina-2-dormitorios'
- '/ph-alquiler-pais-argentina-2-dormitorios'
mercadolibre:
base_url: 'https://inmuebles.mercadolibre.com.ar'
sources:
- '/departamentos/alquiler/2-dormitorios/'
- '/casas/alquiler/2-dormitorios/'
properati:
base_url: 'https://www.properati.com.ar'
sources:
- '/departamento/alquiler/ambientes:2'
inmobusqueda:
base_url: 'https://www.inmobusqueda.com.ar'
sources:
- '/departamento-alquiler-la-plata-casco-urbano.html?cambientes=2.'
If you have issues with SSL certificates you can disable SSL validation with the attribute disable_ssl
, by default it
is enabled.
One final step, you need to initialize the database. Just run python3 setup.py
and that's it. It will create a sqlite3
db file in the root folder.
You're all set. Now run python3 main.py
and sit tight!
Well, perhaps testing
is a big word for this. You can run a module that tests that the providers configured can properly scrap information. If they work, you should see the listings in your console.
To test: python3 -m tests
That's up to you. What I've found more useful is to run it once an hour. For that I put it in the crontab:
0 * * * * cd /<PATH_TO_PROJECT>/housing_tracker && python3 main.py >> run.log 2>&1
https://hub.docker.com/r/pgiu/housing_scraper
- add hooks to lint on commit