API implementation for crawling the news website Der Standard.
Use it with caution. Admins will hate you for it.
pip install git+https://github.com/basislagerservices/dstclient
The most convenient way to access API functions is with the DerStandardAPI
class.
This interface requires a database where results are stored automatically.
This example shows how the web API is used to download all postings in a live ticker.
from dstclient import DerStandardAPI, utils
async def main():
engine = await utils.sqlite_engine("/tmp/database.db")
api = DerStandardAPI(engine)
async with api.web() as web:
ticker = await web.get_ticker(1336696633613)
postings = []
async for thread in web.get_ticker_threads(ticker):
async for p in web.get_thread_postings(thread):
postings.append(p)
The web API without a database interface is available as the WebAPI
class.
SQLAlchemy is used as the ORM for the database. All types returned by the web API are SQLAlchemy types.
The unified API can be used to access the database.
DerStandardAPI.db()
returns a database session.
This example shows how all users in the database can be retrieved. See the SQLALchemy documentation for more details.
from dstclient import DerStandardAPI, User, utils
async def main():
engine = await utils.sqlite_engine("/tmp/database.db")
api = DerStandardAPI(engine)
async with api.db() as s:
users = (await s.execute(select(User))).scalars().all()
By default, the returned session is restricted so that the database is not modified.
Commits are not allowed and the database is rolled back after the session.
Pass the readonly=False
flag if this is not desired.