simonw/datasette

Refactor default views to use register_routes

Opened this issue · 10 comments

It would be much cleaner if Datasette's default views were all registered using the new register_routes() plugin hook. Could dramatically reduce the code in datasette/app.py.

The ideal fix here would be to rework my BaseView subclass mechanism to work with register_routes() so that those views don't have any special privileges above plugin-provided views.
Originally posted by @simonw in #864 (comment)

This would be a lot easier if I had extracted out the hash logic to a plugin, see:

There's a lot of complex logic in the DataView class, which handles conditionally returning content as .json or as HTML or as .csv.

That view subclasses AsgiView which is itself request-aware, so maybe I don't need to reconsider how those classes work - just figure out how to hook them up with register_routes.

The key to all of this may be the DatasetteRouter class. It deals with scope right now but if it internally dealt with request that could be enough to fix #864 by adding logic needed by the .add_message() mechanism.

datasette/datasette/app.py

Lines 904 to 938 in 0991ea7

class DatasetteRouter(AsgiRouter):
def __init__(self, datasette, routes):
self.ds = datasette
super().__init__(routes)
async def route_path(self, scope, receive, send, path):
# Strip off base_url if present before routing
base_url = self.ds.config("base_url")
if base_url != "/" and path.startswith(base_url):
path = "/" + path[len(base_url) :]
scope_modifications = {}
# Apply force_https_urls, if set
if (
self.ds.config("force_https_urls")
and scope["type"] == "http"
and scope.get("scheme") != "https"
):
scope_modifications["scheme"] = "https"
# Handle authentication
default_actor = scope.get("actor") or None
actor = None
for actor in pm.hook.actor_from_request(
datasette=self.ds, request=Request(scope, receive)
):
if callable(actor):
actor = actor()
if asyncio.iscoroutine(actor):
actor = await actor
if actor:
break
scope_modifications["actor"] = actor or default_actor
return await super().route_path(
dict(scope, **scope_modifications), receive, send, path
)

Since AsgiRouter is only used as the super-class of the DatasetteRouter class maybe I should get rid of AsgiRouter entirely - no point in having a Datasette-specific subclass of it if the parent class isn't ever used by anything else.

I could also rename it to just Router which is a nicer name than DatasetteRouter.

Maybe I could add a as_request_view method as an alternative to as_asgi:

class AsgiView:
async def dispatch_request(self, request, *args, **kwargs):
handler = getattr(self, request.method.lower(), None)
return await handler(request, *args, **kwargs)
@classmethod
def as_asgi(cls, *class_args, **class_kwargs):
async def view(scope, receive, send):
# Uses scope to create a request object, then dispatches that to
# self.get(...) or self.options(...) along with keyword arguments
# that were already tucked into scope["url_route"]["kwargs"] by
# the router, similar to how Django Channels works:
# https://channels.readthedocs.io/en/latest/topics/routing.html#urlrouter
request = Request(scope, receive)
self = view.view_class(*class_args, **class_kwargs)
response = await self.dispatch_request(
request, **scope["url_route"]["kwargs"]
)
await response.asgi_send(send)
view.view_class = cls
view.__doc__ = cls.__doc__
view.__module__ = cls.__module__
view.__name__ = cls.__name__
return view

Or I could teach the Router to spot the dispatch_request method and call it directly.

I'm going to ditch that AsgiView class too, by combining it into BaseView.

So now the problem is simpler: I need to get BaseView to a state where it can accept a shared request object and it can be used in conjunction with register_routes().

This code is interesting:

datasette/datasette/app.py

Lines 948 to 955 in 3bc2461

scope_modifications["actor"] = actor or default_actor
scope = dict(scope, **scope_modifications)
for regex, view in self.routes:
match = regex.match(path)
if match is not None:
new_scope = dict(scope, url_route={"kwargs": match.groupdict()})
try:
return await view(new_scope, receive, send)

I want to change the signature of that return await view(new_scope, receive, send) method to instead take (request, send) - so I can have a single shared request object that's created just once per HTTP request.

The problem is the scope modification: I have code that modifies the scope, but how should that impact a shared Request instance? Should its .scope be replaced with alternative scopes as it travels through the codebase?

I'm going to create the single Request() instance in the DatasetteRouter class - at the beginning of the route_path method:

datasette/datasette/app.py

Lines 905 to 925 in 3bc2461

class DatasetteRouter:
def __init__(self, datasette, routes):
self.ds = datasette
routes = routes or []
self.routes = [
# Compile any strings to regular expressions
((re.compile(pattern) if isinstance(pattern, str) else pattern), view)
for pattern, view in routes
]
async def __call__(self, scope, receive, send):
# Because we care about "foo/bar" v.s. "foo%2Fbar" we decode raw_path ourselves
path = scope["path"]
raw_path = scope.get("raw_path")
if raw_path:
path = raw_path.decode("ascii")
return await self.route_path(scope, receive, send, path)
async def route_path(self, scope, receive, send, path):
# Strip off base_url if present before routing
base_url = self.ds.config("base_url")

I've made enough progress on this to be able to solve the messages issue in #864. I may still complete this overall goal (registering internal views with register_routes()) as part of Datasette 0.45 but it would be OK if it slipped to a later release.