seanbreckenridge/browserexport

firefox/chrome: support `from_visit` field

karlicoss opened this issue · 3 comments

Might be potentially interesting for Promnesia. Although also possible that the utility is very marginal and traversing history by timestamps is good enough.

  • firefox: moz_historyvisits.from_visit. On mobile the field is present but seems to always be NULL
  • chrome: visits.from_visit

Perhaps belongs to metadata, but also means that we'd need to keep original visit ID from the sqlite database to match the visit. And even worse, when all visits from different historic exports are merged in a single stream, the ids don't make sense anymore (they are 'internal' to a specific export in general). Although this would be possible to workaround if we somehow remap the ids in browserexport itself

Ah -- yeah, somewhat similar to #16 in complexity, as that requires 'merging' the results back into a database, which would require mangling the from_visit field somehow.

It does become pretty useless across exports -- maybe it would be better to resolve the from_visit id to the URL instead? Previously had thought about making an ordered list of visits after merging; resolving the from_visit id to the URL in individual exports, and iterating backwards through merged visits till you hit that URL, connecting those two? May not always be accurate, but may be useful anyways?

So you'd have something like:

class Metadata(NamedTuple):
    title: Optional[str] = None
    description: Optional[str] = None
    preview_image: Optional[str] = None
    duration: Optional[Second] = None
	from_url: Optional[str] = None

Could also expose the id, but I think promnesia gets the same improvement if its the URL

Remapping the IDs in browserexport sounds like a pain... would make this less functional and require connecting lots of internal state, which is a similar reason to why I've put #16 on low prio.

Ah, the reason I was thinking of ids is because it might be interesting to trace through visits via from_visit, like a chain of visits, and that wouldn't be possible with just URL. Perhaps would make more sense to just keep a Visit reference instead (although this would mess with caching)

But tbh, it's worth just playing with the databases and checking first -- seems that from_visit is often unset (for whatever reason), and perhaps in practice these 'chains' of visits end up too short, so may be not worth bothering

from_visit is often unset

I think it may depend on the visit_type (in firefox at least). It would only be set if visit_type was maybe 1,5 or 6? in other cases the source isnt another link