mozilla/chronicle

Store all the things from Embedly

Closed this issue · 1 comments

There's lots of potentially neat stuff we could use from the Embedly extraction that we currently throw away. Let's figure out how to keep it all so that we don't have to re-extract every page when we find a good use for that data.

This might also be a good time to consider a pages table so that we only store one copy of this data.

deferring the creation of a pages table because we don't have our URL normalization story quite yet (#228), we'll get there eventually :-P