Track subsets in larger dataset
Opened this issue · 1 comments
ccstan99 commented
We have consolidated lots of smaller subsets into larger, logically grouped subsets like blogs. However, it'd still be nice pull sources from a smaller subset that can be used in with pinecone metadata. Consider adding a column in MySQL 'domain' based on the 'url' to easily find smaller subsets.
mruwnik commented
Will this be:
- the domain of the url that is displayed to the user
- the domain of the source url (if provided)
- the domain of the place where the article was first found (e.g. from the alignment newsletter)