riverqueue/river

Allow to list jobs by tags

lucas-jacques opened this issue · 1 comments

Hello, since job's Metadata field is reserved for internal use by River, it would be convenient te be able to list jobs filtered by tags.

Proposed implementation

We can add a field tags to river.JobListParams

type JobListParams struct {
	after            *JobListCursor
	kinds            []string
	metadataFragment string
	overrodeState    bool
	paginationCount  int32
	queues           []string
	sortField        JobListOrderByField
	sortOrder        SortOrder
	states           []rivertype.JobState
	tags             []string 
}

And a method WithTags

func (p *JobListParams) WithTags(tags []string) *JobListParams {
     p.tags = tags
     return p
}

About the implementation, I think that we can add a condition in the query builded in this function:

func JobList(ctx context.Context, exec riverdriver.Executor, params *JobListParams) ([]*rivertype.JobRow, error) {

Maybe something like that:

if len(params.Tags) > 0 {
    for i, tag := range params.Tags {
        writeWhereOrAnd()
        tagName := fmt.Sprintf("tag%d", i)
	conditionsBuilder.WriteString(fmt.Sprintf("@%s = any(tags)",  tagName)
	namedArgs[tagName] = tag
    }
}

I think that would do the job. If you are interested i can make a PR with a real implementation.

Hi @lucas-jacques, the river_job.tags column is currently not indexed in any way. We're also generally opposed to adding more indexes to River for anything but the most common use cases, because doing so can significantly impact throughput on high volume queues. As such, this is probably not something we'll add to River itself because we don't want to add features we can't support well on most/all installations, or which limit River's scalability unnecessarily.

That being said, one of the great things about a system like River is that you are totally in control of your own schema and are free to add whatever indexes you like if you deem them worthwhile. You can also write a simple SQL query to list jobs however you'd like.

I hope that helps and that you understand why this isn't something we'd like to add to the core system. Cheers! ✌️

Edit:

I also want to add, as discussed in #165 I think we would like to improve the way metadata is handled on insert to avoid conflicts with other higher level systems that we're planning which will use the metadata field.

Finally, args and metadata both benefit from a GIST index on their jsonb payloads. While it might not be the answer you're looking for, instead of tags you could make use of an array field within your job args to take advantage of the existing index. This may still require a custom list query but wouldn't require index changes.