RobThree/MongoRepository

Return only specific fields from repository query?

mellodev opened this issue · 2 comments

I have a collection of documents who have a group of simple primitive fields (id, name, status) but also have a very large array on each. As an optimization to reduce query time I would like to only return the light weight document fields when querying the collection, similar to how a "select col1,col2,col3 from tbl" would work in SQL vs the "select * from tbl" I feel that mongo/mongorepository is doing.

Example:
Assume document model { id: 'string', name: 'string', bigarray: [{lots:'data'}]}

For a specific query, I only want to return models with id and name property to save bandwidth and query time by not returning the large payload of the bigarray property. I tried the following, but it feels like it's returning the entire document prior to creating the sparse model.

Repository.SearchFor(c=>c.id=="1234").Select(d=>new model(){ id=d.id, name=d.name}).ToList();

Here's my IRepository SearchFor implementation:

public IQueryable SearchFor(System.Linq.Expressions.Expression<Func<Model, bool>> predicate)
{
return Repository.Where(predicate);
}

Is it possible to return only specific fields from collection documents to create sparse models? I know about BSONIgnore, but that doesn't suit my needs, I need the bigarray property persisted since other queries need to return the full document and rely on that field.

As a longer term fix I'm in the process of refactoring the schema to move the "bigarray" property into it's own collection, and link with a relationship id, but I'm looking for a fix for the interim.

Thanks!

Is it possible to return only specific fields from collection documents to create sparse models?

Not at the "MongoRepository" level. At least, not in a way you want to.

Assume the following program:

using MongoRepository;
using System.Linq;

class Program
{
    static void Main(string[] args)
    {
        var small_repo = new MongoRepository<SmallObject>();
        var large_repo = new MongoRepository<LargeObject>();
        var values = Enumerable.Range(0, 10000).ToArray();


        small_repo.Add(Enumerable.Range(0, 100).Select(i => new SmallObject { Name = "so" + i }));

        large_repo.Add(Enumerable.Range(0, 100).Select(i => new LargeObject { Name = "bo" + i, Values = values }));
    }
}

[CollectionName("myobjects")]
public class SmallObject : Entity
{
    public string Name { get; set; }
}

[CollectionName("myobjects")]
public class LargeObject : Entity
{
    public string Name { get; set; }
    public int[] Values { get; set; }
}

This would create a single collection of small and large objects. You could retrieve a large object as a small object because it happens to have the same fields:

var result = small_repo.Where(o => o.Name == "bo65").Single();

However, as profiling shows the entire object is returned and 'sent over the wire:

{
  "op" : "query",
  "ns" : "TestObjects.myobjects",
  "query" : {
    "Name" : "bo65"
  },
  //...
  "nscanned" : 200,
  "nscannedObjects" : 200,
  //...
  "nreturned" : 1,
  "responseLength" : 98960,
  //...
}

(See the responseLength key)

This makes sense since we have no way of specifying which fields of the document to return; all MongoRepository has/knows are the entities. You could go around MongoRepository using the underlying C# driver for MongoDB and use a projection.

Thank you, that's what I needed to know!