hbutani/spark-druid-olap

Avoid Druid Broker Bottleneck

jpullokkaran opened this issue · 2 comments

Eliminate broker as bottleneck in cases where large amount of data needs to be pulled out from Druid for subsequent processing in spark. One possible solution is to talk directly to Historical nodes.

For example:
SELECT c_name,
bal,
sales_prospects_amount
FROM (SELECT c_name,
Sum(c_acctbal) bal
FROM orderlineitempartsupplier
GROUP BY c_name
HAVING Sum(c_acctbal) > 1000)r1
JOIN (SELECT cname,
Sum(sales_prospects_amount) AS sales_prospects_amount
FROM sales_leads
GROUP BY c_name) r2
ON r1.c_name = r2.cname

+1

fixed with aae401f
set queryHistoricalServer = true,
see examples in HistoricalServerTest