sql - Fetch big tree without overhead in left tables -
my question more theoretical , why rdbms/drivers return data way it, not how find correct set, nor how find it. i'm pretty familiar sql, there 1 thing annoyed sense of economy.
consider following "class" graph:
a { field1, ..., field9 b_items = [ b1, ..., bn ] } b { field1, ..., field6 c_items = [ c1, ..., cm ] } c { field1, field2 }
we have few objects, each object has many b objects, , each b objects has lots of c objects. count(a) < count(b) << count(c)
.
now use rdbms store it, because relations cool , optimizers smart, can virtually in milliseconds, provided there plan , index set.
i'll skip table creation code, should obvious, , go straight select:
select * left join b on b.a_id = a.id left join c on c.b_id = b.id whatever
database server returns result set combined of columns tables, joined sort-of tree:
a.f1 .... a.f9 b.f1 .... b.f6 c.f1 c.f2 --------------------------------------------------- 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 3 3 ... more rows... 1 1 1 1 1 1 999 999 ↓ 1 1 1 2 2 2 1 1 1 1 1 2 2 2 2 2 ... more rows... 1 1 1 2 2 2 999 999 ... lots of rows ... 1 1 1 99 99 99 999 999 ↓ 2 2 2 -- oh there is, a[2] ... 5 5 5 null null null null null -- a[5] has no b_items ... 9 9 9 ...
the problem is, if has many columns, text, json, other heavy data, duplicated thousands of times match each product of +b+c join. why don't sql servers @ least not send me the same {a,b}-rows after first 1 in join group? ideally, see result:
[ { <a-fields>, b = [ { <b-fields>, c = [ { <c-fields> }, ... more c rows ] }, ... more b rows ] }, ... more rows ]
which pretty resembles need in memory on client-side. know can make more queries fetch less data, e.g. via a.id in (ids...)
or stored proc returning nulls on parasite rows, isn't relational model intended one-shot access? roundtrips heavy, , planner guesses. , real data graphs of 3 steps height (consider 5-10). why not make via single pass, without excessive traffic?
i'm fine duplicate cells in , b columns, because there not much, maybe i'm missing mainstream, sql , non-hacky google hides me many years.
thanks!
the way avoid duplicated data transfer use aggregate functions string_agg ()
or array_agg ()
. can aggregate data using jsonb functions. can single json object instead of tabular data, example:
select jsonb_agg(taba) ( select to_jsonb(taba) || jsonb_build_object('tabb', jsonb_agg(tabb)) taba taba left join ( select to_jsonb(tabb) || jsonb_build_object('tabc', jsonb_agg(to_jsonb(tabc))) tabb tabb join tabc on tabc.bid = tabb.id group tabb.id ) tabb on (tabb->>'aid')::int = taba.id group taba.id ) taba
Comments
Post a Comment