azure data lake - U-SQL Paralell reading from SQL Table -


i have scenario in ingesting data ms sql db azure data lake using u-sql. table quite big, on 16 millions records (soon more). select a, b, c dbo.mytable;

i realized, however, 1 vertex used read table.

enter image description here

my question is, there way leverage parallelism while reading sql table?

i don't believe parallelism external data sources supported yet u-sql (although happy corrected). if feel important missing feature can create request , vote here:

https://feedback.azure.com/forums/327234-data-lake

as workaround, manually parallelise queries, depending on columns available in datasource. eg date

// external query working use database youradladb;  // create external query year 2016 @results2016 =     select *     external yoursqldbdatasource execute          @"select * dbo.yourbigtable (nolock) yourdatecol between '1 jan 2016 , 31 dec 2016'";   // create external query year 2017 @results2017 =     select *     external yoursqldbdatasource execute          @"select * dbo.yourbigtable (nolock) yourdatecol between '1 jan 2017 , 31 dec 2017";   // output 2016 results output @results2016 "/output/bigtable/results2016.csv" using outputters.csv();   // output 2017 results output @results2017 "/output/bigtable/results2017.csv" using outputters.csv(); 

now, have created different issue breaking files multiple parts. read these using filesets parallelise, eg:

@input =     extract              ... // column list     "/output/bigtable/results{year}.csv"     using extractors.csv(); 

i ask why choosing move such large file lake given adla , u-sql offer ability query data lives. can explain further?


Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -