{"id":2898,"date":"2016-04-23T05:50:09","date_gmt":"2016-04-23T05:50:09","guid":{"rendered":"https:\/\/www.wapshere.com\/missmiis\/?p=2898"},"modified":"2016-04-23T05:50:09","modified_gmt":"2016-04-23T05:50:09","slug":"powershell-split-a-large-csv-and-process-in-multiple-jobs","status":"publish","type":"post","link":"https:\/\/www.wapshere.com\/missmiis\/powershell-split-a-large-csv-and-process-in-multiple-jobs","title":{"rendered":"PowerShell: Split a large CSV and process in multiple Jobs"},"content":{"rendered":"<p>I had a large CSV of data to be loaded\u00c2\u00a0in through the FIM Service. Single threading this operation could have taken (literally) days, so I decided to have a go at multi-threading it, and here&#8217;s the skeleton script.<!--more--><\/p>\n<pre>$csv = import-csv &lt;my file&gt;\r\n$JobRows=1000 ## The number of CSV rows to process in each Job\r\n$NumJobs = [math]::Ceiling($csv.count \/ $JobRows)\r\n\r\nfor ($i=0; $i -lt $NumJobs; $i++)\r\n{\r\n    [int]$StartRow = ($i * $JobRows)\r\n    [int]$EndRow=(($i+1) * $JobRows - 1)\r\n    write-host (\"Rows {0} to {1}\" -f $StartRow.ToString(),$EndRow.ToString())\r\n\r\n    Start-Job -ArgumentList (,$csv[$StartRow..$EndRow]) -ScriptBlock { \r\n        PARAM ($CSVRows)\r\n\r\n        foreach ($row in $CSVRows)\r\n        {\r\n            &lt;... do something ...&gt;\r\n        }\r\n    }\r\n}\r\n<\/pre>\n<p>A couple of points when using this with the FIM or MIM service:<\/p>\n<ul>\n<li>The FIMAutomation snapin and any function scripts you want to use have to be loaded inside the Start-Job script block, after\u00c2\u00a0the &#8220;PARAM and before the &#8220;foreach&#8221;.<\/li>\n<li>It&#8217;s hard to see what is going on inside each Job. Writing out a log file inside the script block\u00c2\u00a0does slow it down but helps with keeping track of progress. I used a timestamped file name to save having to pass extra parameters into the script block:\n<ul>\n<li>$LogFile = &#8220;C:\\Temp\\{0}.txt&#8221; -f (get-date -format &#8220;s&#8221;).Replace(&#8220;-&#8220;,&#8221;&#8221;).Replace(&#8220;:&#8221;,&#8221;&#8221;).Replace(&#8221; &#8220;,&#8221;&#8221;)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>And some general notes:<\/p>\n<ul>\n<li>It was necessary to pass the array as &#8220;(,$csv[$StartRow..$EndRow])&#8221;\u00c2\u00a0with the leading &#8220;(,&#8221; and closing &#8220;)&#8221;. I&#8217;m not sure what was happening when I just passed &#8220;$csv[$StartRow..$EndRow]&#8221; but it definitely\u00c2\u00a0wasn&#8217;t working. Apparently doing it this way forces it to be\u00c2\u00a0passed to the script block as an array.<\/li>\n<li>You can check the status of the jobs using Get-Job.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>I had a large CSV of data to be loaded\u00c2\u00a0in through the FIM Service. Single threading this operation could have taken (literally) days, so I decided to have a go at multi-threading it, and here&#8217;s the skeleton script.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":[]},"categories":[23,50],"tags":[],"class_list":["post-2898","post","type-post","status-publish","format-standard","hentry","category-powershell","category-scripting"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pkp1o-KK","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.wapshere.com\/missmiis\/wp-json\/wp\/v2\/posts\/2898","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.wapshere.com\/missmiis\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.wapshere.com\/missmiis\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.wapshere.com\/missmiis\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.wapshere.com\/missmiis\/wp-json\/wp\/v2\/comments?post=2898"}],"version-history":[{"count":2,"href":"https:\/\/www.wapshere.com\/missmiis\/wp-json\/wp\/v2\/posts\/2898\/revisions"}],"predecessor-version":[{"id":2900,"href":"https:\/\/www.wapshere.com\/missmiis\/wp-json\/wp\/v2\/posts\/2898\/revisions\/2900"}],"wp:attachment":[{"href":"https:\/\/www.wapshere.com\/missmiis\/wp-json\/wp\/v2\/media?parent=2898"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.wapshere.com\/missmiis\/wp-json\/wp\/v2\/categories?post=2898"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.wapshere.com\/missmiis\/wp-json\/wp\/v2\/tags?post=2898"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}