New Script? Be Prepared!

While writing up some new scripts I ran into some issues which resulting in some troubleshooting, brainstorming and re-writing of the original code.  This article was written to pass along some of the ‘lessons learned’:

(A) Large Data Sets – Scripts that work in small environments will often fail or run very slowly in a larger environment.  Additional considerations, coding and workarounds may be needed when designing and writing scripts for larger environments.

Examples

Progress bar: If you cannot quantify the entire set of data, using Write-Progress might be a waste of time. Once the Write-Progress process in PowerShell gets over a certain percentage (say 200%) errors begin to occur. The script is still running in the background, but ref error messages will appear for this.

I have run into this specifically when using paging with Office 365 connections. For message tracing paging means that we can examine 5000 messages at a time up to a total of 200 pages of these. The problem is that unless you know the total number of pages, then calculating the max value to use for a counter and calculate a percentage will cause write-progress to be inaccurate. Similarly, when pulling information via EWS, there are limits of around 1000 for some and 10000 for others where calculating totals can cause similar issues.

The lesson here is if you can calculate totals and thus percentages, then Write-Progress will be your friend. Otherwise the Write-Progress cmdlet will generate a lot of errors.

Office 365 Throttling: This one is hard to quantify because Microsoft will throttle large data sets. There is not set limit of value that will trigger this throttling. Best that you can do is to try not to query a large number of objects at a time. Perhaps building in some sort of pauses. However, this is also not guaranteed to remove Microsoft’s throttling. In other words, be prepared for when it happens and plan to deal with those delays if the data is more urgent.

Examples of Throttling:
PowerShell:


And

Admin Console (Yes. The Admin Console can also be throttled)

Filter vs. Where: Use Filter whenever possible and use Where as a last result. This will help with querying large data sets.

(B) Files ingested for processing were incorrect – when working with a set of files for data process in PowerShell, it is better to make sure that the files pulled will be as accurate as possible. What does this mean?

For example if a script exports a series of CSV files to a given path and then another part of the script dumps CSV files to the same file location, then the same script ingests all CSV files instead of a particular subset that needs to be compared.
For example:
A PowerShell script exports these CSV files:
[sourcecode language=”powershell”]
User1-MailItems.csv
User2-MailItems.csv
User1-Senders.csv
User2-Senders.csv
[/sourcecode]
As you can tell by the naming, there are two sets of data. Now if we were to accidentally tell PowerShell to grab all *.csv files, we would have too many for our comparison. Make sure to refine your file search to exactly the ones you need, so the query would be something like ‘*-MailItems.csv’ or ‘*-Senders.csv’. That would create two sets of data with two CSV files in each.
(C) Unique isn’t always Unique: PowerShell has many ways to handle data to ensure uniqueness in a particular set. I won’t dig too far into this as there is a great article already out there that can explain this:

https://blogs.technet.microsoft.com/heyscriptingguy/2012/01/15/use-powershell-to-choose-unique-objects-from-a-sorted-list/

So make sure you choose the correct ‘unique’ that you need.
(D) Break up your Script: While working on troubleshooting issue, pull out chunks of code to validate they work when not inside a larger script. This could include function code or code run from within a loop.

Example
Function 1 creates reference files for Function 2
Function 2 fails.

An easy way to troubleshoot the second function is to copy the files created by Function 1 and drop them into a working file directory (i.e. c:\Scripts\Working). Then take the PowerShell code out of Function 2, add some additional error checking (Try/Catch) / reporting (write-host or write-verbose) depending on what Function 2 does and re-run the script against these files. Once this function is fixed, place the code back into the original function shell (Function

Related Post