Garbage In Garbage Out

The Problem

When working with data in PowerShell the well-known idiom “Garbage In, Garbage Out” applies quite aptly. If you source data for any part of your script is bad, inaccurate or constructed wrong it could have bad consequences for the rest of your PowerShell script. For clarity I think this is a well understood concept, but I certainly think it bears repeating as we all need to be aware of the format of the data that is pulled and then reused in other parts of a script. This data could be stored in a variable, an array or an external file. Confirming the validity of the data is something that should be done prior to a script either going into QA or Production environments.

So what do we test and how do we test?

When constructing a script, especially one that may process dozens, hundreds or thousands of objects, the first step would be to test the code on one object. Consider this your Beta testing. Running a script against one object is a good way to validate that the data used to either modify it, or that is queried from it is correct and usable for the rest of the script. One this data has been verified not garbage, to prevent the garbage out symptom, then moving the object count up a few notches each test iteration until all objects that are to be queried or modified are done so successfully.

** Note ** This is also why logging changes to txt files for later examination can be a good roll-back enabler. Having the previous settings may allow one to reverse a change that caused an issue.

Testing would include displaying variables with Write-Host or displaying entire variables or even appending FL to a variable. FL isn’t needed that often, but every once in a while it may reveal more information. Here are some examples of displaying the contents of a variable:
[sourcecode language=”powershell”]
Write-host "Mailbox = $Mailbox"
$Malbox
$Mailbox |FL
[/sourcecode]
Another issue you may run into is finding an attribute of an object is not a single value, but a multi-valued property. This can lead to issues if a script did not treat the variable correctly. Additionally using filters or if statements may also muddy the waters and not pull the correct value as well. For example, -Like could be misused if you entered in the wrong values to reduce results down an expected set of values. If it were scoped too tight or too lose, the wrong data could then be used going forward. Crafting filters should take time and results should be verified before moving forward with the script.

Examples of Testing and Validation

OK. Enough with the theoreticals, we now need to move into the world of the practical. So for this section of the blog post we’ll cover some real world examples and how to handle the data to make sure it is what we need it to be.

SMTP Addresses

Since I do a lot of PowerShell with Exchange, a favorite example of mine is the Primary SMTP Address. This value can be stored as a multi-values attribute for mailboxes / user objects in Active Directory. If we were to perform the following, the results may not be as intended:
[sourcecode language=”powershell”]
$WindowsEmail = (Get-User Damian).WindowsEmailAddress
$PrimarySMTP = (Get-Mailbox Damian).PrimarySMTPAddress
[/sourcecode]
Both of these variables now contain a multi-values attribute:

As you can see the value is not a simple string. This could cause issue with other cmdlets or manipulations you have planned later in the script. A workaround for this is to simply use the ‘[String]’ transformation.

Now the variables display this:

PowerShell cmdlets used:

[sourcecode language=”powershell”]
$WindowsEmail = [String](Get-User Damian).WindowsEmailAddress
$PrimarySMTP = [String](Get-Mailbox Damian).PrimarySMTPAddress
[/sourcecode]
Splitting Values

In some cases, we may not need the entire values that is stored in a variable or an array. This may require us to split the value at a certain character and then selected just a section of the overall value. What you may find is that sometimes the split will work and sometimes a different method must be used in order to move forward with the correct value in the script:

Example 1
Splitting an SMTP address to pull just the domain:
[sourcecode language=”powershell”]
$FullSMTP = $SMTPAddress.Split('@')
[/sourcecode]
Example 2
[sourcecode language=”powershell”]
$FullSMTP = $SMTPAddress.Split([char]0x005A)
[/sourcecode]
However, these may not work and an error like the below may occur:



One possible fix is to change the variable from a multi-value variable into a string variable and THEN using split. The reason is that split cannot be used again certain types of variable stored values.
[sourcecode language=”powershell”]
$Address = [String]$SMTPAddress
$AddressArray = $Address.Split('@')
$AddressArray = $Address.Split([char]0x0040)
[/sourcecode]
Now we have no errors.

CSV Files (or other property dumps)

Another data handling technique is to export properties to CSV files. These CSV files could be used for documentation potentially or for feeding a later section of a script to perform a secondary action based on the data stored in the file. Now if your CSV files is corrupted, then you could potentially have an issue. Corrupted data could then cause an issue later on in the script and cause bad results.

When exporting certain values and attributes, if an attribute has a comma in it, it could corrupt the way that it is either read into PowerShell or exported to a CSV. So one solution is to come up with a character that should not be present, like ‘/’ for example.

Export:
[sourcecode language=”powershell”]
Export-Csv -Delimiter '/'
[/sourcecode]
Import
[sourcecode language=”powershell”]
Import-Csv -Delimiter '/'
[/sourcecode]
Now we have data to work with and can proceed.

Conclusions

Yes, the age old adage of Garbage In / Garbage Out is still valid with modern computing and scripting, and in our case, PowerShell scripting. As such a good skill to have is to be mindful of what the expected results are to be for a script which can help validate the changes made so that you do not end up with the above issue. Also remember, the above article is not comprehensive in explanations of all scenarios where PowerShell, or you, could pull incorrect data. Validate, test and check before moving forward with any data gathered with PowerShell queries.

Related Post

CIM to WMICIM to WMI

Years ago WMI was de-emphasized for CIM. Wait. Do you know what those are? CIM – “The Common Information Model (CIM) is an extensible, object-oriented data model that contains information