tech-comments: WSH Tips for Information Management

I was recently asked for some "Tips" on something which got me to thinking about messages that I had posted to various forums and newsgroups. Out of that came this idea to put some of them up as a blog. I will begin this and add some useful links and the beginning of a general discussion of the "Tips". As I gather code examples and new links I will try to update and extend the discussion. It's only a blog so I don't want to get too wordy and detailed about each item. I LL try and add links to code examples and demonstrations of usage so the blog entry will not get to far out of control.

The issue of gathering and formatting information in Windows Script Host (WSH) technologies has come up in various ways over the last couple of weeks. Along with this I have been playing with the new Microsoft PowerShell scripting environment which has a well formalized concept of formatted output. Since I have always promoted OOP concepts for scripting this sounded great to me. The questions I was fielding became a prompt for me to see what could be done to help admins understand the various techniques available for collecting complex information sets and formatting the output post-collection. Here are some of the elements available that can make collection and output easier while allowing us to decouple the collection code from the output formatting. All of these data collection objects have counterparts in PowerShell that work mostly in an identical way.

Collecting Data
· File
· Array
· Dictionary
· Recordset
· Dataset
· XML
Formatting Output
- Text
- Excel
- XML
- HTML

File: Using a file to collect information is very easy and useful only if the information has a flat schema such as a list or a table. Lists can be read and written with files using the Scripting.FileSystemObject (FSO) or the WShell redirectors StdIn, StdOut, StdErr. For many scripts this is all that is required and can be written pretty much in line. Use of the WShell.Echo command can enable file output by adding a file redirector to the command line when the script is launched. Most scripters have managed to learn this technique and some of the FSO and redirector techniques.

The problem with file IO is that is is limited to "flat" output unless we want to end up writing complicated code to manage a more complex information schema.

Array: The VBScript Array type is good and can be enlisted into complex hierarchical data arrangements. Array element can store other arrays and objects. Most scripters become skilled with arrays quickly up to a point. When the arrays are more complex or contain more than two dimensions the code complexity become a limiting factor. The plus of arrays is that the data, information, can be loaded into an array and formatted for output in multiple ways. This lends the array collection method to easy "encapsulation" in a function or class. The function/class can be reformatted as needed perhaps with multiple output formats selectable by the operator without having to disturb the data collection code. Arrays are good aid in creating "modularity" in script.

The array is a powerful tool within it's limits but it lacks a flexible indexing system. The next technique (Tip) overcomes this limitation in many ways.

Collecting Data

Tip #1 Use Dictionaries to collect your data.

Dictionary: The Scripting.Dictionary object is a an example of a "keyed" collection. Items are stored in the dictionary as key:item pairs. The key can be any type but is usually a string or a number. The key is used to retrieve the item stored at the key's location. Keys can be used to store contact information using the contacts phone number or name but not both. Dictionaries have a single index. For multiple indexes you need to use either arrays or Recordsets. Dictionaries can store complex objects at the key location. The items stored can be of different types but the keys should always be of the same data type such as String, Int, Double. Since VBScript always wraps all type in a Variant this can cause problems so good code design is required if you are using other than strings as keys.

The Dictionary object is more usable than an array because it is easier to manage the information stored in a Dictionary object. We can add a new item at any time and not be concerned about expanding the storage and we can remove an item at any time without having to shrink the storage. This overcomes one of the big headaches of using arrays.

The power and flexibility of the Dictionary object becomes even more apparent when we use a Dictionary to store other Dictionaries. We can collect a number of related items and save them with keys that we know like "name", "address" and Phoneno". We can then save this dictionary in a master dictionary using the name as it's key.

Example:

We need to gather certain user information from Active Directory. We want to be able to format this information in multiple ways depending on the output requested. We know we need the output as text and in Excel but we could be asked to provide the output as a web page or HTML file.

---- We will enumerate the users in the domain and pass the Active Directory path to a function along with our Master Dictionary. The function will create a
Dictionary object and add a list of the items we need to the dictionary. Finally the new dictionary will be added to out master dictionary with the users
SamAccountName as the key.

     Function AddUserToDict( aDSPath, oMasterDict )          Set oUser = GetObject( aDSPath )          Set oUdict = CreateObject("Scripting.Dictionary")          oUDict.Add "Address", oUser.Address          oUDict.Add "City", oUser.City          oMasterDict.Add oUser.sAMAccountName, oUDict       End Function

The example uses the sAMAccountName as the key because it is guaranteed to be a unique value withing the Domain. The user name and other elements of the user object may not be unique. We could also use the GUID of the user or the Kerberos name which are also unique. The exact approach to choosing the key will depend on the type and source of the information being stored into the Dictionary.

Tip #2 For Complex Typed Data Use a Recordset Object

The Recordset object and it's associates are not used much in scripting for storing complex data but it is available and can be designed to store data in very usable ways. For instance we can maintain the exact type of the data in the recordset so a number remains a number under most circumstances. The other great advantage of Recordsets is that they can be sorted in complex ways with relative efficiency. We can add to and delete from a Recordset. We can apply multiple indexes to a Recordset and we can update a Recordset using SQL or ADO.

Recordsets can be "persisted". This means that we can save a Recordset to permanent storage like a file. One form of storage that is very useful when our data needs to be formatted in multiple ways is an XML file.

Tip #3 For Very Complex Relational Data Use a DataSet Class (System.Data)

(Note: to use the DataSet Class you need to set up .NET Interop to make the .NET Classes available under Windows Script Host)

Tip #4 For Complex Hierarchical Data Use An XML File

Using hand built XML to create output.
Using the XMLDOM to create XML output
Using the Recordset Object to create XMLoutput

Formatting Output

Tip #5 Design Output to Handle Logical Steps

Out put designed to handle all of the logical steps of producing a docukment lends itself to format changes more easily than just proceeding to output according to the simple structure you want in teh current document.

A good document output code structure should have all of the following and possibly others.

StartOfDocument
EndOfDocument
DocumentElement
Document Page Header (these two can be blank functions but by including them in the flow we save on restructuring later.)
Document Page Footer

If we are using XMLDOM or the HTTPDOM the above are implied in the structure of the DOM and the XSL that will convert it.

Technorati : VBScript, WSH, jscript, scripting

tech-comments

Friday, June 09, 2006

WSH Tips for Information Management

No comments:

Post a Comment

Sapien MVP

Blog Archive