23 February, 2014

Parse OU location from DistinguishedName - AD

This post is just a bit of breadcrumb of Powershell bits. I've got some scripts which run regularly and have to analyse 100 000+ AD objects. It can take hours to run them, so every bit of code that can make one iteration in the loop a couple of milliseconds quicker can pay significant dividends when running against many objects.

As I was looking through my 3 years old code, I noticed an ugly solution (we all do these things, don't we). I needed to get the OU location of each object, so I decided to take the DistinguishedName attribute and drop the name of the object from the beginning of string therefore I end up with the full LDAP formatted path of the object (could have taken the CanonicalName attibute in reverse order and replace '\' with 'cn=' or 'dn=' or 'ou=', but then I would have to lookup each of those elements to figure if they are OUs or containers...etc.)

Let's take an example, the dinstinguishedName of an object is "CN=DroidServer,OU=ChalmunsCantina,OU=MosEisley,DC=tatooine,DC=com", so the LDAP path of the object can be determined by dropping the first part of this string before the first comma which leaves us with: "OU=ChalmunsCantina,OU=MosEisley,DC=tatooine,DC=com".

First attempt - original code in my script

Easy, lets split the string based on commas, put the elements into an array and drop the first element, then join the elements into a string again (now without the cn=objectname piece):
 $distinguishedName = "CN=DroidServer,OU=ChalmunsCantina,OU=MosEisley,DC=tatooine,DC=com"  
 $arrDN = New-Object System.Collections.ArrayList  
 $tmparr = $distinguishedName.Split(",")  
 $tmparr | %{[void]$arrDN.add($_)}  
 $arrDN.RemoveAt(0)  
 $accLocation = [string]::join(",",$arrDN)  
 $accLocation  

This will take 96.5 milliseconds on my machine.
96 milliseconds, fair enough, it's quicker than me doing this on paper.

Second attempt

Let's get rid of the foreach-object (%) when adding elements to $tmpArr and use the .AddRange method of the ArrayList instead - this will just add all elements in one go instead of going through element by element:
 $distinguishedName = "CN=DroidServer,OU=ChalmunsCantina,OU=MosEisley,DC=tatooine,DC=com"  
 $arrDN = New-Object System.Collections.ArrayList  
 $tmparr = $distinguishedName.Split(",")  
 [void]$arrDN.addrange($tmparr)  
 $arrDN.RemoveAt(0)  
 $accLocation = [string]::join(",",$arrDN)  
 $accLocation  


25 milliseconds, not bad, 4 times quicker.
 

Third attempt

To see if it can be even quicker, we'll need to "thinking outside the box" and see if there's any simpler solution than working with arrays and instead do this in one step and drop the first bit of the string which we don't need.
It's not obvious in PowerShell because the -replace operator does not support the regular expressions which refer only to the first occurrence in a string. What we can do is make it drop all characters which are not commas and they are followed by a comma, that would make sure the "cn=computername," string is dropped and we end up with the full LDAP path of the object:
 $distinguishedName = "CN=DroidServer,OU=ChalmunsCantina,OU=MosEisley,DC=tatooine,DC=com"  
 $accLocation = $distinguishedName -creplace "^[^,]*,",""  
 $accLocation  

Explanation for the regex pattern:
  • ^       start of the string
  • [^,]*   match one or more non-comma characters
  • ,       match a comma character
 
0.4669 milliseconds!
200 times quicker than the first solution! With 100 000 objects, originally it takes 160 minutes (obviously in real life it will be less because of caching...etc.) and with the 3rd solution it should take a bit less than a minute. Maybe it can be quicker with some better trick, but I'm not greedy, I've shaved off ~2.5 hours runtime, it's good enough for me... for today...

t

5 comments:

  1. Hi, this is the best solution I have seen for this common problem, nice and clean. Any reason for case sensitive replace ?

    ReplyDelete
    Replies
    1. I've been staring at it but I don't remember why I did that, probably I was playing with different regex patters before getting to this solution and left it there. Good catch!

      Delete
  2. What if the common name of your object contained a comma?
    CN=Doe, John,OU=ChalmunsCantina,OU=MosEisley,DC=tatooine,DC=com

    ReplyDelete
    Replies
    1. You are correct, life is not simple, so what I usually have in real life is a much longer script.

      What do in this case is post processing of data. So I run through all DNs with the above mentioned script and then check if the first 3 chars of the result is \w\w= (so character, character and then an =). If not, then drop all characters again until the first comma.

      Delete
  3. Hi, I was Googling around as to why split-path can't parse DNs and I came across your post. Powershell's Regex implementation actually does support matching up to the first occurrence of character. The following I find much clearer than your method:

    -replace "^(.*?,)"

    In plain English, it means match starting at the beginning of the string, all characters, but only repeat matching all characters long enough to arrive at a comma. The question mark in this case makes the asterisk "lazy."

    ReplyDelete