15 March, 2013

Enumerate eventlog: NETLOGON errors of broken secure channel - AD

If you have a big Active Directory, you will always have noise in the eventlog of your domain controllers which is not what you want because you might miss the wood... you know, you can't see the wood from the trees.
Sometimes the noisiest one is the NETLOGON service because whenever a machine which either forgot its password or has a broken secure channel or doesn't have an account in AD tries to connect to a DC, Netlogon service throws and error to the System log. Don't ask me why it's an error, in my view it should be a warning (tops) as it's not really an error of NETLOGON. Moreover, if I was MSFT, I would have made it an optional event turned on/off via registry, similar to the NTDS diagnostics events under HKLM\SYSTEM\CurrentControlSet\services\NTDS\Diagnostics.

Anyway, let's not dwell on it but try to do something about it. If you are a conscientious AD guy (and why wouldn't you be, we all are conscientious when it's about work ;) ) you want to make things right. First step: let's identify the machines which have broken secure channel. It shouldn't be difficult, the machine name is part of the event message. However, I have 100+ domain controllers, 30+ sites, reading through the eventlog on a regular basis is not an option. Need a script!

The script which you'll see at the bottom of the article is capable of:
  • Enumerating the netlogon events from a DC and parse the error message and the client name from the event description
  • Can work against 1 DC or a list of DCs in a specified site

Some interesting facts about the script. It uses Get-WinEvent command. If you use it remotely, it can be quite slow, e.g. let's list all events with EventID 5805 from the System Event log:
Get-WinEvent -ea SilentlyContinue -ComputerName c3poDC -LogName System | where{$_.id -eq 5805}

It takes a bit more than 30 seconds:

 
















Obviously, the biggest issue is that it takes all events and then filters to the eventid afterwards. Let's try a trick, hash table. It's in the help of the command that it takes filters in hash table format. Excellent, let's try this then:
Get-WinEvent -ea SilentlyContinue -ComputerName c3poDC -FilterHashtable @{LogName = "System"; id=5805}

Hmmm... not bad, 4 seconds, now we are talking.
















We can just dress up the script a bit:
  • take an integer which determines how many days we want to go back in the log (makes the query even quicker):
    $after = (Get-Date).adddays(-$lastday)
    Get-WinEvent -ea SilentlyContinue -ComputerName $srv -FilterHashtable @{LogName = "System"; StartTime = $after; id=5805}
  • take DC name which we want to query
  • take site name, enumerate the DCs in the site, and then run through them:
    $dclist = Get-ADDomainController -filter * | where{$_.site -ieq $site} | %{$_.name}
  • parse the client name from the event message
    $obj.Computer = [regex]::Match($_.message, "\d|\w+ failed").Value -ireplace " failed",""
  • make sure we only pick up a client name only once, so we have a unique list of clients with secure channel issues at the end:
    if($computerList -inotcontains $obj.Computer){
The full script with comments:
 param(      [string] $dcname = "",  
           [string] $site = "",  
           [int] $lastday = 2)  
 # if -dcname is not specified, but -site is, let's get the ist of DCs from that site  
 if(!$dcname -and $site){  
      Import-Module activedirectory  
      $dclist = Get-ADDomainController -filter * | where{$_.site -ieq $site} | %{$_.name}  
 }  
 else{  
      $dclist = @($dcname)  
 }  
 $dclist  
 # generate the start date for the eventlog query  
 $after = (Get-Date).adddays(-$lastday)  
 $objColl = $computerList = @()  
 if($dclist.length -gt 0){  
      foreach($srv in $dclist){  
           # get the netlogon 5805 events from the eventlog generated after the given date  
           Get-WinEvent -ea SilentlyContinue -ComputerName $srv -FilterHashtable @{LogName = "System"; StartTime = $after; id=5805} | %{  
                $obj = "" | select DC,Computer,message,Date  
                $obj.DC = $srv  
                # parse the computername from the event message  
                $obj.Computer = [regex]::Match($_.message, "\d|\w+ failed").Value -ireplace " failed",""  
                # if we haven't recorded the alert about the given computerm then record it  
                if($computerList -inotcontains $obj.Computer){  
                     $obj.message = $_.message.Split("`n")[1]  
                     $obj.Date = $_.TimeCreated  
                     # add the computername to an array where we can check if we have picked up an event on the given computer already  
                     $computerList += $obj.Computer  
                     $objColl += $obj  
                }  
           }  
      }  
 }  
 else{  
      Write-Host -ForegroundColor "red" "No DC or Site specified."  
 }  
 $objColl  


2 comments:

  1. I've read a few of your blog posts and I like what you are writing about.

    But I believe there's room for improvement :-) Take a look at this Powershell Best Practices article, especially #7: http://blogs.technet.com/b/heyscriptingguy/archive/2012/06/18/the-top-ten-powershell-best-practices-for-it-pros.aspx

    According to http://technet.microsoft.com/en-us/library/ee617217.aspx Get-ADDomainController supports a -SiteName and -DomainName switch which could be used to get rid of "where{$_.site -ieq $site}".

    Also I think (without having access to an AD lab right now) that "%{$_.name}" could be replaced with "select name -expand".

    Keep up the good work.

    ReplyDelete
  2. -SiteName is a good catch. I rewrote this one from another function which was only for Windows 2003 based AD, so no AD module, only ADSI stuff. So yes, that would make the query quicker especially if you have many sites and many domain controllers.

    With select -expand, well, it doesn't always work unfortunately, like with WMI objects. (it also wasn't available in PS 1.0, so %{$_.something} is a bad habit of mine ;)

    ReplyDelete