
I don't work a lot with RegEx but when I do, I use tools like PowerRegex from Sapien, RegExr, the technet help for about_Regular_Expressions or RegExlib.com. And to be honest, most of the time I'm trying to avoid it...trying to find a solution the "PowerShell Way" before trying with Regex...
Problem
So here is what he asked meOut of the following string "OU=MTL1,OU=CORP,DC=FX,DC=LAB" (Which is a Distinguished Name), he wanted to get the name "MTL1", (SiteCode for Montreal).
Solutions
I came up with the two solutions:The PowerShell way
("OU=MTL1,OU=CORP,DC=FX,DC=LAB" -split ",")[0].substring(3)
Using RegEx
("OU=MTL1,OU=CORP,DC=FX,DC=LAB" -split ',*..=')[1]
Note: Please leave a comment if you know a better way, I would be curious to learn more.
Solutions proposed by readers
Jay
'OU=MTL1,OU=CORP,DC=FX,DC=LAB' -match '(?<=(^OU=))\w*(?=(,))' $matches[0]
Robert Westerlund
"OU=MTL1,OU=CORP,DC=FX,DC=LAB" -match "^OU=(?<MTL1>[^,]*)" $matches["MTL1"]
Steps to solution
First let's check the methods and properties available using Get-MemberPS C:\> "OU=MTL1,OU=CORP,DC=FX,DC=LAB" | get-member
TypeName: System.String Name MemberType Definition ---- ---------- ---------- Clone Method System.Object Clone(), System.Object ICloneable.Clone() CompareTo Method int CompareTo(System.Object value), int CompareTo(string ... Contains Method bool Contains(string value) CopyTo Method void CopyTo(int sourceIndex, char[] destination, int dest... EndsWith Method bool EndsWith(string value), bool EndsWith(string value, ... Equals Method bool Equals(System.Object obj), bool Equals(string value)... GetEnumerator Method System.CharEnumerator GetEnumerator(), System.Collections... GetHashCode Method int GetHashCode() GetType Method type GetType() GetTypeCode Method System.TypeCode GetTypeCode(), System.TypeCode IConvertib... IndexOf Method int IndexOf(char value), int IndexOf(char value, int star... IndexOfAny Method int IndexOfAny(char[] anyOf), int IndexOfAny(char[] anyOf... Insert Method string Insert(int startIndex, string value) IsNormalized Method bool IsNormalized(), bool IsNormalized(System.Text.Normal... LastIndexOf Method int LastIndexOf(char value), int LastIndexOf(char value, ... LastIndexOfAny Method int LastIndexOfAny(char[] anyOf), int LastIndexOfAny(char... Normalize Method string Normalize(), string Normalize(System.Text.Normaliz... PadLeft Method string PadLeft(int totalWidth), string PadLeft(int totalW... PadRight Method string PadRight(int totalWidth), string PadRight(int tota... Remove Method string Remove(int startIndex, int count), string Remove(i... Replace Method string Replace(char oldChar, char newChar), string Replac... Split Method string[] Split(Params char[] separator), string[] Split(c... StartsWith Method bool StartsWith(string value), bool StartsWith(string val... Substring Method string Substring(int startIndex), string Substring(int st... ToBoolean Method bool IConvertible.ToBoolean(System.IFormatProvider provider) ToByte Method byte IConvertible.ToByte(System.IFormatProvider provider) ToChar Method char IConvertible.ToChar(System.IFormatProvider provider) ToCharArray Method char[] ToCharArray(), char[] ToCharArray(int startIndex, ... ToDateTime Method datetime IConvertible.ToDateTime(System.IFormatProvider p... ToDecimal Method decimal IConvertible.ToDecimal(System.IFormatProvider pro... ToDouble Method double IConvertible.ToDouble(System.IFormatProvider provi... ToInt16 Method int16 IConvertible.ToInt16(System.IFormatProvider provider) ToInt32 Method int IConvertible.ToInt32(System.IFormatProvider provider) ToInt64 Method long IConvertible.ToInt64(System.IFormatProvider provider) ToLower Method string ToLower(), string ToLower(cultureinfo culture) ToLowerInvariant Method string ToLowerInvariant() ToSByte Method sbyte IConvertible.ToSByte(System.IFormatProvider provider) ToSingle Method float IConvertible.ToSingle(System.IFormatProvider provider) ToString Method string ToString(), string ToString(System.IFormatProvider... ToType Method System.Object IConvertible.ToType(type conversionType, Sy... ToUInt16 Method uint16 IConvertible.ToUInt16(System.IFormatProvider provi... ToUInt32 Method uint32 IConvertible.ToUInt32(System.IFormatProvider provi... ToUInt64 Method uint64 IConvertible.ToUInt64(System.IFormatProvider provi... ToUpper Method string ToUpper(), string ToUpper(cultureinfo culture) ToUpperInvariant Method string ToUpperInvariant() Trim Method string Trim(Params char[] trimChars), string Trim() TrimEnd Method string TrimEnd(Params char[] trimChars) TrimStart Method string TrimStart(Params char[] trimChars) Chars ParameterizedProperty char Chars(int index) {get;} Length Property int Length {get;}
So how can I get the "MTL1" ? Notice how each elements are separated by a comma ','
Let's try to split them, there is Split() method!
("OU=MTL1,OU=CORP,DC=FX,DC=LAB").split(',')
OU=MTL1 OU=CORP DC=FX DC=LAB
Awesome... but instead of the Split() method, let's use the PowerShell -Split Operator.
PS C:\> "OU=MTL1,OU=CORP,DC=FX,DC=LAB" -split ','
OU=MTL1 OU=CORP DC=FX DC=LAB
Now, Out of the 4 items, we want to select the first one. [0] will do it
PS C:\> ("OU=MTL1,OU=CORP,DC=FX,DC=LAB" -split ',')[0]
OU=MTL1
Finally we can use the method SubString() to select the piece of text we want.
The first letter of the Site code comes after the = sign, so it will be charactere number 3.
PS C:\> ("OU=MTL1,OU=CORP,DC=FX,DC=LAB" -split ',')[0].substring(3)
MTL1
Voila!
Using Regex
RegEx a sequence of characters that forms a search pattern, mainly for use in pattern matching with strings, or string matching (example: validate an Email format). RegEx allows you to search on Positioning, Characters Matching, Number of Matches, Grouping, Either/Or Matching, Backreferencing. Important to note that you can also use RegEx to replace substring or split your strings.In my solution I used the following part:
The first part ,* will match zero or more time of the preceding element.
The second part ..= will find any pattern that contains 2 characters followed by '='
A period matches one instance of any character
PS C:\> ("OU=MTL1,OU=CORP,DC=FX,DC=LAB" -split ',*..=')[1]
MTL1
More information
TechNet - about_Operators
TechNet - about_Comparaison_operators
TechNet - about_split
TechNet - about_join
PowerShell Cookbook - Appendix B - Regular Expression Reference
MSDN - Regular Expression Language - Quick Reference (Thanks Jay)
Scripting Guys Blog - How Can I Create a Phone Directory from Files with Varying Text Formats?
Scripting Guys Blog - How Can I Convert a Tab-Delimited File to a Comma-Separated Value File?
Scripting Guys Blog - How Can I See Which Packets Are Being Dropped by Windows Firewall?
Scripting Guys Blog - Use PowerShell Regular Expressions to Format Numbers
Scripting Guys Blog - Articles about RegEx
Thanks for Reading! If you have any questions, leave a comment or send me an email at fxcat@lazywinadmin.com.
I invite you to follow me on Twitter @lazywinadm / Google+ / LinkedIn.TechNet - about_Comparaison_operators
TechNet - about_split
TechNet - about_join
PowerShell Cookbook - Appendix B - Regular Expression Reference
MSDN - Regular Expression Language - Quick Reference (Thanks Jay)
Scripting Guys Blog - How Can I Create a Phone Directory from Files with Varying Text Formats?
Scripting Guys Blog - How Can I Convert a Tab-Delimited File to a Comma-Separated Value File?
Scripting Guys Blog - How Can I See Which Packets Are Being Dropped by Windows Firewall?
Scripting Guys Blog - Use PowerShell Regular Expressions to Format Numbers
Scripting Guys Blog - Articles about RegEx
You can also follow the LazyWinAdmin's Blog on Facebook Page and Google+ Page
Good stuff! Here are some functions for processing ldap paths you may like. I hid them in a recent script release (but never actually used them, like a hidden little gift).
ReplyDeleteFunction Get-TreeFromLDAPPath
{
# $Output = [System.Web.HttpUtility]::HtmlDecode(($a | ConvertTo-Html))
[CmdletBinding()]
Param
(
[Parameter(HelpMessage="LDAP path.")]
[string]
$LDAPPath,
[Parameter(HelpMessage="Determines the depth a tree node is indented")]
[int]
$IndentDepth=1,
[Parameter(HelpMessage="Optional character to use for each newly indented node.")]
[char]
$IndentChar = 3,
[Parameter(HelpMessage="Don't remove the ldap node type (ie. DC=)")]
[Switch]
$KeepNodeType
)
$regex = [regex]'(?^.+)\=(?.+$)'
$ldaparr = $LDAPPath -split ','
$ADPartCount = $ldaparr.count
$spacer = ''
$output = ''
for ($index = ($ADPartCount); $index -gt 0; $index--)
{
$node = $ldaparr[($index-1)]
if (-not $KeepNodeType)
{
if ($node -match $regex)
{
$node = $matches['LDAPName']
}
}
if ($index -eq ($ADPartCount))
{
$line = ''
}
else
{
$line = $IndentChar
$spacer = $spacer + (' ' * $IndentDepth)
# This fixes an offset issue
if ($index -lt ($ADPartCount - 1))
{
$spacer = $spacer + ' '
}
}
$line = $spacer + $line + $node + "`n"
$output = $Output+$line
}
[string]$output
}
Function Get-ObjectFromLDAPPath
{
[CmdletBinding()]
Param
(
[Parameter(HelpMessage="LDAP path.")]
[string]
$LDAPPath,
[Parameter(HelpMessage="Determines the depth a tree node is indented")]
[switch]
$TranslateNamingAttribute
)
$output = @()
$ldaparr = $LDAPPath -split ','
$regex = [regex]'(?^.+)\=(?.+$)'
$position = 0
$ldaparr | %{
if ($_ -match $regex)
{
if ($TranslateNamingAttribute)
{
switch ($matches['LDAPType'])
{
'CN' {$_ldaptype = "Common Name"}
'OU' {$_ldaptype = "Organizational Unit"}
'DC' {$_ldaptype = "Domain Component"}
default {$_ldaptype = $matches['LDAPType']}
}
}
else
{
$_ldaptype = $matches['LDAPType']
}
$objprop = @{
LDAPType = $_ldaptype
LDAPName = $matches['LDAPName']
Position = $position
}
$output += New-Object psobject -Property $objprop
$position++
}
}
Write-Output -InputObject $output
}
Thanks Zachary! I will check it out
DeleteAnother way of using Regex here is to use these two lines:
ReplyDelete'OU=MTL1,OU=CORP,DC=FX,DC=LAB' -match '(?<=(^OU=))(?=(,))'
$matches[0]
This uses the regex concepts of lookahead and lookbehind which are covered fairly well in this article:
http://blogs.technet.com/b/heyscriptingguy/archive/2011/03/03/use-powershell-regular-expressions-to-format-numbers.aspx
This matches the string that comes after the pattern '^OU='(the caret '^' is the beginning of line metacharacter) and before the pattern ','. The -match operator returns a boolean and stores the actual matches in the $matches variable.
Splitting on ',*..=' requires you to know where substring is in the string in order to choose the correct index from the array. If that's the case the substring method is going to be more straightforward than regex. If you're not sure where in the string the pattern is you're better off using regex.
Thanks Jay, weird i'm getting an error with
Delete'OU=MTL1,OU=CORP,DC=FX,DC=LAB' -match '(?<=(^OU=))(?=(,))'
$matches[0]
The first line return $false.
Thanks again for the information, really useful! very appreciated
My apologies the first line should be:
Delete'OU=MTL1,OU=CORP,DC=FX,DC=LAB' -match '(?<=(^OU=))\w*(?=(,))'
Nice!! Thanks Jay :-)
DeleteAnother useful link on Regular Expressions in .net:
ReplyDeleteRegular Expression Language - Quick Reference
http://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx
Thanks Jay, Useful !
DeleteI added the link to the post.
Here is another way, except that with this method you can select column 1 or column 2 at the same time side by side, take a look:
ReplyDelete$var1 = "OU=MTL1,OU=CORP,DC=FX,DC=LAB"
This gives column 1
$var1 | % {"{1}" -f($_ -split ',*..=')}
MTL1
Now you want column 1 and 3
$var1 | % {"{1} {3}" -f($_ -split ',*..=')}
MTL1 - FX
if you want a separator of some kind, the dash in this case, "-":
$var1 | % {"{4} - {1}" -f($_ -split ',*..=')}
FX - MTL1
Enjoy!
Awesome! Thanks Luis, Great info!
DeleteIt's actually faster with your line
PS C:\Users\Francois-Xavier> Measure-Command { ($var1 -split ',*..=')[1]}
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 1
Ticks : 16121
TotalDays : 1.86585648148148E-08
TotalHours : 4.47805555555556E-07
TotalMinutes : 2.68683333333333E-05
TotalSeconds : 0.0016121
TotalMilliseconds : 1.6121
PS C:\Users\Francois-Xavier> Measure-Command { $var1 | % {"{1}" -f($_ -split ',*..=')}}
Days : 0
Hours : 0
Minutes : 0
Seconds : 0
Milliseconds : 0
Ticks : 5494
TotalDays : 6.3587962962963E-09
TotalHours : 1.52611111111111E-07
TotalMinutes : 9.15666666666667E-06
TotalSeconds : 0.0005494
TotalMilliseconds : 0.5494
I added a couple of links at the bottom of the post. The Scripting Guy Ed Wilson and Lee Holmes wrote some nice articles about the subjet, hope this help!
ReplyDeleteThanks very nice post
ReplyDelete