I thought this was an interesting exercise even if some tools would probably do a better job ($$$). Also we shouldn't take those results too seriously since some protocols like Ajax or HTML5 talk a lot to the servers, keep refreshing pages even if you are not actively working on them.
In this post, I will talk about the last part of this process and how I combine all those files to get a real monthly top domains.
Example of daily report
Here is a sample of daily CSV file result after the parsing has done its job:
So at the end of the month I have a bunch of CSV files that were generated during the months and I need to know how many time each domains were accessed.
One-Liner Magic!
To calculate the Sum of each domains for each months I can simply use the following One-Liner:
Get-ChildItem -Path .\*.csv | # Get each CSV files ForEach-Object -Process { Import-Csv -Path $PSItem.FullName # Import CSV data } | Group-Object -Property Name | # Group per Domain Name Select-Object -Unique -Property Name, @{ Label = "Sum"; Expression = { # Sum all the counts for each domain ($PSItem.group | Measure-Object -Property Count -sum).Sum } } | Sort-Object -Property Sum -Descending | Out-GridView -Title "Top Domains"
Step by Step
Get all the CSV File
Now we import the data inside each file. I added a Sort-Object so you can see that we have duplicates (since we have the same values coming from different files).
Using Group-Object by Domain Name. This will group also all the different "counts" for each domain
Finally we use Select-Object to show a unique Domain Name and Sum each counts present in the "Group" property from the Group-Object Cmdlet (see above).
Thanks for reading! If you have any questions, leave a comment or send me an email at fxcat@lazywinadmin.com. I invite you to follow me on Twitter @lazywinadm / Google+ / LinkedIn. You can also follow the LazyWinAdmin Blog on Facebook Page and Google+ Page.
No comments:
Post a Comment