Use PowerShell’s Invoke-WebRequest to retrieve and parse an RSS or Atom feed.
Status monitoring feeds aplenty
There are a lot of third party services that provide status monitoring updates through RSS/Atom feeds. Statuspage is a popular service that other companies use for providing this type of status monitoring feed, as well as a status page for web viewing.
For my case, I wanted to take advantage of monitoring one of these status feeds, but I wanted to do it using PowerShell so I could send email alerts and display status information within Solarwinds Orion. Here is how I used the Atom feed from DigitalOcean‘s Statuspage.
Invoke-WebRequest: PowerShell’s way to process the web
To pull down and work with DigitalOcean’s Statuspage atom feed, we will use the cmdlet Invoke-WebRequest . Per docs.microsoft.com, Invoke-WebRequest “gets content from a web page on the Internet.”
Introduced in PowerShell 3.0, “The Invoke-WebRequest cmdlet sends HTTP, HTTPS, FTP, and FILE requests to a web page or web service. It parses the response and returns collections of forms, links, images, and other significant HTML elements.”
Get the Atom Feed URL
First, we’ll go to https://status.digitalocean.com and get the Atom Feed’s URL. Why the Atom feed as opposed to RSS? Ultimately it’s a personal preference. When comparing the Atom to RSS, I felt that Atom was the better standard (and now let flame war ensue).
Right-click the Atom Feed URL and copy the link address.
For reference, it is https://status.digitalocean.com/history.atom. If you open it in a browser, it will look something like this:
<?xml version="1.0" encoding="UTF-8"?> <feed xml:lang="en-US" xmlns="http://www.w3.org/2005/Atom"> <id>tag:status.digitalocean.com,2005:/history</id> <link rel="alternate" type="text/html" href="https://status.digitalocean.com"/> <link rel="self" type="application/atom+xml" href="https://status.digitalocean.com/history.atom"/> <title>DigitalOcean Status - Incident History</title> <updated>2017-11-29T08:03:18Z</updated> <author> <name>DigitalOcean</name> </author> <entry> <id>tag:status.digitalocean.com,2005:Incident/1517334</id> <published>2017-11-27T23:22:27Z</published> <updated>2017-11-27T23:22:27Z</updated> <link rel="alternate" type="text/html" href="https://status.digitalocean.com/incidents/l5g9gxc5yjdw"/> <title>Delayed Ticket Creation via Contact Form</title> <content type="html"><p><small>Nov 27, 23:22 UTC</small><br><strong>Update</strong> - We have temporarily disabled the contact form while we work to resolve the issues causing delays with ticket creation. Until the issue is fixed, please email contact@digitalocean.com or submit a ticket directly through the control panel if you need assistance from our support team. We appreciate your patience, and apologize for any inconveniences.</p><p><small>Nov 27, 18:40 UTC</small><br><strong>Investigating</strong> - We are investigating an issue where messages submitted through our contact form are experiencing delays creating tickets in our support center. This issue may lead to a delay in response time from our support team. As we work to correct this issue, we recommend that users in need of support log into the control panel to submit tickets, or email us at contact@digitalocean.com.</p></content> </entry> ...
Return Invoke-WebRequest result to $Response variable
Using this URL, we will now request the content from the URL and store the response in a PowerShell variable for further processing.
# Get Atom Feed $Response = Invoke-WebRequest -Uri "https://status.digitalocean.com/history.atom" -UseBasicParsing -ContentType "application/xml" If ($Response.StatusCode -ne "200") { # Feed failed to respond. Write-Host "Message: $($Response.StatusCode) $($Response.StatusDescription)" }
Note that we used the argument -UseBasicParsing . By default, the Invoke-WebRequest cmdlet leverages Internet Explorer. If the computer we are running the script on (and this script is intended to run on a server) doesn’t have Internet Explorer installed, this argument allows us to still return the HTML content.
I add the argument -ContentType “application/xml” specifies the content type of the request. This isn’t necessarily required; if you ran the command above without this it would still return the right content, but I feel it is more complete and makes it easier to understand.
Exploring $Response object
Let’s take a look at $Response. You’ll see we get properties StatusCode and StatusDescription which tell us the response was OK. You then see Content contains XML, which is the same XML that we see when we navigate directly to the Atom feed URL. Because it is all XML, we can easily parse it.
PS C:\Users\aaron> $Response StatusCode : 200 StatusDescription : OK Content : <?xml version="1.0" encoding="UTF-8"?> <feed xml:lang="en-US" xmlns="http://www.w3.org/2005/Atom"> <id>tag:status.digitalocean.com,2005:/history</id> <link rel="alternate" type="text/html" href="h... RawContent : HTTP/1.1 200 OK Transfer-Encoding: chunked Connection: keep-alive X-XSS-Protection: 1; mode=block X-Content-Type-Options: nosniff X-StatusPage-Version: d3af947 Vary: Accept,Accept-Encoding,Fastl... Forms : Headers : {[Transfer-Encoding, chunked], [Connection, keep-alive], [X-XSS-Protection, 1; mode=block], [X-Content-Type-Options, nosniff]...} Images : {} InputFields : {} Links : {} ParsedHtml : RawContentLength : 41256
Parsing feed entries from XML
Now we will parse the XML to identify the entries. We cast $Response.Content as XML and store it in $FeedXml . We then iterate through the entries we find within the XML content, and in this example adding anything updated within the last 24 hours to an array. We use PSCustomObject to create an object to store the information from the entry within the array.
$FeedXml = [xml]$Response.Content $Entries = @() $Now = Get-Date # Exract recent entries (currently set for updated within the last 24 hours) ForEach ($Entry in $FeedXml.feed.entry) { If (($Now - [datetime]$Entry.updated).TotalHours -le 24) { $Entries += [PSCustomObject] @{ 'Id' = "status.digitalocean.com - " + ($Entry.id).Remove(0, 24) 'Updated' = [datetime]$Entry.updated 'Title' = $Entry.title 'Content' = $Entry.content.'#text' } } }
Sending email alert for recent feed entries
The reference script was intended to check the feed every five minutes and send an email alert for each updated entry within the last five minutes. We iterate through the array again to identify those recent items and send an email using Send-MailMessage . (NOTE: Opportunity for optimization here – this could be incorporated into the initial ForEach iteration).
# Send email notifications for entries updated in the last 5 minutes. ForEach ($Entry in $Entries) { If (($Now - [datetime]$Entry.updated).TotalMinutes -le 5) { $Params = @{ 'Body' = $Entry.Content 'BodyAsHtml' = $true 'From' = "alert@example.com" 'SmtpServer' = "smtp.example.com" 'Subject' = $Entry.Id + " - " + $Entry.Title 'To' = "alerts@example.com" } # Send notifications Send-MailMessage @Params } }
Here is an example snippet of how an email alert generated from this feed:
Full Solarwinds component monitor for atom feed
In this post we covered bits of the Solarwinds component monitor as it related to requesting an atom feed and parsing it. For the full compoent monitor script, refer to the Github Gist link in the Reference section below. There is plenty of opportunities for optimization, but in its current form it is providing basic capabilities that we sought.
Reference
- Invoke-WebRequest | docs.microsoft.com
- DigitalOcean.Status.AtomFeed.Poll.ps1 | gist.github.com
JeffJ says
I feel obligated to point out that -useBasicParsing was added in PowerShell 5 as it has burned me in the past.
Aaron Rothstein says
Noted Jeff, thanks for pointing that out.
George Jenkins says
Thanks for this post on how to read Atom RSS feeds. Here’s the code I made using your page plus StackOverflow to remove hex characters from XML section in the content tag:
$kurzweil = invoke-webrequest http://www.kurzweilai.net/feed/atom
#removing hex chars from XML content
[xml]$content = $kurzweil.content -replace “[^\x09\x0A\x0D\x20-\xD7FF\xE000-\xFFFD\x10000-x10FFFF]”,””
[array]$entries = $content.feed.entry
Then parse from there. Thanks again.
Hayden says
Most of the time I don’t make comments on websites, but I’d like to say that this article really forced me to do so. Really nice post! monitor