I’m trying to get air quality index data from the airnow.gov website.
This one isn’t so simple though. My output contains everything up to the second description element, which contains the only data I really need - Air Quality Index. It’s like it’s skipping the remaining data.
Any help would be appreciated!
My Code:
import system
import xml.dom.minidom
url = "http://feeds.enviroflash.info/rss/realtime/133.xml"
response = system.net.httpGet(url)
dom = xml.dom.minidom.parseString(response)
for tag in dom.getElementsByTagName("*"):
print tag.firstChild.data
DATA:
<rss version="2.0">
<channel>
<title>San Francisco, CA - Current Air Quality</title>
<link>http://www.airnow.gov/</link>
<description>EnviroFlash RSS Feed</description>
<language>en-us</language>
<webMaster>
airnowdmc@sonomatech.com (AIRNow Data Management Center)
</webMaster>
<pubDate>Thu, 12 Oct 2017 08:45:10 PDT</pubDate>
<item>
<title>San Francisco, CA - Current Air Quality</title>
<link>
http://feeds.enviroflash.info/rss/realtime/133.xml?id=AC9AF12B-02F4-5A9E-BD504999C6EF606E
</link>
<description>
<!--  Format data output  -->
 <div xmlns="http://www.w3.org/1999/xhtml"> <table style="width: 350px;">    
 <tr> <td> <br> </td> </tr> <tr> <td valign="top">
 <div><b>Location:</b> San Francisco, CA</div><br /> <div> <b>Current
 Air Quality:</b> 10/12/17 8:00 AM PDT<br /><br /> <div> Unhealthy -
 156 AQI - Particle Pollution (2.5 microns)<br /> <br /> Good - 1 AQI -
 Ozone<br /> <br /> </div> </div> <div><b>Agency:</b> San Francisco Bay
 Area AQMD </div><br /> <div><i>Last Update: Thu, 12 Oct 2017 08:45:10
 PDT</i></div> </td> </tr> </table> </div>
</description>
</item>
</channel>
</rss>
My OUTPUT:
San Francisco, CA - Current Air Quality
http://www.airnow.gov/
EnviroFlash RSS Feed
en-us
airnowdmc@sonomatech.com (AIRNow Data Management Center)
Thu, 12 Oct 2017 08:45:10 PDT
San Francisco, CA - Current Air Quality
http://feeds.enviroflash.info/rss/realtime/133.xml?id=AC9AF12B-02F4-5A9E-BD504999C6EF606E
             
            
              
              
              
            
            
           
          
            
            
              Simplest change:
for tag in dom.getElementsByTagName("*"):
	print tag.lastChild.data
tag.firstChild.data seems to be interpreting the commented <!-- Format data output --> as an element, but not actually displaying it. lastChild of the description element has the full description value.
             
            
              
              
              1 Like
            
            
           
          
            
            
              Here’s another approach:
import re
import system
import xml.etree.ElementTree as ET
url = "http://feeds.enviroflash.info/rss/realtime/133.xml"
response = system.net.httpGet(url)
# get the inner HTML description from the XML
root = ET.fromstring(response)
inner = root.find('.//item/description')
inner_html = inner.text
# probably fragile regex to get the AQI values
aqi_particle = re.search("(\d+) AQI - Particle Pollution", inner_html).group(1)
aqi_ozone = re.search("(\d+) AQI - Ozone", inner_html).group(1)
print "AQI (Particle): ", aqi_particle
print "AQI (Ozone): ", aqi_ozone
             
            
              
              
              2 Likes
            
            
           
          
            
            
              That helped thanks! Is there any way to write a gateway script to feed it to a tag in case I want an alarm email sent out?
Gateway script I tried, but doesn’t work. Runs in the script editor, but if I try to make it a gateway script it throws up IOError: connect timed out
import re
import system
import xml.etree.ElementTree as ET
# the URL to pull data from
url = "http://feeds.enviroflash.info/rss/realtime/134.xml"
	
response = system.net.httpGet(url)
		
root = ET.fromstring(response)
inner = root.find('.//item/description')
inner_html = inner.text
	
# probably fragile regex to get the AQI values
aqi_particle = re.search("(\d+) AQI - Particle Pollution", inner_html).group(1)
currentData = aqi_particle
	
# push data to tag
system.tag.write("Global Tags/Air Quality Index", currentData)
             
            
              
              
              
            
            
           
          
            
            
              That approach worked. I didn’t realize comments were being considered, thanks!
The output of that still contains the < div > displaying, so I went with Kevin’s approach and send it to a value in a window so the guys can see when it’s too smoky outside  (wildfires up north).
Thanks again!
             
            
              
              
              
            
            
           
          
            
            
              This script should work, I think the server just doesn’t have access to the internet.