Hi guys, I was thinking I may want to access a project’s own gateway web page to get some info that may not be accessible otherwise (from the client). I’m trying to use system.net.httpGet and supply a url that needs authentication to get to, plus a username and password that should have access to it, but the response I get is just the source for the sign in page. Is there something else I need to do here to login to the gateway web page?
You would likely need to jump through some hoops using python’s httplib.
When you first hit the site you need to look at the response headers and pull out your assigned session id.
Then you’d need to make a POST with your auth credentials, and that session ID to the endpoint that processes logins.
Then you could make a request to get the page you want to scrape as long as you still include that session ID and as long as that session remains valid.
This is something that we have eventual plans to improve - exactly how is up in the air, but the rough idea was some kind of API token table (in the gateway configure page) where you can add/remove random tokens, then use those as authentication against endpoints.
Consider having a version/overload of httpGet() that only goes to the gateway, and automatically inherits the client session (cookie). Or an isolated child session, identified by the main session's cookie. Gateway endpoints already enforce various role requirements, so it would leverage existing features. This would be especially helpful for generic webdev endpoints.
I think I’ve got the jist of what you’re saying, I get the cookie part after trying once, then use that cookie trying to post my username/password, then use get again, and I got another ‘found’ … but no response with any data. Here is my test code - I’ve cleaned it up a bit but its pretty rough right now. It uses some parameters for the sensitive bits, which have been left out of what I’m posting here. This is also created in the script console, so assumes variables will be carried over (like cookies).
import httplib, urllib
conn = httplib.HTTPConnection(url, port)
params = urllib.urlencode({'id13_hf_0': '', 'username': username, 'password': password, 'button': "Sign In"})
if cookie is not None:
cookie = cookie.split(";")[0]
else:
cookie = None
#print cookie
headers = {
'Cookie': cookie,
'Content-Type': "application/x-www-form-urlencoded",
'Connection':" keep-alive",
'Accept': "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8"
}
#print headers
conn.request("POST","/main/web/signin?2-1.IFormSubmitListener-sign~in~form",params,headers)
res = conn.getresponse()
print res.status, res.reason
resRead = res.read()
hdrs = res.getheaders()
cookieNew = res.getheader('set-cookie')
cookie = cookieNew if cookieNew is not None else cookie
print cookie
print "\n".join(["%s: %s" % r for r in hdrs])
print "----"
headers = {
'Cookie': cookie,
'Content-Type': "application/x-www-form-urlencoded",
'Connection':" keep-alive",
'Accept': "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
'Accept-Encoding': "gzip-deflate",
'Accept-Language': "en-US, en;q=0.9",
'DNT': 1,
'Host': "%s:&s" % (url, port),
'Upgrade-Insecure-Requests': 1,
'User-Agent': "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36"
}
params = urllib.urlencode({'3': ''})
conn.request("GET", "/main/web/status/?3", params, headers)
res = conn.getresponse()
print res.status, res.reason
resRead = res.read()
hdrs = res.getheaders()
print "\n".join(["%s: %s" % r for r in hdrs])
print resRead
conn.close()
You’re on the right track, but there could be some extra trickery going on behind the scenes that make this unviable(and obviously unsupported). The 302 - response indicates a redirect to another page to the browser.
I tested your code and the submission seems to work but I never actually get access - the 302 redirects back to the sign in page as if the login failed. If I hard-code an already logged in session ID from my browser I can get the pages without any extra hoops which makes me think there’s some magic to posting through the login form that is still missing.
Thanks for that, I’ve continued to try other methods as much as I can with no luck.
Anyone else have an idea for this? Any IA people around and able to shed some light on what we’re missing here?
Edit: New problem, sorta. I think I was able to get in by passing the username & parameters POST’d to the form action url in the signin source thats provided on the signin page. But now, the source I get for something like the log viewer page just doesn’t match what I get when I look via the browser. Its missing… well, the good stuff. Seems like some script takes care of this later in the process in a normal browser, I’m not sure if we can easily replicate it and get it.
That's because it's all generated dynamically by React. You'll have to monitor the calls a browser makes to the gateway to figure out how to retrieve what you want. If that's even possible. Consider adding the React tools to your browser.
If you're specifically looking for logs information, then you can use the internal endpoint (rather than trying to parse the frontend/React output):
http://<gateway>/main/data/status/logs?page=0&pageSize=100&minLevel=ALL&queryText=
Should be fairly self explanatory. This will return a JSON object that you can then freely parse/manipulate.
All of the status pages (to my knowledge, anyways) work through some internal endpoint that returns plain JSON - so using the browser's network tracing tools is your best bet to get programmaticly useful data out of these pages.
Fantastic, even better than what I was hoping for. As far as I can tell so far, all the ones I’m looking for have the json retrieval and it works great through scripting. Thanks!
I’m revisiting this project and have another problem - using system.net.httpGet() from the gateway seems to have a timeout error. I have a UDT with a boolean memory tag, when that tag goes from low to high it runs a shared script that will ideally use httpGet to login to a given gateway URL and get the JSON information as discussed above. In particular, I’m trying to access the gateway the script is on, so nothing should be too odd.
def signin(urlP):
url = "http://%s:8088" % urlP
webURL = "/main/web"
username = 'admin'
password = 'password'
signinTest = shared.HTTP.checkSignin(url)
retStr = []
if signinTest == "":
retStr.append("signed in")
else:
retStr.append( "not signed in: %s" % signinTest )
retStr.append( "trying signin now" )
signinResponse = system.net.httpPost(url + webURL + signinTest, {'username': username, 'password': password})
retStr.append('Signin Response:\n%s' % signinResponse)
newTest = shared.HTTP.checkSignin(url)
if newTest == "":
retStr.append( "signin successful")
else:
retStr.append( "signin not successful")
return retStr
def checkSignin(url, ext=None):
import re
ext = ext if ext is not None else "/main/web/status/"
firstGet = system.net.httpGet(url = url + ext)
patternPre = ".*?action=\"\.(.*?)\""
pattern = re.compile(patternPre, re.DOTALL)
match = pattern.match(firstGet)
if match:
subText = match.group(1)
return str(subText)
else:
return ""
This isn’t the most optimized code, but on the ‘firstGet’ assignment in checkSignin I have a timeout error. The same code I use to call signin() works just fine when in the script console, so I think something about the gateway scope makes it different, but I’m not sure why.
It’s likely an infinite loop because you don’t have a user session. You need to create a cookie first.
There aren’t really any opportunities there for an infinite loop - it goes into checkSignin when assigning signinTest, and then has the error there. And, after all this I set another memory tag to the retStr returned value. I just wrapped that assignment causing trouble in a try/except so the retStr finally returned - no duplicated entries at all.
As for the user session - thats what this is trying to test. checkSignin tries to access a page you need to login to get to, and if that acts like you’re already logged in it will return true else false, THEN it’ll actually login. So it should be fine if there is no user session just yet, that will be made later.
Edit: Actually I’m not sure if I’ve changed something or just misread the error before but now its saying ‘Server redirected too many times (20)’. So, an infinite loop after all, just not one caused by the script directly.
Edit 2: Got it working! For now at least. Used httplib to get the cookie (as far as I can tell the system.net functions don’t give the headers at all). Passed that in as a headerValue in the httpGet in checkSignin and its working well.
The getCookie code:
def getCookie(url):
import re, httplib
conn = httplib.HTTPConnection(url.replace("http://", ""))
conn.request("GET", "/main/web/home")
r1 = conn.getresponse()
cookieStr = [list(t)[1] for t in r1.getheaders() if list(t)[0] == 'set-cookie']
conn.close()
cookies = cookieStr[0] if len(cookieStr)>0 else 'none'
patternPre = "JSESSIONID=(?P<cookie>.*?);"
pattern = re.compile(patternPre)
match = pattern.match(cookies)
return str(match.group('cookie'))
Just a heads up, this won’t work for 8. In getCookie, changes the GET url to
"/web/signin"
Nice!
Do you have a working final example? I tried to make it work but missed the checkSignin (shared.HTTP.checkSignin(url)) function.
I am trying to get the RAM/uptime of clients from this post and don’t know how to get the json for /main/data/vision/status/sessions
.
Thanks
checkSignin is a function that can be found at the bottom of Michael’s snippet here