无法使用Python和请求模块从.aspx登录页获得所需的响应

Unable to obtain desired response from .aspx login page using Python and the Requests module

本文关键字:登录 响应 aspx Python 模块 请求      更新时间:2023-09-26

我一直在尝试登录.aspx网站(https://web.iress.com.au/html/LogonForm.aspx-对于源/初始cookie引用),它使用javascript函数__doPostBack(eventTarget, eventArgument)提交表单(对javascript的了解非常有限,所以最好猜测)。

我目前对HTTP请求的理解是,在表单的上下文中,它们主要是POST类型的请求。我使用Chrome来探查未输入凭据时使用的请求标头和表单数据(出于安全考虑),它们如下所示:

Remote Address:##BLANKEDOUT##
Request URL:https://web.iress.com.au/html/logon.aspx
Request Method:POST
Status Code:302 Found
**Request Headers**
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip,deflate
Accept-Language:en-US,en;q=0.8
Cache-Control:no-cache
Connection:keep-alive
Content-Length:585
Content-Type:application/x-www-form-urlencoded
Cookie:ASP.NET_SessionId=##SESSION ID STRING##
Host:web.iress.com.au
Origin:https://web.iress.com.au
Pragma:no-cache
Referer:https://web.iress.com.au/html/LogonForm.aspx
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/##ADDRESS## Safari/537.36
    **Form Data** 
    __EVENTTARGET:
    __EVENTARGUMENT:
    __VIEWSTATE: ##VIEWSTATE STRING##
    __VIEWSTATEGENERATOR:##VIEWSTATEGENERATOR KEY##
    __PREVIOUSPAGE: ##PREVIOSUPAGE STRING##    
    __EVENTVALIDATION: ##STRING##
    fu:LogonForm.aspx
    su:Default.aspx
    un: # Would be my username if i had typed it in
    pw: # Would be password
    ImageButton1.x:45 # These two values change depending on where i click the submit button
    ImageButton1.y:13

这是我尝试登录时使用的代码:

from requests import session
payload = {
    '__EVENTTARGET'             : '',
    '__EVENTARGUMENT'           : '',
    '__VIEWSTATE'               : '##STRING FOUND FROM CHROME SNIFF##',
    '__VIEWSTATEGENERATOR'      : '##STRING FOUND FROM CHROME SNIFF##',
    '__PREVIOUSPAGE'            : '##STRING FOUND FROM CHROME SNIFF##',
    '__EVENTVALIDATION'         : '##STRING FOUND FROM CHROME SNIFF##',
    'fu'                        : 'LogonForm.aspx',
    'su'                        : 'Default.aspx',
    'un'                        : 'myuser@company',
    'pw'                        : 'mypassword',
    'ImageButton1.x'            : '0', 
    'ImageButton1.y'            : '0' 
    }
requestheaders = {
    'Accept'                    :      'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
    'Accept-Encoding'           : 'gzip,deflate',
    'Accept-Language'           : 'en-US,en;q=0.8',
    'Cache-Control'             : 'no-cache',
    'Connection'                : 'keep-alive',
    'Content-Type'              : 'application/x-www-form-urlencoded',
    'Host'                      : 'web.iress.com.au',
    'Origin'                    : 'https://web.iress.com.au',
    'Cookie'                    : '',
    'Pragma'                    : 'no-cache',
    'Referer'                   : 'https://web.iress.com.au/html/LogonForm.aspx',   
    'User-Agent'                : 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML,  like Gecko) Chrome/##ADRESSS AS ABOVE## Safari/537.36'
    }
with session() as sesh:
    LOGINURL = 'https://web.iress.com.au/html/LogonForm.aspx'
    sesh.get(LOGINURL) #Get request to get the session ID cookie
    sessionID = sesh.cookies['ASP.NET_SessionId'] #Grab session ID value
    sessionIDname = 'ASP.NET_SessionId='
    sessionIDheader = str(sessionIDname + sessionID) #Prepare session ID header
    requestheaders['Cookie'] = sessionIDheader # Add session ID header to requestheaders dictionary

    response = sesh.post('https://web.iress.com.au/html/LogonForm.aspx', data=payload,  headers=requestheaders)

    print(response.headers)
    print(response.content)

我似乎只知道页面的来源(https://web.iress.com.au/html/LogonForm.aspx)以获取内容及其标头作为响应。我也不确定这是否和__VARIABLES有关,但它们似乎没有改变,previouspage是个例外。我可能需要提取这些__VARIABLES才能在请求标头中使用它们吗?

您发布到错误的URL;您自己的数据显示表单张贴到https://web.iress.com.au/html/logon.aspx,但您却张贴到/LogonForm.aspx

请注意,session对象将为您处理cookie,不要自己设置Cookie标头。您应该避免设置HostOriginContent-Type标头,而Cache-ControlAccept*标头和Pragma不会对其工作方式产生任何影响。