通过Python使用Selenium抓取Javascript

Scraping Javascript using Selenium via Python

本文关键字:抓取 Javascript Selenium 使用 Python 通过      更新时间:2023-09-26

我试图从一个网站抓取javascript数据。目前,我给自己的挑战,试图刮从这个网站的追随者的数量。下面是我到目前为止的代码:

import os
from selenium import webdriver
import time
chromedriver = "/Users/INSERT USERNAME/Desktop/chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver
driver = webdriver.Chrome(chromedriver)
driver.get("http://freelegalconsultancy.blogspot.co.uk/")
time.sleep(5)
title = driver.find_element_by_class_name
print title

如您所见,我在桌面上有一个chromedriver文件。当我执行代码时,得到以下结果:

<bound method WebDriver.find_element_by_class_name of <selenium.webdriver.chrome.webdriver.WebDriver (session="dd9e5d3f429bc2810c30ebe7067e4e22")>>

我尝试用for循环迭代到这个,但它返回了一个错误。有人知道我如何获得Javascript数据并最终获得关注者的数量吗?

编辑:

因此,根据请求,我已经将我的代码更改为:

import os
from selenium import webdriver
import time
chromedriver = "/Users/INSERT USERNAME/Desktop/chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver
driver = webdriver.Chrome(chromedriver)
driver.get("http://freelegalconsultancy.blogspot.co.uk/")
time.sleep(5)
title = driver.find_element_by_class_name("member-title")
print title

但是我现在得到这个错误:

Traceback (most recent call last):
  File "C:'Users'INSERT USERNAME'Desktop'blogger_v.1.py", line 11, in <module>
    title = driver.find_element_by_class_name("member-title")
  File "C:'Python27'lib'site-packages'selenium'webdriver'remote'webdriver.py", line 413, in find_element_by_class_name
    return self.find_element(by=By.CLASS_NAME, value=name)
  File "C:'Python27'lib'site-packages'selenium'webdriver'remote'webdriver.py", line 752, in find_element
    'value': value})['value']
  File "C:'Python27'lib'site-packages'selenium'webdriver'remote'webdriver.py", line 236, in execute
    self.error_handler.check_response(response)
  File "C:'Python27'lib'site-packages'selenium'webdriver'remote'errorhandler.py", line 192, in check_response
    raise exception_class(message, screen, stacktrace)
NoSuchElementException: Message: no such element: Unable to locate element: {"method":"class name","selector":"member-title"}
  (Session info: chrome=53.0.2785.143)
  (Driver info: chromedriver=2.24.417431 (9aea000394714d2fbb20850021f6204f2256b9cf),platform=Windows NT 6.1.7601 SP1 x86_64)

有什么好主意吗?

编辑:

所以我把我的代码改成了:
import os
from selenium import webdriver
import time
chromedriver = "/Users/INSERT USERNAME/Desktop/chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver
driver = webdriver.Chrome(chromedriver)
driver.get("http://freelegalconsultancy.blogspot.co.uk/")
time.sleep(5)
title = driver.find_element_by_class_name("item-title")
print title

我得到这个结果:

<selenium.webdriver.remote.webelement.WebElement (session="5fe8fb966edd26fdf808da07f99d4109", element="0.9924860218635834-1")>

我怎么打印所有的javascript?这可能吗?

您需要提供您正在查找的类名作为参数。

title = driver.find_element_by_class_name("TheNameOfTheClass")