Trying to Get Selenium to Download Data Based on JavaScript…I think

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



Trying to Get Selenium to Download Data Based on JavaScript…I think



I am trying to download data from the following URL.



https://www.nissanusa.com/dealer-locator.html



I came up with this, but it doen't actually grab any of the data.


import urllib.request
from bs4 import BeautifulSoup

url = "https://www.nissanusa.com/dealer-locator.html"
text = urllib.request.urlopen(url).read()
soup = BeautifulSoup(text)

data = soup.findAll('div',attrs='class':'dealer-info')
for div in data:
links = div.findAll('a')
for a in links:
print(a['href'])



I've done this a couple times before, and it has always worked in the past. I'm guessing the data is dynamically generated by JavaScript, based on the filters that a user selects, but I don't know for sure. I've read that Selenium can be used to automate a web browser, but I have never used it, and I'm not really sure where to start. Ultimately, I am trying to get the data in this format, in the image below. Either printed in the Console Window, or downloaded to a CSV, would be fine.



enter image description here



Finally, how the heck does the site get the data? Whether I enter New York City or San Francisco, the map and the data set changes relative to the filter that is applied, but the URL does not change at all. Thanks in advance.





Have you tried passing any text e.g. New York City or San Francisco to the search box? Which data from the dealer-info are you looking for?
– New contributor
Aug 10 at 4:43


New York City


San Francisco




1 Answer
1



Use selenium to open/navigate to the page, then pass the page source to BeautifulSoup.


from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
from bs4 import BeautifulSoup

browser = webdriver.Chrome()
wait = WebDriverWait(browser, 10)

url = 'https://www.nissanusa.com/dealer-locator.html'
browser.get(url)

time.sleep(10) // wait page open complete

html = browser.page_source
soup = BeautifulSoup(html, "html.parser")

data = soup.findAll('div',attrs='class':'dealer-info')
for div in data:
links = div.findAll('a')
for a in links:
print(a['href'])





That looks pretty much right, but when I run it I get this: 'AttributeError: 'bytes' object has no attribute 'findAll''
– ryguy72
Aug 10 at 13:43






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

make 2 or more post in bootsrap

Store custom data using WC_Cart add_to_cart() method in Woocommerce 3

Firebase Auth - with Email and Password - Check user already registered