Trying to Get Selenium to Download Data Based on JavaScript…I think

I am trying to download data from the following URL.

https://www.nissanusa.com/dealer-locator.html

I came up with this, but it doen't actually grab any of the data.

import urllib.request from bs4 import BeautifulSoup url = "https://www.nissanusa.com/dealer-locator.html" text = urllib.request.urlopen(url).read() soup = BeautifulSoup(text) data = soup.findAll('div',attrs='class':'dealer-info') for div in data: links = div.findAll('a') for a in links: print(a['href'])

I've done this a couple times before, and it has always worked in the past. I'm guessing the data is dynamically generated by JavaScript, based on the filters that a user selects, but I don't know for sure. I've read that Selenium can be used to automate a web browser, but I have never used it, and I'm not really sure where to start. Ultimately, I am trying to get the data in this format, in the image below. Either printed in the Console Window, or downloaded to a CSV, would be fine.

enter image description here

Finally, how the heck does the site get the data? Whether I enter New York City or San Francisco, the map and the data set changes relative to the filter that is applied, but the URL does not change at all. Thanks in advance.

Have you tried passing any text e.g. New York City or San Francisco to the search box? Which data from the dealer-info are you looking for?
– New contributor
Aug 10 at 4:43

New York City

San Francisco

1 Answer
1

Use selenium to open/navigate to the page, then pass the page source to BeautifulSoup.

from selenium import webdriver from selenium.common.exceptions import TimeoutException from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.support.wait import WebDriverWait from bs4 import BeautifulSoup browser = webdriver.Chrome() wait = WebDriverWait(browser, 10) url = 'https://www.nissanusa.com/dealer-locator.html' browser.get(url) time.sleep(10) // wait page open complete html = browser.page_source soup = BeautifulSoup(html, "html.parser") data = soup.findAll('div',attrs='class':'dealer-info') for div in data: links = div.findAll('a') for a in links: print(a['href'])

That looks pretty much right, but when I run it I get this: 'AttributeError: 'bytes' object has no attribute 'findAll''
– ryguy72
Aug 10 at 13:43

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

搜尋此網誌

Sfyjdyy

Trying to Get Selenium to Download Data Based on JavaScript…I think

Trying to Get Selenium to Download Data Based on JavaScript…I think

1 Answer
1

Popular posts from this blog

make 2 or more post in bootsrap

Store custom data using WC_Cart add_to_cart() method in Woocommerce 3

React Native Navigation and navigating to another Screen problem

Trying to Get Selenium to Download Data Based on JavaScript…I think

Trying to Get Selenium to Download Data Based on JavaScript…I think

1 Answer 1

Popular posts from this blog

make 2 or more post in bootsrap

Store custom data using WC_Cart add_to_cart() method in Woocommerce 3

React Native Navigation and navigating to another Screen problem

1 Answer
1