Selenium is unable to extract page source and returning empty body of html page

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



Selenium is unable to extract page source and returning empty body of html page



Here is my python code:


import pandas as pd
import pandas_datareader.data as web
import bs4 as bs
import urllib.request as ul

from selenium import webdriver
style.use('ggplot')
driver = webdriver.PhantomJS(executable_path='C:\Phantomjs\bin\phantomjs.exe')
def getBondRate():
#driver.deleteAllCookies();
url = "https://www.marketwatch.com/investing/index/tnx?countrycode=xx"

driver.get(url)
driver.implicitly_wait(10)
html = driver.page_source
return html
bondRate = getBondRate()
print(bondRate)



Few days back it was reading perfectly fine from Market watch. Now it is returning nothing in Body tag. Is selenium not loading page?




2 Answers
2



Do you require the HTML tags also? If not, you can try retrieving using the body tag. Here's how I would do it using Java.


String src=driver.findElement(By.tagName("body")).getText();



As per the url https://www.marketwatch.com/investing/index/tnx?countrycode=xx the behavior you are observing is pretty much justified.


https://www.marketwatch.com/investing/index/tnx?countrycode=xx



I have taken up your code and along with a simple tweak tried to extract the page_source with PhantomJS as well as ChromeDriver. It is observed that when you use any WebDriver variant, the WebDriver fingerprints are geting detected and a Fingerprinting error is raised as follows:


page_source


Fingerprinting error



Error details:


Failed to load resource: the server responded with a status of 404 (Not Found)
kpf.js?url=/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint&token=058cbc6a-f8b8-f175-ca68-8c2e0fd6a4e3:1 Fingerprinting error
name: Error
message: Error issuing AJAX request (status code: 404)
stack: Error: Error issuing AJAX request (status code: 404)
at XMLHttpRequest.N.a.onreadystatechange (https://www.marketwatch.com/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint/script/kpf.js?url=/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint&token=058cbc6a-f8b8-f175-ca68-8c2e0fd6a4e3:1:1884)
DevTools failed to parse SourceMap: https://www.marketwatch.com/149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint/script/fingerprint.js.map



DevTools Snapshot:



fingerprintingerror


Browser Automation with Selenium: Fingerprints, recognizability and traceability?


Can a website detect when you are using selenium with chromedriver?


Selenium Webdriver is detectable





Thank you. How do I overcome with this issue if I still need to access data?
– JAGS8386
Aug 9 at 15:23





@JAGS8386 There are multiple ways. You can compile the WebDriver binary i.e. chromedriver binary with a few tweeks or use a PROXY. I have updated the answer and added some more references.
– New contributor
Aug 9 at 15:25






@JAGS8386 Glad to help you out. If my answer have catered to your question please Accept the answer by clicking on the hollow check mark beside my answer which is just below the votedown arrow so the check mark turns green.
– New contributor
Aug 9 at 15:26






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard