Extracting href from beautifulsoup: why None?

Clash Royale CLAN TAG#URR8PPP
Extracting href from beautifulsoup: why None?
I'm parsing a site using BeautifulSoup4. The code:
for link in soup.find_all("div", "class": "fl nav_left_2j"):
for item in link.find_all("li"):
print(item)
Gets me:
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince1.htm" title="点击查看 北京市 更多领导信息">北京市</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince18.htm" title="点击查看 天津市 更多领导信息">天津市</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince35.htm" title="点击查看 河北省 更多领导信息">河北省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince219.htm" title="点击查看 山西省 更多领导信息">山西省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince351.htm" title="点击查看 内蒙古自治区 更多领导信息">内蒙古自治区</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince465.htm" title="点击查看 辽宁省 更多领导信息">辽宁省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince580.htm" title="点击查看 吉林省 更多领导信息">吉林省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince650.htm" title="点击查看 黑龙江省 更多领导信息">黑龙江省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince792.htm" title="点击查看 上海市 更多领导信息">上海市</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince810.htm" title="点击查看 江苏省 更多领导信息">江苏省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince926.htm" title="点击查看 浙江省 更多领导信息">浙江省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince1028.htm" title="点击查看 安徽省 更多领导信息">安徽省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince1150.htm" title="点击查看 福建省 更多领导信息">福建省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince1245.htm" title="点击查看 江西省 更多领导信息">江西省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince1357.htm" title="点击查看 山东省 更多领导信息">山东省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince1515.htm" title="点击查看 河南省 更多领导信息">河南省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince1692.htm" title="点击查看 湖北省 更多领导信息">湖北省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince1809.htm" title="点击查看 湖南省 更多领导信息">湖南省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince1946.htm" title="点击查看 广东省 更多领导信息">广东省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince2089.htm" title="点击查看 广西壮族自治区 更多领导信息">广西壮族自治区</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince2213.htm" title="点击查看 海南省 更多领导信息">海南省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince2240.htm" title="点击查看 重庆市 更多领导信息">重庆市</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince2279.htm" title="点击查看 四川省 更多领导信息">四川省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince2482.htm" title="点击查看 贵州省 更多领导信息">贵州省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince2580.htm" title="点击查看 云南省 更多领导信息">云南省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince2726.htm" title="点击查看 西藏自治区 更多领导信息">西藏自治区</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince2807.htm" title="点击查看 陕西省 更多领导信息">陕西省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince2925.htm" title="点击查看 甘肃省 更多领导信息">甘肃省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince3026.htm" title="点击查看 青海省 更多领导信息">青海省</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince3078.htm" title="点击查看 宁夏回族自治区 更多领导信息">宁夏回族自治区</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince3106.htm" title="点击查看 新疆维吾尔自治区 更多领导信息">新疆维吾尔自治区</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince3220.htm" title="点击查看 香港特别行政区 更多领导信息">香港特别行政区</a></li>
*****
<li><a href="/web/20171221213907/http://ldzl.people.com.cn:80/dfzlk/front/personProvince3221.htm" title="点击查看 澳门特别行政区 更多领导信息">澳门特别行政区</a></li>
*****
<li><a>台湾省</a></li>
*****
Which is great!
Now I want to extract the href (and eventually the title, but I'm assuming if I solve for one I solve for both). According to the docs, I should be able to add:
print(item.get('href'))
and get all of the links. Unfortunately, all I get is "None". But there are hrefs there! I can see them! I'd prefer not to have to build a bunch of regexes if there is built in functionality to do this correctly. What mindbogglingly obvious thing am I doing wrong?
thanks!
item[0].get('href')
<li>
href
3 Answers
3
In item there is <li> stored. You have to extract <a> first. You can do it by sth like: item.find('a').get('href').
item
<li>
<a>
item.find('a').get('href')
Thank you! I now feel less crazy.
– mweinberg
Aug 10 at 14:28
for link in item.find_all('a'):
print(link.get('href'))
try this because getting the item from find_all will only return data with tags and moreover for extracting links, one needs to find <a> tag directly in div and extract href
<a>
div
href
for item in link.find_all("li"):
print(item.a["href"])
This throws error when
href is not present into a tag– utks009
Aug 10 at 14:15
href
Yes that's one of the very rarest case because as far as I know it is not even safe to use
anchor tag (<a>) without href. So mostly href exists, may be empty but it would be there.– Upasana Mittal
Aug 10 at 14:26
anchor tag
<a>
href
href
Actually, an
<a> without href is fairly common to define link targets: <a name="..."/>.– Robert
Aug 10 at 16:35
<a>
href
<a name="..."/>
Please provide more details on why this code works. What is the rationale of your approach? Why does it work where the OP's approach failed?
– Greenstick
Aug 10 at 17:26
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Try
item[0].get('href')(<li>themselves have nohrefin your data).– Kirill Bulygin
Aug 10 at 14:05