beautifulsoup

2

Solved

Creating an XML document with BeautifulSoup

In all the examples and tutorials I have seen of BeautifulSoup, an HTML/XML document is passed and a soup object is returned which can then be used to modify the document. However, how can I use Be...

python xml beautifulsoup

Calculation asked 30/4, 2013 at 17:18

4

Solved

Scraping Wikipedia tables with Python selectively

I have troubles sorting a wiki table and hope someone who has done it before can give me advice. From the List_of_current_heads_of_state_and_government I need countries (works with the code below) ...

python-3.x web-scraping beautifulsoup wikipedia

Heulandite asked 15/5, 2018 at 16:56

4

Solved

Beautiful Soup - Get all text, but preserve link html?

I have to process a large archive of extremely messy HTML full of extraneous tables, spans and inline styles into markdown. I am trying to use Beautiful Soup to accomplish this task, and my goal i...

python html parsing beautifulsoup

Foliate asked 26/8, 2018 at 12:30

3

Web Scraping Risk Factors from 10-K EDGAR

Is there anyone who has tried to extract the individual risk factors from the Risk Factors section i.e. Item 1A from the EDGAR 10-K filings of the company using BeautifulSoup or any other web scrap...

html python-3.x regex web-scraping beautifulsoup

Philosophy asked 17/6, 2020 at 13:31

6

Solved

BeautifulSoup: just get inside of a tag, no matter how many enclosing tags there are

I'm trying to scrape all the inner html from the <p> elements in a web page using BeautifulSoup. There are internal tags, but I don't care, I just want to get the internal text. For example,...

python beautifulsoup

Penney asked 2/6, 2010 at 10:58

2

Suppress warning of url in beautifulsoup

I'm using Beautiful Soup 4 to parse some html-formatted text, scraped from the Internet. Sometimes this text is simply the link to some website. A fact that BS4 is very cross about: UserWarning: "...

python beautifulsoup

Icosahedron asked 16/3, 2016 at 15:13

2

Solved

How to insert an element before the closing body tag using Beautiful Soup?

What is the most efficient way to insert an element as the last one in the <body> of an HTML page?

python beautifulsoup

Wallache asked 1/12, 2010 at 0:58

4

Solved

Parsing web page in python using Beautiful Soup

I have some troubles with getting the data from the website. The website source is here: view-source:http://release24.pl/wpis/23714/%22La+mer+a+boire%22+%282011%29+FRENCH.DVDRip.XviD-AYMO there...

python beautifulsoup urllib

Yogh asked 27/6, 2012 at 20:48

20

Solved

How to find elements by class

I'm having trouble parsing HTML elements with "class" attribute using Beautifulsoup. The code looks like this soup = BeautifulSoup(sdata) mydivs = soup.findAll('div') for div in mydivs: if (div[...

python html web-scraping beautifulsoup

Ouzel asked 18/2, 2011 at 11:58

12

Solved

UnicodeEncodeError: 'charmap' codec can't encode characters

I'm trying to scrape a website, but it gives me an error. I'm using the following code: import urllib.request from bs4 import BeautifulSoup get = urllib.request.urlopen("https://www.website.c...

python beautifulsoup file-io urllib

Trost asked 23/11, 2014 at 18:47

2

Solved

python/beautifulsoup to find all <a href> with specific anchor text

I am trying to use beautiful soup to parse html and find all href with a specific anchor tag <a href="http://example.com">TEXT</a> <a href="http://example.com/link&quo...

python html beautifulsoup filtering

Monolayer asked 5/11, 2012 at 21:30

2

Solved

Find HTML attribute values using BeautifulSoup

I know what I'm trying to do is simple but it's causing me grief. I'd like pull data from HTML using BeautifulSoup. To do that I need to properly use the .find() function. Here's the HTML I'm worki...

python html css beautifulsoup

Pigpen asked 16/12, 2015 at 0:9

16

Solved

Cannot import Beautiful Soup

I am trying to use BeautifulSoup, and despite using the import statement: from bs4 import BeautifulSoup I am getting the error: ImportError: cannot import name BeautifulSoup import bs4 does not...

python beautifulsoup

Hereditable asked 27/4, 2015 at 22:57

2

Solved

SEC EDGAR 13F source HTTPError: HTTP Error 403: Forbidden

Please help, SEC EDGAR used to work flawlessly until now. it gives HTTPError: HTTP Error 403: Forbidden import pandas as pd tables = pd.read_html("https://www.sec.gov/Archives/edgar/data/15416...

pandas dataframe parsing beautifulsoup http-status-code-403

Grantgranta asked 15/12, 2021 at 18:47

3

FutureWarning: Passing literal html to 'read_html' is deprecated and will be removed in a future version

I'm getting this error when wrapping a soup element in a str. I'm trying to parse a table with pandas. I'm getting the correct output but also this warning: "FutureWarning: Passing literal ht...

python pandas beautifulsoup

Avra asked 2/1, 2024 at 19:24

2

BeautifulSoup not working cannot import name 'BeautifulSoup' from partially initialized module 'bs4' [duplicate]

I keep getting error when trying to import Beautiful Soup: from bs4 import BeautifulSoup ImportError: cannot import name 'BeautifulSoup' from partially initialized module 'bs4' (most likely ...

python beautifulsoup importerror

Teter asked 8/12, 2019 at 16:6

2

Beautifulsoup - How to open images and download them

I am looking to grab the full size product images from here My thinking was: Follow the image link Download the picture Go back Repeat for n+1 pictures I know how to open the image thumbnails ...

python beautifulsoup

Heavyweight asked 28/8, 2013 at 20:43

5

Solved

urllib.request.urlopen(url) with Authentication

I've been playing with beautiful soup and parsing web pages for a few days. I have been using a line of code which has been my saviour in all the scripts that I write. The line of code is : r = r...

python python-3.x url beautifulsoup request

Welch asked 29/5, 2017 at 10:7

27

Solved

Scraping: SSL: CERTIFICATE_VERIFY_FAILED error for http://en.wikipedia.org [duplicate]

I'm practicing the code from 'Web Scraping with Python', and I keep having this certificate problem: from urllib.request import urlopen from bs4 import BeautifulSoup import re pages = set...

python web-scraping beautifulsoup scrapy ssl-certificate

Popular asked 8/5, 2018 at 14:32

2

Solved

difference between lxml and html5lib in the context of beautifulsoup

Is there a difference between the capabiities of lxml and html5lib parsers in the context of beautifulsoup? I am trying to learn to use BS4 and using the following code construct -- ret = requests...

python beautifulsoup lxml html5lib

Saraband asked 3/9, 2013 at 0:44

1

Async HTML Parse with Beautifulsoup4 in Python

I'm making a python web scraper script. I should do this using asyncio. So for Async HTTP request I use AioHTTP. It's ok but when i'm trying to make a non-blocking app (await), the beautifulsoup4 w...

python asynchronous beautifulsoup

Dehumanize asked 4/7, 2019 at 7:32

4

Scraping Bandcamp fan collections via POST

I've been trying to scrape Bandcamp fan pages to get a list of the albums they have purchased and I'm having trouble efficiently doing it. I wrote something with Selenium but it's mildly slow so I'...

python web-scraping beautifulsoup

Homegrown asked 18/10, 2020 at 21:36

9

Solved

How to handle IncompleteRead: in python

I am trying to fetch some data from a website. However it returns me incomplete read. The data I am trying to get is a huge set of nested links. I did some research online and found that this might...

python python-2.7 web-scraping beautifulsoup mechanize

Kessiah asked 21/1, 2013 at 15:45

4

Solved

Python correct encoding of Website (Beautiful Soup)

I am trying to load a html-page and output the text, even though i am getting the webpage correctly, BeautifulSoup destroys somehow the encoding. Source: # -*- coding: utf-8 -*- import requests...

python encoding utf-8 beautifulsoup mojibake

Ardath asked 25/4, 2016 at 6:25

22

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

... soup = BeautifulSoup(html, "lxml") File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 152, in __init__ % ",".join(features)) bs4.FeatureNotFound: Could...

python python-2.7 beautifulsoup lxml

Sutra asked 25/6, 2014 at 0:12

beautifulsoup Questions

Recommended topics

Hot tags