beautifulsoup Questions

2

Solved

In all the examples and tutorials I have seen of BeautifulSoup, an HTML/XML document is passed and a soup object is returned which can then be used to modify the document. However, how can I use Be...
Calculation asked 30/4, 2013 at 17:18

4

Solved

I have troubles sorting a wiki table and hope someone who has done it before can give me advice. From the List_of_current_heads_of_state_and_government I need countries (works with the code below) ...
Heulandite asked 15/5, 2018 at 16:56

4

Solved

I have to process a large archive of extremely messy HTML full of extraneous tables, spans and inline styles into markdown. I am trying to use Beautiful Soup to accomplish this task, and my goal i...
Foliate asked 26/8, 2018 at 12:30

3

Is there anyone who has tried to extract the individual risk factors from the Risk Factors section i.e. Item 1A from the EDGAR 10-K filings of the company using BeautifulSoup or any other web scrap...
Philosophy asked 17/6, 2020 at 13:31

6

Solved

I'm trying to scrape all the inner html from the <p> elements in a web page using BeautifulSoup. There are internal tags, but I don't care, I just want to get the internal text. For example,...
Penney asked 2/6, 2010 at 10:58

2

I'm using Beautiful Soup 4 to parse some html-formatted text, scraped from the Internet. Sometimes this text is simply the link to some website. A fact that BS4 is very cross about: UserWarning: "...
Icosahedron asked 16/3, 2016 at 15:13

2

Solved

What is the most efficient way to insert an element as the last one in the <body> of an HTML page?
Wallache asked 1/12, 2010 at 0:58

4

Solved

I have some troubles with getting the data from the website. The website source is here: view-source:http://release24.pl/wpis/23714/%22La+mer+a+boire%22+%282011%29+FRENCH.DVDRip.XviD-AYMO there...
Yogh asked 27/6, 2012 at 20:48

20

Solved

I'm having trouble parsing HTML elements with "class" attribute using Beautifulsoup. The code looks like this soup = BeautifulSoup(sdata) mydivs = soup.findAll('div') for div in mydivs: if (div[...
Ouzel asked 18/2, 2011 at 11:58

12

Solved

I'm trying to scrape a website, but it gives me an error. I'm using the following code: import urllib.request from bs4 import BeautifulSoup get = urllib.request.urlopen("https://www.website.c...
Trost asked 23/11, 2014 at 18:47

2

Solved

I am trying to use beautiful soup to parse html and find all href with a specific anchor tag <a href="http://example.com">TEXT</a> <a href="http://example.com/link&quo...
Monolayer asked 5/11, 2012 at 21:30

2

Solved

I know what I'm trying to do is simple but it's causing me grief. I'd like pull data from HTML using BeautifulSoup. To do that I need to properly use the .find() function. Here's the HTML I'm worki...
Pigpen asked 16/12, 2015 at 0:9

16

Solved

I am trying to use BeautifulSoup, and despite using the import statement: from bs4 import BeautifulSoup I am getting the error: ImportError: cannot import name BeautifulSoup import bs4 does not...
Hereditable asked 27/4, 2015 at 22:57

2

Solved

Please help, SEC EDGAR used to work flawlessly until now. it gives HTTPError: HTTP Error 403: Forbidden import pandas as pd tables = pd.read_html("https://www.sec.gov/Archives/edgar/data/15416...
Grantgranta asked 15/12, 2021 at 18:47

3

I'm getting this error when wrapping a soup element in a str. I'm trying to parse a table with pandas. I'm getting the correct output but also this warning: "FutureWarning: Passing literal ht...
Avra asked 2/1, 2024 at 19:24

2

I keep getting error when trying to import Beautiful Soup: from bs4 import BeautifulSoup ImportError: cannot import name 'BeautifulSoup' from partially initialized module 'bs4' (most likely ...
Teter asked 8/12, 2019 at 16:6

2

I am looking to grab the full size product images from here My thinking was: Follow the image link Download the picture Go back Repeat for n+1 pictures I know how to open the image thumbnails ...
Heavyweight asked 28/8, 2013 at 20:43

5

Solved

I've been playing with beautiful soup and parsing web pages for a few days. I have been using a line of code which has been my saviour in all the scripts that I write. The line of code is : r = r...
Welch asked 29/5, 2017 at 10:7

27

Solved

I'm practicing the code from 'Web Scraping with Python', and I keep having this certificate problem: from urllib.request import urlopen from bs4 import BeautifulSoup import re pages = set...
Popular asked 8/5, 2018 at 14:32

2

Solved

Is there a difference between the capabiities of lxml and html5lib parsers in the context of beautifulsoup? I am trying to learn to use BS4 and using the following code construct -- ret = requests...
Saraband asked 3/9, 2013 at 0:44

1

I'm making a python web scraper script. I should do this using asyncio. So for Async HTTP request I use AioHTTP. It's ok but when i'm trying to make a non-blocking app (await), the beautifulsoup4 w...
Dehumanize asked 4/7, 2019 at 7:32

4

I've been trying to scrape Bandcamp fan pages to get a list of the albums they have purchased and I'm having trouble efficiently doing it. I wrote something with Selenium but it's mildly slow so I'...
Homegrown asked 18/10, 2020 at 21:36

9

Solved

I am trying to fetch some data from a website. However it returns me incomplete read. The data I am trying to get is a huge set of nested links. I did some research online and found that this might...
Kessiah asked 21/1, 2013 at 15:45

4

Solved

I am trying to load a html-page and output the text, even though i am getting the webpage correctly, BeautifulSoup destroys somehow the encoding. Source: # -*- coding: utf-8 -*- import requests...
Ardath asked 25/4, 2016 at 6:25

22

... soup = BeautifulSoup(html, "lxml") File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 152, in __init__ % ",".join(features)) bs4.FeatureNotFound: Could...
Sutra asked 25/6, 2014 at 0:12

© 2022 - 2025 — McMap. All rights reserved.