Webscraping using BeautifulSoup

Hi Friends today I have started using beautifulsoup to try to scrape a website.

To install beautifulsoup in fedora use the command

yum -y install python-BeautifulSoup.noarch

Use the following code to get the contents of a site :
import urllib2
from BeautifulSoup import BeautifulSoup
url = 'manualian.blogspot.com'
source = urllib2.urlopen(url)

This code reads the content of the site . And if you give source.read() you can get the content been gathered.
The urllib2 is a library to open urls.

No comments:

Post a Comment