web scraping means to extract data from the website(it may be any kind of datas) . to do such kinds of propose we prefer programming languages like python because we manually cannot do such tasks.

In this example we will be using BeautifulSoup frame work of python. so let’s start with an example.

Firstly you need to install BeautifulSoup package in your system. For install copy this text and paste it into your system command prompt(run) . First install request : pip install requests
Then install bs4 : pip install bs4

After completion of installation let’s do with an example:

import requests
from bs4 import BeautifulSoup
r=requests.get(“http://example.com/”)
c=r.content
print(c)

Above example will show you the source code of “example.com”. But it will not show the source code in organised way .To show in organised way you should write as follows:

import requests
from bs4 import BeautifulSoup
r=requests.get(“http://example.com/”)
c=r.content
soup=BeautifulSoup(c,”html.parser”)
print(soup)

Now you can accesses the html tags of the site :

import requests
from bs4 import BeautifulSoup
r=requests.get(“https://example.com/”)
c=r.content
soup=BeautifulSoup(c,”html.parser”)
s=soup.find_all(“div”)
print(s)

Your output should look like this:

[<div>< h1>Example Domain</h1>                                                                                                        <p>This domain is established to be used for illustrative examples in documents. You may use this domain in examples without prior coordination or asking for permission.</p>                                                        <p><a href=”http://www.iana.org/domains/example”>More information …</a></p>
</div>]

 

Leave a Reply

Your email address will not be published. Required fields are marked *