User:Francg/expub/thesis/prototype: Difference between revisions

From XPUB & Lens-Based wiki
(Created page with " <center> '''Extracting URL's from any website''' Now when we know what BS4 is and we have installed it on our machine, let's see what we can do with it. from bs4 import B...")
 
No edit summary
Line 1: Line 1:




<center>
 


'''Extracting URL's from any website'''
'''Extracting URL's from any website'''


Now when we know what BS4 is and we have installed it on our machine,
Run Python (I did it from virtual environment)
let's see what we can do with it.
<br>from bs4 import BeautifulSoup
 
<br>import requests
from bs4 import BeautifulSoup
<br>url = raw_input("Enter a website to extract the URL's from: ")
 
<br>r  = requests.get("http://" +url)
import requests
<br>data = r.text
 
<br>soup = BeautifulSoup(data)
url = raw_input("Enter a website to extract the URL's from: ")
<br>for link in soup.find_all('a'):
 
r  = requests.get("http://" +url)
 
data = r.text
 
soup = BeautifulSoup(data)
 
for link in soup.find_all('a'):
     print(link.get('href'))
     print(link.get('href'))
</center>

Revision as of 00:18, 5 October 2017



Extracting URL's from any website

Run Python (I did it from virtual environment)
from bs4 import BeautifulSoup
import requests
url = raw_input("Enter a website to extract the URL's from: ")
r = requests.get("http://" +url)
data = r.text
soup = BeautifulSoup(data)
for link in soup.find_all('a'):

   print(link.get('href'))