User:Francg/expub/thesis/prototype: Difference between revisions

From XPUB & Lens-Based wiki
No edit summary
No edit summary
Line 19: Line 19:
<br>soup = BeautifulSoup(data)
<br>soup = BeautifulSoup(data)
<br>for link in soup.find_all('a'):
<br>for link in soup.find_all('a'):
    print(link.get('href'))
<br>    print(link.get('href'))
 
 
 
<img src="https://pzwiki.wdka.nl/mw-mediadesign/images/9/98/Bs4-test-reddit1.png" alt="Bs4-test-reddit1" width="180%" height="180%"/>

Revision as of 12:54, 5 October 2017


Prototype

Extracting data (in this case I scrap URL's / web links only) from: https://www.reddit.com/



Run Python (I did it from virtual environment)
from bs4 import BeautifulSoup
import requests
url = raw_input("https://www.reddit.com/: ")
r = requests.get("https://www.reddit.com/" +url)
data = r.text
soup = BeautifulSoup(data)
for link in soup.find_all('a'):
print(link.get('href'))


Bs4-test-reddit1