User:Francg/expub/thesis/prototype: Difference between revisions
No edit summary |
No edit summary |
||
Line 5: | Line 5: | ||
'''Prototype''' | '''Prototype''' | ||
Extracting data (in this case I | Extracting data (in this case I scrap URL's / web links only) from: https://www.reddit.com/ | ||
<br> | <br> | ||
Line 15: | Line 14: | ||
<br>from bs4 import BeautifulSoup | <br>from bs4 import BeautifulSoup | ||
<br>import requests | <br>import requests | ||
<br>url = raw_input(" | <br>url = raw_input("https://www.reddit.com/: ") | ||
<br>r = requests.get(" | <br>r = requests.get("https://www.reddit.com/" +url) | ||
<br>data = r.text | <br>data = r.text | ||
<br>soup = BeautifulSoup(data) | <br>soup = BeautifulSoup(data) | ||
<br>for link in soup.find_all('a'): | <br>for link in soup.find_all('a'): | ||
print(link.get('href')) | print(link.get('href')) |
Revision as of 00:52, 5 October 2017
Prototype
Extracting data (in this case I scrap URL's / web links only) from: https://www.reddit.com/
Run Python (I did it from virtual environment)
from bs4 import BeautifulSoup
import requests
url = raw_input("https://www.reddit.com/: ")
r = requests.get("https://www.reddit.com/" +url)
data = r.text
soup = BeautifulSoup(data)
for link in soup.find_all('a'):
print(link.get('href'))