python - How to extract the <span> tag contents using the Beautiful Soup? -

July 15, 2013

i'm trying extract span tag content google translate website. content translated result has id="result_box". when tried print contents, returns none value.

please check image here

import requests bs4 import beautifulsoup  r = requests.get("https://translate.google.co.in/?rlz=1c1chzl_enin729in729&um=1&ie=utf-8&hl=en&client=tw-ob#en/fr/good%20morning")  soup = beautifulsoup(r.content, "lxml") spanner = soup.find(id = "result_box")  result = spanner.text

requests doesn't execute javascript, use selenium , phantomjs headless browsing this:

from bs4 import beautifulsoup selenium import webdriver  url = "https://translate.google.co.in/?rlz=1c1chzl_enin729in729&um=1&ie=utf-8&hl=en&client=tw-ob#en/fr/good%20morning" browser = webdriver.phantomjs() browser.get(url) html = browser.page_source  soup = beautifulsoup(html, 'lxml') spanner = soup.find(id = "result_box") result = spanner.text

this gives our expected result:

>>> result 'bonjour'

Search This Blog

TY

python - How to extract the <span> tag contents using the Beautiful Soup? -

Comments

Post a Comment

Popular posts from this blog

html - How to set bootstrap input responsive width? -

javascript - Highchart x and y axes data from json -

javascript - Get js console.log as python variable in QWebView pyqt -