DEV Community

Gcobani Mkontwana
Gcobani Mkontwana

Posted on

Python scripts fails to copy the website and translating text into Hindi on my Terminal

Hi team

I need some help with my script, the script does two things, first scraps the webiste. In this instance i am using selenium to scrap the site and this works. But when i tried to translate what the content of the site(title course and name) From english to Hindi using data frame its failes. See below exceptions and stuck since this morning;

                                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "C:\Users\Zux\PycharmProjects\ClassCentral\ClassCentral\Lib\site-packages\googletrans\client.py", line 182, in translate
data = self._translate(text, dest, src, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Zux\PycharmProjects\ClassCentral\ClassCentral\Lib\site-packages\googletrans\client.py", line 78, in _translate
token = self.token_acquirer.do(text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Zux\PycharmProjects\ClassCentral\ClassCentral\Lib\site-packages\googletrans\gtoken.py", line 194, in do
self._update()
File "C:\Users\Zux\PycharmProjects\ClassCentral\ClassCentral\Lib\site-packages\googletrans\gtoken.py", line 62, in _update
code = self.RE_TKK.search(r.text).group(1).replace('var ', '')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'group'

// code to implement scraping the website using selenium webdrive and pandas.
`from googletrans import Translator
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
import pandas as pd
import urllib.request as urllib2
import time

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument("window-size=1920,1080")
driver = webdriver.Chrome(options=chrome_options)
url = "https://www.classcentral.com/collection/top-free-online-courses"
driver.get(url)

translator = Translator()

try:
while True:
# wait until button is clickable
WebDriverWait(driver, 1).until(
expected_conditions.element_to_be_clickable((By.XPATH, "//button[@data-name='LOAD_MORE']"))
).click()
time.sleep(0.5)
except Exception as e:
pass

all_courses = driver.find_element(by=By.CLASS_NAME, value='catalog-grid__results')
courses = all_courses.find_elements(by=By.CSS_SELECTOR, value='[class="color-charcoal course-name"]')

df = pd.DataFrame([[course.text, course.get_attribute('href')] for course in courses],
columns=['Title (eng)', 'Link'])

df['Title (hin)'] = df['Title (eng)'].apply(lambda x: translator.translate(x, dest='hi').text)

print(df)`

I ❤️ building dashboards for my customers

I ❤️ building dashboards for my customers

Said nobody, ever. Embeddable's dashboard toolkit is built to save dev time. It loads fast, looks native and doesn't suck like an embedded BI tool.

Get early access

Top comments (0)

Heroku

Tired of jumping between terminals, dashboards, and code?

Check out this demo showcasing how tools like Cursor can connect to Heroku through the MCP, letting you trigger actions like deployments, scaling, or provisioning—all without leaving your editor.

Learn More

👋 Kindness is contagious

Explore this insightful write-up embraced by the inclusive DEV Community. Tech enthusiasts of all skill levels can contribute insights and expand our shared knowledge.

Spreading a simple "thank you" uplifts creators—let them know your thoughts in the discussion below!

At DEV, collaborative learning fuels growth and forges stronger connections. If this piece resonated with you, a brief note of thanks goes a long way.

Okay