About the Role
Write and maintain data pipelines
Design and develop web crawling and scraping solutions with a focus on performance and accuracy
Build scalable tools that automate web crawling, scraping, and data aggregation to populate databases
Maintaining the running web crawlers full-stack application
Design-build web crawlers to scrape data and URLs.
2-5 years Relevant experience in developing Web Scraping/Crawling using Python and in setting up the automated scheduling of the data scraping spiders.
Should have knowledge in scraping frameworks such as Scrapy, Beautiful Soup, HTQL, Jsoup, Web-Harvest and others.
Strong knowledge of Regular expression, HTML, CSS selectors, DOM, XPATH, etc
Proficient in Git
Python Tech stack (Python library: Requests, Pandas, Scipy, Scikit-learn)
Proficiency with cloud-based environments like AWS/ GCP, experience working with APIs / in an API-driven environment and web crawlers deployment.