Python Developer Needed: Advanced Web Scraping (Directory + External Email Discovery)

Remote Full-time
We are looking for an experienced Python developer (or Web Automation expert) to build a scraper for a public speaker directory. The Goal: We need to extract approximately 10,000 profiles into a clean CSV/Google Sheets database. The Challenge (2-Step Logic): The email addresses are NOT listed on the directory itself. The script must perform a "Deep Scrape": Step 1 (Directory): Scrape the profile on the main platform (our website) to get: Name, Topics, Location, Profile-URL, and the Link to their Personal Website. Step 2 (External Enrichment): The script must visit the personal website of each speaker. Step 3 (Email Extraction): On the personal website, the script must crawl for the email address. Note: The websites are in German. The script needs to look for keywords like "Impressum" (Legal Notice), "Kontakt" (Contact), or "Datenschutzerklärung" to find the page where the email is listed. It needs to handle simple regex extraction and common obfuscations (e.g., info [at] domain). Deliverables: The Dataset: A CSV/Google Sheet containing: - Name - Topics - City/Country - Personal Website URL - Extracted Email Address (if found) The Source Code: Well-documented Python script (e.g., Scrapy, Selenium, Playwright) so we can run it again in the future. Requirements: - Proven experience with Python (Scrapy/BeautifulSoup) or Headless Browsers (Selenium/Playwright). - Experience in scraping data from multiple different domain structures (since every personal website looks different). - Ability to handle potential anti-bot measures (IP rotation/delays) to scrape respectfully and avoid blocking. - Bonus: Experience with German websites (understanding the structure of "Impressum" pages). Apply tot his job
Apply Now

Similar Opportunities

[Remote] Quality Automation Engineer (Tosca) - Remote

Remote Full-time

[Remote] QA Automation Engineer III - IntelliScript (Remote)

Remote Full-time

QA Automation Engineer (Remote - Pacific Hours)

Remote Full-time

QA Engineer (Remote - Poland)

Remote Full-time

QA Engineer/ Remote, 12 Months Contract

Remote Full-time

Digital QA Lead Media & Entertainment

Remote Full-time

IT Quality Assurance Lead

Remote Full-time

[Remote] English Language Quality Assurance (QA) - Remote

Remote Full-time

Quality Assurance/Test Engineer (Remote)

Remote Full-time

Freelance QA Tester – Hourly Contractor

Remote Full-time

**Experienced Data Entry Clerk – Entry-Level Opportunity for Remote Work at blithequark in Los Angeles**

Remote Full-time

In-Home Product Tester – No Fees, No Experience, $25-$45/hr – Indeed Jobs US

Remote Full-time

Financial Advisor - Pullman, WA & Moscow, ID

Remote Full-time

Sales Consultant

Remote Full-time

**Experienced Virtual Customer Support Representative – Delivering Exceptional Experiences for blithequark Customers**

Remote Full-time

Experienced Full Stack Data Entry Specialist – Remote Work Opportunity with Competitive Hourly Rate

Remote Full-time

Remote Complex Case Manager RN - REMOTE

Remote Full-time

Experienced Customer Service Representative – Live Chat and Telephone Support Specialist for Exceptional Client Experience

Remote Full-time

Senior Manager - Digital Analytics and Customer Insights Leader for Strategic Growth and Innovation

Remote Full-time

**Experienced Medical Transcription Specialist – Remote Chat Support Agent in Medical Transcription, Earning $25-$35/hr**

Remote Full-time
← Back to Home