Start here

At getData you can learn Python to jumpstart your research project, from writing your first crawler, to scraping data from web pages, to extracting data from complicated files, and beyond.

Here is a roadmap of topics, sorted chronologically with the steps in a research project.

  1. Prepare
    1. Getting started
      1. Crawler, scraper, parser–what’s the difference?
    2. What data do I need?
    3. Identifying data sources
    4. Creating a data dictionary
    5. Time planning
  2. Crawling data
    1. Crawling 101: Building your first data crawler with Python
    2. Crawling 102: Crawling data from APIs
    3. Crawling 103: Crawling dynamic web data with Selenium
    4. Advanced topics in crawling
      1. Parallelization
      2. Proxies
      3. Scheduling
    5. Quick tips crawling
      1. Friendly crawling
      2. Crawling graphs
      3. How to avoid getting blocked
  3. Parsing data
    1. Parsing 101: Extracting data from web pages with Beautiful Soup
  4. Cleaning data
    1. Quick tips integrating
      1. Combining CSV files