Course curriculum
-
1
Introduction
- Why Data Scraping
- Applications of Data Scraping
- Introduction of Instructor
- Introduction to Course, Scraping, Tools
- Projects Overview
- Github and OneDrive Links to get the Course Materials
-
2
Requests
- Introduction to Python Requests
- Hand on with Requests
- Extracting Quotes Manually
- Quiz(Extracting Authors)
- Solution(Extracting Authors)
- Pagination
- Quiz(Extracting Author and Quotes)
- Solution 01(Extracting Author and Quotes)
- Solution 02(Extracting Author and Quotes)
- Ajax Requests
- Ajax Requests for Cricinfo
- Ajax Requests Paggination
- Quiz(Extracting Top Stats from Cricinfo)
- Solution 01(Extracting Top Stats from Cricinfo)
- Solution 02(Extracting Top Stats from Cricinfo)
-
3
Beautiful Soap 4(BS4)
- Introduction to BS4
- Quiz(Difference between Requests and BS4)
- Solution(Difference between Requests and BS4)
- Hands on with BS4
- Extracting Data from Tree
- Extracting Quotes from the Website
- Quiz(Extracting Author Names)
- Solution(Extracting Author Names)
- Attributes of Tags in BS4
- Multi Valued Attributes of Tags in BS4
- Scraping Movie Names from IMDB
- Quiz(Getting the Rattings,Year,Name of the Movie)
- Solution 01(Getting the Rattings,Year,Name of the Movie)
- Solution 02(Getting the Rattings,Year,Name of the Movie)
- Scraping Time,Genre and Releasing Date from IMDB 01
- Scraping Time,Genre and Releasing Date from IMDB 02
- Combining Two Requests Data for IMDB
- Movies Recommender System (Creating Movie Url)
- Movies Recommender System (Creating Director Url)
- Movies Recommender System using BS4 (Getting Top 4 Movies)
- Movies Recommender System using BS4 (Merge All Requests Together)
-
4
CSS Selectors
- Introduction to CSS Selectors
- CSS Selectors Handson(Tags)
- Quiz(Tags)
- Solution(Tags)
- CSS Selectors Handson(Decendants, Id, Class)
- Quiz(Descendants)
- Solution(Descendants)
- Quiz(ID)
- Solution(ID)
- Solution(Class)
- Solution(Class)
- CSS selectors handson(Nested Tags, ID tags, Class tags)
- Quiz(Class with Tag)
- Solution(Class with Tag)
- CSS selectors handson(coma seprator, universial selectors
- Quiz(Combining Two Selectors)
- Solution(Combining Two Selectors)
- CSS Selectors Handson(Sibling Notations and Direct Child)
- Quiz(Adjacent Sibling)
- Solution(Adjacent Sibling)
- Quiz(General Sibling)
- Solution(General Sibling)
- CSS selectors handson(Child Selectors)
- Quiz(First Child)
- Solution(First Child)
- Quiz(Only Child)
- Solution(Only Child)
- Quiz(Last Child)
- Solution(Last Child)
- CSS Selectors Handson(Nigations, Attributes)
- Quiz(Negation)
- Solution(Negation)
- CSS Selectors Handson (Attributes, Attributes Values)
- Quiz(Attributes Values)
- Solution(Attributes Values)
- CSS Selectors Handson (Attributes Wild Cards Values)
- Quiz(Attributes Wild Card)
- Solution(Attributes Wild Card)
-
5
Scrapy
- Introduction to Scrapy
- Comparison of Scrapy and Requests
- Scrapy at a glance documentation
- Getting Started with Scrapy
- Running Documentation Spider 1
- Running Documentation Spider 2
- Writing Spider from the Scratch
- Understanding the Response(url, status)
- Understanding the Response(headers)
- Understanding the Response(values in headers)
- Understanding the Response(body)
- Understanding the Response(request)
- Understanding the Response(meta)
- Understanding the Response(flags, certificate, ip_address, copy)
- Understanding the Response(replace, urljoin, follow, follow_all)
- Response CSS and Scrapy Shell
- Extracting quotes
- Understanding nested selectors
- Extracting the Author and Quotes
- Checking for next page
- Checking for next page in Spider
- Checking for next page URL
- Scraping quotes from next pages
- Exporting extracted data
- Quiz(Get The Tags)
- Solution(Get The Tags)
- Next Website
- CSS Selectors for Movie Names and URLs
- Combined CSS Selectors for Movie Names and URLs
- Sent request to the film info page
- Merge Data from Two Callbacks
- Extracting Movie Duration and Genres
- Exporting the Extracted Data
- Quiz(Extracting the Year)
- Solution(Extracting the Year)
- Geting Director Name and Url
- Getting Top Four Movies of Directors
- Extracting Data
- Extracting Data Anomaly (CSS Selector)
- Extracting Data Anomaly (dont_filter Flag)
-
6
Scrapy Project
- Hugoboss webiste for scraping
- Understanding Site Structure
- Writing CSS Selectors for Listings
- Listings in Scrapy Shell
- Sending Request to Listings Urls
- Extracting Products Url from the Listings
- Sending Requests to Products of the Listings
- Writing CSS for getting the product Info
- Getting the bigger images of the product
- Checking Next Page Url
- Adding Pagination to Spider and Running it
- Output of the Spider
-
7
Selenium
- Introduction To Selenium
- Getting Started with Selenium
- Configuring the Webdriver
- Extracting Quotes
- Extracting Quotes and Author Names
- Quiz(Extracting Quotes)
- Solution(Extracting Quotes)
- Clicking on Button
- Paggination and Extracting Data
- Exception Handling for Unavailable Element
- Navigating the Website for Login
- Quiz(Log in and Extract Quote)
- Solution(Log in and Extract Quote)
-
8
Project Selenium
- Overview of Project
- Closing the Cookie Button
- Setting the Language for Translation
- Sending the Text for Transaltion
- Downaloading the translation
- Reading Data from File for Translation