- Browse through different sites and pick on to scrape. Check the "Project Ideas" section for inspiration.
- Identify the information you'd like to scrape from the site. Decide the format of the output CSV file.
- Summarize your project idea and outline your strategy in a Juptyer notebook. Use the "New" button above.
-
I;m going to scrape
-
Best Selling books : https://gufhtugu.com/product-category/best-selling/
-
SciFi books : https://gufhtugu.com/product-category/scifi/
-
Books under 750pkr : https://gufhtugu.com/product-category/books-under-750pkr/
-
Mix books : https://gufhtugu.com/product-category/mix/
-
Children books : https://gufhtugu.com/product-category/main-children-books/
-
Islamic books : https://gufhtugu.com/product-category/islamic-books/
-
Urdu Literature books : https://gufhtugu.com/product-category/urdu-literature-2/
-
Urdu science books : https://gufhtugu.com/product-category/urdu-science/
-
Zinda Kitaben books : https://gufhtugu.com/product-category/zinda-kitabein/
-
Uncategorized books : https://gufhtugu.com/product-category/uncategorized/
-
Technical books : https://gufhtugu.com/product-category/technical/
-
Upcoming books : https://gufhtugu.com/product-category/upcoming-books/
-
Booksets books : https://gufhtugu.com/product-category/book-sets/
-
Classics books : https://gufhtugu.com/product-category/classics/
-
English Literature books : https://gufhtugu.com/product-category/english-literature/
-
Health books : https://gufhtugu.com/product-category/health/
-
Gufhtugu Publications books : https://gufhtugu.com/product-category/gufhtugu-publications/
-
I'll get a list of All books. For each books,I'll get book title,book selling category and book url
-
For each Book, we'll get the all books from the gufhtugu publication webpages
-
For each book we'll create a CSV file in the following format:
-
book_title, book_selling_cat, book_url