error in quick scraper file
Closed this issue · 4 comments
Traceback (most recent call last):
File "/Users/SiddharthRaja/Documents/Moneycontrol/Sentiment-analysis-of-financial-news-data-master/code/quick_scraper.py", line 7, in
from .title_scrape import *
ModuleNotFoundError: No module named 'main.title_scrape'; 'main' is not a package
also if you can specify the order in which files needs to be executed to test run the full project, it will be helpful.
PS: Thumps up, good work....
@rajkotraja Hi, thank you for your appreciation. Currently, since the project is still under development, so there may be some more errors which may occur until the development cycle is complete. However, to fix the current error, just replace line 7 in quick_scraper.py with the following:
import scrape_with_bs4
Also, you need to setup scrapy directory in order to use the quick_scraper. For this, create a new project in scrapy using the following command:
scrapy startproject tutorial
Then place the quick_scraper and scrape_with_bs4 file under the spiders directory.
To know more about how to do this, you can check the following link
After that, first, you need to scrape the urls using archive_scraper file. Then run the quick_scraper file using the following command:
scrapy crawl yolo
I will be updating the complete setup directory for scrapy usage after sometime.
Thank you for quick response.
Did as you suggested.
scrapy crawl yolo
Traceback (most recent call last):
File "/Users/SiddharthRaja/miniconda3/bin/scrapy", line 11, in
sys.exit(execute())
File "/Users/SiddharthRaja/miniconda3/lib/python3.6/site-packages/scrapy/cmdline.py", line 149, in execute
cmd.crawler_process = CrawlerProcess(settings)
File "/Users/SiddharthRaja/miniconda3/lib/python3.6/site-packages/scrapy/crawler.py", line 249, in init
super(CrawlerProcess, self).init(settings)
File "/Users/SiddharthRaja/miniconda3/lib/python3.6/site-packages/scrapy/crawler.py", line 137, in init
self.spider_loader = _get_spider_loader(settings)
File "/Users/SiddharthRaja/miniconda3/lib/python3.6/site-packages/scrapy/crawler.py", line 336, in _get_spider_loader
return loader_cls.from_settings(settings.frozencopy())
File "/Users/SiddharthRaja/miniconda3/lib/python3.6/site-packages/scrapy/spiderloader.py", line 61, in from_settings
return cls(settings)
File "/Users/SiddharthRaja/miniconda3/lib/python3.6/site-packages/scrapy/spiderloader.py", line 25, in init
self._load_all_spiders()
File "/Users/SiddharthRaja/miniconda3/lib/python3.6/site-packages/scrapy/spiderloader.py", line 47, in _load_all_spiders
for module in walk_modules(name):
File "/Users/SiddharthRaja/miniconda3/lib/python3.6/site-packages/scrapy/utils/misc.py", line 71, in walk_modules
submod = import_module(fullpath)
File "/Users/SiddharthRaja/miniconda3/lib/python3.6/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 994, in _gcd_import
File "", line 971, in _find_and_load
File "", line 955, in _find_and_load_unlocked
File "", line 665, in _load_unlocked
File "", line 678, in exec_module
File "", line 219, in _call_with_frames_removed
File "/Users/SiddharthRaja/Documents/Moneycontrol/Sentiment-analysis-of-financial-news-data-master/code/tutorial/tutorial/spiders/quick_scraper.py", line 9, in
class ContentSpider(scrapy.Spider):
File "/Users/SiddharthRaja/Documents/Moneycontrol/Sentiment-analysis-of-financial-news-data-master/code/tutorial/tutorial/spiders/quick_scraper.py", line 24, in ContentSpider
NEWS={'reuters.com':sc_reuters,'thehindu.com':sc_thehindu,'economictimes.indiatimes':sc_econt,
NameError: name 'sc_reuters' is not defined
Getting this error.
also, if you preparing a complete guide already, will wait. Let me know if any quick fixes for above error.
Thank you
@rajkotraja Currently, this is buggy. I am refactoring the code. It will take some time before it is ready to be used without any hassle.