tasos-py/Search-Engines-Scraper

Hi. can I get result of the News section of google engine ? and second question ; can I store result of each page in a json file? thanks

fazialnjd opened this issue · 2 comments

Hi. can I get result of the News section of google engine ? and second question ; can I store result of each page in a json file? thanks

this code by default returned section all of google engine. yes?
how I get results of News section??
from search_engines import Google

engine = Google()
results = engine.search("my query")
links = results.links()

print(links)

About your first question - yes, it returns all results. The functionality for specific results (News, Images, etc) is not implemented, so you won't be able to use it, unfortunately. I may implement it in the future, but currently I don't have the time.

About your second question - no, I'm afraid you can't do that either. That's because pagination info is not stored in the SearchResults object, all results are stored in one big list. However, SearchEngine._collect_results() is responsible for collecting page results, so you could override/patch this method to achieve that.
If you want to save all results, there are two ways. You can either use the SearchEngine.output() method, or get the raw results from SearchResults.results() and save them manually. For example,

engine = Google()
results = engine.search("my query")

# create JSON file using `engine.output()`
engine.output('json', '/path/to/file')

# create using `results.results()`
with open('/path/to/file.json', 'w') as f:
    f.write(json.dumps(results.results()))