maldevel/EmailHarvester

Controlling which search engines are used or not?

Closed this issue · 2 comments

Hi again,

[I think I forgot to say that earlier: Thanks for the great job!]

From what I understood of the source code, I think the argument parsing allows to exclude a list of search-engines with "-r", but the inclusion with "-e" works only for a single search engine.

Wouldn't it be more flexible if the user could select a list for both "-e" & "-r", then the soft retains only "-e"'s list minus "-r"'s list?

Another option would be to use regexes to select search-engines. But that might be less "deterministic" if 2 people run the same command-line but with 2 different lists of plugins...

Another idea might be to group the "search-engines" by theme. Indeed, I've noticed that there are 3 groups for the moment: search-engines, meta-search-engines & social networks...

You can take a look at my fork, this feature has been implemented in it.

I've made a pull request which have not been accepted or rejected yet.

Hi,

I've modified the code myself as follows to allows -e to accept several values and to enable both -e & -r to be used together:

   all_emails = []
    excluded = set([])
    if args.exclude:
        excluded = set(args.exclude.split(','))
    included = set([])
    if args.engines == "all":
        print(green("[+] Searching everywhere"))
        included = set(plugins)
    else:
        included = set(args.engines.split(','))
    final_engines = included.difference(excluded)
    print(green("[+] Using engine list: "+", ".join(str(x) for x in final_engines)))
    for search_engine in final_engines:
        all_emails += plugins[search_engine]['search'](domain, limit)
    all_emails = unique(all_emails)