Parallel find_packages_allowing_duplicates?

For large workspaces (600+ packages), the serialized parsing of package.xml files in find_packages_allowing_duplicates can start to be non-trivial:

catkin_pkg/src/catkin_pkg/packages.py

Lines 109 to 110 in a4cb118

    
           for path in package_paths: 
        
               packages[path] = parse_package(os.path.join(basepath, path), warnings=warnings)

Without changing the interface to the function, would we consider allowing this work to be spread over multiple threads or processes, possibly triggered by some threshold in number of packages?

Absolutely! The caller shouldn't care how the requested information is being gathered. If that loop can be parallelized that would be great.

A naive threading implementation is slower than the simple loop, so it's not IO bound. I get ~1.5s with the current implementation, and <0.5s running it with a multiprocessing map. I'll send a PR shortly.

The simplest implementation is like so:

package_paths = find_package_paths(basepath, exclude_paths=exclude_paths, exclude_subspaces=exclude_subspaces)
parsed_packages = multiprocessing.Pool(4).map(parse_package, package_paths)
return dict(zip(package_paths, parsed_packages))

However, to preserve the behaviour of the warnings argument requires passing an extra thing into the map for it and then manually aggregating the results, which necessitates some additional wrapping, unfortunately.

Addressed by #171.

	for path in package_paths:
	packages[path] = parse_package(os.path.join(basepath, path), warnings=warnings)