coin-or/Cbc

CbcMain not thread safe

spoorendonk opened this issue · 18 comments

I am trying to run some MIP problems in parallel using the CbcMain0/CbcMain1 pattern.
I am on https://github.com/coin-or/Cbc/tree/stable/2.10
Problem is both on Linux and Windows.

My code snippet is

    std::vector<int> probIndices(10);
    std::iota(probIndices.begin(), probIndices.end(), 0);
    std::for_each(std::execution::par,
                  std::begin(probIndices),
                  std::end(probIndices),
                  [&](int idx) {
                    OsiClpSolverInterface clp;
                      clp.readLp("folio10_7.lp");
                      clp.initialSolve();
                      clp.resolve();
                      CbcModel cbcModel(clp);
                      CbcMain0(cbcModel);
                     
                      const char *argv[20];
                      int argc = 0;
                      std::string cbcExe = "cbc";
                      std::string cbcSolve = "-solve";
                      std::string cbcQuit = "-quit";
                      std::string cbcLog = "-log";
                      std::string cbcLogSet = "3";
                      std::string cbcGap = "-ratio";
                      std::string cbcGapSet = "0.05";
                      std::string cbcTime = "-seconds";
                      std::string cbcTimeSet = "1000";
                      std::string cbcCutoff = "-cutoff";
                      std::string cbcCutoffSet = "1000000";
                      argv[argc++] = cbcExe.c_str();
                      argv[argc++] = cbcLog.c_str();
                      argv[argc++] = cbcLogSet.c_str();
                      argv[argc++] = cbcGap.c_str();
                      argv[argc++] = cbcGapSet.c_str();
                      argv[argc++] = cbcTime.c_str();
                      argv[argc++] = cbcTimeSet.c_str();
                      argv[argc++] = cbcCutoff.c_str();
                      argv[argc++] = cbcCutoffSet.c_str();
                      argv[argc++] = cbcSolve.c_str();
                      argv[argc++] = cbcQuit.c_str();
                      CbcMain1(argc, argv, cbcModel);
                      if (cbcModel.status() == -1)
                      {
                          throw std::domain_error("CBC fail");
                      }
                  });
    return 0;

It fails when parsing the commandline parameters (I think) The output is like

...
Welcome to the CBC MILP Solver 
Version: 2.10 
Build Date: Sep  1 2020 

command line - cbc -log 3 -ratio 0.05 -seconds 1000 -cutoff 1000000 -solve -quit (default strategy 1)
logLevel was changed from 1 to 3
ratioGap was changed from 0 to 0.05
String of -log is illegal for double parameter seconds value remains -1
No match for 3 - ? for list of commands
Welcome to the CBC MILP Solver 
Version: 2.10 
Build Date: Sep  1 2020 

command line - cbc -log 3 -ratio 0.05 -seconds 1000 -cutoff 1000000 -solve -quit (default strategy 1)
ratioGap was changed from 0.05 to 0.05
String of -cutoff is illegal for double parameter seconds value remains No match for 1000 - ? for list of commands-1

No match for 1000000 - ? for list of commands

Running single threaded is no problem, that is, std::execution::seq.

I was not able to readily track this down in the CbcMain1 function, it is a rather complicated function.
Anyone experienced this before?

It's not clear to me, but did you build with --enable-cbc-parallel? That's probably necessary to get a (probably) thread-safe version of the code. Otherwise, there are some global variables and things that mess with thread safety. I'm in the process of trying to eliminate all the global variables now to deal with this issue in the current development version.

Yes --enable-cbc-parallel is on.

In the DIp code (stable/0.95) I found

https://github.com/coin-or/Dip/blob/b8f05414090729e28187126f3be96564ca3a7f60/Dip/src/DecompModel.cpp#L398-L409

and underneath it uses CbcMain for single threaded. I guess the same problem was encountered here. It works with the cbc.branchAndBound() in parallel. But that is not the recommended (most efficient) way as far as I understand?

Hmm, I don't think we did encounter issues the last time we experimented with it, but it's been a while. Since I'm digging around in the code right now anyway, I'll see if I can see any fix for this. Yes, the results will likely not be as good when calling cbc.branchAndBound() directly.

Cool. That would be super helpful

@jjhforrest Ah, yes, any reason not to change CBC_THREAD_SAFE to CBC_THREAD so that we get thread safe code by default?

Excellent. It seems to solve my problem.

When I build static libs are there other defines I should remember to include when linking coin into my executable?
Is it documented somewhere?

@jjhforrest Ah, yes, any reason not to change CBC_THREAD_SAFE to CBC_THREAD so that we get thread safe code by default?

Even if one disabled multithreading for Cbc you want Cbc to be thread-safe - always and everytime.
That is, I don't understand why there is an ifdef at all.

The stuff inside CBC_THREAD_SAFE is a quick hack. I'm trying to clean this up so that we'll always have thread safety, as @svigerske said.

When I build static libs are there other defines I should remember to include when linking coin into my executable?
Is it documented somewhere?

I am adding

            "CBC_THREAD",
            "COIN_HAS_COINUTILS",
            "COIN_HAS_OSI",
            "COIN_HAS_CLP",
            "COIN_HAS_CGL",
            "COIN_HAS_CBC",

as defines for now when linking against Cbc. Is there anything else I should add?

If you're using the autotools, it shouldn't be necessary to define any of those symbols. The COIN_HAS_XXX variables should be defined automatically in CbcConfig.h (or a header included by it). CBC_THREAD is also auto-defined if you are building with --enable-cbc-parallel. There are some symbols you can define to turn on experimental stuff or that enable fixes for some edge case, but generally speaking, the symbol definition is taken care of by configure when necessary.

My CbcConfig.h only has

/* src/config_cbc.h.  Generated by configure.  */
/* src/config_cbc.h.in. */

#ifndef CBC_VERSION

/* Version number of project */
#define CBC_VERSION "2.10"

/* Major Version number of project */
#define CBC_VERSION_MAJOR 2

/* Minor Version number of project */
#define CBC_VERSION_MINOR 10

/* Release Version number of project */
#define CBC_VERSION_RELEASE 9999

#endif

When I search the headers of dist/include I see nothing like #define COIN_HAS_CLP 1 which is added to the build/Cbc/2.10/src/config.h by the configure script (and not copied into CbcConfig.h?).

There is a lot of "#ifdef COIN_HAS_CLP" in other headers in dist/include. Maybe my c++ build skills are not super tight, but I am wondering if I link with the libs in dist/lib and include headers dist/include could I run into a problem if I do not explicitly do -DCOIN_HAS_CLP=1?

I am not using autotools but cmake when linking coin to my own code.

It's a sign for unclean code if the Cbc API header have ifdef's like these. If that is in parts of headers that are relevant for a user of Cbc, then that can be a bug, too. The user is certainly not expected to guess a number of COIN_HAS_XYZ to be defined manually. This won't be fixed in 2.10 anymore, though, and @tkralphs is doing some cleanup based on master that may also reduce the appearance of such ifdef's in installed header files (until someone puts them back in again).

Excellent. It seems to solve my problem.

So the -DCBC_THREAD_SAFE it worked in debug mode not in release (!?). Can't figure out what the difference is.
However I look at

Cbc/Cbc/src/CbcSolver.cpp

Lines 1351 to 1354 in b127c88

#ifdef CBC_THREAD_SAFE
// Initialize argument
int whichArgument = 1;
#endif

and then the usage in e.g.

Cbc/Cbc/src/CbcSolver.cpp

Lines 1193 to 1204 in b127c88

static std::string
CoinReadGetCommand(int &whichArgument, int argc, const char *argv[])
{
std::string field;
if (whichArgument < argc)
field = argv[whichArgument++];
else
field = "quit";
if (field[0] == '-')
field = field.substr(1);
return field;
}

This is not thread safe is it? Anyone can increment whichArgument?

Embarrassing.

Of course it was my compile definitions that got screwed up. -DCBC_THRED_SAFE works ok.

I can confirm, that adding -DCBC_THREAD_SAFE to a coinbrew install avoids race conditions between separate instances (with separate problems) of calling the C API that run at the same time. For anybody looking for the exact invocation, here's what worked for me:

coinbrew build Cbc@master ADD_CXXFLAGS=-DCBC_THREAD_SAFE --no-prompt

And I guess as the cleanup in the refactor branch removes the #ifdef CBC_THREAD_SAFE, this will also take care of the thread safety in Cbc?

The cleanup in the refactor branch should take care of thread safety, yes. There will be no need to build with CBC_THREAD_SAFE anymore.