HewlettPackard/swarm-learning

Results vary between hosts when there is data disparity between them.

VuDuc09 opened this issue · 4 comments

Issue description

  • issue description: I have tried using an SQLi CNN model, but with different data between hosts (different in number or different in number of samples) they all produce different results between hosts. When I run mnist example with data is imbalanced between host it's run perfectly. I made no adjustments except for the python model file and data. Has anyone encountered this and can give me some advice?
  • occurrence - consistent or rare: consistent
  • commands used for starting containers: same at example
  • docker logs [APLS, SPIRE, SN, SL, SWCI]:

Swarm Learning Version: 2.0.0

OS and ML Platform

  • details of host OS: Ubuntu 20.04
  • details of ML platform used: CNN
  • details of Swarm learning Cluster (Number of machines, SL nodes, SN nodes): 2 host, 2 SN, 2 SL

Quick Checklist: Respond [Yes/No]

  • APLS server web GUI shows available Licenses? Yes
  • If Multiple systems are used, can each system access every other system? Yes
  • Is Password-less SSH configuration setup for all the systems? No
  • If GPU or other protected resources are used, does the account have sufficient privileges to access and use them? Yes
  • Is the user id a member of the docker group? Yes

Additional notes

  • Are you running documented example without any modification? Yes

Please, what kind of account password do I need to log in to pull the swarm image
Message:
Unable to find image 'hub.myenterpriselicense.hpe.com/hpe/swarm-learning/sn:2.0.0' locally
2.0.0: Pulling from hpe/swarm-learning/sn

Hi @VuDuc09, the final Swarm Learning model will be same across the Hosts. If you have applied different dataset across the Hosts then the inference results could be different. Let us know if you still have any concerns.

Hi @VuDuc09, the final Swarm Learning model will be same across the Hosts. If you have applied different dataset across the Hosts then the inference results could be different. Let us know if you still have any concerns.

When i test with mnist dataset example, I split a dataset with a ratio between 2 host and I have a same result. And besides thât in two case, mnist dataset with example model and my dataset with my model have a same test data between two host, just a train data is different.

Hi @VuDuc09, we could not completely understood the use-case that you mentioned. As this is not exactly an issue with Swarm Learning core components, we can continue this in the discussion.