felixfung/InfoFlow

"java.lang.OutOfMemoryError: Java heap space"

Opened this issue · 2 comments

Hi Felix,

Hope you are well! I read your paper and wanted to give Infloflow a try on a big graph I am working with (1.3M nodes, 30M edges). In Infomap this graph fits under 60GB of memory. But no matter how much memory I allow on virtual machine I keep getting memory errors. I have no familiarity with scala or spark. Is there something obvious I'm missing?

I adapted it from your citation network demo notebook. I've just attached my settings and error log below.

Thank you for your time!
-Bernie

infoflow_config = {
    "Graph": '/home/jupyter/data/user_streaming_20220401_20220401/gcs/graph/global_edge_weights_20220401_20220401.5_plus.net',
    "spark configs": {
        "Master": "local[*]",
        "num executors": "1",
        "executor cores": "4",
        "driver memory": "100G",
        "executor memory": "100G",
    }
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
22/06/15 01:04:50 INFO SparkContext: Running Spark version 2.1.1
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/opt/spark/jars/hadoop-auth-2.7.3.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
22/06/15 01:04:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/06/15 01:04:50 INFO SecurityManager: Changing view acls to: jupyter
22/06/15 01:04:50 INFO SecurityManager: Changing modify acls to: jupyter
22/06/15 01:04:50 INFO SecurityManager: Changing view acls groups to: 
22/06/15 01:04:50 INFO SecurityManager: Changing modify acls groups to: 
22/06/15 01:04:50 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(jupyter); groups with view permissions: Set(); users  with modify permissions: Set(jupyter); groups with modify permissions: Set()
22/06/15 01:04:50 INFO Utils: Successfully started service 'sparkDriver' on port 45831.
22/06/15 01:04:50 INFO SparkEnv: Registering MapOutputTracker
22/06/15 01:04:50 INFO SparkEnv: Registering BlockManagerMaster
22/06/15 01:04:50 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
22/06/15 01:04:50 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
22/06/15 01:04:50 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-39e90c5b-da52-412e-a094-d41905e2c216
22/06/15 01:04:50 INFO MemoryStore: MemoryStore started with capacity 434.4 MB
22/06/15 01:04:50 INFO SparkEnv: Registering OutputCommitCoordinator
22/06/15 01:04:50 INFO Utils: Successfully started service 'SparkUI' on port 4040.
22/06/15 01:04:50 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.174.15.236:4040/
22/06/15 01:04:51 INFO SparkContext: Added JAR file:/home/jupyter/src/third_party/infoflow/target/scala-2.11/infoflow_2.11-1.1.1.jar at spark://10.174.15.236:45831/jars/infoflow_2.11-1.1.1.jar with timestamp 1655255091016
22/06/15 01:04:51 INFO Executor: Starting executor ID driver on host localhost
22/06/15 01:04:51 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38401.
22/06/15 01:04:51 INFO NettyBlockTransferService: Server created on 10.174.15.236:38401
22/06/15 01:04:51 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
22/06/15 01:04:51 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.174.15.236, 38401, None)
22/06/15 01:04:51 INFO BlockManagerMasterEndpoint: Registering block manager 10.174.15.236:38401 with 434.4 MB RAM, BlockManagerId(driver, 10.174.15.236, 38401, None)
22/06/15 01:04:51 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.174.15.236, 38401, None)
22/06/15 01:04:51 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.174.15.236, 38401, None)
CPU times: user 20.3 ms, sys: 4.81 ms, total: 25.1 ms
Wall time: 1min 49s
Exception in thread "netty-rpc-env-timeout" Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.addConditionWaiter(AbstractQueuedSynchronizer.java:1896)
	at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2077)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1170)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
	at java.base/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.regex.Matcher.<init>(Matcher.java:248)
	at java.base/java.util.regex.Pattern.matcher(Pattern.java:1133)
	at scala.util.matching.Regex.unapplySeq(Regex.scala:246)
	at PajekReader$$anonfun$apply$2.apply(pajekreader.scala:63)
	at PajekReader$$anonfun$apply$2.apply(pajekreader.scala:47)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
	at PajekReader$.apply(pajekreader.scala:47)
	at GraphReader$.apply(graphreader.scala:15)
	at InfoFlowMain$.readGraph(main.scala:107)
	at InfoFlowMain$.main(main.scala:38)
	at InfoFlowMain.main(main.scala)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Hi Bernie,

Thank you for your interest in this project. If I am to make a guess, the Java environment has a low heap space assignment within Jupyter, causing the crash. What if you run this in native spark environment?

Hi Felix,
I have similar issue when running InfoFlow on a big graph. I also try to run the code in native spark environment (with spark-submit command directly). I got the same error message like the post of Bernie. The value of "driver memory" and "executor memory" in the config file seems not helpful for setting Java heap space. Do you have any idea to solve this problem?
Many thanks for you time.
Yifan