Some algorithms take discrete dataset as continous one. Resulting in error.
yasu-sh opened this issue · 13 comments
Dear @jdramsey,
The other day I noticed an error when I ran a typical pipeline for explaing the workflow to my colleagues.
Some algortihms make error even the dataset made as BayesNet with 3 category in tetrad simulation box, i.e. default.
OS: Windows 10 Pro 22H2, locale = japanese
Java: JDK 21 (Oracle)
PC: CPU i7-7820HQ / 32GB HP note PC
Reproduction steps:
- Start tetrad gui.
- Blank, untitled1.tet session opens.
- From toolbar, select [pipeline] - [simulate, search then compare]
- Double-click Simulation1 Box
- No parameters changes, just Press [Simulate] - Button
- Press [OK] in simulate dataset dialog
- Press [Done] at Simulation1 window
- Double-click Search1 Box
- Select PC algorithm. No change in other options, parameters.
- Press [Set Parameters]
- Press [Run Search & Generate Graph]. No change any parameters in PC algorithm.
- Error message appears: [Stopped with error: Not a continous data set]
- User confused, since dataset looks discrete dataset and no way to understand why dialog treated this dataset as continous one.
I tried some algorithms, not all.
Success: FAS, FCI, GFCI, IMaGES, RFCI
Fail: BOSS, CPC, FGES, FOFC, GRaSP, PC
Some users might have faced this and noticed workarounds.
Unfortunately I have not. I am wondering if somebody tell me a workaround.
Error message when using PC algorithm.
java.lang.IllegalArgumentException: Not a continuous data set.
at edu.cmu.tetrad.data.CovarianceMatrix.<init>(CovarianceMatrix.java:90)
at edu.cmu.tetrad.data.CovarianceMatrix.<init>(CovarianceMatrix.java:85)
at edu.cmu.tetrad.data.SimpleDataLoader.getCovarianceMatrix(SimpleDataLoader.java:384)
at edu.cmu.tetrad.search.score.SemBicScore.getCovarianceMatrix(SemBicScore.java:240)
at edu.cmu.tetrad.search.score.SemBicScore.<init>(SemBicScore.java:133)
at edu.cmu.tetrad.search.score.SemBicScorer.scoreDag(SemBicScorer.java:49)
at edu.cmu.tetrad.search.score.SemBicScorer.scoreDag(SemBicScorer.java:30)
at edu.cmu.tetrad.algcomparison.statistic.BicEst.getValue(BicEst.java:41)
at edu.cmu.tetrad.search.utils.LogUtilsSearch.stampWithBic(LogUtilsSearch.java:188)
at edu.cmu.tetrad.algcomparison.algorithm.oracle.cpdag.Pc.search(Pc.java:112)
at edu.cmu.tetradapp.model.GeneralAlgorithmRunner.lambda$execute$1(GeneralAlgorithmRunner.java:391)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at edu.cmu.tetradapp.model.GeneralAlgorithmRunner.execute(GeneralAlgorithmRunner.java:366)
at edu.cmu.tetradapp.editor.GeneralAlgorithmEditor$1MyWatchedProcess.watch(GeneralAlgorithmEditor.java:183)
at edu.cmu.tetradapp.util.WatchedProcess.lambda$startLongRunningThread$0(WatchedProcess.java:62)
at java.base/java.lang.Thread.run(Thread.java:1583)
Oh, I know what the problem is there. I fixed it for someone else for the Python version. Here's what happened. I thought it might be a good idea to "stamp" algorithms results with their BIC score. However, the method I used to do this assumed the data was continuous. This is where the exception is coming from.
I'll try to post a revision in the next couple of days.
Thanks.
@cg09 Thanks for your comment.
Running 7.6.1 on my Mac. No idea what Java version, no problem.
I tried Java 11 with adoptium temurin JRE 11 and oracle JDK17.
The error occurs as well. Currently I use Tetrad 7.6.1 also, latest release.
It means this occurs only in Windows.
I will fix it. It's fixed in py-tetrad; I'll need to make a note of all the changes I've made since then and post another version. (I suppose I could post a version with just that one change.)
This will be fixed in the upcoming release, which will come out in a few days.
Good news! I'll be await for the release.
Let me check FOFC--that shouldn't be stamping a BIC score, but let me check. Maybe it's a different problem...
Well I did see one bug for FOFC--if you make a random MIM in the Graph box, it doesn't show the edges among the latents, though if you close the graph box and re-open it, the edges are shown. Hmm...
@jdramsey I successfully reproduced as you mentioned.
- Places simulation box.
- Double clicks simulation box
- Selects Random One Factor MIM at Type of Graph.
- Presses Simulate
- Closes Dialog by press Done.
- Places Graph Box
- Double-click Graph Box
- Select Graph, Then Press OK.
- The edges between Latent nodes disappear
- Closes Dialog by Pressing Done.
- Double-clicks Graph Box
- The edges at step 9 come up this time.
- If you does select 'Direct Acyclic Graph' or 'Structual Equation Model Graph', this symptoms does not come up.
Well I did see one bug for FOFC--if you make a random MIM in the Graph box, it doesn't show the edges among the latents, though if you close the graph box and re-open it, the edges are shown. Hmm...
I feel like closing this issue. let me set up on this.