Steven Shack
2013-02-12 04:39:33 UTC
I've got a naive bays classifier workflow in the knowledge workflow portion of weka.
I can read small training and test sets in from a postgres database fine. Where small means, up to 40K rows for training.
30K rows for testing.
When I tried reading in my large test set (400K rows), I get a java array index out of bounds exception and the workflow
stops. Except, I have no clue what is generating this error within my workflow. Or if it's caused by bad data being
read in from the database (which column, row then?).
Given that there's no indication where the problem lays. Is there any way I can narrow this down a bit? I've
attached my workflow below. My workflow works nicely for smaller datasets.
-- Java error.
Notifying data listeners (ClassAssigner)
Notifying data listeners (ClassAssigner)
In accept data set
Notifying listeners (training set maker)
java.lang.ArrayIndexOutOfBoundsException
I can read small training and test sets in from a postgres database fine. Where small means, up to 40K rows for training.
30K rows for testing.
When I tried reading in my large test set (400K rows), I get a java array index out of bounds exception and the workflow
stops. Except, I have no clue what is generating this error within my workflow. Or if it's caused by bad data being
read in from the database (which column, row then?).
Given that there's no indication where the problem lays. Is there any way I can narrow this down a bit? I've
attached my workflow below. My workflow works nicely for smaller datasets.
-- Java error.
Notifying data listeners (ClassAssigner)
Notifying data listeners (ClassAssigner)
In accept data set
Notifying listeners (training set maker)
java.lang.ArrayIndexOutOfBoundsException