Monday, April 11, 2011

Weka: "Train and test set incompatible"

Scenario: A training and test set, both .csv files. Weka does not like. Deems them incompatible.
Search engine magic fail: According to Google, the attribute names are out of order. If you, like me, spent hours staring at them side by side as a .csv file and seeing no difference in order, spelling, spaces, etc., do not fret like I did!

SOLUTION:
  1. Open both training and test set files in Weka and save as .arff files.
  2. Open them with a plain text editor (Notepad, Notepad++, etc.).
  3. Check the nominal attributes (i.e. not numerical), which Weka has put into arrays of this format: {..}.
  4. Compare the attribute arrays of the training file against test set. The order depends on how they appear in the file, so just rearrange the arrays in both training and test set for matching attribute arrays. It should now run without throwing an "incompatible" error.

2 comments:

  1. Thanks. Very helpful comment. have wasted 1 hour with this. you saved from wasting more :)

    ReplyDelete
    Replies
    1. You're welcome, I wasted some time with it too, so I'm glad to help others :)

      Delete