My less than 0.05% share of success :)

Hmm, it seems my book's royalty fees will be lower per ebooks in this quarter too: while celebrating  #Packt2k, but it might be a good opportunity for others to grab some books from other topics too. :)
Recently I got the royalty report about the 2013 Q3, it was interesting to see how many paper books were sold compared to ebooks, approximately 2:3 were the ratio (p vs e), I was expecting lower ratio.


Deeper integration of RapidMiner

It seems the next milestone is coming for RapidMiner integration for KNIME. Check the following screenshot:
As you can see some of the RapidMiner nodes (those that are only inputting and outputting ExampleSets) are generated and used in a KNIME workflow. The description is not yet good, it has to be improved, but it is possible to set the parameters, specify the roles in the configuration, and the icons are used when they are available. There are things to do, but I think this is a promising step forward. :)


RapidMiner within KNIME with editing

Using RapidMiner from KNIME with the RapidMiner editor was not an easy task. It will require to send patches to RapidMiner, to make it more easy to embed in other applications with GUI (with OSGi class loader for example).
I think lot of things are already working:

  • can load/edit processes
  • can use data properties from KNIME in the editor (so you get automatic validation on parameters), the metadata from RapidMiner is available as KNIME table specifications
  • can use data from other RapidMiner related sources
Highlighting through RapidMiner processes might be not possible, but maybe I just miss a good idea. Some testing and support for more exotic data types would not hurt. I am not yet satisfied with the startup time and the user interface, so the latter will change, and check whether the startup time can be improved (maybe not). The interesting part is, the usage of RapidMiner plugins. It might be a great feature if those were available within this node too. I'll have to check that too. Workflow variables/parameters might be made available in RapidMiner too.



First on Gild in the following skills:

I miss a good score for C (424), C#.NET (176), and SQL (173). Maybe answering more questions, starting solving code puzzles will help. Anyway, this is not too bad for now. :)


RapidMiner within KNIME

Finally this is working without (known) problems:

So it handles nominal (String) values (in the first setup within RapidMiner the w values were cluster-0, and cluster-1), we can add/remove new columns (the id column is added), add/remove/generate rows within RapidMiner.
The possible ways to improve:
  • Add views from RapidMiner,
  • Add a configuration dialog to the KNIME node using the RapidMiner UI (with correct input setup),
  • Multiple input/output ports (easy),
  • Documentation,
  • Keep only the necessary amount of data in memory (for the input table),
  • On configuration compute the result column types within KNIME.
I think this would be really cool, as RapidMiner offers some methods (like data validation, Fourier transformation, ...) that are not available within KNIME, although this brings those options to them.
The problem is with the licence of RapidMiner: AGPL, or commercial. I think I have to ask for commercial licence.



Well, at least I did not submitted a wrong solution to the 500 point problem. The sad part is that I have not finished it in time. (First I misread the problems statement and I was trying to compute the number of those configurations where all balls are caught. So bad.)
Maybe next time it will be better.

Wrong idea: ManagedCompiler

Well, it looks like I seldom find the right choice in this project. The ManagedCompiler was a really bad idea to use in the MSBuild task:
  • it allows to use only a single version of Scala
  • the dlls used cannot be freed
  • it cannot be deployed unless the dlls are merged in one dll (not too feasible), or the dlls are signed (not the case currently)
I was aware only the first problem, but I thought it is not a huge problem to deploy a new version of Scala MSBuild task for each version of Scala, although the other problems made it really wrong idea. (I hoped that the ManagedCompiler would make the error reporting easier and the compilation faster. Well, this shall not happen soon.)