A little over a week ago I went to the Twin Cities User Group presentation on Machine Learning 101 by Mark Kalal at Intertech. The presentations are member-driven, so they can be prototypes or one offs, but it was enjoyable even if a bit basic (for me) in spots.
He introduced me to Nand Kishor, a big data guy who is prolific and writes over at House of Bots.
This diagram by Nand was particularly interesting. I've seen similar breakdowns, but this diagram is easily digestible for someone not in the data science space. Unfortunately, for me, it makes me realize how much I float in that top level of the chart and seldom get to go deep sea diving.
His focus for machine learning work was focused on Weka or Wakaito out of New Zealand. It fits well into a presentation at Code Freeze a few days ago where one of the presenters pointed out how often you should visualize the big data you're working with at every step to understand how it's changing, what's being highlighted, and what's outside the norm. In this case it was small sets of data and the tool is crude, but effective. e.g. after about 40 megs, the data requires the command line, so there are limitations to what you can assess. I'd like to try it out with some of our data, to see if there are questions that can be intuited out of our already aggregated data.
I don't have his presentation because I think I wrote the link down incorrectly (rather than just taking a picture - tags me as old I think), but I reached out to see if there's a link so I can include a little more of his low-level work here.
No comments:
Post a Comment