The Tech Field Day site for Data Field Day 1 has my picture on it, so I guess it’s official: I’m going to Data Field Day 1 in San Francisco, from 13 to 15 May 2015.
As this is the first Data Field Day, I’m not sure what to expect. Each Tech Field Day event has a different feel to it, while still having the same underlying approach that Stephen and the crew bring to all their events. Partly this is due to the different topic, and partly the different group of attendees.
We don’t have a full list of sponsors yet, either, so let’s just interpolate from insufficient data, which is always fun.
We have Cloudera, so there’ll be plenty of Hadoop.
We also have Sandisk, which is a big company with many divisions, so this is likely to be the enterprise tech or flash division.
I’ve done some refresher research into the field, and I now have a great question to take with me to the event:
What even is the field?
Is it cloud? Is it analysis of large data sets? Is it complexity of dataset? Is it everything? Does anyone even know?
Cloudera had a link to a Gartner magic quadrant thing on… let me look this up again… “Gartner’s 2015 Magic Quadrant for Data Warehouse and Data Management Solutions for Analytics”.
This appears to be a split between traditional “data warehouses” and newer MapReduce type things like Hadoop and friends. The “leaders” quadrant contains companies whose products I have plenty of experience with: Oracle, Teradata, IBM, SAP and a bunch of challengers I know less about: Cloudera (hence their link to the MQ), MapR, Amazon Web Services (?) and 1010data. AWS is a bit of an odd inclusion there, but I can guess why AWS wants to be included. SAS are notable by their complete absence from the MQ. Bit of a significant oversight there, in my opinion.
But hey, Gartner gonna Gartner.
I hope our sponsors can help clear this up for me, because they’ll need to explain what market they’re targeting for me to understand whatever message they’ll be trying to get out during DFD1.
Cloud Data Analysis
If you sit down and think about what cloud based data analysis is, it’s pretty much exactly like booking time on a mainframe. Ok, your job submissions are now in Java instead of JCL, and you can do it over a broadband connection instead of posting a bunch of punch cards, but how is this not basically the same thing? This is progress?
Yes, I know there has been a bunch of progress in CPU speeds and the amount of data that’s addressable, and there are some really cool new techniques for doing statistical analysis at scale, but so much of that is hidden behind all this “big data+cloud=magic” marketing blah that I just switch off completely.
I love me some stats, but I want it to have practical, real-world examples of neat stuff that isn’t just “we can do linear regressions faster” or “we have all this data now! Isn’t that exciting?” Yawn. No.
Show me something cool with Bayesian Networks! Show me that machine learning can predict when the next tech bubble will pop based on how often executives use the word “disruptive”. Show me an app that could tell me which briefings to skip because the content is going to be non-stop buzzwords. Talk about value!
Except keynotes. Snarking at keynotes is the only reason people follow me on Twitter. :)
More To Come
I’ll add some update blogs as more information is released. It looks like there are some secret companies that I won’t be blogging about before the event (unless they become publicly known beforehand), but I know nothing yet.
Also, I’m considering extending my trip if I can line up a bunch of briefings and meetings. If you’d like to catch up, get in touch with me by email or Twitter and let’s see what we can organise. Before the event is a better idea than straight after, because from past experience, my brain is full after a TFD event.