Data Mining and Knowledge Discovery Questions

Where did you get your dataset? How large is it?
Was cleaning the data time consuming? Can you explain why?
Why was your error rate so low when you used this algorithm?
Did you do any cross validation?
What portion of the dataset was for training and what part was for testing accuracy?
Why was one algorithm faster than another?
How many items were in your sample data stream?
Why did one algorithm use more memory than another?
Can you show a graph for the memory usage to data size ratio?
Why didn't you use any deterministic algorithms?
Did you measure the discounted cumulative gain (DCG)?
Can you explain the idea behind the formulas for location filtering and activity filtering?
What factors affect accuracy?
What benefit does machine learning provide for your users?
What benefit does collaborative filtering provide?
How do you generate recommendations?
How do you know what users want?
How did you transfer PLACELAB into Java?
What do the X and Y axes represent?
If new data keeps coming in, how can you incorporate it?
What differences would you expect if you used a different dataset?
Would the technical part change with a different type of dataset?
How do you use GPS?
Can you explain more about JSON?
What's unique about JSON?