Data Mining and Knowledge Discovery Questions

Below are some questions related to data mining and/or knowledge discovery that may be asked during or after your oral presentation.

  • Where did you get your dataset? How large is it?
  • Was cleaning the data time consuming? Can you explain why?
  • Why was your error rate so low when you used this algorithm?
  • Did you do any cross validation?
  • What portion of the dataset was for training and what part was for testing accuracy?
  • Why was one algorithm faster than another?
  • How many items were in your sample data stream?
  • Why did one algorithm use more memory than another?
  • Can you show a graph for the memory usage to data size ratio?
  • Why didn't you use any deterministic algorithms?
  • Did you measure the discounted cumulative gain (DCG)?
  • Can you explain the idea behind the formulas for location filtering and activity filtering?
  • What factors affect accuracy?
  • What benefit does machine learning provide for your users?
  • What benefit does collaborative filtering provide?
  • How do you generate recommendations?
  • How do you know what users want?
  • How did you transfer PLACELAB into Java?
  • What do the X and Y axes represent?
  • If new data keeps coming in, how can you incorporate it?
  • What differences would you expect if you used a different dataset?
  • Would the technical part change with a different type of dataset?
  • How do you use GPS?
  • Can you explain more about JSON?
  • What's unique about JSON?

Copyright HKUST CSE Dept. 2018
Blog template built for Bootstrap by @mdo.
Back to top