NSF I-Corps Kickoff in Atlanta

One of the big projects I’m involved in is the CAM2 image analysis team in Purdue Universities ECE Department. Headed by Professor Lu, CAM2 seeks to synthesize the vast array of public cameras across the world to create a platform where users can easily select and execute image analysis’s on the accumulated datasets.

The I-Corps program specifically accepts roughly 20 teams into each co-hort (I believe there are 3 per year) to investigate whether the research technology is commercially viable. For our team, our Principal Investigator is Professor Lu, our Entrepreneurial Lead is Kyle McNulty, and I am serving as the Business Mentor. Although young relative to other business mentors, I think it will be an advantage as far as relating and working with Kyle and Dr. Lu as well as working in a more agile Business Model development environment such as the one the Business Model Canvas requires.

Over the next 6 weeks we will use the $50,000 I-Corps grant to travel and interview an abundance of potential “customers” / users of our hypothesized application – using image analysis to allow retail stores understand and react to the behavior and intent of their customers. So far we have conducted store level interviews with Nike, Microsoft, Bloomingdales, and Banana Republic to name a few and are looking to move up the corporate decision making ladder in the coming weeks.

With a great team, I’m sure we will be able to thoroughly evaluate this market segment using the Business Model Canvas and hopefully we’ll have a continue or don’t continue answer founded in solid market evaluation at the end of it all. Nonetheless, I’m going to try and write about the progress each week if not for my own records and sanity.

The Finished Product: Web Crawler + Search Engine

After 6 long weeks of work, I finally wrapped up my CS 390 Web Crawler + Search Engine project. Although only worth 1 credit hour, I must have spent at least 30 a week working on it. The best part; it was easy. Finally, after sitting through so much theory and mundane labs and projects, I could finally apply all the different things I’ve learned throughout my coursework.

Of the things I set out to accomplish, I achieved almost all of them save a few database optimization features which I simply ran out of time to implement. My Web Crawler managed to crawl at a rate of roughly 800 URLs fetched and parched per minute, mainly through the (painful) implementation of threading and breadth-first URL grabbing. The search engine itself used Java Servlets to perform the computations, with Ajax requests getting the necessary information.

The three applets created were an autocomplete servlet for the user input, a URL list generator from the user query ranked with a simplified version of PageRank, and a special person servlet to return the information of Professors. Ranking the results was probably the most difficult part of the project, as it took a few iterations to get it right. For example, in addition to incrementing a pages rank when another pointed at it, I also checked for the case of the user query appearing in the title of the webpage as well as artificially boosted the rank of faculty pages to ensure they were towards the top for a Professor name search. Very difficult to balance this and some additional tweaks are probably needed. The search query suggestions and results were calculated in real time, the former using a Mealy machine automata to predict the user input complete with state transitions and all. The result is a quick video I captured and uploaded to Instagram below. I will try and host this project on GitHub or a personal site, and will continue to work on improving the querying speed as well as evolve special cases.

Nonetheless, I suggest anyone and everyone at Purdue who wants to take a Computer Science course try and take one with Professor Rodriguez- Rivera as his projects are tough, but relevant in the real world and get you thinking and applying the fringes of your skillset. Especially CS390, which is offered in C++, Java, and Python for 8 weeks each.