Research Projects for Summer 2016

This link is to a page of opportunities outside the CS department... .

Please apply by Jan. 31, 2016    via the Apply link at left...

Project titles/advisors

A title/advisor overview of the projects described on this page and in our application:

  • Computational Biology: from Darwin to DP   [Hadas/Wu]
  • What Makes You Different from Me, and What Does It Mean about Evolution?   [Wu]
  • Computer Science Teaching Tips   [Lewis]
  • Dream Journal   [Lewis]
  • Summer start-up   [Lewis]
  • Distributed Computational Geometry Algorithms   [Amelang]
  • DreamWorks' OpenVDB algorithms on Manycore Architectures   [Amelang]
  • Trace Repository   [Kuenning]
  • Dynamically Optimizing Disk Layouts   [Kuenning]
  • Text Simplification   [Medero]
  • Active Transportation   [Medero]
  • Query Processing with the Crowd   [Trushkowsky]
  • PaWPal: Productivity and Wellness Pal   [Boerkoel]
  • Robot Brunch   [Boerkoel]
  • Summer Staff   [CS Staff]
  • Intelligent Music Software   [Keller]
  • Robot Mapping   [Dodds]
  • MyCS: Middle-years Computer Science   [Dodds]
  • Interactive Physics Diagrams   [Wiedermann]
  • Vice-chancellor of Fun positions   [Wiedermann/Dodds]

Project Descriptions

Computational Biology: From Darwin to Dynamic Programming

Darwin posited in the "Origin of Species" that two species - such as flowers and the bees that pollinate them - might evolve in tandem for the mutual benefit of both species. Darwin's speculations on "coevolution" of species have been validated by recent advances in computational biology. In particular, our group and others have developed algorithmic methods - many of them using dynamic programming - for understanding the most likely scenarios by which a pair of species coevolved.

These same methods can be used for the related problem of understanding the relationships between species and their genes. In some cases, species acquire genes by "transfers" from other species. And, genes can duplicate and then one copy can independently evolve new function. In short, genes and species have relationships that look - in many ways - like the relations between pairs of species like flowers and bees.

This research seeks to develop new techniques and software tools to help biologists better understand the relationships between groups, such as in the coevolution of species or the evolution of genes and species. The work involves several different activities including the design and analysis of algorithms, implementing new software tools, visualization of large datasets, among others. Students will work on the activities that best match their interests and background.

There are at least five positions available this summer. A few positions will be available to first-year students who have taken CS 60. Other positions will be available to students who have taken "Algorithms" (CS 140/Math 168). MathBio 118 is also good background for this project. Directed by Prof. Libeskind-Hadas in collaboration with Prof. Wu.

Computer Science Teaching Tips is a NSF-sponsored project for disseminating effective computer science teaching practices. Summer research activities will include some subset of the following tasks: (1) interviewing CS educators about their CS teaching (2) writing tips for, (3) transferring our body of tips from to other resource libraries for CS educators, (4) reading CS education research papers, (5) improving the appearance and infrastructure of the existing website, (6) developing interactive visualizations for standardized test data, (7) integrating tips from into an existing CS curriculum, (8) thinking about the teaching and learning of CS, (9) working with CS teachers, (10) developing the social media presence of, and (11) some other stuff.

As you can see, there isn't a finalized set of tasks we'll do. Summer researchers will help steer the project to achieve the greatest impact for the international CS education community! We might not know exactly what we'll do - but we'll have a LOT of fun!!! Only CS5 (or equivalent) experience is required.

In your essay, please indicate whether you'd be interested in working 10 weeks (~May 23 - July 29) or 8 weeks (after summer math ~ June 6 - July 29). If you have other scheduling constraints this summer please mention this in your essay, as well. Also please indicate if you're more interested in programming or contributing to the educational content of the project.

(Directed by Prof. Lewis.)

Dream Journal

My dad is a psychologist working at the Veterans Association (VA) in Fresno, CA. He works primarily with veterans who have post-traumatic stress disorder (PTSD) and many of these veterans have disrupted sleep from nightmares. Recent research on nightmares documents a promising new practice for eliminating nightmares. The new practice involves nightmare sufferers maintaining a dream journal and adding a new ending to each nightmare.

This practice could easily be scalable to people in and out of treatment programs if integrated into an iOS app. Students in my software development class (CS121) worked with my dad to develop a prototype. You'd work in a pair with another student to (1) learn to develop iOS apps and (2) make improvements to the existing prototype so we can pitch the project to the VA.

In your essay, please indicate whether you'd be interested in working 10 weeks (~May 23 - July 29) or 8 weeks (after summer math ~ June 6 - July 29). If you have other scheduling constraints this summer please mention this in your essay, as well.

(Directed by Prof. Lewis.)

Summer start-up

Want to work in a group of 3-4 students to create a start-up? You'll come up with an idea, develop a prototype, develop a pitch, and present your start-up to the HMC Entrepreneurial Network. You need to be totally committed to working in a group and learning new skills (e.g. programming, business, communication, etc).

In your essay, please indicate whether you'd be interested in working 10 weeks (~May 23 - July 29) or 8 weeks (after summer math ~ June 6 - July 29). If you have other scheduling constraints this summer please mention this in your essay, as well.

directed by Prof. Lewis

DreamWorks' OpenVDB algorithms on Manycore Architectures, or "constructing clouds on countless cores"

My summer project topics aren't yet finalized, but we'll almost definitely have one about performing algorithms on sparse data structures on the GPU for DreamWorks. The DreamWorks project is exploring the use of OpenVDB on GPUs and other manycore architectures. The core data structure (a VDB) in OpenVDB is a tree that contains voxels organized hierarchically into cells. Many algorithms construct VDBs serially. Building trees with low-core-count CPU parallelism is tricky, but building trees with high-core-count GPU parallelism is even trickier. In this project, we'll work on formulating and implementing algorithms of interest to animation (such as the closest point transform) such that they work well on massively-parallel manycore architectures.

directed by Prof. Amelang

Distributed Computational Geometry Algorithms, or "dashing distances on distributed data"

My summer project topics aren't yet finalized, but we'll probably have a project sponsored by CD-adapco in which we'll work on some computational geometry problems. CD-adapco creates software used throughout the government and private sectors to do fluid simulations. At the core of many simulation techniques lie many challenges of computational geometry, such as mesh operations, mesh refinement, and spatial queries. However, performing these geometric algorithms on a single computer is easy mode; it is far more difficult to perform the algorithms on supercomputers where the data is distributed across thousands of computers. In this project, we'll investigate performing spatial queries with distributed data (finding the closest object, overlap tests, etc.), we'll look at the inside/outside problem for unstructured points (given a mesh and a set of points, classify the points as being inside or outside of the mesh), mesh refinement, or something similar.

directed by Prof. Amelang

Trace Repository

The SNIA Trace Repository contains several terabytes of data collected by observing the behavior of real file systems. Harvey Mudd is responsible for the management and enhancement of this repository. Students will develop tools related to traces, help write standards, locate, convert, and post new traces, and integrate tools contributed by researchers. (Directed by Prof. Kuenning.)

Dynamically Optimizing Disk Layouts

Disk drives are the slowest component of a computer system. But they can be made faster if data is laid out properly. The problem is that "properly" depends on how the information is accessed, which depends on what programs do. This project investigates ways to observe program behavior and then rearrange things on the fly to optimize performance.

(Directed by Prof. Kuenning.)

Text Simplification

In the U.S. alone, millions of adults don't read well enough to carry out daily activities that many of us take for granted. Prof. Medero's text simplification project wants to make more text accessible to more people. We do that by automatically making hard texts easier to read, without (significantly!) changing their meaning. This summer, we'll work on two aspects of the project:

  • Adding new functionality to TextScroll, an iOS application we developed last summer that lets users interact with the text reading through their device's accelerometer.
  • Developing an automatic simplification system based on machine translation techniques.

(Directed by Prof. Medero)

Computing for Active Transportation

There was a time when almost half of K-8 students walked or rode their bike to school, but today only 13% of students do. Research shows that students who participate in one of these forms of active transportation do better in school. At the same time, increasing the number of students who walk or bike to school would decrease the car traffic in front of schools, resulting in improved traffic safety, better air quality, and lower transportation costs for parents. This summer, we'll work on two aspects of the project:

  • Modifying and finalizing the design for a low-cost, open-source air quality sensor that we built last summer, and creating a custom interface to that tool. The interface will let us use the sensor with elementary and middle-school students to design and run science experiments related to air quality.
  • Taking feedback from parents, teachers, and principals into account to develop a tool for organizing groups of students to walk to school together, building off of algorithmic analysis that we did last summer.

(Directed by Prof. Medero.)

What Makes You Different from Me, and What Does It Mean about Evolution?

Evolution is responsible for the immense biological diversity of our planet; however, despite its central role as the most fundamental property of life, the process of evolution remains poorly understood, and current models have typically been unable to span the diversity of scales at which evolution can act.

The goal of this project is to develop computational models and tools to enable better insight into this evolutionary process. In particular, we will be leveraging massive collections of genomic data to look at how differences among individuals within the same species can be used to improve our inferences.

This project incorporates knowledge from a variety of fields, including machine learning and algorithms, mathematical modeling and statistics, and evolutionary biology -- students interested in learning more about any of these fields are encouraged to apply. There are projects available for both intro and advanced students, and we will work together to find one tailored to your background and interest. (Advised by Prof. Wu)

Query Processing with the Crowd: Dynamic Filter

We want to be able to issue a query to a data processing system asking it to filter a set of items, such as restaurants, such that we retain only those items that satisfy a set of conditions (e.g., has fresh-made guacamole). These conditions are called "predicates" in the database systems community. Evaluating a predicate takes time/resources because the relevant information about each item is not available in the database. We hope to leverage the crowd to help find and interpret the relevant information to determine which items belong in the final query result.

In our scenario, it is not necessarily true that predicates are equally difficult to evaluate. Several factors could influence differing processing costs, where "cost" may involve time and money spent on a crowd platform. These factors include possible subjectivity and/or ambiguity of the predicate. Furthermore, it might be the case that evaluating a particular predicate has different cost for different items. With ad-hoc queries, these costs will be unknown at the time the query executes.

Given that we can eliminate an item from the query result once it fails one of the predicates, the lowest overall cost to evaluate the query is achieved by determining the cheapest and most restrictive (i.e., highly "selective") predicates to evaluate first. Since we do not know the predicates' cost or selectivity a priori, we need to learn and adapt to observations of these metrics as the query is running by iteratively deciding which (item, predicate) pair should be processed next.

The goal of this project is to devise an adaptive algorithm to evaluate such a query, building off work done by students last summer!

(Advised by Prof. Trushkowsky)

PaWPal: Productivity and Wellness Pal

College students often struggle to balance their work with personal wellness. In part, this occurs because students work when they are unable to focus. One approach to increasing both productivity and wellness is to help users achieve flow. Flow is a state in which people feel focused, motivated, and fully immersed in their activity, resulting in feelings of satisfaction and even joy.

The Productivity and Wellness Pal (PaWPal) is an Android-based application that seeks to make users aware of their efficacy at various tasks as well as which courses of action are likely to lead to immersive experiences. This summer, we'll be asking the following questions:

  • Can we use the lab.s newest robot Jibo ( to better motivate PaWPal users?
  • Can we implement our current PaWPal design so that iOS users can benefit from it?
  • Can we use machine learning to build a model of users' efficacy and predict when they will be most likely to experience flow?

Skills utilized: iOS (Swift) development, interaction design, machine learning

Please visit the HEATlab website for more information:

(Directed by Prof. Boerkoel.)

Robot Brunch

Temporal plans exist to provide robust directives that robots can follow to accomplish their goals, while also coordinating when these activities should occur. In general, we want temporal plans that are adaptable to events that are beyond the direct control of agents; e.g., a robot may experience slippage or sensor failures. To do this, we must answer two questions: (1) how and when do new or unexpected events arise in practice?, and (2) how "good" is the temporal plan at adapting to unexpected events that might otherwise invalidate the plan?

In previous summers, we explored the usefulness of a new metric called robustness, which assesses the likelihood that a multi-robot plan succeeds. We showed that robustness is a better measure of multi-robot plan quality and that we can generate plans that optimize for robustness.

This summer we ask the question, what changes if these robots are coordinating with humans? The goals for this summer include:

Skills utilized: artificial intelligence, interaction design, ROS development

Please visit the HEATlab website for more information:

(Directed by Prof. Boerkoel.)

Summer Staff

Summer staff is a small group of students who help maintain and improve our CS department's computational infrastructure: software, hardware, and usage patterns/policies. No previous systems-administration experience is required: you will learn about the systems that It's fun, vital for the department, and an ideal chance to expand your systems knowledge -- join us! (Directed by the CS staff and its advisor, to be named.)

Intelligent Music Software

In prior work in the Intelligent Music Software project, we prototyped a Deep Belief Network (layers of Restricted Boltzmann Machines) that proved capable of learning to improvise jazz melody lines over a chord sequence. While initial results were encouraging, the learning process was not considered efficient enough at the time to merit inclusion in our Impro-Visor software. Secondly, although these networks are probabilistic, and thus capable of nuance, the amount of nuance observed did not seem competitive with our grammar-based approach. Since then, significant strides have been made in Deep Learning and it seems appropriate to revisit our previous model and to research other deep learning models in an effort to achieve an approach that overcomes previous problems. This is therefore the research planned for this summer's HMC REU in the Intelligent Music Software project.

The ideal participant will have experience with neural networks and machine learning, as well as software development using Java. Knowledge of how improvisation works is useful, but not totally essential, since we expect the network to learn based on minimal wired-in musical knowledge. Experience with AI programming in Lisp, Prolog, or Python is a plus.

(Directed by Prof. Keller.)

RCs to Robots

This project will include several hands-on conversions of remote-control vehicles into autonomous robots. We will start by following in the footsteps of others who have converted RC devices into autonomous ones, looking for ways to make that process straightforward, accessible, and low-cost. From there, we will add sensing.mostly wireless order to make the resulting platforms as capable as possible. Most of the sensing effort will be computer vision: experimenting with flexible and efficient software to interpret the incoming images. Because RC vehicles and wireless cameras are so inexpensive, we hope that this project will make it possible for more institutions and groups to take on sophisticated robotics challenges, even without substantial resources. Join in!

(Directed by Prof. Dodds.)

MyCS: Middle-years Computer Science

This project seeks to increase the skillset and mindset of computing in precollege-age students -- in particular, at the middle-school level. This summer we will be expanding our middle-school target audience to include both an elementary version and a high-school version in which CS5 is used as a foundation for the new AP CS principles course. Join us to help develop K-12 CS curriculum and present two week-long summer workshops to a small groups of middle-, elementary- and high-school teachers! This project can support 8 or so students. (Directed by Prof. Dodds.)

Interactive Physics Diagrams

Interactive diagrams are animated diagrams that students can interact with using a computational device such as a computer or tablet. For example, an interactive physics diagram with pulleys and weights might allow the student to move the pulleys and vary the weights, while observing the physical simulation that ensues. Such animated diagrams allow students not only to visualize concepts in fields like physics, chemistry, biology and math, but also to test their understanding through interactive exercises.

While showing a huge amount of promise, interactive diagrams are also really hard to make. Developing a good interactive diagram, along with associated interactive exercises, requires a lot of expertise with programming technologies like Javascript, html5 and server-side databases. This project will make it easier for teachers, who have the domain knowledge, to directly create interactive physics simulations.

Funding is a available for two students (one from HMC and one from off-campus). The project involves several aspects: front-end design / development (including user-oriented design and website development), programming-language design and implementation, physics content knowledge, and programming in Scala. It's not necessary to have experience or interest in all of them; but ideal candidates have experience in at least one and interest in others. (Directed by Prof. Wiedermann)

Vice-chancellor of fun position

If you are applying to an HMC summer position and would enjoy working as the social coordinator for the program and for all of HMC CS summer research, then this vice-chancellor of fun position will be of interest. It supports your on-campus housing for the summer and it adds a $1000 stipend to your existing summer stipend and puts you in charge of procuring summer foodstuff and organizing activities (or encouraging others to do so!) You need to have a driver's license, a willingness and ability to drive the large HMC van, and enthusiasm for making things happen! (2 spots available) (Directed by Profs. Wiedermann and Dodds.)