CS 8395-04: Special Topics - Intelligent Surgical Robots
Project 2 Due: Nov 10th, 17th, Dec 8th
The goal of this project is to implement a neural network for modeling the latent space from reconstructing robot kinematics from endoscope images obtained during robot-assisted procedures. You will work in pairs for this project and only one student in each pair needs to submit the project to Brightspace.
You have a lot of freedom for where you want to take this project - some ideas for improving the latent space representation results are below. Please note that you’ll have a full month for this project so the scope should match the time frame.
- Learn from both left and right simultaneously and enforce consistency between frames
- Use a more up-to-date network that accounts for temporal information (such as recurrent neural networks
- variations, or transformers)
- Change the network structure so that both kinematics and images are used as input. Then, use a different task as the self-supervised training loss (such as contrastive/triplet loss)
- Cluster in full latent space (rather than the 2D projection made by UMAP) and compare results
- Train a supervised network for skill and action segmentation and compare results
- Develop a semi-supervised network that can learn from both labeled and unlabeled data
You also have the option of exploring another temporal modeling problem. Please either send me an email about your plan or talk to me after class if you decide to do this so I can check that your project is appropriate in both scope and theme. You will also need to follow the outlined submission steps if you go this route (submit a in week 1, your preliminary code in week 2, and the final report in week 3). You may need to spend more of part 1 describing the problem.
Literature review - proj2 part1
Scan the literature on the topic of instrument segmentation. You can take a look at the works that cite the JIGSAWS dataset as a starting point (https://scholar.google.com/scholar?cites=260619035228262 4469&as_sdt=5,43&sciodt=0,43&hl=en). After you have an understanding of the existing work, use that knowledge to develop a plan for what you will implement for action modeling. Write a one page report on your plan and how it relates to the existing literature.
Implement your plan - proj2 part2
Spend the next week implementing your plan. Submit your code as a checkpoint by the end of the week. It’s OK if it doesn’t work fully yet. You’ll continue to train and improve it throughout the next weeks.
Final report - proj2 part3
Continue training your network. Once you’re satisfied with the results, write up a final report with the results you’ve obtained. Your final report should be at max 5 pages. Your final discussion should include:
- Tables and/or graphs showing your results
- Discussion of whether your results matched your expectations
- Comparison of this algorithm to homework 3
- How you could improve your project if you had more time
Prepare a presentation for describing your method and results. Presentations will take place Dec 6th and 8th. Please sign up for a slot here (as a group) https://docs.google.com/spreadsheets/d/1de70lS4-45L oM2688-55-x4HNwxnHCJjbd7Doz7wR_M/edit?usp=sharing. Your presentations should be around 15 min.
Depending on which of the above you choose, you may not necessarily get better performance (especially if you choose to experiment with different network architectures). That’s OK! The point of this class is to discuss reallife constraints that make it challenging to apply cutting-edge machine learning techniques to surgical robotics data.
Instead, the focus of the evaluation will be on whether you’ve done something interesting and whether it has clinical value. This will be judged based on your write-up. You can use readings from this class and additional readings as sources for why something would be clinically useful.