A Brief Review of Computer Vision and Pattern Recognition (CVPR) 2017

Mean paper name: “Deep … in the wild”

With what must be one of the best locations for a computer vision conference ever, CVPR 2017 was always going to be one of the best, but even considering that this conference was notably better organized than most of it’s predessesors. This was despite a massive increase in attendance - over 37% more than CVPR 2016 in Los Vegas see here for the slides describing this year’s stats, and a 40% increase in paper submissions. Notable changes this year included moving to a 3 track conference instead of 2 track. Even so, one of the days had a free afternoon, and the organizers noted that this was at least partially because reviewers had not nominated enough orals this year to fill the schedule.


Of the few papers that were orals, I was impressed with the following:

  • “Unsupervised Learning of Depth and Ego-Motion from Video”: This work is simply impressive (full disclaimer, one of the co-authors is my former PhD supervisor). It learns depth simply from unlabelled data of car driving around urban scenes. The weaknesses of the method seemed to be that it learned a bias in the data not to expect any close depth in the center of the frame (caused by correctly cautious drivers leaving a gap between themselves and the car in front), and a lack of good perforamance on non-urban scenes. In both cases the authors claimed that these were solvable by having more data. However, I believe there simply aren’t enough features outside of urban canyons for this method to suceed outside of urban environments.
  • “Learning From Simulated and Unsupervised Images””: Although the work is interesting, and potentially very useful, this oral stood out for another reason. It was explicitly an oral by Apple researchers, marking a significant shift for a company that only a few years ago wouldn’t even let it’s CVPR attending employees admit their affiliation - nevermind publish it’s research. Hopefully this newfound openness continues.
  • “Densely Connected Convolutional Networks”: This work presented some very impressive results in a domain I’m particularly interested in. I’m surprised such a network can be more computationally efficient while reducing error significantly. I will be trying to repeat these results for sure.
  • “Global Optimality in Neural Network Training”: With little theoretical progress in understanding the optimization of deep networks, I like this work because it has relatively few assumptions compared to many theoretical analysis, and yet has relatively large claims. Whether it can be put to effect in practice however remains to be seen.
  • “YOLO9000: Better, Faster, Stronger”: This oral surprisingly suceeded despite (because?) of the ridiculous title, and the theme of the presentation (daft punk’s stronger, faster). The presentation was engaging, and the results very impressive. A live demo of their realtime object detection system not only worked, but seemed to go beyond even the author’s expectations when the object detection system identified most of the audience in view along with the planned foreground objects.

Industrial Presence

As with every year, the industrial presence at CVPR grew more this year. While the number of companies increased, this was mostly due to the large number of new startups with a booth presence, rather than the larger companies. If anything the larger US companies had a slightly reduced prescense overall, especially when I compare with NIPS. This had the notable exception of Apple however, which organized three separate events, two `mixers’ and one technical session, and Nvidia, whose CEO attended the conference and made a major new GPU announcement. Notably this year there were many more asian companies with a presence, Chinese (Tencent, and too many to list), and Korean companies (Naver, Samsung).

Leave a Comment

Your email address will not be published. Required fields are marked *