存档

文章标签 ‘计算机视觉’

世博会里的计算机视觉与增强现实

2010年9月6日 10 条评论

上个周末终于去了世博,继若干年前春运时在上海南站通宵排队买火车票之后,再次被勤劳勇敢的中国人民的排队能力震撼了。人胖,不能久站,只看了若干冷门馆,在此分享一下我在世博会上看到的计算机视觉与增强现实相关玩意儿。

清明上河图

中国馆的清明上河图相当的赞,一副画卷在长廊的墙壁上徐徐展开,粗粗估计一下,至少用了15个投影仪,画中灯光水影,市井里人来人往,客栈里人声鼎沸。有那么一瞬间,我感觉我像是轩辕剑三之天之痕里的正要找个杂货店买些天仙玉露的陈靖仇。。。

阅读全文…

计算机视觉会议时间表

2010年6月15日 没有评论

放上来共享,兼备忘。 http://iris.usc.edu/Information/Iris-Conferences.html

下面这张表早晚得过期,还是点击上面的链接跳转查询吧。

表里的P代表paper deadline
computer vision conferences time table 2010

CVPR2010

2010年6月13日 没有评论

CVPR算不算是cv届的世界杯呢,或者是苹果粉丝的WWDC?今年的CVPR在San Francisco,今天是CVPR 2010的第一天,最后一天是18号。几年的CVPR有21 workshops, 8 tutorial sessions, 和33 demos。

有谁在现场么?投递点第一手材料啊。SOS。

奇怪的是怎么访问不了官网呢?

附上 cvpr2010 paper online

Standford的Computer Vision课程

2010年5月18日 2 条评论

来自斯坦福的CV课程。xiaochao推荐的。

链接:http://vision.stanford.edu/teaching/cs223b/

老师是Feifei Li?

里面有一组视频,一年前想找而未遂。Computer Vision, Fact & Fiction。原始来源是ucsd

OpenCV 2.1 发布

2010年4月10日 1 条评论

Open Computer Vision Library IconOpen Computer Vision Library IconOpen Computer Vision Library IconOpen Computer Vision Library IconOpen Computer Vision Library IconOpen Computer Vision Library IconOpen Computer Vision Library IconOpen Computer Vision Library IconOpen Computer Vision Library IconOpen Computer Vision Library Icon

来源如下:

http://sourceforge.net/mailarchive/forum.php?thread_name=s2y619b2d671004051902n71c46746m5dc42581ca1d4a79@mail.gmail.com&forum_name=opencvlibrary-devel

贴个changelog里的新添加的features:

>>> New functionality, features:
- cxcore, , cvaux:
* (http://en.wikipedia.org/wiki/) image segmentation algorithm has been implemented.
See /samples/c/grabcut.cpp
* new improved version of one-way descriptor is added. See opencv/samples/c/one_way_sample.cpp
* modified version of H. Hirschmuller semi-global stereo matching algorithm that we call SGBM
(semi-global block matching) has been created. It is much faster than Kolmogorov’s graph
cuts-based algorithm and yet it’s usually better than the block matching StereoBM algorithm.
See opencv/samples/c/stereo_matching.cpp.
* existing StereoBM stereo correspondence algorithm by K. Konolige was noticeably improved:
added the optional left-right consistency check and speckle filtering,
improved performance (by ~20%).
* User can now control the image areas visible after the stereo rectification
(see the extended stereoRectify/cvStereoRectify), and also limit the region
where the disparity is computed (see CvStereoBMState::roi1, roi2; getValidDisparityROI).
* Mixture-of-Gaussian based algorithm has been rewritten for better performance
and better accuracy. Alternative C++ interface BackgroundSubtractor has been provided,
along with the possibility to use the trained background model to segment the foreground
without updating the model. See opencv/samples/c/bgfg_segm.cpp.
还有几个链接:

The packages are available at SourceForge (
https://sourceforge.net/projects/opencvlibrary/files/).
The detailed ChangeLog is here:
https://code.ros.org/svn/opencv/trunk/opencv/doc/ChangeLog.htm.
The installation guide is here:
http://opencv.willowgarage.com/wiki/InstallGuide

NocturnalVision:黑暗环境下的图像增强

2010年4月3日 4 条评论

The comparison of a frame taken recorded with a videocamera a night shows the original frame at left, the amplified signal at the center, and the image with NocturnalVision's noise reduction applied at right.

摄像头在黑暗环境下的噪声是在是忍无可忍,一个瑞典的创业小公司 NocturnalVision希望借助计算机视觉技术来提高摄像头在黑暗环境下的图像质量,通过帧内和帧间的信息来达到降噪的目的。媒体报道和公司官网并没有披露算法细节,只提及了一些仿生的背景。

但是貌似算法非常慢,而且有个固有的缺陷:因为要用到帧间的信息,输出图像可能要延迟上几帧。

“This algorithm is well-suited for parallel programming on a or on other parallel architectures,” Malm said. With newer hardware, it’s likely the algorithm could work in real time–in other words, for video cameras that shoot at the ordinary rate of 30 frames per second.

来源

推荐一本书

2010年4月1日 6 条评论

最近看到一本新到还没有出版的书: : algorithm and applications,作者是微软研究员的Richard Szeliski

内容异常之与时俱进。大赞。

链接在此:http://research.microsoft.com/en-us/um/people/szeliski/Book/

追忆似水年华:SenseCamera助人找回失落的回忆

2010年3月22日 1 条评论

佩戴一个便携式的摄像头,录音仪,gps来记录身边发生的一切已经不是个新鲜玩意了。但是如何有效的挖掘,浏览,总结这海量的数据,却是个新鲜可挖掘的课题。

SenseCamera是微软研发的这种设备,有摄像头,光学传感器,红外传感器,加速仪等等。目前研究者致力于如何有效的组织采集到的影像等数据,来帮助记忆有困难的人来了解过去究竟发生了什么,从而不用像memento(记忆碎片)里那个可怜的家伙一样把自己全身刺满纹身了。

下面是相关研究的介绍引文,没时间翻了。

To find the best memory cues for Mr. Reznick’s experiences, the researchers — Anind K. Dey, a computer science professor at Carnegie Mellon University, and Matthew Lee, a graduate student — considered the types of images that had proved the most effective in previous SenseCam studies.

They soon realized that the capriciousness of memory made answers elusive. For one subject, a donkey in the background of a barnyard photo brought back a flood of recollections. For another, an otherwise unremarkable landscape reminded the subject of a snowfall that had not been expected.

Still, the researchers came up with some broad rules for identifying and retrieving images likely to serve as memory triggers. For a people-based experience like a family reunion, the system selects photographs in which faces are clearly discernible; for a location-based experience like a visit to a museum, it uses geographical positions provided by GPS and accelerometer data to judge what images might be most salient — for example, when a subject might be hovering at one spot, like in front of a painting.

Research groups elsewhere are experimenting with other techniques to summarize and make use of SenseCam data. Alan Smeaton and colleagues at Dublin City University in Ireland are comparing images to categorize them by activity — shopping, for example — so the system can put together a visual summary of the day. At the University of Toronto, a group led by Ronald M. Baecker is investigating the usefulness of complementing SenseCam images with an audio narrative created by a loved one.

Once the system selects some photos from the hundreds taken, the caregiver winnows down the candidates, adding cues like audio from the voice recorder, verbal narration and brief text captions. The final product is a multimedia slide show on a tablet computer that allows the patient to dig deeper into highlighted parts of some images by tapping on the screen. The first tap plays audio, the second shows captions.

“The design is intended to give the patient the ability to engage actively with the experience instead of simply flipping through some pictures,” said Mr. Lee, the graduate student. Testing the system with the Reznicks and two other couples, he and Dr. Dey found that it helped patients recall events more vividly and with greater confidence than when they simply went through all of the images.

Other SenseCam studies — also financed by — have produced encouraging results, but plans to market the device as a memory aid have not been announced.

媒体来源

没时间更新的结果就是这样的链接大放送

2010年3月10日 1 条评论

时间太少,没空所有好玩的,有价值的都单独更新,只好来个链接大放送:

  • BlogOV的更新,探讨一下智能监控中视频检索的问题。其中谈到的用户需求值得学习。
  • What do users really want? Based on direct user feedback, they want to be able to search for specific people, vehicles or events across the enterprise video system. They want to use what they learn from one search to refine the next. They want to search by example—designating a specific vehicle or person of interest to flag in a database search that might be narrowed by a specific geographic region or time period. They want to create searches and visualize search results on an intuitive geo-interface. And, they don’t want to just be limited to the video archives. Forensic search results need to then become the parameters for real-time rules to find exactly where that white cargo van of interest is right now.

  • 谁说计算机视觉没有用,视频检索威力很大啊,虽然是用来检测porn。。。这就是传说中的porn detector。
  • 据说给它1个半小时,就能从500GB的硬盘中扫描7w张图片,然后通过面部分析、肤色分析、器官分析等多种算法找出想要的图片。官方保证它的失误仅保持在1%

  • FourSquare算不算是geo-based增强现实呢?这里有教程,手把手教你怎么在黑莓上玩,可惜俺的手机是nokia 1200啊。
    foursquare_blackberry
  • People hopper是google最新推出的一个实验产品,有点像face morphing,但是中间过程都是真实的用户头像,没有任何的graphics的成分。
  • 另:google research blog上有很多新鲜玩意,建议围观。

用计算机视觉追踪苍蝇

2010年3月3日 没有评论

Dickinson Lab

: The Caltech Multiple Walking Fly Tracker

当生物学家要研究小型生物,比如果蝇的行为时,传统的法布尔式的人为观测变得不可能。加州理工利用计算机视觉,在一个相当长的时间里,跟踪每个果蝇的位置和朝向,从而为生物学家解决了大量观测数据的来源问题。

Ctrax是基于matlab的开源代码,大伙可以在夏天抓几只苍蝇来试验一下。。。

Ctrax is an open-source, freely available, machine vision program for estimating the positions and orientations of many walking flies, maintaining their individual identities over long periods of time. It was designed to allow high-throughput, quantitative analysis of behavior in freely moving flies. Our primary goal in this project is to provide quantitative behavior analysis tools to the neuroethology community, thus we’ve endeavored to make the system adaptable to other lab’s setups. We have assessed the quality of the results for our setup, and found that it can maintain fly identities indefinitely with minimal supervision, and on average for 1.5 fly-hours automatically.

To minimize the number of identity errors made during tracking, we provide the FixErrors Matlab GUI that identifies suspicious sequences of frames and allows a user to correct any tracking errors. We also distribute the BehavioralMicroarray Matlab Toolbox for defining and detecting a broad palette of individual and social behaviors. This software inputs the trajectories output by Ctrax and computes descriptive statistics of the behavior of each individual fly. We provide software for three proof-of-concept experiments to show the potential of the Ctrax software and our behavior detectors.

This software is described in the article High-Throughput Ethomics in Large Groups of Drosophila.