存档
Kmeans based indexing and Asymmetric Distance Computation for ANN search (Binary Local Feature): part1
受Herve Jegou的Hamming Embedding and Weak Geometric consistency for large-scale image search以及Product quantization for nearest neighbor search的启发,将Kmeans clustering、inverted files、Asymmetric Distance Computation应用到二进制形式的局部特征的最近邻检索。
主要思路:
用Kmeans做特征的粗索引。
根据统计数据对feature进行压缩。
检索时使用非对称的方式计算索引特征与查询特征之间的距离。
算法:
训练:
- 使用Kmeans对欲索引的特征进行聚类,得到K个中心。对二进制形式的feature做聚类时,类别中心更新方式为:对于每一个bit,统计所有落在该类别的特征的对应bit上的1,0频率,并取高者。
- 对于每个cluster,统计所有落在该类别的特征的每个bit位的1,0频率,取1或者0频率靠近50%的前M个bits。(越靠近50%,熵越大)
经过训练,我们得到两组数据:
- K个特征类别中心。
- 对于每个类别中心,都有一组“M个bit位置标示符”。这些标示符构成一个对原始feature进行压缩的依据。(本文以后将其称为投影向量)
CV Dazzle

之前介绍过一个反人脸检测的东西,作者Adam Harvey又将其发扬光大了,整出一个CV Dazzle。最早的版本没有考虑到审美上的因素,纯粹只是为了干掉人脸检测器(OpenCV based),这次Adam Harvey试图弄得。。好看点?
[招聘] 华为 多媒体技术实验室杭州分部
Job Function:
Researcher responsible for developing algorithms and prototype of research projects related to video and audio. The candidate will work with the multimedia technology lab in Hangzhou or Shenzhen.
Skills/Experience
Must hold a Bachelors and/or a Masters/or a PhD in Electrical Engineering/or Computer Engineering with at least 2+ years experience in design, development, and integration of multimedia algorithms.
Strong Knowledge in computational camera, computational photography, 3D display and video/audio signal processing.
Knowledge in video standards on H.264, MPEG4, VP6/VP8, or graphics domain, Open GL standards is a definite plus.
Knowledge in audio standards on AMR, AMR-WB, G.711, G.719 and OpenAL.
Knowledge in optical lens design is a plus.
Strong C, C++, and MatLab development skills is required.
Experience in initiating a research proposal and conduct the research activities, developing deliverable software with a GUI interface is desired.
Should have good analytical ability, problem solving skills and be a self-starter.
Work well within a matrix organization and able to influence and collaborate with team members all over the world.
vibe

ViBe – a powerful technique for background detection and subtraction in video sequences
Executive summary
Description
ViBe is a powerful pixel-based technique that detects the background in video sequences. Many experiments have shown that it performs better than the state-of-the-art techniques known in the scientific literature. In addition the computational load is lower than simple background techniques implemented in commercial products. ViBe is the perfect solution for both software and hardware implementations.
Code and program for Windows and Linux
- A program for Windows and Linux. Download an archive zip archive [10 MB - updated on May 19, 2011] to use ViBe on Windows (or under Wine in Linux). Details on this page.
The program allows you to: (1) save the result for your own images, (2) change the few parameters of ViBe to experiment with, and (3) reproduce our results. - Linux: link a C/C++ object file to your own code. We provide the object (compiled) code of ViBe for non-commercial applications. Under Linux, download the 32 bits zip or compressed tar file, or the 64 bits zip or compressed tar file. Details on this page.
KinectFusion的PCL实现
WillowGarage的猛士再次发力,实现了今年ISMAR上炫目的KinectFusion。
The preliminary source code is currently available in our SVN repository’s trunk in the CUDA/KinFu module. Since this code is still unreleased and under active development, we won’t be able to provide support via our forums yet; however, advanced users are free to check out the code and give it a try. Be advised that this code relies heavily on the NVidia CUDA development libraries for GPU optimizations and will require a compatible GPU for best results.
Moving forward, we continue to refine and improve the system, and we are hoping to improve upon the original algorithm in order to model larger scale environments in the near future. We are targeting a stable release date to coincide with the upcoming PCL 2.0 release next year. (Please note there is no planned release in the 1.x branch.)
Learning ImageMagick 5: 颜色通道操作
一对对偶操作:分离颜色通道,合并颜色通道。
分离颜色通道:使用-channel 和 –separete 参数
convert ..\SampleImages\Lena.png -channel R -separate Lena_R.png
convert ..\SampleImages\Lena.png -channel G -separate Lena_G.png
convert ..\SampleImages\Lena.png -channel B -separate Lena_B.png
另外一种分离方式:同时提取三个通道。
convert ..\SampleImages\Lena.png -separate Lena_RGB_%d.png
合并颜色通道:使用-combine 参数
convert Lena_R.png Lena_G.png Lena_B.png -combine Lena2.png
ZZ OpenCV在iOS和x86平台上的性能测试
来自老杨的一篇评测。
好久没更新博客了,最近都在忙找工作,目前来看形势还不错。感觉我基本已经跟computer vision没什么关系了,面了很多公司,只有一个是视觉的,其它都是纯码农活儿了。
这学期上了个计算机架构的水课,最后的大作业我就benchmark了一下OpenCV在x86和ARM下面的性能,我的笔记本的CPU是Intel Core i7 620M,iOS测试我用的是iPod Touch,和iPhone 4是一样的CPU, Apple A4。最新的iPhone 4s和iPad2上的Apple A5没机会测,因为这俩都不能完全越狱。x86比ARM快那是肯定的,不过这个benchmark可以看出来到底差多少。
编译:在x86上我用的是64位编译的,因为Apple A4分别支持ARM v6和v7两个版本,我就分别都编译了不同的静态库。
测试:我使用了不同的数据类型,8/16/32位整形,32/64位浮点;不同的输入矩阵大小,4*4/8*8/…/256*256/512*512;不同的操作,加、乘、转置、求逆、SVD,还有一组图像处理的比较。
下面是一些比较结果:

最新评论