this post was submitted on
13 points (93% like it)
14 up votes 1 down vote
all 25 comments

[–]glitch83 5 points6 points ago

Separate flame war - Anyone else over the Andrew Ng and Peter Norvig worship? Can we start calling them fanboys yet? There are plenty of other AI contributors in the world who deserve credit. Does this have to do with the fact that their student base is now the size of a small city?

[–]spiderling 2 points3 points ago

It seems to me like a subject-object issue. The manner in which we as humans define objects is somewhat subjective (what is a shoe but cloth and rubber, and does it become a hat if I wear it on my head?).

Heidegger has some interesting things to say about object identity (I think he calls it thinginess) in Building Dwelling Thinking, if I recall correctly.

[–]moscheles[S] 1 point2 points ago

Dear spiderling, I appreciate your reply here. I do believe the way english speakers divide up the world around them is very much dependent on linguistic conventions handed down through culture and language use by other speakers. Having said that, I see your reply has been thumbed down, while the hordes of Machine Learning fanatics are now actively engaged in thumbing each other's posts up. I guess what I would say to them is that Peter Norvig was interviewed on camera and he himself said that the Object Recognition Problem is a central plaguing issue that he would like to see solved. And I believe the other AI researcher sitting to his left concurred with that.

[–]spiderling 1 point2 points ago

Ironically, after the thumb-down I thought I might be thinking of the wrong sort of "object," and that perhaps you meant object as in "object-based programming." Though I'm considering going back to school to study computer science, my interest in AI up to this point has been purely philosophical. Thanks for the affirmation.

[–]marshallp 0 points1 point ago

Peter Norvig was talking about it because he is actually working on it and has made progress as we speak, not because he hopes someone out there will hear him and do it.

In fact there are rumors that google internally has a pretty good object recognition system which they haven't exposed yet because it is too costly to release for free (requires too much computation on servers), or is not good enough yet (might be 80% accurate but not the 95% they are looking for etc).

The ai researcher next to him is a fool (eric horvitz) who sat around as microsoft's lead ai person the past two decades while google leapfrogged him. Microsoft's successful ai products, bing and kinect were done without him. He was a contributor to the disastrous clippy and is basically still working on that, 15 years later, with nothing practical to show for it.

[–]zarawesome 0 points1 point ago

For the purposes of this problem, a object is its visible shape. If a machine identifies a tennis shoe, a sculpture of the same shoe in clay, and a picture of a shoe as "shoes", it's good enough for research work.

[–]autoencoder 2 points3 points ago

Could be me that is biased, but I think this is the universal solution.

[–]moscheles[S] 4 points5 points ago

Modern robots do not yet perceive objects if by we mean by the word objects, "what a thing is". Robots at this time (early 2012), only perceive patches of color. They can be made to follow patches of color by blindly following geometric algorithms on depth. Even today, expensive robots in academic laboratories can be 300 thousand dollars. Even they have problems where their localization drifts because a calibration was off slightly. The researchers often have to back the robot up and restart it to regain is localization. The reason for this is because these expensive robots are still merely puppets who are following a geometric algorithm. If these robots actually perceived objects, they could "self-calibrate" and self-localize without any outside help. They would know when they would see their own grasper hand, and they would know an object in partial occlusion and turned at an odd angle. Today they do not do this. They cannot do this.

The english word "chair" is a category. The determining factors of this category are a thing's use to human beings who do a sitting action on top of things. The category, "Chair", is not the collection of invariant features extracted from thousands of photographs of chairs.

When I hand a human being a collection of say, 80 photographs of various chairs, they realize they are all chairs not becuase they share some common visual feature. The long-term memory of the human has INSTANTIATED all these rather different-lookig objects into the same "Use Category" through the years. There is no common visual feature to be extracted, because it is simply not there. And a supercomputer the size of a skyscraper would never find it. It's not there to be found.

There is a problem with objects such as "glass of water". Because water is not an object but a substance. I can take one glass of water, and split it to create ten glasses. You AI guys and Machine Learning fanatics propose that a "rock" is equal to the collection of visual features of rocks. However I could grind a rock into sand, and then argue that it was "sand all along." There is no clear way to prove me wrong in this instance. This might seem strange to you only because you are an english speaker with the cultural conventions of english. English speakers have no problem with referring to ice as "frozen water" and that it was "really water all along". Why then, can I not say a rock is merely frozen sand?

Coming to me and claiming "machine learning already solved this and go read this paper" is not what I will count as a thorough and serious discussion of the matter.

[–]marshallp 0 points1 point ago

sounds like a lot of wishy-washyness.

What you are talking about is really data, you need to collect to enough data to have robust recognition (and this includes features, as well as examples).

There are real object recognition benchmarks, serious people are working on this. Some datasets - imagenet from stanford, 80 million tiny image and SUN from mit. Object recognition for small categories, about 5-100 categories is "solved", ie, you can get a computer to learn in a few minutes or hours. For real world use, humans can recognize between 30,000-100,000 different types of objects, and progress is being made on that front. Man made objects, houses, cars, airplanes, tables etc, are much easier than natural objects - animals and plants, but progress is made on it. (My own prediction is that probably this year object recognition will be "solved" in the sense of matching human performance - afterwards it will outperform humans).

[–]moscheles[S] 0 points1 point ago

You refer to machine vision performance on still-image "data sets". That is not the Object Recognition Problem.

[–]marshallp 0 points1 point ago

That is what is meant by object recognition in the computer vision and machine learning community. Maybe you're talking about some philosophical sense of the term but that's not how phrased it in your op's (get robots to see).

[–]rmnature 0 points1 point ago

You're right about all this. Object recognition in the sense you're talking about is a frustratingly hard problem.

[–]autoencoder 0 points1 point ago

Though I have very limited experience (only the sparse autoencoder exercise from UFLDL), I believe we basically solved perception.

It's only a matter of time and processing power until we can create robots that can gather data and learn for themselves. Hence your claim of "they cannot do this" is, honestly, ridiculous.

And as for these nomenclature issues you present, I think a system with a greater computing power will deal with them the same way humans do - by learning abstractions.

For each extra layer of features, you can represent more and more abstract concepts. An existent modification of this algorithm can represent sentences internally, with arbitrary and variable depth.

Also, according to this neat and reference-abundant video, all we know is derived from evidence. That includes all your petty nomenclature technicalities. A deep learning system will be able to figure them out.

And you can prove me wrong, if we wait a little while (2015? 2020?) for computers to become cheaper.

Don't be afraid of AI, because it's coming, whether you like it or not, and I don't think anything short of a global cataclysm could stop it. Even if this technology isn't the right one, I am sure people will find their way around it, in the worst case by simulating the brain, because the brain is just a physical system.

All I wonder now is whether such neural networks could be considered sentient. :D

[–]moscheles[S] 0 points1 point ago

Having admitted that you have "very limited experience" in this topic, I will say the following,

If you think you have a solution to this problem, please send an email to Peter Norvig. He is tenured, he is published, and he has plausible access to millions of dollars in R&D money. If you think you are ready to formulate the solution, go tell him, not me. I'm here on reddit merely looking for links to what professionals are saying about it.

[–]marshallp -2 points-1 points ago*

Actually, Peter Norvig agrees with the sparse coding "formulation". (He referred to it a few months ago because some of his google colleagues and andrew ng published a paper in which they used that to get the best performance on identifying house numbers on google street view images).

edit: forget to add, sparse coding also has given the best results on general object recognition this year (ie. it is the best method). In previous years hand coded feature engineering and supervised learning methods (svm's) gave the best result. This is a hugely significant breakthrough because no human engineering is involved - simply let a simple algorithm run, and so this can be scaled up to get as accurate results as you want or can afford.

[–]autoencoder 0 points1 point ago

Wow, I swear I didn't see your message. Are we the same person?

[–]marshallp 1 point2 points ago

Haha, no, but a general consensus is building about the importance of sparse coding. In fact, in signal processing it occurred a few years ago, where it is known as compressed sensing. Terence Tao did some mathematical thing (that i don't understand), but was significant because it moved the field of linear algebra forward, which hadn't had anything new happen since the early 19th century.

[–][deleted] 0 points1 point ago

Humm, hmm. What does your last sentence mean? What was the last advancement in linear algebra in the early 19th century?

[–]marshallp 0 points1 point ago

I mentioned that I don't actually understand it. I saw a video on youtube that terence tao gave (mainly because he might well smartest person alive on earth) about compressed sensing to laypeople and he mentioned that it's an advancement on linear algebra, a subject which was thought to have been fully explored in the 19th century.

[–][deleted] 0 points1 point ago

There is probably a misunderstanding, there were a lot of results in linear algebra during the 20th century. In fact, the definition of a linear space was given by Peano around 1890, so most of the modern formulation is fairly recent.

Tao's work is more closely related to harmonic analysis I think.

[–]autoencoder 0 points1 point ago

I think he knows about it. I don't think I need to prove anything. I also think he'd agree with it - he's a fan of using statistics in natural language processing.

Again, this algorithm is here, but it needs refinement and a lot more computing power to be practical.

But I'll mail him about his opinion. What if he does answer strangers on the internet? Thanks for the suggestion!

Also, both guys, him and Andrew Ng, have really close ties with Google. I think Google might actually be using these kind of things for their "visually similar" image search.

[–]marshallp 0 points1 point ago

also see his publications page http://www.cs.stanford.edu/people/ang/papers.php

Basically, sparse coding is what can allow "learning" of anything, whether it be through autoencoding neural networks or sparse pca, or k-svd. Closely related is fourier transforms, filter banks, wavelets, and random features.

[–]glitch83 1 point2 points ago

I firmly believe that object recognition will require AI to come back together before it is "solved" - Learning, Planning, Common-sense, Vision, Language, etc. I think the reason it's a "problem" is that it gets at the fact that there aren't many people who are thinking big enough yet.

Honest challenge to the field - read some literature in a tangential AI field and get some inspiration. Time to bring it all back together :-P

[–]moscheles[S] 1 point2 points ago

Good. These are the days in which I think the philosophers should come across campus to the AI department and toss cold water on the people there.

[–]meshugga 1 point2 points ago

Absolutely. The predominant stance seems to be, any problem that can't be tackled by statistics doesn't exist or isn't formulated correctly.

I believe, cognitive vision is more a problem of combinatorics than statistics, and desperately needs context/pragmatics, reasoning and a concept of time and modality. And given the same information, it should make the same mistakes as the human brain please.

I'm pretty sure, AI would be somewhere way farther if the AI gurus also took courses in linguistics and psychology. There are a lot of reasons to believe that the brain actually doesn't identify objects through vision only.