Announcement

Collapse
No announcement yet.

New visual search engine

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    New visual search engine

    PC Pro: News: Visual search engine is photographer's best friend

    Pretty cool technology, I'm just curious how long a basic search is going to take to complete for something like this.
    [img]http://img.photobucket.com/albums/v337/Igorod/troopdod.jpg[/img]
    [url=http://profile.xfire.com/trooper110][img]http://miniprofile.xfire.com/bg/co/type/1/trooper110.png[/img][/url]

    #2
    I wish they gave us an example picture/video of the work in progress!
    [url="http://www.lolking.net/summoner/na/19030358"]Dracontius - North America - Summoners - League of Legends Statistics - lolking.net[/url] - My LoL stats etc

    [URL="http://signup.leagueoflegends.com/?ref=4b6dbfab44e82"]http://signup.leagueoflegends.com/?ref=4b6dbfab44e82[/URL] - Use link if you want to play League of Legends.

    Comment


      #3
      Originally posted by Trooper110 View Post
      PC Pro: News: Visual search engine is photographer's best friend

      Pretty cool technology, I'm just curious how long a basic search is going to take to complete for something like this.


      You do understand that when you click "search" in a search engine what that search engine is searching is a database and not the internet? Basically how search engines work is they have these robots or crawlers or spiders(basically scripts) that continually crawl the net and then snarf up various bits of info into a database. When you submit a search you actually search that database. So in essence this search should take no longer than any other type of search if they've done the backend DB stuff correctly. My guess is they have some type of algorithm for hashing all of the various image formats. Then when you submit a pic it subjects it to the same algorithm and does a hash compare. If the compare is within a certain tolerance it's listed as a match. Of course putting a robots.txt like this at the root of the web server

      User-agent: *
      Disallow: /


      defeats all search engines. I had to do that for my gallery site cause when crawlers hit my in home picture server latency increased.

      Comment


        #4
        The problem is that it's supposed to search the pixels in each picture. They're going to have to have some kind of incredible database and search algorithm to do that. It's supposed to be able to find cropped or lightly modified images as well. That's going to take some massive computing power.
        [img]http://img.photobucket.com/albums/v337/Igorod/troopdod.jpg[/img]
        [url=http://profile.xfire.com/trooper110][img]http://miniprofile.xfire.com/bg/co/type/1/trooper110.png[/img][/url]

        Comment


          #5
          Originally posted by Trooper110 View Post
          The problem is that it's supposed to search the pixels in each picture. They're going to have to have some kind of incredible database and search algorithm to do that. It's supposed to be able to find cropped or lightly modified images as well. That's going to take some massive computing power.


          Don't think so. Like I said it's going to hash each image that it comes across. Then store that hash in a DB. When you upload an image it hashes it as well. It then searches the DB for a hash collision or something close. My guess is the hard part is tweaking the hashing algorithm. I bet it works by looking at pixel to pixel ratios and builds off of that. i.e this one is green and is next to a black and a red one. Only it does this over the entire pic. Reducing the pic down to a hash...or signature. This is the only way they can do it


          For example of this check out this free open source project

          DuMP3 - duplicate/similar file finder - Home

          DuMP3 (derived from Duplicate MP3) is a Java program to find any duplicate or similar file.
          It finds files by calculating a fingerprint based on the image, audio or text data for each file and then comparing the fingerprints. It does not compare filenames or even ID3 tags (even though plugin classes could be written that perform these operations). Calculated fingerprints can be stored in a MySQL database so that they do not have to be calculated again.

          Comment


            #6
            The TinEye search engine, developed by Canadian company Idee, allows users to search by uploading a picture rather than typing in a keyword. It then conducts a pixel by pixel search across the internet, flagging up all instances of that image even if it's been cropped, merged or digitally altered in some way.

            "TinEye does for images what Google does for text," says Leila Boujnane, the CEO of TinEye. "We are not limited by words, Google can only find an image if a particular search word is in proximity to it. We have the ability on a large scale to tell somebody where one of their images has appeared and how it's being used."

            And the technology is not dependent on the quality of the input image, according to Boujnane: "Anything you would consider a preview image or low resolution image would work. I can take a photograph of a picture in the Louvre with my mobile and upload it to TinEye and it would dump me on the page of that Wikipedia page related to that painting."
            I guess you could do that with hashes, but I'm not sure how you would.
            [img]http://img.photobucket.com/albums/v337/Igorod/troopdod.jpg[/img]
            [url=http://profile.xfire.com/trooper110][img]http://miniprofile.xfire.com/bg/co/type/1/trooper110.png[/img][/url]

            Comment


              #7
              Originally posted by Trooper110 View Post
              I guess you could do that with hashes, but I'm not sure how you would.

              It's all data in every image file format. You just need some way to read in data for all of the file formats and process that data stream in a similar way. In essence though the formats are going to display(output) a picture so they all have some of the same elements. So you apply a fuzzy type of logic when creating the hash based on pixel relation to each other. BTW a hash is

              A hash value (also called a "digest" or a "checksum") is a kind of "signature" for a stream of data that represents the contents.
              Do the same thing to any uploaded image and then search the DB for similar signatures.

              The three main time/processing intensive areas are
              1.crawling the net
              2.doing the hash generation on each image found
              3. database search.

              Crawling the net and DB searching are pretty mature technologies. It'll be hard to F those up. Also on the plus side is crawling and hashing images are happening always in the background just like Google is doing right now. The scripts are constantly crawling the net and updating the DB. The only lag time is when a user needs to submit a search. so it will have to

              1.Upload the image
              2.Generate the signature on the image
              3.Search DB for similar signatures.

              With broadband uploads are not an issue. Custom made hardware that does a gizillion hashing functions are common place nowadays so hashing shouldn't be a prob. That leaves DB searching and as I said this is pretty mature technology. I'm surprised someone hasn't done this sooner to be honest.

              Comment


                #8
                I'm still curious as to how they'll search for digitally modified pictures. I can understand finding cropped or such pictures, you'd still have part of the hash, but if it's been modified enough the hash isn't going to match up that well, especially if it becomes part of a larger modified file. Guess we'll see how things come out.
                [img]http://img.photobucket.com/albums/v337/Igorod/troopdod.jpg[/img]
                [url=http://profile.xfire.com/trooper110][img]http://miniprofile.xfire.com/bg/co/type/1/trooper110.png[/img][/url]

                Comment


                  #9
                  Originally posted by Trooper110 View Post
                  I'm still curious as to how they'll search for digitally modified pictures. I can understand finding cropped or such pictures, you'd still have part of the hash, but if it's been modified enough the hash isn't going to match up that well, especially if it becomes part of a larger modified file. Guess we'll see how things come out.

                  If the hash sig is significantly similar it scores it as a hit. Don't think of the hash actually taking the numerical values of each pixel and summing them. Think if the hash took the numerical difference of each pixel and each adjacent pixel and built the hash off of that. If the hash function was good enough it could find pictures that had the colors inverted or changed over the entire pic in the same manner. For example lightening, darkening sharpening, histogram changes, various color filters would all still score as a hit.

                  Comment


                    #10
                    Heh I rock. I understood the product by just hearing about it. Here is info from tineye's own website

                    How does TinEye work?
                    Every day TinEye's spiders crawl the web for additional images. Using sophisticated pattern recognition algorithms, TinEye creates a unique and compact digital signature or 'fingerprint' for each one and adds it to the index.

                    When you want to find out where an image is being used on the web, you submit it to TinEye. The attributes of the image are analyzed instantly, and its fingerprint is compared to the fingerprint of every single image in the TinEye search index. The result? A detailed list of any websites using that image, worldwide.

                    Use TinEye to find out where and how an image appears on the web, even if it has been cropped or heavily modified.

                    Comment


                      #11
                      I'd still like to see how they're doing the searches on the "heavily modified" images :P
                      [img]http://img.photobucket.com/albums/v337/Igorod/troopdod.jpg[/img]
                      [url=http://profile.xfire.com/trooper110][img]http://miniprofile.xfire.com/bg/co/type/1/trooper110.png[/img][/url]

                      Comment


                        #12
                        Check out they're beta it works pretty good.

                        Comment

                        Cain's Lair Forums Statistics

                        Collapse

                        Topics: 26,184   Posts: 269,831   Members: 6,182   Active Members: 5
                        Welcome to our newest member, newiron09.

                        Today's Birthdays

                        Collapse

                        Top Active Users

                        Collapse

                        There are no top active users.

                        More Posts

                        Collapse

                        • Reply to BF4 new easter egg.. need to know moriss code
                          by newiron09
                          BTW on iOS use "Light Conversation", its an app from an artist...he spreaded it for light arrangements(stories in lights) in a city...propably...
                          15 Sep 2024, 09:24 AM
                        • Reply to Otterbox is 10 years old
                          by newiron09
                          Having tried many cases I've given up on expensive ones. Not convinced they are significantly better than an inexpensive gel case for 1/10 the money....
                          15 Sep 2024, 09:20 AM
                        Working...
                        X