Announcement

Collapse
No announcement yet.

New visual search engine

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    New visual search engine

    PC Pro: News: Visual search engine is photographer's best friend

    Pretty cool technology, I'm just curious how long a basic search is going to take to complete for something like this.
    [img]http://img.photobucket.com/albums/v337/Igorod/troopdod.jpg[/img]
    [url=http://profile.xfire.com/trooper110][img]http://miniprofile.xfire.com/bg/co/type/1/trooper110.png[/img][/url]

    #2
    I wish they gave us an example picture/video of the work in progress!
    [url="http://www.lolking.net/summoner/na/19030358"]Dracontius - North America - Summoners - League of Legends Statistics - lolking.net[/url] - My LoL stats etc

    [URL="http://signup.leagueoflegends.com/?ref=4b6dbfab44e82"]http://signup.leagueoflegends.com/?ref=4b6dbfab44e82[/URL] - Use link if you want to play League of Legends.

    Comment


      #3
      Originally posted by Trooper110 View Post
      PC Pro: News: Visual search engine is photographer's best friend

      Pretty cool technology, I'm just curious how long a basic search is going to take to complete for something like this.


      You do understand that when you click "search" in a search engine what that search engine is searching is a database and not the internet? Basically how search engines work is they have these robots or crawlers or spiders(basically scripts) that continually crawl the net and then snarf up various bits of info into a database. When you submit a search you actually search that database. So in essence this search should take no longer than any other type of search if they've done the backend DB stuff correctly. My guess is they have some type of algorithm for hashing all of the various image formats. Then when you submit a pic it subjects it to the same algorithm and does a hash compare. If the compare is within a certain tolerance it's listed as a match. Of course putting a robots.txt like this at the root of the web server

      User-agent: *
      Disallow: /


      defeats all search engines. I had to do that for my gallery site cause when crawlers hit my in home picture server latency increased.

      Comment


        #4
        The problem is that it's supposed to search the pixels in each picture. They're going to have to have some kind of incredible database and search algorithm to do that. It's supposed to be able to find cropped or lightly modified images as well. That's going to take some massive computing power.
        [img]http://img.photobucket.com/albums/v337/Igorod/troopdod.jpg[/img]
        [url=http://profile.xfire.com/trooper110][img]http://miniprofile.xfire.com/bg/co/type/1/trooper110.png[/img][/url]

        Comment


          #5
          Originally posted by Trooper110 View Post
          The problem is that it's supposed to search the pixels in each picture. They're going to have to have some kind of incredible database and search algorithm to do that. It's supposed to be able to find cropped or lightly modified images as well. That's going to take some massive computing power.


          Don't think so. Like I said it's going to hash each image that it comes across. Then store that hash in a DB. When you upload an image it hashes it as well. It then searches the DB for a hash collision or something close. My guess is the hard part is tweaking the hashing algorithm. I bet it works by looking at pixel to pixel ratios and builds off of that. i.e this one is green and is next to a black and a red one. Only it does this over the entire pic. Reducing the pic down to a hash...or signature. This is the only way they can do it


          For example of this check out this free open source project

          DuMP3 - duplicate/similar file finder - Home

          DuMP3 (derived from Duplicate MP3) is a Java program to find any duplicate or similar file.
          It finds files by calculating a fingerprint based on the image, audio or text data for each file and then comparing the fingerprints. It does not compare filenames or even ID3 tags (even though plugin classes could be written that perform these operations). Calculated fingerprints can be stored in a MySQL database so that they do not have to be calculated again.

          Comment


            #6
            The TinEye search engine, developed by Canadian company Idee, allows users to search by uploading a picture rather than typing in a keyword. It then conducts a pixel by pixel search across the internet, flagging up all instances of that image even if it's been cropped, merged or digitally altered in some way.

            "TinEye does for images what Google does for text," says Leila Boujnane, the CEO of TinEye. "We are not limited by words, Google can only find an image if a particular search word is in proximity to it. We have the ability on a large scale to tell somebody where one of their images has appeared and how it's being used."

            And the technology is not dependent on the quality of the input image, according to Boujnane: "Anything you would consider a preview image or low resolution image would work. I can take a photograph of a picture in the Louvre with my mobile and upload it to TinEye and it would dump me on the page of that Wikipedia page related to that painting."
            I guess you could do that with hashes, but I'm not sure how you would.
            [img]http://img.photobucket.com/albums/v337/Igorod/troopdod.jpg[/img]
            [url=http://profile.xfire.com/trooper110][img]http://miniprofile.xfire.com/bg/co/type/1/trooper110.png[/img][/url]

            Comment


              #7
              Originally posted by Trooper110 View Post
              I guess you could do that with hashes, but I'm not sure how you would.

              It's all data in every image file format. You just need some way to read in data for all of the file formats and process that data stream in a similar way. In essence though the formats are going to display(output) a picture so they all have some of the same elements. So you apply a fuzzy type of logic when creating the hash based on pixel relation to each other. BTW a hash is

              A hash value (also called a "digest" or a "checksum") is a kind of "signature" for a stream of data that represents the contents.
              Do the same thing to any uploaded image and then search the DB for similar signatures.

              The three main time/processing intensive areas are
              1.crawling the net
              2.doing the hash generation on each image found
              3. database search.

              Crawling the net and DB searching are pretty mature technologies. It'll be hard to F those up. Also on the plus side is crawling and hashing images are happening always in the background just like Google is doing right now. The scripts are constantly crawling the net and updating the DB. The only lag time is when a user needs to submit a search. so it will have to

              1.Upload the image
              2.Generate the signature on the image
              3.Search DB for similar signatures.

              With broadband uploads are not an issue. Custom made hardware that does a gizillion hashing functions are common place nowadays so hashing shouldn't be a prob. That leaves DB searching and as I said this is pretty mature technology. I'm surprised someone hasn't done this sooner to be honest.

              Comment


                #8
                I'm still curious as to how they'll search for digitally modified pictures. I can understand finding cropped or such pictures, you'd still have part of the hash, but if it's been modified enough the hash isn't going to match up that well, especially if it becomes part of a larger modified file. Guess we'll see how things come out.
                [img]http://img.photobucket.com/albums/v337/Igorod/troopdod.jpg[/img]
                [url=http://profile.xfire.com/trooper110][img]http://miniprofile.xfire.com/bg/co/type/1/trooper110.png[/img][/url]

                Comment


                  #9
                  Originally posted by Trooper110 View Post
                  I'm still curious as to how they'll search for digitally modified pictures. I can understand finding cropped or such pictures, you'd still have part of the hash, but if it's been modified enough the hash isn't going to match up that well, especially if it becomes part of a larger modified file. Guess we'll see how things come out.

                  If the hash sig is significantly similar it scores it as a hit. Don't think of the hash actually taking the numerical values of each pixel and summing them. Think if the hash took the numerical difference of each pixel and each adjacent pixel and built the hash off of that. If the hash function was good enough it could find pictures that had the colors inverted or changed over the entire pic in the same manner. For example lightening, darkening sharpening, histogram changes, various color filters would all still score as a hit.

                  Comment


                    #10
                    Heh I rock. I understood the product by just hearing about it. Here is info from tineye's own website

                    How does TinEye work?
                    Every day TinEye's spiders crawl the web for additional images. Using sophisticated pattern recognition algorithms, TinEye creates a unique and compact digital signature or 'fingerprint' for each one and adds it to the index.

                    When you want to find out where an image is being used on the web, you submit it to TinEye. The attributes of the image are analyzed instantly, and its fingerprint is compared to the fingerprint of every single image in the TinEye search index. The result? A detailed list of any websites using that image, worldwide.

                    Use TinEye to find out where and how an image appears on the web, even if it has been cropped or heavily modified.

                    Comment


                      #11
                      I'd still like to see how they're doing the searches on the "heavily modified" images :P
                      [img]http://img.photobucket.com/albums/v337/Igorod/troopdod.jpg[/img]
                      [url=http://profile.xfire.com/trooper110][img]http://miniprofile.xfire.com/bg/co/type/1/trooper110.png[/img][/url]

                      Comment


                        #12
                        Check out they're beta it works pretty good.

                        Comment

                        Cain's Lair Forums Statistics

                        Collapse

                        Topics: 26,187   Posts: 269,850   Members: 6,183   Active Members: 7
                        Welcome to our newest member, Fermin13Q.

                        Today's Birthdays

                        Collapse

                        There are no members with birthdays today.

                        Top Active Users

                        Collapse

                        There are no top active users.

                        More Posts

                        Collapse

                        • Reply to Hi guys!
                          by Evil_T0NY {CLR}
                          I've been Alpha and will be Beta testing the Delta Force game. It's been really getting good reviews! Definitely a good Battlefield feel to it like the...
                          14 Nov 2024, 08:50 PM
                        • Reply to Hope your all OK over there
                          by Apache Warrior
                          We had 17 inches of rain from the storm on November 7, 2024.
                          Apache
                          11 Nov 2024, 07:55 AM
                        • Reply to Hope your all OK over there
                          by Sirex
                          Aye, I'm inclined to agree with that lmao
                          Gone are the days of warm summers and snow filled winters here, nothing but rain and wind for 8mths of...
                          10 Nov 2024, 08:53 PM
                        • Reply to Hope your all OK over there
                          by Apache Warrior
                          Now we have had a lot of flooding in this area and there are still a lot of houses that have not been repaired. Must be the apocalypse.
                          ...
                          8 Nov 2024, 09:23 AM
                        Working...
                        X