Welcome Guest!
 Benchathlon
 Previous Message All Messages Next Message 
query categories  Henning Müller
 Oct 19, 2001 02:48 PDT 
Dear Bechathletes,

here is a list of possible query categories that we could have for the
Bechathlon. I already sent this list a while ago, but I was told that
topica had problems with attachments and so I send it as text again,
now.

Please comment on which other categories you can imagine to evaluate
retrieval systems. This is important for creating a general benchmarking
harness.
I think for this year's benchmark we should concentrate on presentating
a framework for query by example and the evaluation of relevance
feedback of the participating systems.

None of the performance measures is fixed yet. I think we should start
out with a larger number of performance measures and then compare them
to find out measures that contain differing information about a system.
Any suggestions for other performance measures?

Henning

--
1.) Looking for a specific image
1.1) Looking if the exact same image is in the database
     goal: How fast can a system find this out?
     measures: Response time for a correct answer
               Accuracy of the reply, number of correct answers
1.2) Looking if the query image is part of an image in the database
     goal: How quickly does a system find part of an image and with
which accuracy
           where accuracy might be more important than time
     measures: Response time for a correct answer
               Accuracy of the reply, positions of the relevant images
1.3) Looking if a geometrically altered image is part of an image in the
database
     goal: How quickly does a system find part of an image and with
which accuracy
           where accuracy might be more important than time
     measures: Response time for a correct answer
               Accuracy of the reply, positions of the relevant images
1.4) Looking if a compressed version of an image is in the database (ie.
strong JPEG compression)
     goal: How quickly does a system find a compressed image and with
which accuracy
     measures: Response time for a correct answer
               Accuracy of the reply, positions of the relevant images

2.) Looking for a number of similar images
2.1) Query by example with known groundtruth
     goal: Find images that are relevant for a certain query image
     measures: normalized average rank (see BIRDS-I) as a leading
indicator
               precision/recall graph
               precision and recall at certain important cutoff points
               rank of the first relevant image other than the query
image
               average rank
               primary recall
2.2) evaluation of positive and/or negative feedback
     goal: How well can the results be improved with feedback, how many
steps of feedback
     measures: Possibly the measure of secondary recall etc, proposed by
C. Leung
               can the same measures be used as for the first query step
to have
               a comparison between the two?
2.3) how well can a system adapt the output for the same starting image
but with different ground truth sets
     goal: How well can the system adapt the output to the need of
different users?
     measures: the same measures as before but with different relevance
sets
               Can we get different relevance sets from the ground
truth?
               Can we use the same measures as stated before and average
them over the different
               relevance sets?

3.) Target search (or called image browsing), the image searched for is
not taken as an input
3.1) How quickly can an image be found while browsing
     goal: Find an image as quickly as possible
     measure: Number of images that have to be viewed before the correct
one is found

4.) Application
4.1) Inserting an image into the database
     goal: time it takes to insert an image
     measures: time
4.2) Inserting an image into the database and find a known image similar
to this one
     goal: time it takes to insert an image and how accurate the
response is
     measures: time and accuracy

5.) Looking for a sketch of an image (incomplete information)
5.1) How well can a sketch of an image be found?
     goal: speed and accuracy of the reply
     measures: time and accuracy

6.) Tests where two systems are explicitly compared
see the article of A. Dimai at Visual 99

7.) Tests for special application areas such as trademarks or medical
imaging
    are the measures really different or can the same measure be used as
before
    just applied to a different set of groundtruth and images

8.) Measure the scalability of a CBIR system
8.1) Scalability with respect to a large collection size
(10,000;100,000;1,000,000 images)
     goal: Measure the time it takes with collections of different
sizes
            to be able to interpolate the response time for even larger
            collection sizes
     measures: time change with respect to the collection size

9.) Evaluation of CBIR interfaces
    goal: Find the most efficient user interface for a certain task
    measures:   
--
     ----------------------------------------------------------
     Henning Mueller, Computer Vision Group
     Computer Science Department, University of Geneva
     24, rue du General Dufour, CH-1211 Geneva 4, SWITZERLAND
     Phone : +41(22)705 7633; fax: +41(22)705 7780
     Henning.-@cui.unige.ch
     ----------------------------------------------------------
	
 Previous Message All Messages Next Message 
  Check It Out!

  Topica Channels
 Best of Topica
 Art & Design
 Books, Movies & TV
 Developers
 Food & Drink
 Health & Fitness
 Internet
 Music
 News & Information
 Personal Finance
 Personal Technology
 Small Business
 Software
 Sports
 Travel & Leisure
 Women & Family

  Start Your Own List!
Email lists are great for debating issues or publishing your views.
Start a List Today!

© 2001 Topica Inc. TFMB
Concerned about privacy? Topica is TrustE certified.
See our Privacy Policy.