>> |
Anonymous File :-(, x)
>>499783 It indexes everything -- there's a link to a reconstruction of the threads on the image page to demonstrate this.
The current search mechanism operates as a fulltext search across the post comment and subject fields (a fulltext search means every word of at least three characters, excepting a list of stopwords like 'the' 'from', etc is indexed).
When you make a search, it pulls all posts which match the query, and all posts which are in a thread whose OP matches the query. From this set of posts, it grabs all of the images and presents that as the results.
As you've noted, this is a bit of a hassle for posts which have an image but no text. There's a couple of features I haven't gotten around to re-implementing yet that somewhat migitate this -- the first and most obvious is to include the original filename in the text search. This is useful because every image (excepting the 20k legacy images pulled from an older database) has that. Additionally, people (on /w/ and /wg/, at least) tend to organize their personal collection with filenames, so it'll likely be relevant to the image contents.
The other "feature" is twofold -- on the image display page (which currently shows the image thumbnail and all the posts it was posted on) will have a small set of "related images" at the bottom. These images are chosen from various bits of metadata from both scraped information (ie, posts contents) and stuff which can be extracted from the image (dominant colors, fuzzy analysis to find detexts, etc). So even if there isn't a comment associated with an image, I might be able to pull off enough metadata to make it properly searchable (or at least, some approximation of that).
|