Plan was to use the categories, yes. Like I've said, our whole tagging strategy will revolve around those... I'd love to hear the sorting strategies but I am not sure how much I trust Pytherworld's fallbacks at it's stage of development and testing.
Yeah, I'd also rather fix my own bugs than someone elses...
You should know though that dynamic image selection based on tags has always been to core development effort in Pytherworld. The current system is the fourth or fifth major iteration/rewrite. It's not untested any more. I do concede that heavier usage will surely bring many solvable problems to the surface, but I'm confident that the basic approach will work. I have used this code with a database of about 15000 images and 180000 tags.
Now, let me take you through the process of selecting an image for a task in Pytherworld in order to explain the sorting strategy at work.
Selecting an image for a task in PytherworldEach image search starts with the complete database of images. Steps 1-3 are filtering steps that drastically reduce the size of the database. Steps 4-6 are concerned with sorting the remaining images according to how well they match the ingame situation.
1) In the first step, most images are excluded from the database to narrow down the search. This is done by removing all images that do not have a main character with the same
race and
hair color as the girl performing the task we want to find an image for.
2) Tasks may specify tags that must be on the image displaying the task. These are called required tags. All images that do not carry
all required tags are removed from the database.
3) Tasks may also specify tags that must
not be on the image displaying the task. These are called excluded tags. All images that carry
any excluded tag are removed from the database.
Now our goal is to sort all remaining images according to how well they represent the task we want to visualize.
4) Our first consideration are the looks of the girl performing the task. We will create a dictionary that contains all tags that can be used to describe the looks of a character and assign each tag a score between -500 and +500, depending on how similar that tag is to the looks of the girl. Example:
Our girl has light blonde hair. The tag "lightblonde hair" will get +500. "darkblonde hair" will get +400. "black hair" will get -500 and so on.
5) At this point, the image ranking rules specified by the task, the scoring tags, come into play. Every task may provide a dictionary of tags and scores which reflects that some tags may be nice to have but are not essential (positive score), while others should probably not be on the task image, but are no dealbreakers either (negative score).
The scoring tags are combined with the dictionary from 4) to create the scoringdict. Using this dict, we can calculate a score from the tags on each image and therefore rank all images according to how well they reflect our task and girl.
6) Based on the ranking performed in 5), we would like to select an image that reflects our task well. The best match is the image with the highest score, but always choosing the same image is boring. We don't know how many other good images there are, so we want a selection of images with high scores to choose from. To that end, we take the scoredict and calculate the maximal score that can theoretically be achieved in this ranking. Then we reduce the maximal score by a user-defined percentage (5 to 25% are useful values) and get a threshold score that each image must reach to get into the final selection. If no image reaches the threshold, we know our search did not yield any good matches and can take countermeasures or just lower the threshold. Another possible route is to start a search for a profile image, which is the fallback strategy used in Pytherworld.
7) From our final selection of nicely matching images we simply take the ten with the highest score and randomly pick one.