Yes, that's certainly true. When image metadata is used for tagging, the best model is IMHO to compile the tags into a single file before release and ship that with the game. Only the people doing the tagging need to download the latest version of the images. If every image pack is only tagged by one person, nobody has to repeatedly download the images. In addition, many game releases contain a complete set of their images anyway.
PyTFalls previous system of tags based on image file names has the same problem by the way (tag change requiring image redownload).
With that being said, I do believe that collaboration between many people is easier with tags stored in text files (JSON, XML, INI, ...), for several reasons (file size, version control tools and editing tools to name a few). Therefore, I intend to support text-file based tagging with Image Tagger at some point. Of course, text files also have several disadvantages. For example, any image that is renamed or moved into another folder loses its tags. If you combine a given image archive with a tag text file for an ealier/later "version" of this image archive (moving/renaming/adding/removing files changes the "version" of the image archive), some images will not find their tags because their path has changed. There is no easy way to go around this disconnection between tags and the images they describe. Part of a solution could be a checksum calculated from the image files. This would break as soon as the image file is changed (image optimization to reduce size, remove background and so on).
Adding support for text file based tagging to Image Tagger will require several steps:
1) Allow Image Tagger to export its tag memory for a particular image archive into a text file, probably a JSON file in the same format PyTFall currently uses.
At this point, tagging is still based on XMP metadata in the image files. However, Image Tagger users can create a snapshot of the tag memory in the form of a text file and distribute that file.
2) Allow Image Tagger to import text file based tag data and adapt the tags in its tag memory accordingly.
This creates the danger of destroying lots of tagging data if a user imports the wrong tag data text file and does not check the results of this import. There is also the question of how to handle images described in the tag data but not present in the file system (path changed or file deleted). At this point, XMP metadata is not necessary anymore.
Step 1) is easy and can be accomplished in about a week (depending on free time). I mean, if I get lucky, this could be done in a day, but experience shows I'm never that lucky...
Step 2) is harder due to the problems created by the disconnection between metadata and the image file introduced by abandoning XMP metadata.