User:Plantdesign
Operator: fleuhusen1
Tasks: This tool will NOT edit Wikimedia Commons. It will:
- Query Commons via the MediaWiki API to retrieve file metadata (title, author/credit, license, categories, structured data, and file URLs)
- Download only a targeted subset of files matching specific criteria (e.g., license types, plant taxa keywords/categories), storing attribution + license info alongside each file
- Respect Commons caching, pagination, and rate limits; no aggressive crawling of HTML pages
Operation: Fully automated fetching, with manual oversight (I will monitor logs and adjust if any issues arise).
When: Intermittently at specified intervals (e.g., scheduled runs a few times per day), and paused immediately if any performance concerns are reported.
Maximum edit rate: N/A (no edits). API request rate will be throttled (e.g., no more than 1 request/second; lower if advised).
Language: Python (using MediaWiki API client / Pywikibot-style API calls; running on my own infrastructure).