Commons:Bots/Requests/AuntuBot
AuntuBot (talk · contribs)
Operator: Tausheef Hassan (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)
Bot's tasks for which permission is being sought:
1. Under Template:EdictGov-Bangladesh/Gazette, Bangladeshi government gazettes are explicitly covered by a copyright exemption, as they are official publications of the Government of Bangladesh.
I have compiled a dataset of 2,695 weekly and 56,751 extraordinary gazettes, available here and here, spanning from 1968 to the present. The bot’s task will be to upload these PDFs to Commons. Once the backlog is cleared, the bot will upload new gazettes weekly.
Educational value:
- They provide authoritative, verifiable sources for the exact dates on which ordinances, acts, rules, and statutory orders are issued or enforced.
- They serve as primary evidence for establishing the creation, dissolution, or restructuring of government institutions, statutory bodies, and public organizations, supporting legal and historical accuracy.
- The gazettes contain official notifications, appointments, regulations, and policy decisions that can be used to verify and support factual statements in Wikipedia articles.
- As state-published primary sources, they are frequently cited in legal research, academic studies, journalism, and court proceedings, reinforcing their reliability and educational significance.
- Uploading these documents to Wikimedia Commons will significantly improve public access to primary legal and administrative sources, supporting education, research, and transparency. The DPP website frequently experiences downtime, and its search function is unreliable.
2. In 2018, the Bangladeshi government attempted to implement a National Open Education Resource Policy draft under which educational resources would have been distributed under a Creative Commons license. As part of this initiative, Bangladesh Open University—the 8th largest university in the world by enrollment Bangladesh Open University has made 377 courses publicly available. These courses range from high school to master’s level and are entirely in Bengali.
All courses are licensed under Creative Commons 4.0 International License. License data can be found here. Each course has 10 to 15 pdfs each. Bot's task be to upload these files under proper category.
Currently these courses are currently not searched by search engines and stored in ip server http://103.103.100.12:8080/jspui/ . I have personally used it's content in high school scattered across various locations. This will consolidate these files in one place and greatly help Bengali students and fulfil Wikimedia common's goal of free inclusive education for all regardless of language.
I have previously wrote all source code and managed 60k uploads of User:PID-Bangladesh-UploadBot. It's images is being used very very widely in wikimedia projects as well as by independent journalists. I have also made Pypan tool which has uploaded 57,256 files, 0.04% of all files on Wikimedia Commons. I am planning on using the same method for uploading.
Trial run:
(The bot account is recently created and under Special:AbuseFilter/281 new users cannot upload PDF files. So autopatrol or confirmed rights will be appreciated)
Automatic or manually assisted: Automatic
Edit type (e.g. Continuous, daily, one time run): One time for backlog, Once a week after backlog
Maximum edit rate (e.g. edits per minute): 12
Bot flag requested: (Y/N): Yes
Programming language(s): Python with pywikibot
Tausheef Hassan (talk) 21:11, 17 January 2026 (UTC)
- @Tausheef Hassan: Would you mind explaining how you chose the bot's name? I'm a little concerned that it's so similar to my user name. Thanks. -- Auntof6 (talk) 23:12, 4 February 2026 (UTC)
- @Auntof6: My full legal name is Tausheef Hassan Auntu. Most people know me by my nickname "Auntu" (অন্তু) Tausheef Hassan (talk) 04:49, 5 February 2026 (UTC)
- @Tausheef Hassan: OK, thanks. I didn't think you would misuse it, just that it might be confusing. :) -- Auntof6 (talk) 04:56, 5 February 2026 (UTC)
- @Auntof6: I understand your concern, I can apply to change the username if you want. Tausheef Hassan (talk) 05:08, 5 February 2026 (UTC)
- @Tausheef Hassan: Not on my account. If people get confused, it can be explained. -- Auntof6 (talk) 05:39, 5 February 2026 (UTC)
- @Auntof6: I understand your concern, I can apply to change the username if you want. Tausheef Hassan (talk) 05:08, 5 February 2026 (UTC)
- @Tausheef Hassan: OK, thanks. I didn't think you would misuse it, just that it might be confusing. :) -- Auntof6 (talk) 04:56, 5 February 2026 (UTC)
- @Auntof6: My full legal name is Tausheef Hassan Auntu. Most people know me by my nickname "Auntu" (অন্তু) Tausheef Hassan (talk) 04:49, 5 February 2026 (UTC)
- Discussion
Does it really make sense to use monthly categories for weekly publications? --EugeneZelenko (talk) 15:46, 18 January 2026 (UTC)
- @EugeneZelenko: I have not finalized the categorization structure yet. The files are auto-categorized by Module:Bangladesh Gazette, and the existing categories are only a proof of concept. I plan to finalize the categorization with discussion with the community. For now, I have changed the weekly gazettes to be categorized on a yearly basis. Tausheef Hassan (talk) 16:13, 18 January 2026 (UTC)
- Please make new test run. EugeneZelenko (talk) 16:15, 18 January 2026 (UTC)
- @EugeneZelenko: Categorization is not performed during the upload process. Running a new test will produce results identical to previous runs. Categorization is handled entirely by the module, and any changes made to the module will affect the categorization of both existing and future files. The current test run files are now categorised yearly without any edits to the files. Tausheef Hassan (talk) 16:30, 18 January 2026 (UTC)
- But such categories could be definitely added during uploading. EugeneZelenko (talk) 16:15, 19 January 2026 (UTC)
- @EugeneZelenko: Yes, technically that could be done, but I do not think it should be.
These categories follow a recurring and consistent structure across a very large number of files. Per COM:T (“recurring messages to pages consistently”), I think, such recurring structures should be implemented via templates or modules
License templates already place files into their primary license categories in Commons. The module I am using only groups files into appropriate subcategories under the parent license category. From my experience with large-scale uploads, any recurring text or categorization logic should be handled by templates or modules. This approach avoids human error and prevents the need for tens of thousands of mass edits later. Main reason of this approach is to fix error without mass editing. I have personally encountered this problem in past uploads.
Module-based categorization provides centralized control over both existing and future files and allows flexible recategorization, refinement, or restructuring if new file types are introduced or if community consensus changes. Automatic categorization can still be overridden manually on individual files when needed.
Based on my experience with large batch uploads, recurring categorization structures should be implemented via templates or modules rather than during the upload process. This keeps bot runs deterministic and avoids embedding provisional or potentially disputed categorization decisions directly into file pages.Tausheef Hassan (talk) 04:27, 20 January 2026 (UTC)
- @EugeneZelenko: Yes, technically that could be done, but I do not think it should be.
- But such categories could be definitely added during uploading. EugeneZelenko (talk) 16:15, 19 January 2026 (UTC)
- @EugeneZelenko: Categorization is not performed during the upload process. Running a new test will produce results identical to previous runs. Categorization is handled entirely by the module, and any changes made to the module will affect the categorization of both existing and future files. The current test run files are now categorised yearly without any edits to the files. Tausheef Hassan (talk) 16:30, 18 January 2026 (UTC)
- Please make new test run. EugeneZelenko (talk) 16:15, 18 January 2026 (UTC)
- @EugeneZelenko: @Krd: I would love to start uploading soon. Tausheef Hassan (talk) 02:43, 26 January 2026 (UTC)
- Under current community consensus in Commons:Deletion requests/Template:EdictGov-Bangladesh, I will only be uploading government files published before 2023-09-18. Tausheef Hassan (talk) 19:11, 14 February 2026 (UTC)
- The above mentioned test edits are now nominated for deletion. Please elaborate. Krd 15:17, 23 February 2026 (UTC)
- @Krd:
- Bangladesh transitioned from a UK-style copyright framework to a US-style fair use system with the enactment of the Copyright Act, 2023. Prior to this, from 2000 to 2023, copyright was governed by the Copyright Act, 2000, and earlier, from 1962 to 2000, by the Copyright Ordinance, 1962. Importantly, none of these laws operate retroactively.
- Initially, I relied on COM:Bangladesh and {{EdictGov-Bangladesh}} at face value. However, after the deletion proposal at Commons:Deletion requests/Template:EdictGov-Bangladesh was raised, I conducted a detailed review of both current and historical Bangladeshi copyright law. Based on that review, it is clear that COM:Bangladesh and Bangladeshi license templates are outdated and do not accurately reflect the legal framework. I have updated {{EdictGov-Bangladesh}} after the discussion.
- From my understanding, works published before 2023-09-18 fall within the scope of {{EdictGov-Bangladesh}}, and therefore can be reused under that exemption. The legal reasoning and supporting analysis are outlined in the deletion request discussion.
- For the trial run, my bot uploaded a small set of recent files from the dataset. Since these files were published after 2023, they do not fall under the exemption and therefore lack the necessary permission. This was an error in selection for testing purposes. However, the dataset still contains approximately 55,000 files published before 2023 that do meet the criteria and can be uploaded.
- In parallel, I am working on a comprehensive update of COM:Bangladesh to reflect the current and historical legal framework. My review so far suggests that:
- Buildings and physical structures are not protected by copyright in Bangladesh (i.e., they are not copyrightable subject matter).
- The term of copyright under the 1962 Ordinance was 50 years, and the 2000 Act did not introduce a retroactive term extension for works already governed by the earlier law.
- I am currently preparing a detailed proposal for the Village Pump to formalize these interpretations and update Commons guidance accordingly.
- For the purposes of this bot request, I will follow the current community consensus at Commons:Deletion requests/Template:EdictGov-Bangladesh, which indicates that government works published before 2023-09-18 are acceptable for upload. The bot will therefore be restricted to uploading only those files that clearly meet this criterion. - Tausheef Hassan Auntu (talk) 12:40, 24 February 2026 (UTC)
- New Trial run: @Krd:
- The above mentioned test edits are now nominated for deletion. Please elaborate. Krd 15:17, 23 February 2026 (UTC)
- Tausheef Hassan Auntu ✉Talk? 15:23, 26 February 2026 (UTC)
- Please summarize why these files are in project scope of Wikimedia Commons. --Krd 09:03, 14 March 2026 (UTC)
- @Krd: According to COM:SCOPE, these 60,000+ pre-2023 government gazettes fall within the project scope for allowable PDF formats. As the highest level of official state publications, they are essential texts for Bengali Wikisource. This directly aligns with the policy that "A PDF or DjVu file of a published and peer-reviewed work would be in scope on Wikisource and is therefore also in scope on Commons." Additionally, these gazettes have served as the state's official public record since 1968. Because of their immense historical and administrative significance, they perfectly fit the criterion that "The file is a scan of a document of historic or other external significance, e.g. scans of existing copyright-free or licensed books, reports, newspapers, etc." Hosting these files ensures that "The file is usable as a fixed, verifiable source document, e.g. for Wikisource or Wikibooks." These documents can provide reliable and permanent sources for Wikipedia articles and primary texts for Bengali Wikisource. ≈ MS Sakib 📩 ·📝 17:45, 14 March 2026 (UTC)
- @Krd: is there any problem with the bot request? Tausheef Hassan Auntu ✉Talk? 08:16, 31 March 2026 (UTC)
- @Krd: According to COM:SCOPE, these 60,000+ pre-2023 government gazettes fall within the project scope for allowable PDF formats. As the highest level of official state publications, they are essential texts for Bengali Wikisource. This directly aligns with the policy that "A PDF or DjVu file of a published and peer-reviewed work would be in scope on Wikisource and is therefore also in scope on Commons." Additionally, these gazettes have served as the state's official public record since 1968. Because of their immense historical and administrative significance, they perfectly fit the criterion that "The file is a scan of a document of historic or other external significance, e.g. scans of existing copyright-free or licensed books, reports, newspapers, etc." Hosting these files ensures that "The file is usable as a fixed, verifiable source document, e.g. for Wikisource or Wikibooks." These documents can provide reliable and permanent sources for Wikipedia articles and primary texts for Bengali Wikisource. ≈ MS Sakib 📩 ·📝 17:45, 14 March 2026 (UTC)
- Please summarize why these files are in project scope of Wikimedia Commons. --Krd 09:03, 14 March 2026 (UTC)
- Tausheef Hassan Auntu ✉Talk? 15:23, 26 February 2026 (UTC)
Approved. --Krd 16:08, 2 April 2026 (UTC)