Module:PIDCategoryHelper/doc

Category:Module documentation#PIDCategoryHelper/doc

Overview

This module provides automated date management and categorization for images from the Bangladesh Press Information Department (PID). It works in conjunction with {{Source-PID}}, {{Date-PID}}, and {{PD-BDGov-PID}} templates to:

  • Automatically retrieve publication dates from a centralized database
  • Override manual dates when database dates are available
  • Add appropriate date-based categories
  • Handle historic images with special categorization
  • Validate and normalize URLs for reliable matching

Database structure

The module loads date data from yearly submodules:

Each submodule contains a Lua table mapping URLs to dates:

return {
    ["https://pressinform.gov.bd/path/to/image.jpg"] = "2024-03-15 10:30:00",
    ["https://pressinform.gov.bd/historic/photo.jpg"] = {
        date = "1971-12-16",
        historic = true
    }
}

Dates can be either:

  • Simple string: "2024-03-15 10:30:00" (regular images)
  • Table with metadata: {date = "1971-12-16", historic = true} (historic images)

Functions

checkCategory

Main categorization function called from {{PD-BDGov-PID}}.

Usage:

{{#invoke:PIDDateData|checkCategory}}

Behavior:

  • Extracts URL from {{Source-PID}} template on the page
  • Looks up date in database
  • If no {{Date-PID}} template exists but date is found:
    • Adds [[Category:PID-BD images from Month YYYY]]
    • Adds [[Category:Bangladesh photographs taken on YYYY-MM-DD]]
    • For historic images: adds [[Category:Historic images from PID-BD]] instead
  • Falls back to [[Category:Press Information Department images]] if no date found
  • Adds {{Uncategorized PID Image}} if page lacks meaningful categories

checkOverride

Overrides manual dates with database dates when available. Called from {{Date-PID}}.

Usage:

{{#invoke:PIDDateData|checkOverride|2024-03-15}}

Parameters:

  • 1: Manual date provided to template

Returns:

  • Database date if available (normalized to ISO format)
  • Manual date if no database match found

shouldAddDateCategories

Determines whether date categories should be added (returns empty for historic images).

Usage:

{{#if:{{#invoke:PIDDateData|shouldAddDateCategories}}|add categories|skip}}

Returns:

  • "yes" for regular images
  • Empty string "" for historic images (treated as false in {{#if:}})

processURL

Cleans and normalizes URLs for external use.

Usage:

{{#invoke:PIDDateData|processURL|url=https://example.com/image.jpg?tracking=123}}

Parameters:

  • 1 or url: URL to process

Processing:

  • Trims whitespace
  • Removes tracking parameters after file extension
  • Converts spaces to %20 (except in image filenames)

getData

Retrieves data from a specific year's submodule.

Usage:

{{#invoke:PIDDateData|getData|2024}}

Parameters:

  • 1 or year: Year to retrieve

getAllData

Returns all combined date data from all yearly submodules.

Usage:

{{#invoke:PIDDateData|getAllData}}

URL matching

The module handles various URL variations:

  • Protocol normalization: http:// and https:// are treated as equivalent
  • Domain normalization: pressinform.portal.gov.bdpressinform.gov.bd
  • Space handling: Both space and %20 versions are checked
  • Tracking parameter removal: ?fbclid=xyz and similar are stripped

Date formats

The module accepts and normalizes multiple date formats:

  • YYYY-MM-DD HH:MM:SS am/pm
  • YYYY-MM-DD HH:MM am/pm
  • YYYY-MM-DD HH:MM:SS (24-hour)
  • YYYY-MM-DD HH:MM (24-hour)
  • YYYY-MM-DD (date only)

All dates are normalized to ISO format for consistency.

Historic images

Images marked with historic = true in the database:

  • Do not receive date-based categories
  • Receive [[Category:Historic images from PID-BD]] instead
  • Are typically photographs from Bangladesh's independence period or earlier

Categories added

For regular images with dates:

  • [[Category:PID-BD images from Month YYYY]]
  • [[Category:Bangladesh photographs taken on YYYY-MM-DD]]

For historic images:

  • [[Category:Historic images from PID-BD]]

Fallback (no date found):

  • [[Category:Press Information Department images]]

Performance

  • Date data is cached after first load (via mw.loadData)
  • Yearly submodules allow incremental updates without reloading all data
  • URL normalization minimizes duplicate entries

Examples

Example 1: Regular image with database date

Page has:

{{Source-PID|url=https://pressinform.gov.bd/2024/photo.jpg}}
{{PD-BDGov-PID}}

Database has:

["https://pressinform.gov.bd/2024/photo.jpg"] = "2024-03-15 14:30:00"

Result:

  • Adds [[Category:PID-BD images from March 2024]]
  • Adds [[Category:Bangladesh photographs taken on 2024-03-15]]

Example 2: Historic image

Database has:

["https://pressinform.gov.bd/historic/1971.jpg"] = {
    date = "1971-12-16",
    historic = true
}

Result:

  • Adds [[Category:Historic images from PID-BD]]
  • Does not add date-based categories

Example 3: Manual date override

Page has:

{{Date-PID|2024-01-01}}
{{Source-PID|url=https://pressinform.gov.bd/photo.jpg}}

Database has:

["https://pressinform.gov.bd/photo.jpg"] = "2024-03-15"

Result:

  • Manual date 2024-01-01 is overridden
  • Actual date used: 2024-03-15

See also

Category:Module documentation