Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 35 additions & 8 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -1,18 +1,41 @@
name: Tests
on: [push, pull_request]
jobs:
lint:
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4.2.2
- uses: actions/setup-python@v5.6.0
with:
python-version: '3.12'
- name: Install requirements
run: pip install flake8 pycodestyle
- name: Check syntax
run: flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics --exclude ckan

test:
runs-on: ubuntu-latest
needs: lint
strategy:
matrix:
include:
- ckan-version: "2.11"
ckan-image: "ckan/ckan-dev:2.11-py3.10"
- ckan-version: "2.10"
ckan-image: "ckan/ckan-dev:2.10-py3.10"
- ckan-version: "2.9"
ckan-image: "ckan/ckan-dev:2.9-py3.9"
fail-fast: false

name: CKAN ${{ matrix.ckan-version }}
runs-on: ubuntu-24.04
container:
# The CKAN version tag of the Solr and Postgres containers should match
# the one of the container the tests run on.
# You can switch this base image with a custom image tailored to your project
image: openknowledge/ckan-dev:2.9
image: ${{ matrix.ckan-image }}
options: --user root
services:
solr:
image: ckan/ckan-solr:2.9
image: ckan/ckan-solr:${{ matrix.ckan-version }}-solr9
postgres:
image: ckan/ckan-postgres-dev:2.9
image: ckan/ckan-postgres-dev:${{ matrix.ckan-version }}
env:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
Expand All @@ -29,7 +52,7 @@ jobs:
CKAN_REDIS_URL: redis://redis:6379/1

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4.2.2
- name: Install requirements
# Install any extra requirements your extension has here (dev requirements, other extensions etc)
run: |
Expand All @@ -45,6 +68,10 @@ jobs:
ckan -c test.ini db init
- name: Run tests
run: pytest --ckan-ini=test.ini --cov=ckanext.sitemap --cov-report xml:coverage.xml --disable-warnings ckanext/sitemap

- name: Install unzip for SonarQube and cov
run: apt-get -y install unzip curl

- name: SonarQube Scan
uses: sonarsource/sonarqube-scan-action@master
with:
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,6 @@ coverage.xml

# Sphinx documentation
docs/_build/

# Generated sitemaps (default directory)
ckanext/sitemap/public/sitemap*
86 changes: 75 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,34 +1,98 @@
[![Tests](https://github.com/OpenGov-OpenData/ckanext-sitemap/workflows/Tests/badge.svg?branch=main)](https://github.com/OpenGov-OpenData/ckanext-sitemap/actions)

# ckanext-sitemap
A CKAN extension that generates a sitemap XML file is designed to create a structured map of a CKAN instance's datasets and resources, making it easier for search engines to discover and index the available data. !

## Installation
A CKAN extension that generates a sitemap XML file is designed to create a structured map of a CKAN instance's datasets and resources, making it easier for search engines to discover and index the available data.

**TODO:** Add any additional install steps to the list below.
For example installing any non-Python dependencies or adding any required
config settings.
## Table of Contents

- [Getting Started](#getting-started)
- [Contributing](#contributing)
- [Versioning](#versioning)
- [License](#license)

## Getting Started

### Installation

To install ckanext-sitemap:

1. Activate your CKAN virtual environment, for example:

. /usr/lib/ckan/default/bin/activate
```bash
. /usr/lib/ckan/default/bin/activate
```

2. Clone the source and install it on the virtualenv
2. Clone the source and install it in the virtual environment

git clone https://github.com/OpenGov-OpenData/ckanext-sitemap.git
cd ckanext-sitemap
pip install -e .
pip install -r requirements.txt
```bash
git clone https://github.com//ckanext-sitemap.git
cd ckanext-sitemap
pip install -e .
pip install -r requirements.txt
```

3. Add `sitemap` to the `ckan.plugins` setting in your CKAN
config file (by default the config file is located at
`/etc/ckan/default/ckan.ini`).

4. Restart CKAN. For example if you've deployed CKAN with Apache on Ubuntu:

sudo service apache2 reload
```bash
sudo service apache2 reload
```

### Configuration

You can configure this extension in the `ckan.ini` file of your CKAN instance. Ensure to set these environment variables according to your requirements for sitemap generation and management.

Environment Variable | Default Value | Description
-------------------- | ------------- | -----------
`ckanext.sitemap.directory` | [`./ckanext/sitemap/public`](./ckanext/sitemap/public/) | The directory path for storing generated sitemaps.
`ckanext.sitemap.max_items` | `5000` | Maximum number of items per sitemap file. If the total count of resources exceeds this limit, the sitemap is split into multiple files.
`ckanext.sitemap.autorenew` | `True` | If this option is enabled, the sitemaps will be automatically renewed whenever a user requests a sitemap and the existing sitemap is older than the Time-To-Live (TTL) value specified. Set this to False if you prefer a cron job to handle sitemap generation.
`ckanext.sitemap.ttl` | `8 * 3600` (8 hours) | Time-To-Live (TTL) for sitemaps. Sitemaps older than this value (in seconds) are regenerated when a user visits a sitemap route.
`ckanext.sitemap.resources` | `True` | Determines whether package resources (distributions) should be included in the sitemaps.
`ckanext.sitemap.groups` | `True` | Determines whether groups and organizations should be included in the sitemaps.
`ckanext.sitemap.language_alternatives` | `True` | Determines whether language alternatives should be included in the sitemaps.
`ckanext.sitemap.custom_uris` | `Undefined` | A list of additional sitemap URIs separated by whitespace or newlines. These URIs will be included in the sitemap generation process alongside the default CKAN URIs.

### Using Cron for Regular Sitemap Generation

Using cron to generate sitemaps regularly can be advantageous, especially if the sitemap generation process is time-consuming.

Ensure that the sitemap generation occurs within the time frame specified by `ckanext.sitemap.ttl`, or alternatively, set `ckanext.sitemap.autorenew` to `False` to prevent accidental triggering of sitemap generation by users.

**Example Cron Job:**

To schedule the command to run at 2 AM, 10 AM, and 6 PM:

```bash
0 2,10,18 * * * /usr/lib/ckan/default/bin/ckan -c /etc/ckan/default/ckan.ini ckanext-sitemap generate > /dev/null 2>&1
```

## Available Commands

- `generate`

This command triggers the generation of the sitemap.

Usage:

```bash
ckanext-sitemap generate
```

## Contributing

To contribute to this documentation, create a branch or fork this repository, make
your changes and create a merge request.

## Versioning

We use [SemVer](http://semver.org/) for versioning. For the versions available, see
the tags on this repository.

## License

Expand Down
33 changes: 33 additions & 0 deletions ckanext/sitemap/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# -*- coding: utf-8 -*-

import click
import ckanext.sitemap.sitemap as sm

def get_commands():
return [ckanext_sitemap]


@click.group()
def ckanext_sitemap():
"""ckanext-sitemap

Usage:

ckanext-sitemap generate
- (Re)generate sitemap.
"""


@ckanext_sitemap.command()
def generate():
"""
Command to generate sitemap.
"""
try:
click.echo('Starting sitemap generation..')
sm.generate_sitemap()
click.echo('Finished sitemap generation.')

except Exception as e:
# Handle exceptions that may occur during cleanup
click.echo(f'Error during sitemap generation: {str(e)}', err=True)
7 changes: 5 additions & 2 deletions ckanext/sitemap/plugin.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
import ckan.plugins as plugins
import ckanext.sitemap.view as view
from ckanext.sitemap import cli


class SitemapPlugin(plugins.SingletonPlugin):
plugins.implements(plugins.IBlueprint)
plugins.implements(plugins.IClick)

# IBlueprint
def get_blueprint(self):
return view.get_blueprints()



# IClick
def get_commands(self):
return cli.get_commands()
Loading