2023-06-06 10:44:08 +02:00
|
|
|
|
---
|
|
|
|
|
lang: en
|
|
|
|
|
---
|
|
|
|
|
|
2023-04-14 09:30:57 +02:00
|
|
|
|
# SPIP Database to Markdown
|
2023-04-14 10:58:35 +02:00
|
|
|
|
|
2023-06-06 10:44:08 +02:00
|
|
|
|
`spip2md` is a litle Python app that can export a SPIP database into a plain text,
|
|
|
|
|
Markdown + YAML repository, usable with static site generators.
|
|
|
|
|
|
|
|
|
|
## Features
|
|
|
|
|
|
|
|
|
|
`spip2md` is currently able to :
|
|
|
|
|
|
|
|
|
|
- Export every section (`spip_rubriques`), with every article (`spip_articles`) they
|
|
|
|
|
contain
|
|
|
|
|
- Replace authors (`spip_auteurs`) IDs with their name (in YAML block)
|
|
|
|
|
- Generate different files for each language found in `<multi>` blocks
|
|
|
|
|
- Copy over all the attached files (`spip_documents`), with proper links
|
|
|
|
|
- Convert SPIP [Markup language](https://www.spip.net/fr_article1578.html)
|
|
|
|
|
- Convert SPIP ID-based internal links (like `<art123>`) into path-based, normal links
|
2023-04-14 11:31:07 +02:00
|
|
|
|
|
2023-06-15 17:00:59 +02:00
|
|
|
|
## Dependencies
|
|
|
|
|
|
|
|
|
|
`spip2md` needs Python version 3.9 or supperior.
|
|
|
|
|
|
|
|
|
|
`spip2md` uses three Python libraries (as defined in pyproject.toml) :
|
|
|
|
|
|
|
|
|
|
- Peewee, with a database connection for your database :
|
|
|
|
|
- pymysql (MySQL/MariaDB)
|
|
|
|
|
- PyYaml
|
2023-06-15 18:12:19 +02:00
|
|
|
|
- python-slugify (unidecode variant prefered)
|
2023-06-15 17:00:59 +02:00
|
|
|
|
|
2023-06-15 18:12:19 +02:00
|
|
|
|
## Installation
|
|
|
|
|
|
|
|
|
|
### Simple `pip` method
|
|
|
|
|
|
|
|
|
|
Install the package with `pip install spip2md` (or `python -m pip install spip2md`
|
|
|
|
|
if you don’t have pip installed).
|
|
|
|
|
|
|
|
|
|
Assuming your `$PATH` contains your `pip` install directory, you can now run
|
|
|
|
|
`spip2md` a normal command of the same name.
|
|
|
|
|
|
|
|
|
|
### Traditional method
|
2023-04-14 10:58:35 +02:00
|
|
|
|
|
2023-06-15 18:12:19 +02:00
|
|
|
|
Clone this git repo with command `git clone` and `cd` into the created directory.
|
|
|
|
|
|
|
|
|
|
Either make sure you have the dependencies installed system-wide, or create a
|
|
|
|
|
Python virtual-environment and install them inside.
|
|
|
|
|
|
|
|
|
|
You can then run `spip2md` as a Python module with command `python -m spip2md`.
|
2023-06-16 12:44:10 +02:00
|
|
|
|
Make sure to replace `spip2md` with a path to directory `spip2md` if you
|
2023-06-15 18:12:19 +02:00
|
|
|
|
didn’t `cd` into this repository’s directory.
|
|
|
|
|
|
2023-06-16 15:08:29 +02:00
|
|
|
|
## Configuration and Usage
|
2023-06-15 17:00:59 +02:00
|
|
|
|
|
2023-06-15 18:12:19 +02:00
|
|
|
|
Make sure you have access to the SPIP database you want to export on a
|
2023-06-15 17:00:59 +02:00
|
|
|
|
MySQL/MariaDB server. By default, `spip2md` expects a database named `spip` hosted on
|
|
|
|
|
`localhost`, with a user named `spip` of which password is `password`, but you can
|
|
|
|
|
totally configure this as well as other settings in the YAML config file.
|
|
|
|
|
|
|
|
|
|
If you want to copy over attached files like images, you also need access to
|
|
|
|
|
the data directory of your SPIP website, usually named `IMG`, and either rename it
|
|
|
|
|
`data` in your current working directory, or set `data_dir` setting to its path.
|
|
|
|
|
|
2023-06-16 15:08:29 +02:00
|
|
|
|
### YAML configuration file
|
|
|
|
|
|
|
|
|
|
To configure `spip2md` you can place a file named `spip2md.yml` in standard \*nix
|
|
|
|
|
configuration locations, set it with the command line argument, or run the
|
|
|
|
|
program with a `spip2md.yml` file in your working directory.
|
|
|
|
|
|
|
|
|
|
Here’s the *default configuration options* with comments explaining their meaning :
|
2023-06-06 10:44:08 +02:00
|
|
|
|
|
|
|
|
|
```yaml
|
2023-06-15 18:12:19 +02:00
|
|
|
|
db: spip # Name of the database
|
|
|
|
|
db_host: localhost # Host of the database
|
|
|
|
|
db_user: spip # The database user
|
|
|
|
|
db_pass: password # The database password
|
|
|
|
|
data_dir: data # The directory in which SPIP images & files are stored
|
|
|
|
|
|
|
|
|
|
export_languages: ["en"] # Array of languages to export, two letter lang code
|
|
|
|
|
# If set, directories will be created only for this language, according to this
|
|
|
|
|
# language’s titles. Other languages will be written along with correct url: attribute
|
|
|
|
|
storage_language: null
|
|
|
|
|
output_dir: output/ # The directory in which files will be written
|
|
|
|
|
|
|
|
|
|
prepend_h1: false # Add title of articles as Markdown h1, looks better on certain themes
|
|
|
|
|
# Prepend ID to directory slug, preventing collisions
|
|
|
|
|
# If false, a counter will be appended in case of name collision
|
|
|
|
|
prepend_id: false
|
|
|
|
|
prepend_lang: false # Prepend lang of the object to directory slug (prenvents collision)
|
|
|
|
|
export_drafts: true # Should we export drafts
|
|
|
|
|
remove_html: true # Should we clean remaining HTML blocks
|
|
|
|
|
title_max_length: 40 # Maximum length of a single filename
|
|
|
|
|
unknown_char_replacement: ?? # String to replace broken encoding that cannot be repaired
|
|
|
|
|
|
|
|
|
|
# You probably don’t want to modify the settings below
|
|
|
|
|
|
|
|
|
|
clear_log: true # Clear logfile between runs instead of appending to
|
|
|
|
|
clear_output: true # Clear output dir between runs instead of merging into
|
|
|
|
|
|
|
|
|
|
logfile: log-spip2md.log # Name of the logs file
|
|
|
|
|
loglevel: WARNING # Refer to Python’s loglevels
|
|
|
|
|
|
|
|
|
|
export_filetype: md # Filetype of exported text files
|
2023-06-06 10:44:08 +02:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## External links
|
2023-04-14 11:58:02 +02:00
|
|
|
|
|
|
|
|
|
- SPIP [Database structure](https://www.spip.net/fr_article713.html)
|
2023-06-06 10:44:08 +02:00
|
|
|
|
|
|
|
|
|
## TODO
|
|
|
|
|
|
2023-06-15 17:00:59 +02:00
|
|
|
|
These tables seem to contain not-as-useful information,
|
|
|
|
|
but this needs to be investicated :
|
2023-06-06 10:44:08 +02:00
|
|
|
|
|
|
|
|
|
- `spip_evenements`
|
|
|
|
|
- `spip_meta`
|
|
|
|
|
- `spip_mots`
|
|
|
|
|
- `spip_syndic_articles`
|
|
|
|
|
- `spip_mots_liens`
|
|
|
|
|
- `spip_zones_liens`
|
|
|
|
|
- `spip_groupes_mots`
|
|
|
|
|
- `spip_meslettres`
|
|
|
|
|
- `spip_messages`
|
|
|
|
|
- `spip_syndic`
|
|
|
|
|
- `spip_zones`
|
|
|
|
|
|
2023-06-15 17:00:59 +02:00
|
|
|
|
These tables seem technical, SPIP specific :
|
2023-06-06 10:46:04 +02:00
|
|
|
|
|
2023-06-06 10:44:08 +02:00
|
|
|
|
- `spip_depots`
|
|
|
|
|
- `spip_depots_plugins`
|
|
|
|
|
- `spip_jobs`
|
|
|
|
|
- `spip_ortho_cache`
|
|
|
|
|
- `spip_paquets`
|
|
|
|
|
- `spip_plugins`
|
|
|
|
|
- `spip_referers`
|
|
|
|
|
- `spip_referers_articles`
|
|
|
|
|
- `spip_types_documents`
|
|
|
|
|
- `spip_versions`
|
|
|
|
|
- `spip_versions_fragments`
|
|
|
|
|
- `spip_visites`
|
|
|
|
|
- `spip_visites_articles`
|
|
|
|
|
|
|
|
|
|
These tables are empty :
|
|
|
|
|
|
|
|
|
|
- `spip_breves`
|
|
|
|
|
- `spip_evenements_participants`
|
|
|
|
|
- `spip_forum`
|
|
|
|
|
- `spip_jobs_liens`
|
|
|
|
|
- `spip_ortho_dico`
|
|
|
|
|
- `spip_petitions`
|
|
|
|
|
- `spip_resultats`
|
|
|
|
|
- `spip_signatures`
|
|
|
|
|
- `spip_test`
|
|
|
|
|
- `spip_urls`
|