Commit Graph

252 Commits

Author SHA1 Message Date
Guilhem Fauré
1076040316 use str.replace() instead of regex when not needed 2023-05-16 11:29:22 +02:00
Guilhem Fauré
b61853a4d5 unknown characters highlighting & reporting 2023-05-16 11:19:47 +02:00
Guilhem Fauré
12db0375e7 better article text build 2023-05-16 10:01:33 +02:00
Guilhem Fauré
bc616cc7a1 started allowing to gather unknown encoding bugs 2023-05-15 17:18:36 +02:00
Guilhem Fauré
b8f99fb329 refactor project structure 2023-05-15 17:10:58 +02:00
Guilhem Fauré
8eb0d1101a add conversion ç 2023-05-11 16:20:24 +02:00
Guilhem Fauré
629594de9b remove every html tag (maybe temporarily) 2023-05-11 16:11:20 +02:00
Guilhem Fauré
b3119924a8 more strict cleaning of metadata 2023-05-11 15:17:44 +02:00
Guilhem Fauré
d8b7a1b562 style fixes 2023-05-11 14:31:08 +02:00
Guilhem Fauré
65e9f0a67b more encoding fixes, warns when unknown encoding 2023-05-11 14:22:13 +02:00
Guilhem Fauré
3e3259c564 delete lark syntax 2023-05-11 13:46:34 +02:00
Guilhem Fauré
ca4a3c1a96 lowercase meta class (pyright) + do not print title in markdown & h1 headings 2023-05-11 13:45:33 +02:00
Guilhem Fauré
3a261800a6 update licence 2023-05-11 11:47:29 +02:00
Guilhem Fauré
4141c10bfc add explanation in comments 2023-05-11 11:38:38 +02:00
Guilhem Fauré
995fee5b6a fixed most of the encoding bugs 2023-05-11 11:36:23 +02:00
Guilhem Fauré
b3fa5023c4 fix some encoding bugs with regex replace 2023-05-11 10:33:35 +02:00
Guilhem Fauré
5c78dcd753 init buggy encoding example 2023-05-11 10:23:01 +02:00
Guilhem Fauré
3b36aeb776 simplified spip->md mapping 2023-05-11 10:22:50 +02:00
Guilhem Fauré
723a7ddeea simplified architecture 2023-05-11 09:50:18 +02:00
Guilhem Fauré
5e86ed0ed5 export empty articles 2023-05-11 09:25:26 +02:00
Guilhem Fauré
a4bb234b72 added pymysql again in requirements 2023-05-10 11:17:06 +02:00
Guilhem Fauré
1541cffa10 try to encode 2023-05-10 11:13:43 +02:00
Guilhem Fauré
e4a0eb68af better cli 2023-05-10 11:03:13 +02:00
Guilhem Fauré
cf2345e43e regex replacing spip to markdown conversion 2023-05-10 11:00:27 +02:00
Guilhem Fauré
8a6026d129 try with basic regex replacing 2023-05-09 17:38:18 +02:00
Guilhem Fauré
a455c8e4a2 add pyparsing, lark not adapted to complex languages like SPIP or Markdown 2023-05-09 16:47:02 +02:00
Guilhem Fauré
8eec4033f8 paragraphs cannot start with tags 2023-05-09 16:31:34 +02:00
Guilhem Fauré
8f9775119c multiline headings 2023-05-09 15:52:18 +02:00
Guilhem Fauré
8f4fcccbdc accept " in tag options 2023-05-09 14:57:55 +02:00
Guilhem Fauré
c5c04cc645 more precise tags, added problematic tag to tests 2023-05-09 14:51:40 +02:00
Guilhem Fauré
aa046aa45c parser tests 2023-05-09 14:37:12 +02:00
Guilhem Fauré
cda96d1864 document can end with an inline tag 2023-05-09 13:14:50 +02:00
Guilhem Fauré
82c952641a ? allowed in hrefs 2023-05-09 13:11:30 +02:00
Guilhem Fauré
73927bd3cc prevent export of empty articles 2023-05-09 11:42:03 +02:00
Guilhem Fauré
079e156971 2 params paragraphs, paragraphs inside inline tags can start with block-specific startings 2023-05-09 11:36:45 +02:00
Guilhem Fauré
f7357998c9 improved output 2023-05-09 11:26:59 +02:00
Guilhem Fauré
08973616b0 parsing into normal flow 2023-05-09 11:13:47 +02:00
Guilhem Fauré
79a50d5e83 optional line break at end of list items 2023-05-09 10:59:57 +02:00
Guilhem Fauré
c9906c56cc more strict table rules, more flexible hr rules 2023-05-09 10:51:01 +02:00
Guilhem Fauré
ca0e9af4b4 remove last priority, definin text non-starting char 2023-05-09 10:34:35 +02:00
Guilhem Fauré
b14137c1fd fix closing inline tags 2023-05-09 10:15:53 +02:00
Guilhem Fauré
8fca926461 remove priorities 2023-05-09 10:10:45 +02:00
Guilhem Fauré
7d99e59c3d parametric paragraph 2023-05-09 09:28:29 +02:00
Guilhem Fauré
07e1855a70 simpler inline tags img|emb|doc 2023-05-04 11:59:55 +02:00
Guilhem Fauré
a113ba79c5 fixed block/inline tags 2023-05-04 11:54:37 +02:00
Guilhem Fauré
8cc7d3640e case insensitive tag regex 2023-05-04 11:35:46 +02:00
Guilhem Fauré
1ccc95b894 progress on supporting raw html tags 2023-05-03 17:00:15 +02:00
Guilhem Fauré
64a0deac93 footnotes & wikilinks OK 2023-05-03 16:32:36 +02:00
Guilhem Fauré
6f9ca8e3ba fix orphan/pair tags 2023-05-03 16:22:16 +02:00
Guilhem Fauré
0f510459e2 tags ok 2023-05-03 16:06:23 +02:00