Commit graph

386 commits

Author SHA1 Message Date
Noah Hellman
116fb04725 lex: remove DollarBacktick token
can now read src to check dollar before backticks, so simpler to always
use backtick token

fixes several bugs, e.g.

| $`abc` | (previously did not get closed)

`$abc$` (previously did not get closed)
2023-05-15 19:05:22 +02:00
Noah Hellman
b2d15383e7 inline: allow empty span when followed by url/label/attrs
allow e.g. images without alt text ![](url)
2023-05-10 22:12:07 +02:00
Noah Hellman
329207689b prepass: fix lookup of headings
bug caused by confusing binary_search_by_key to be element instead of
index, both of which were usize
2023-05-10 22:04:01 +02:00
Noah Hellman
9efd4e4448 prepass: fix referenced headings with e.g. spaces
headings that did not exactly match their ids were previously not
matched with their reference
2023-05-10 22:04:01 +02:00
Noah Hellman
45c86da274 inline: fix unclosed attrs after cont/verb 2023-05-04 19:58:42 +02:00
Noah Hellman
9d9aded764 html: add doc comment to html::Renderer 2023-05-04 19:58:42 +02:00
Noah Hellman
d2a46663f1 html: fix initial newline when hr first element 2023-04-29 14:21:11 +02:00
Noah Hellman
99f4691e52 lib: emit footnotes as they are encountered
Previously, footnotes and their children events were skipped (stored in
block tree) and inline parsed at the end. Now, they are emitted by the
parser immediately and the responsibility to aggregate them has been
moved to the renderer.

resolves #31
2023-04-25 21:03:18 +02:00
Noah Hellman
c4ecd0c677 Revert "lib: add Render::render_{event, prologue, epilogue}"
This reverts commit e8503e28fd.

This imposed too many limitations on the renderer implementation. E.g.
making it impossible to store `Event<'s>`'s in the renderer struct.

Revert back to having the renderer struct separate from the implementor
of the Render trait. The implementor may instead create a renderer
struct without any restrictions.
2023-04-25 21:03:18 +02:00
Noah Hellman
8e48021f7a lib: emit LinkDefinition event
resolves #14
2023-04-25 21:03:18 +02:00
Noah Hellman
17b166867f html: derive Renderer::default
setting defaults manually mostly causes rebase conflicts
2023-04-25 21:03:18 +02:00
Noah Hellman
bdab4f021b lex: eat non special chars separately
let tight loop work as long as there no special characters
2023-04-25 20:26:47 +02:00
Noah Hellman
3701d282ac lex: separate escaped/non-escaped 2023-04-25 20:26:47 +02:00
Noah Hellman
e877fdbde8 inline: rm resolved TODO
resolved by a846477cea
2023-04-25 18:35:13 +02:00
Noah Hellman
ab33adc799 lib: cfg guard doctests using html renderer
docs do not compile when html feature is off otherwise
2023-04-25 18:35:10 +02:00
Noah Hellman
d2d7f5d474 lib: fix url in autolink/email end event
was constant ">" instead of actual url
2023-04-10 18:50:22 +02:00
Noah Hellman
1202160a88 inline: resume attr parsing when new lines received
instead of starting over for each new line
2023-04-05 21:17:33 +02:00
Noah Hellman
50205573d0 inline: store attributes in vec 2023-04-05 21:17:33 +02:00
Noah Hellman
62d33effc4 inline: parse multiline attributes
reimplement after broken by "take str per line instead of full inline
iter" commit

also resolves #18 and #34
2023-04-05 21:17:33 +02:00
Noah Hellman
3d42820001 inline: add ControlFlow enum
needed to indicate that more input is needed

this is needed only for multiline attributes that require backtracking
2023-04-05 21:17:33 +02:00
Noah Hellman
7133de94bb inline: keep track of lines ahead
Needed for attributes, as they cannot be parsed without backtracking.
Potentially we have to parse for attributes until the end of the block
before we can know if it is invalid attributes and we should instead
parse for other inline events.
2023-04-05 21:17:33 +02:00
Noah Hellman
2bcc6122ca inline: store link cowstrs in vec
try to reduce size of Event by placing the cowstr in a shared vec, and
just keeping an index in the event itself

seems to have a significant performance benefit on benchmarks
2023-04-05 21:17:33 +02:00
Noah Hellman
a846477cea inline: parse multi-line link tags/urls
reimplement after broken by "take str per line instead of full inline
iter" commit

this also resolves #22
2023-04-05 21:17:33 +02:00
Noah Hellman
98f3fe5c7c attr: Parser overhaul
- allow reading one line at a time, values may span multiple inputs
- mv event push to Parser, allowing reuse from outside Attributes::parse
- get rid of Element, simplify
2023-04-05 21:17:33 +02:00
Noah Hellman
34e74ddc43 attr: impl valid without Parser
only State fsm is needed

try to use Parser only when attributes need to be stored
2023-04-05 21:17:33 +02:00
Noah Hellman
242a64e3b4 attr: mv wildcard State use to Parser methods
avoid potential name conflict
2023-04-05 21:17:33 +02:00
Noah Hellman
172f555272 attr: step one char at a time
make sure attr can keep track of all state so one char can be provided
at a time

this allows not restarting from beginning if we find out we need more
chars to finish parsing attributes
2023-04-05 21:17:33 +02:00
Noah Hellman
e1b12ba642 attr: rename State::{Attribute->Key}
more accurate name
2023-04-05 21:17:33 +02:00
Noah Hellman
86ee4ee520 lex: rm lex::Kind::Whitespace
Whitespace tokens do not necessarily create new events but they work as
a delimiter for words with attributes and affect some container
delimiters. Now when we can read the source from inline, we can instead
inspect for whitespace when needed.

Removing the whitespace token allows the lexer to continue a lot longer
without stopping. E.g. a typical line in a paragraph with no special
characters can turn into a single token.
2023-04-05 21:17:33 +02:00
Noah Hellman
9454a2e393 inline: apply word attribute when flushing event buf
fixes issue with e.g `[text]({.cls})` where attributes get immediately
applied to `[text](` where link should have priority.
2023-04-05 21:17:33 +02:00
Noah Hellman
08ef15655b inline: extract merge_str_events
decrease size of inline::Parser::next, make more readable
2023-04-05 21:17:33 +02:00
Noah Hellman
1b7bb25519 inline: allow reading src 2023-04-05 21:17:33 +02:00
Noah Hellman
8382fe122f lib: derive Clone for Parser
should now be safe to do
2023-04-05 21:17:33 +02:00
Noah Hellman
3a1a3996e9 inline: take str per line instead of full inline iter
gets rid of DiscontinousChars which is large and requires cloning on
peek

resolves #4
2023-04-05 21:17:33 +02:00
Noah Hellman
8169feb1f6 inline: impl links w/o lookahead
needed to get rid of DiscontinuousChars
2023-04-05 21:17:33 +02:00
Noah Hellman
66d821f03e inline: replace Delim by Opener
no need to have a delimiter for closer, only opener needs to be stored
in stack
2023-04-05 21:17:33 +02:00
Noah Hellman
ed5aed3759 inline: impl verbatim w/o lookahead
avoid lookahead for containers that can cross newline

needed to get rid of DiscontinuousChars iter
2023-04-05 21:17:33 +02:00
Noah Hellman
e8e551fd8b inline: separate lex, span to separate Input object
easier handling of mutable pointers, can borrow self.input instead of
whole self

can e.g. borrow mutable state while still eating new tokens
2023-04-05 21:17:33 +02:00
Noah Hellman
5d6d0e0840 inline: use lex kinds
import Sequence and use Delimiter, Symbol consistently
2023-04-05 21:17:33 +02:00
Noah Hellman
f192ea2aa6 inline: use ? to flatten methods 2023-04-05 21:17:33 +02:00
Noah Hellman
491c5f2866 inline: always push events from parse methods
in order to more conveniently allow pushing in arbitrary order

parse methods now return an Option<()> that functions kind of like a
std::ops::ControlFlow. Some(()) means the token was parsed, None means
continue parsing.
2023-04-05 21:17:33 +02:00
Noah Hellman
9429f90307 inline: reuse event buffer between blocks
make sure not to allocate a new buffer on each block
2023-04-05 21:17:33 +02:00
Noah Hellman
1e5e56c463 only assert in debug builds
these are primarily used to detect bugs during e.g. fuzzing.

most of these asserts have negligible impact on performance, but if they
are not debug asserts it is not obvious that they dont affect
performance of release builds
2023-04-05 21:17:33 +02:00
Noah Hellman
d171cbb516 inline: add quote test 2023-04-05 21:17:33 +02:00
Noah Hellman
0a501fec10 lib: add test attr_inline_consecutive{,_invalid} 2023-04-05 21:17:33 +02:00
Noah Hellman
718f2df60e lib: add test link_reference_multiline 2023-04-05 21:17:33 +02:00
Noah Hellman
66fd099af1 lib: add test attr_inline_multiline 2023-04-05 21:17:33 +02:00
Noah Hellman
722f549ffd clippy: allow blocks_in_if_conditions everywhere 2023-04-05 21:17:33 +02:00
Noah Hellman
f31398a796 attr: impl Debug for Attributes manually
show key value pairs instead of internal structure
2023-04-05 21:17:33 +02:00
Noah Hellman
10788af246 lib: add Render::{push, write}_borrowed
allow rendering iterators with borrowed events

resolves #24
2023-04-05 21:17:33 +02:00
Noah Hellman
e8503e28fd lib: add Render::render_{event, prologue, epilogue}
derive push/write automatically from these
2023-04-05 21:17:33 +02:00
Noah Hellman
e506fffed8 mv push/write examples from html to Render trait
They apply more to the Render trait now than the implementation in the
html module
2023-04-05 21:17:33 +02:00
Noah Hellman
8eafdf073b html: extract Writer::render_{event, epilogue} 2023-04-05 21:17:33 +02:00
Noah Hellman
3d1b5f2115 html: rm events/out from Writer
separate input/output from rendering state
2023-04-05 21:17:33 +02:00
Noah Hellman
f5724fcc9c html: rm FilteredEvents
no longer useful as no peeking is needed, use simple early exit instead
2023-04-05 21:17:33 +02:00
Noah Hellman
336927faef html: avoid peek of next event
Try to make rendering of each event independent.

The only case where we need to peek is when a backref link should be
added to the last paragraph within a footnote.

Before, when exiting a paragraph, we would peek and add the link before
emitting the close tag if the next event is a the footnote end.

Now, the paragraph end event skips emitting a paragraph close tag if it
is within a footnote. The next event will then always close the
paragraph, and if it is a footnote end, it will add the backref link
before closing.
2023-04-05 21:17:33 +02:00
Noah Hellman
a603ea2124 lib: Add SpanLinkTag::Unresolved variant
keep the tag for unresolved links, and allow distinguishing between
`[tag][tag with empty url]` and `[tag][non-existent tag]`.

closes #26
2023-04-05 21:17:32 +02:00
Noah Hellman
05a4992d99 lib: don't prepend mailto to url in autolink Event
better to provide the original url, the event is already tagged as email

also avoids a string allocation
2023-04-05 21:17:04 +02:00
Noah Hellman
62e73100a6 bug fix: set LinkType to Email for email autolinks 2023-03-20 23:39:51 +01:00
Noah Hellman
14065177ae attr: fix name/key/value validation
match reference implementation
2023-03-17 18:57:36 +01:00
Noah Hellman
0719b2de65 block: fix class attribute parsing
match reference implementation
2023-03-17 18:57:36 +01:00
Noah Hellman
33d8215a2a html: fix invalid html for footnote inside image 2023-03-17 18:57:36 +01:00
Noah Hellman
c6022004bb html: fix alt text on nested images 2023-03-17 18:57:35 +01:00
Noah Hellman
fc374be56c html: escape img src values 2023-03-17 18:57:10 +01:00
Noah Hellman
5768b24907 html: escape quotes in img alt text 2023-03-17 18:45:20 +01:00
Noah Hellman
648a6dbef2 lib: derive Clone for Event and Container 2023-03-16 20:05:20 +01:00
kmaasrud
e3f39d4b88
feat: support escapes in attributes
Related issue: #1
2023-03-12 08:19:56 +01:00
Noah Hellman
a7f5b337a8 inline: fix attrs missing for inline verbatim
closes #15
2023-03-11 22:16:56 +01:00
Noah Hellman
418bb38f82 inline: extract ahead_container_attributes 2023-02-21 17:58:05 +01:00
Noah Hellman
c3ff064c78 make Event::is_{,container_}block public
this is only used by html renderer, may be useful for other renderers
also
2023-02-12 00:59:18 +01:00
Noah Hellman
413fecfe6a fix/allow clippy lints 2023-02-12 00:59:18 +01:00
kmaasrud
d7f2c0a819 implement Render trait for html::Renderer 2023-02-10 09:46:18 +01:00
kmaasrud
4743781cb9 add Render trait 2023-02-10 09:45:43 +01:00
kmaasrud
896c7004c4 add input and output args to CLI
This commit also adds a help text, accessible with the `--help` flag, as
well as a version text, available by using `--version`. My hope is that
this commit will make the jotdown CLI a bit friendlier to use.
2023-02-08 22:43:07 +01:00
Noah Hellman
b572790ac9 bug: fix tightness, ignore end blanklines 2023-02-07 21:51:31 +01:00
Noah Hellman
0d560901eb block: add Element::list 2023-02-07 21:49:35 +01:00
Noah Hellman
f98ebd477f bug: fix indent of footnote/list inner
when starting multiple blocks on same line, e.g. inner part of

    - - a
      - b

was

     - a
      - b

instead of

     - a
     - b
2023-02-06 23:09:48 +01:00
Noah Hellman
42360d7001 fixup! block: add MeteredBlock as intermediate struct 2023-02-06 23:09:48 +01:00
Noah Hellman
34452a282a features: add flag for html module 2023-02-05 21:59:38 +01:00
Noah Hellman
0de7776020 impl Clone, Copy on public objects 2023-02-05 20:36:49 +01:00
Noah Hellman
477eadde1c document lib API 2023-02-05 20:36:49 +01:00
Noah Hellman
5efb700c9b move atomic events to Event from Atom
An additional Atom enum seems to be more cumbersome and add little
value.

methods could potentially be used to classify events in several ways,
e.g.  block vs inline, atomic vs container
2023-02-05 20:36:49 +01:00
Noah Hellman
2811493c34 html: do not emit newline in beginning 2023-02-05 20:36:49 +01:00
Noah Hellman
7a26476315 fixup! wip djot -> html 2023-02-05 20:36:49 +01:00
Noah Hellman
0420aad0a5 implement symbols
e.g. :some-sym:
2023-02-05 20:36:49 +01:00
Noah Hellman
61f0d6281e rm unused 2023-02-05 20:36:49 +01:00
Noah Hellman
cc5a196149 fixup! parse block elements 2023-02-05 20:36:49 +01:00
Noah Hellman
cc89a06964 fixup! fixup! test_parse, test_block 2023-02-05 20:36:49 +01:00
Noah Hellman
fbd8811c86 block: parse description list 2023-02-05 20:36:49 +01:00
Noah Hellman
95bf52a31e update tree 2023-02-05 20:36:49 +01:00
Noah Hellman
768699d138 optionally use btree maps instead of hash maps
btree maps are deterministic which is useful for fuzzing. hash maps,
however have better performance in our case
2023-02-05 20:36:49 +01:00
Noah Hellman
924d6c44ac inline: disallow '<' in autolinks
avoid hangs on long <<<<<< sequences
2023-02-05 20:36:49 +01:00
Noah Hellman
82adc631d9 allow attributes on thematic breaks 2023-02-05 20:36:49 +01:00
Noah Hellman
670763dd93 fixup! do not treat \0 as EOF 2023-02-05 20:36:49 +01:00
Noah Hellman
59450ed9ad fixup! block: split parse_block function 2023-02-05 20:36:49 +01:00
Noah Hellman
cadf49fc53 fix usage of byte vs char count 2023-02-05 20:36:49 +01:00
Noah Hellman
4cb9c07cfc fixup! block attributes 2023-02-05 20:36:49 +01:00
Noah Hellman
82e1fd74f5 fixup! block: add MeteredBlock as intermediate struct 2023-02-05 20:36:49 +01:00
Noah Hellman
59be7070de block: count indent in chars instead of bytes 2023-02-05 20:36:49 +01:00
Noah Hellman
ca7f3c7e89 do not treat \0 as EOF
may appear in input
2023-02-05 20:36:49 +01:00