FeedPipe Item Descriptions
I am making progress on the implementation of FeedPipe. One aspect of the design that I need to update is the design of index/feed item descriptions.
In my website software, a content metadata description
property is used to set the description
metadata in the
rendered HTML page. The property is optional in general, and no
description metadata is included in the HTML page in cases where the
property is not set. The description is also used in RSS feeds, however,
where the description
property is required. The content
metadata description
property is therefore required for
content that is included in RSS feeds (articles and blog entries).
In the Haskell Books articles, the description of a book is included in the HTML metadata, in the RSS feed, and on the book HTML page. I specify the description in the article metadata (in plain text) as well as in the article content (in Markdown). The two values are usually the same, aside from formatting. If I need to update the description, it must be done in two places.
FeedPipe is not as general as my website software, and I considered
various options for how to handle item descriptions. I discussed the
initial design of FeedPipe index/feed items in the FeedPipe (Part 3) blog
entry. The item metadata does not require a
description
property, and I wrote:
When an item
description
property is not specified, a description is generated from the title and revisions. Users can specify a string description in cases where the generated description is not wanted.
At the time, I thought that this design would allow users to decide
if they want to maintain a item description
property or
not. I pictured the following scenarios:
- Some users may not specify a
description
property in the item metadata. The description would be on the HTML page only, and the RSS feed would show the generated description. - Users who do not care about proper formatting could specify a
description
property in the item metadata and include it in the HTML by simply using{{ description|e }}
in the item template. - Some users may specify a
description
property in the item metadata as well as in the item content, like I do with my website.
After making progress on the implementation, I would like to change the design. There are a few problems with the initial design:
- I do not think that a generated description is appropriate for the HTML metadata.
- The generated description acts as a CHANGELOG for the item, and I
think that such information is helpful to include in RSS feeds even when
a metadata
description
property is set.
In the updated design, the metadata description
property
is separate from the generated CHANGELOG. The property is still
optional. As with my website software, no description metadata is
included in the HTML page in cases where one is not set. When the
property is set, the description in the RSS feed contains both
the metadata description
property and the generated
CHANGELOG. When the property is not set, the description in the RSS feed
contains just the generated CHANGELOG.
metadata description ? |
HTML metadata | RSS item description |
---|---|---|
provided | description |
description <> changelog |
not provided | None | changelog |
I considered allowing the metadata description
property
to be written using Markdown. The motivation is to give users the option
to only write/maintain the description in one place. The Markdown can be
transformed to plain text for use in the HTML metadata, and it would
have correct formatting in the HTML page.
In the initial design, I decided against this feature because I did
not think that it was worth the added complexity. With the redesign,
however, I decided to go ahead and try it out because the RSS item
description is not plain text. I implemented a
commonmarkToText
function that traverses the
Node
tree of a parsed Markdown document and formats a plain
text representation of the content. Since it is only used for
description metadata, it only supports content that consists of a single
paragraph.
The implementation works great, but there is one complication: the handling of soft line breaks. The problem is that the HTML metadata should not contain newlines, so soft line breaks must be folded, but folding line breaks depends on the language of the content. Many languages such as English require a space to be inserted, while many other languages such as Japanese must not have a space inserted.
Always inserting a space is a peeve of mine, so that is definitely not an option. In addition to simply not processing Markdown, I am currently considering the following options:
- The most user-friendly option is to use the text-icu package to join lines based on the content. The API provides functions for breaking strings but not joining them, unfortunately, but I can implement the functionality myself based on the Unicode block of the two characters on either side of a line break. I am not sure if it is worth adding the dependency. I will do some tests before I decide.
- A simple solution is to expose a configuration option that allows users to choose how lines are folded. This would not work well for indexes/feeds that have items in different languages, but perhaps that is an acceptable limitation.
Note that the commonmarkToHtml
function avoids this
issue. It simply keeps the newlines in the output HTML, leaving it up to
the web browser to render the text correctly. In my tests, Firefox
correctly renders HTML with separate Japanese lines without inserting
spaces.
Even if I keep this functionality, use of the metadata
description
in the HTML page is optional, of course. One
may use a short metadata description
in the HTML metadata
and RSS description
and provide a more detailed description
in the item content.