Canonical links are expected to be full URLs, not relative.
For this to work, the Dockerfile had to be updated, because we're stripping
the domain-name from links ("<a href..."), but the script currently also included
"<link rel='canonical' .." tags.
With the change, canonical links are left alone;
These hrefs will be replaced
echo '<a class=foo href="https://docs.docker.com/foo">hello</a>' | sed -e 's#\(<a[^>]* href="\)https://docs.docker.com/#\1/#g'
# <a class=foo href="/foo">hello</a>
echo '<a href="https://docs.docker.com/foo">hello</a>' | sed -e 's#\(<a[^>]* href="\)https://docs.docker.com/#\1/#g'
# <a href="/foo">hello</a>
But, for example, this one is left alone
echo '<link rel="canonical" href="https://docs.docker.com/foo/bar" />' | sed -e 's#\(<a[^>]* href="?\)https://docs.docker.com/#\1/#g'
# <link rel="canonical" href="https://docs.docker.com/foo/bar" />
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Commits 7e5352f1ae and e72030d2c6
added automatic generation of page titles and descriptions from the page's content
if no front-matter metadata was present.
Some pages may include characters that should be escaped before using as a HTML
attribute or JSON field.
This patch adds escaping to those texts to prevent the HTML or JSON from being
invalid.
Before this:
HTML meta:
<meta name="description" content="docker build: The `docker build` command builds Docker images from a Dockerfile and a " context".="" a="" build's="" context="" is="" the="" set="" of="" files="" located="" in="" specified="" `path`="" or="" `url`...."="" />
JSON meta:
<script type="application/ld+json">{"@context":"http://schema.org","@type":"WebPage","headline":"docker build","description":"docker build: The `docker build` command builds Docker images from a Dockerfile and a "context". A build's context is the set of files located in the specified `PATH` or `URL`....","url":"https://docs.docker.com/engine/reference/commandline/build/"}</script>
After this:
HTML meta:
<meta name="description" content="docker build: The `docker build` command builds Docker images from a Dockerfile and a "context". A build's context is the set of files located in the specified `PATH` or `URL`...." />
JSON meta:
<script type="application/ld+json">{"@context":"http://schema.org","@type":"WebPage","headline":"docker build","description":"docker build: The `docker build` command builds Docker images from a Dockerfile and a \"context\". A build's context is the set of files located in the specified `PATH` or `URL`....","url":"https://docs.docker.com/engine/reference/commandline/build/"}</script>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These styles seem to be used when using AngularJS;
https://docs.angularjs.org/api/ng/directive/ngCloak
> The ngCloak directive is used to prevent the AngularJS html template from
> being briefly displayed by the browser in its raw (uncompiled) form while
> your application is loading. Use this directive to avoid the undesirable
> flicker effect caused by the html template display.
And I don't think that's used anywhere currently, so let's remove
Also removing some other ng-xx classes
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This generates a description for pages that don't have one;
- for referene docs, try the "long" description
- fallback to "short" description
- finally, fallback to taking the first 30 words from the
page content
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This is a very hacky way to extract the page title from pages that do not have
front-matter yaml, but have a H1 header. We need to take (id-) attributes into
account, so some hacking is needed.
Note that there's also a Jekyll plugin that features similar functionality, but
it requires additional dependencies, and we only have a few pages that need
this, so for now using this hack.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>