Help with LibreSSL manpages

Discussion:

Stephen Gregoratto

2018-11-25 13:26:21 UTC

Hello,

I've recently been getting into (re)writing my manpages using mdoc(7),
and came across Ingo's talk about mandoc/LibreSSL [1]. In it he
mentioned that there are still some functions to document and many pages
need a couple of goes over (specifically openssl(1)).

Now I've never developed for Open/LibreSSL, and have an OK knowledge of
C, but I do have a bit of free time over Christmas and would be happy to
help out in any way. Would I need to fully grok the code before I could
write the docs?

[1] https://www.openbsd.org/papers/eurobsdcon2018-mandoc.pdf

--
Stephen Gregoratto

Ingo Schwarze

2018-11-25 16:36:16 UTC

Permalink

Hi Stephen,

Post by Stephen Gregoratto
I've recently been getting into (re)writing my manpages using mdoc(7),
and came across Ingo's talk about mandoc/LibreSSL [1]. In it he
mentioned that there are still some functions to document and many
pages need a couple of goes over (specifically openssl(1)).
Now I've never developed for Open/LibreSSL,

That will make the learning curve significantly steeper, but it should
still be possibly to help. You should expect to spend considerable
amounts of time learning how the features you are trying to document
work, though, in addition to the time needed to work on the text itself.

Post by Stephen Gregoratto
and have an OK knowledge of C,

That is good.

Post by Stephen Gregoratto
but I do have a bit of free time over Christmas

If you never plan to come back to the project in 2019 or later, i
have some doubts as to how much you can get done. If one or two
weeks is all the time you have to spend, most of that will be used
up learning about LibreSSL, mdoc, and about how OpenBSD development
works in general.

If you have some time now and are not yet sure whether you might
be interested to work again on it in the future but consider that
possible, just try.

Post by Stephen Gregoratto
and would be happy to help out in any way.

That's quite welcome.

Post by Stephen Gregoratto
Would I need to fully grok the code before I could write the docs?

Absolutely not. You could spend an infinite amount of time to
understand the code if you tried to understand everything.
Of course, there is nothing wrong with studying whatever interests
you.

But if your main goal is to improve the docs, you only need to look
at the code when it is unclear what a given feature does or how it
must be used, and you only need to understand enough of the code
such that you can can answer these two questions.

Judging from how you describe your knowledge and experience,
i (wildly) *guess* that the main challenges might be:

* Figuring out what are gaps in the documentation and what is
intentionally undocumented (as i explained in the talk, that's
a serious problem for anyone, even for me, heck, even for jsing@).

* Getting used to the style of OpenBSD documentation (assuming i
understand correctly that you never provided significant
contributions to OpenBSD docs before). You can't learn that
style from LibreSSL docs because they are written in a different
style, in OpenSSL style. The only way to learn that style is
trial and error, discussing your patches with OpenBSD developers,
and occasionally looking at OpenBSD libc section 2 and section 3
manual pages.

* Getting used to the way OpenBSD development works. You learn
that by sending patches and listening to the feedback.

* Getting used to the mdoc(7) languange. You learn that by
reading mdoc(7), sending patches, and listening to the feedback.
It's likely the least of the challenges.

Now, where should you start, both in the sense of choosing a subject
area and in the sense of choosing a working style?

Regarding the subject area / sub-library: if there is an area you
are personally interested in, start with that. Otherwise, consider
starting with something that is important (BN_*, EVP_*, RSA_*) or
modern (EC*, TLS 1.3). If you ever enter into a coversation with
jsing@ over one of your patches, ask him what is most urgent.

Possible working styles depend on what you are best at.

* Copy editing: read existing manuals, improve the style,
make it more concise, more precise, clearer, close gaps.
Also fix the order of information.
Needed almost everywhere.
Works best with a strong understanding of OpenBSD manual style.

* New pages: take an important or modern sub-library, look at the
public headers and identify undocumented functions. Check with
LibreSSL developers which ones should be documented. Write
new manuals. Needed almost everywhere.
Requires moderate experience writing function manuals from scratch.

* Hunting for bugs: compare docs and code and identify contradictions
between both: mismatches between argument types in the *.h, *.c,
and *.3 files, mismatches of return types, wrong .In lines,
typos in function names or other parts of prototypes, statements
about behaviour or usage that contradict the code.
Can be tedious and take quite some time with little visible
results, but requires less experience and has the side-effect
of building familiarity with the code and docs.

Note that the page openssl(1) is a special case. It already had
basic copy editing by jmc@ regarding style and language, but it
could use more checking and improving of the content. Since it is
merely a testing tool, it is not the most important part; the reason
for mentioning it in the talk was that it is a part that so far had
less review than other parts, no matter the importance.

If you still don't have a clear preference how to start, one useful
approach that i'd recommend is to go through recent LibreSSL commits
in reverse chronological order (that makes sure what you do is at
least of some relevance). For each commit, figure out whether it
has user-visible consequences. If so, check whether the docs
correctly describe how the library works after the commit in this
respect. If not, send patches to fix the docs. Also, send patches
for any other bugs or gaps that you discover in the process.

Even if you want to pick a different working style, going through
recent commits can still serve to choose a subject area / sub-library:
If an area got at least a handful of non-trivial commits during the
last year, it is likely worth working on.

Good luck,
Ingo

Stephen Gregoratto

2018-11-26 10:24:25 UTC

Permalink

Thanks for your response Ingo. I think I'll start with the missing
functions and go through them by order of length. I'll try and peruse
through the ports and check for any examples.

Speaking of functions: I'm trying to generate a list of each function,
the source file it's defined in and the corresponding line number,
similar to the format of `grep -n`. Is there a way to force ctags to
output in some tabular format that can be AWK'd? The -x option isn't
cutting it for me.

--
Stephen Gregoratto

Ingo Schwarze

2018-11-26 14:01:26 UTC

Permalink

Hi Stephen,

Post by Stephen Gregoratto
Thanks for your response Ingo. I think I'll start with the missing
functions and go through them by order of length.

Not saying "by order of length" is impossible, but keep in mind that

* There are few section 3 manual pages documenting only a single
function, especially in bloaty libraries like libcrypto/libssl,
it is usually better to document very closely related functions
together on one page. (Rule of thumb: avoid text duplication,
but don't make pages too long and complicated.)

* Going by length implies that you will have to learn about a new
sub-library for almost every function you tackle.

* Going by length implies that you will be sending a mix of diffs
for functions of vastly different importance; it may be harder
to get reviews for functions of lower importance and to get
those committed.

So the criterion "by length" will surely make things harder for you.
Not saying the criterion is impossible to use, just making sure you
know it has some serious downsides for you.

Post by Stephen Gregoratto
I'll try and peruse
through the ports and check for any examples.

Be aware that functions exist that are notoriously misused,
and code in ports is of widely varying quality. Large parts of
the ports tree contain code that is way below OpenBSD quality
standards, so you will learn much less from looking at ports code
than from looking at base system code, unless you know which ports
are of unusually high code quality.

Post by Stephen Gregoratto
Speaking of functions: I'm trying to generate a list of each function,

Not saying "don't do that" - but be aware that the reason why nobody
did that yet (not even when giving a public talk that included
statements about "future directions") is that it is not really
needed in the current situation to make progress. Such a list
becomes useful when we get very close to the goal of having everything
documented. That is years in the future at best. A list containing
hundreds of functions doesn't seem all that useful to me.

Post by Stephen Gregoratto
the source file it's defined in and the corresponding line number,
similar to the format of `grep -n`. Is there a way to force ctags to
output in some tabular format that can be AWK'd? The -x option isn't
cutting it for me.

I have no idea, i never used ctags(1), nor any similar tool for that
matter. I never saw a use for it. I you want to use it, figure it
out. :)

There is no need to try and refute the points made in this posting,
and you can proceed as described here if you really want to. I'm
merely trying to make sure that you are not making the project
harder for yourself without even being aware that that's what you
are doing. After all, you asked for advice on how to proceed and
neither "producing a complete list of missing functions" nor
"mastering automated tools for static source code analysis" was
anyway near my list of potential obstactles that might have to
be overcome. ;-)

Yours,
Ingo

Joel Sing

2018-11-28 14:47:15 UTC

Permalink

Post by Ingo Schwarze

Post by Stephen Gregoratto
Would I need to fully grok the code before I could write the docs?

Absolutely not. You could spend an infinite amount of time to
understand the code if you tried to understand everything.
Of course, there is nothing wrong with studying whatever interests
you.

In addition to what Ingo has already said, I would add that I think it is
actually beneficial for someone who is not highly familiar with the code to
write the documentation, as it gives you a fresh set of eyes and a different
perspective. You'll question and challenge things rather than accepting them
at face value. However, you obviously you need to at least be able to read the
code to generally understand what it is doing.

Furthermore, I'd suggest that you start small (fix some typos, correct some
gramatical issues) and then work up to bigger changes (rewriting poorly
written documentation or creating new man mages). This will let you become
familiar with the documentation style and review/commit process, with faster
reviews and feedback.