This is a boring task. Ask any, single, motivated developer and try to find one that will not roll his eyes whenever asked to do this kind of thing. We (engineers) don't like it, yet are doomed to get this question every now and then. It is not productive to repeat the same thing over and over again, why can't someone make it simpler?
Waiting a couple of years didn't worked, so time to roll up the sleeves and find an easier way of getting this sorted. To date, one needs to list manually each and every portion of code that is not original (e.g. libraries, icons, translations, etc) and this will either end up on a text file or a spreadsheet (pick your poison).
There are ways to manage dependencies. Think of npm, maven and similar. However, you need to be using a dependency manager and this doesn't solve the case of non-code items. For example, when you want to list that package of icons from someone else, or just list dependencies that are part of the project, but not really part of the source code (e.g. servers, firewalls, etc).
For these cases, you still need to do things manually and it is painful. At TripleCheck, we don't like ourselves to do these lists so started looking into how to automate this step once for all. Our requirements: 1) simple, 2) tool-agnostic and 3) portable.
So we got inclined to the way how configuration files work because they are plain text files that are easy for humans to read or edit, and straightforward for machines to parse. We are big fans of SPDX because it permits describing third-party items in intrinsic detail, but a drawback of being so detailed is that sometimes we only have granular information. Example, we know that the files on a given a folder belong to some person and have a specific license (maybe we even know the version), but we don't want to compute the SHA1 binary signature for each and every file on that folder (either because the files might change often, or simply because it won't be done so easily and quickly by the engineer).
Turns out we we're not alone on this kind of quest. NexB had already pioneered in previous years a text format specifically for this kind of task, defining the ".ABOUT" file extension to describe third-party copyrights and applicable licenses: http://www.aboutcode.org/
The text format is fairly simple, here is an example we use ourselves:
name: jsTree license_spdx: MIT copyright: Ivan Bozhanov version: 3.0.9 spec_version: 1.0 download_url: none home_url: http://jstree.com/ # when was this ABOUT file created or last updated? date: 2015-09-14 # files inside this folder and sub-folders about_resource: ./
Basically, it follows the SPDX license abbreviations to ensure we use a common way of talking about the same license and you can add or omit information as much as it is available. Take attention on the "about_resource" field that describes what is covered by this ABOUT file. When using "./" means all files and files in respective sub-folders.
One interesting point is the possibility for nesting of multiple ABOUT files. For example, place one ABOUT on the root of your project to describe the license terms generally applicable to the project and then create specific ABOUT on specific third-party libraries/items to describe what is applicable for such cases.
When done with the text file, place it on the same folder of what you want to cover. The "about_resource" can also be used for a single file, or repeated in several lines for covering a very specific set of files.
NexB made available tooling to collect ABOUT files and generate documentation. Unfortunately, this text format is not as known as it should be. Still, it fits like a glove as easy solution to list third-party software so we started using it for automating the code detection.
Our own TripleCheck engine is now supporting the recognition of .ABOUT files and adding this information automatically to the report generation. There is even a simple web frontend for creating .ABOUT files at http://triplecheck.net/components/
From that page, you can either create your own .ABOUT files or simply browse through the collection of already created files. The backend of that web page is powered by GitHub, you find the repository at https://github.com/dot-about/components/tree/master/samples
So, no more excuses to keep listing third-party software manually on spreadsheets.
Have fun! :-)
No comments:
Post a Comment