How to gather copyright notices using Maven License Plugin?
Asked Answered
S

4

7

I use many open-source libraries in my project and I use license-maven-plugin to gather information about them. I can see all the license texts in the target directory and THIRD-PARTY-included-modules.txt as follows:

Lists of 144 third-party dependencies.
 (Apache License 2.0) Commons Codec (commons-codec:commons-codec:1.8 - http://commons.apache.org/proper/commons-codec/)
 (Apache License 2.0) Commons IO (commons-io:commons-io:1.4 - http://commons.apache.org/io/)
 (Apache License 2.0) Commons Logging (commons-logging:commons-logging:1.1.3 - http://commons.apache.org/proper/commons-logging/)
 (CDDL) JavaBeans Activation Framework (JAF) (javax.activation:activation:1.0.2 - http://java.sun.com/products/javabeans/jaf/index.jsp)
...(and 140 more lines)

However, this doesn't seem to match the legal obligations:

(from MIT license) The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

As far as I can read, I'm supposed to include notices such as:

Copyright (C) 2011, 2014, 2015 Tatsuhiro Tsujikawa

How am I supposed to gather the copyright notices that I should include in the About page?

Here is my pom.xml:

<project ...>
    <build>
        <plugins>
            <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>license-maven-plugin</artifactId>
                <version>1.8</version>
                <configuration>
                    <!--
                        mvn clean license:add-third-party license:download-licenses
                    -->
                    <projectName>Play SQL PageObjects</projectName>
                    <licenseName>Commercial License</licenseName>
                    <organizationName>Play SQL S.A.S.U.</organizationName>
                    <inceptionYear>2015</inceptionYear>

                    <!-- Files we input into license-maven-plugin -->
                    <licenseFile>${basedir}/src/license/PLAY_SQL_LICENSE.txt</licenseFile>
                    <useMissingFile>true</useMissingFile>
                    <!-- The input file with the list of licenses, for those which can't be found automatically -->
                    <missingFile>src/license/THIRD-PARTY.properties</missingFile>
                    <!-- Same as 'missingFile' but in XML, probably -->
                    <licensesConfigFile>src/license/licenses-manual.xml</licensesConfigFile>

                    <!-- Output folder -->
                    <outputDirectory>${project.basedir}/target/classes/META-INF/licenses</outputDirectory>
                    <!-- Text with the output list of all licenses. Just contains the list of projects and websites, does not contain the copyright notices -->
                    <thirdPartyFilename>THIRD-PARTY-included-modules.txt</thirdPartyFilename>
                    <!-- XML with the output list of all licenses -->
                    <licensesOutputFile>target/classes/META-INF/licenses/licenses-generated.xml</licensesOutputFile>
                    <!-- Folder with an output dump of all license text. Usually they contain the license template (for APL2) but not the copyright notices. -->
                    <licensesOutputDirectory>target/classes/META-INF/licenses/text</licensesOutputDirectory>


                    <includedScopes>compile</includedScopes>
                    <excludedScopes>test|provided|runtime|system</excludedScopes>
                    <excludedGroups>com.playsql</excludedGroups>
                    <licenseMerges>
                        <licenseMerge>Apache License 2.0|The Apache Software License|Version 2.0,Apache License, Version 2.0|The Apache Software License, Version 2.0|Apache License, Version 2.0|Apache 2</licenseMerge>
                    </licenseMerges>
                </configuration>
            </plugin>

How am I supposed to gather the copyright notices with the license-maven-plugin (or any other tool)?

Section answered 16/9, 2015 at 10:38 Comment(0)
F
2

I've used the attribution maven plugin for this purpose. It generates an XML file that contains all the known license data for the dependencies in the same POM. You can then include this XML file in your packaged code and use it to display an "about" page.

Fabiola answered 15/3, 2017 at 12:59 Comment(0)
F
1

I do not know of any effecient copyright scanner/collector available as a Maven plugin. But you can you use the ScanCode toolkit to collect copyrights: it has been designed for this and other related purposes. (disclaimer: I am one of the devs there)

This requires only a Python interpreter and once installed, run scancode --copyright <your directory tree> --jsonpp <json output file> to get the copyrights in JSON format.

Edit: It can also output CSV, YAML, SPDX and CycloneDX

It should be rather easy to wrap scancode that in a Maven plugin. Help wanted!

See https://github.com/nexB/scancode-toolkit

Fur answered 1/11, 2015 at 9:20 Comment(5)
Do you know of any tool which can convert the json output to csv? Asking this before I start writing my own code!Rydder
Did you get the help?Fourfold
@Rydder ScanCode does support a CSV outputFur
@Fourfold I did not get help. If you want to write a Maven plugin with scancode inside, I will help youFur
@PhilippeOmbredanne Thank You for your reply. I was thinking to write it. But, instead I wrote a python script and used maven cyclone dx sbom plugin for generating the required data. The requirement was to auto-generate the notice file in Eclipse OS org template format. For scan code, I used DASH plugin (internally uses ScanCode tool) as the only requirement there was to check for Open source license compliance.Fourfold
G
1

I created a fork of the mojohaus maven license plugin. Discussed here: https://github.com/mojohaus/license-maven-plugin/issues/357. Not intensively tested, most likely has some small bugs left, but for my purposes it's working. But be warned! Pro argument: This solution is super fast compared to ScanCode, which scans bruteforce even binary files and also needs all archives to be extracted before scanning.

The plugin writes all it can fetch into the target\generated-resources\licenses.xml, including the licenses and notices text files.

The notices text files usually contain the copyright notices you need. The copyright notices were one of the main reasons to write the extension.

Just clone it from https://github.com/JD-CSTx/license-maven-plugin. To build and install it quickly just for testing use mvn install -DskipITs=true -DskipTests=true.

The goal is license:aggregate-download-licenses, version 2.1.0-SNAPSHOT and option is extendedInfo.

It can also write into an excel file with the option writeExcelFile, beware: Excel cells are cut off because of the 32,767 chars limit.

Config for your projects pom.xml:

<plugin>
  <groupId>org.codehaus.mojo</groupId>                   
  <artifactId>license-maven-plugin</artifactId>
    <version>2.1.0-SNAPSHOT</version>
    <configuration>
       <includeTransitiveDependencies>true</includeTransitiveDependencies>                       
       <verbose>true</verbose>
       <!-- New -->
       <extendedInfo>true</extendedInfo>
       <!-- New -->
       <writeExcelFile>true</writeExcelFile>
       ...

I would love some feedback on this.

Goldplate answered 19/11, 2019 at 8:39 Comment(1)
If all you need is scanning metadata, then you can use scancode --package to get that.Fur
F
0

Though not particularly related to the maven-license-plugin to generate the notice md file, another neat approach I took was to auto-generate the NOTICE md file in the format here: (any format can be used as highlighted in the steps below)

https://github.com/eclipse/openmcx/blob/main/NOTICE.md

Here is what I did:

  1. Use maven-cyclonedx-plugin to generate SBOM (in JSON or XML format)
  2. Create a template with the above linked file (or any file which you want to use as template).
  3. Use placeholders in the above linked file in place of the data that you want specific to your repository. (ex: ##third_party_licenses## for replacing all third party licenses, ##repo## to replace repository name)
  4. Write a python script which uses some XML or JSON framework to parse the generated SBOM for getting the values of your placeholders.
  5. Replace the values in the template file and create NOTICE md file.
  6. Commit NOTICE md file. (Configure git user before hand. If using GH action, you might not want to use GitHub bot to do the push, but the user from GITHUB.ACTOR environment variable)

PS: Good thing is that you can ask ChatGPT or Meta AI to generate this python script for you. If there are multiple repositories where this needs to be done, you can use a GitHub reusable workflow and composite GitHub actions to maintain the script, template file in a single separate repository)

Fourfold answered 13/8 at 3:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.