File headers and Copyright statements


#1

The issue of file headers and copyright statements is one that comes up from time to time so I figured it was time we standardized on some guidance for .NET Foundation projects. I thought I would kick off a proposal below, but want everyone to weigh in and give some opinions as lots of people here have a lot more experience in with dealing with this in different places than I do.

###Why have a file header?
People (including Wikipedia) sometimes said that US Copyright law no longer requires a copyright notice but regardless it is beneficial to make sure anyone taking a look at a piece of source code knows where the code came from and under what license that code is being provided. A brief file header is the easiest and most conventional way to do this.

Rationale behind Guidance

  1. Keep file headers as brief as possible and standardized to make it easy to apply the right header
  2. Minimize the number of times that a file would get touched just to update the header or license text to avoid polluting history in Git
  3. Minimize the room for error in a file header being incorrect by getting out of date over time and as code is refactored.
  4. Make sure we give proper credit to the awesome contributions given by the many passionate people involved in each project
  5. Make sure we have a way to clearly identify the copyright holders who originally submitted a contribution into the .NET Foundation project.
  6. Also, make sure we have a way to give credit to the shoulders on which we stand by properly attributing other open source included in our projects and bringing that attribution front and center to ensure people understand what open source is included in a project.

Copyright and License Notice Guidance (Proposal)

At a convenient time to the project, committers should move to a standardized file header in all non-generated source files such as the following example from C# source code:

//  
// Copyright (c) .NET Foundation. All rights reserved.  
// Licensed under the ##LICENSENAME##. See LICENSE file in the project root for full license information.  
//  

For example:

//  
// Copyright (c) .NET Foundation. All rights reserved.  
// Licensed under the MIT License. See LICENSE file in the project root for full license information.  
//  

Projects must have a file at the root of the repository called LICENSE (or LICENSE.md or LICENSE.txt) that contains the full license statement of the project. Within that license the copyright holder should be listed as “.NET Foundation” and no year is required in the copyright notice.

Projects should have a CONTRIBUTORS (or CONTRIBUTORS.md) file at the root of their repository that lists the authors and copyright holders that have made non-trivial contributions to that project source code. The format of that file is as follows:

Contributors
--------------
This is the official list of the ##PROJECTNAME## project contributors.
Names of the original copyright holders (individuals or organizations)
should be listed with a '*' in the first column. People who have 
contributed from an organization can be listed under the organization
that actually holds the copyright for their contributions (see the 
Microsoft organization for an example). Those individuals should have
their names indented and be marked with a '-'
 
* Contoso
  - Egon Andersson
  - Zdenek Broz

* Marius Carlsson

* Microsoft
  - Sofie Hellstrom
  - Pontus Lindberg
  - Nada Lukic
  - Milun Radic
  - Frantiska Spackova
  - Kaari Suutari-Jaasko
  - Jaroslav Tomek

If a project contains any other open source code from another source then this the original location should be noted in the file comments with-in the file. If a method or snippet from a file is used than the beginning and end of that snippet should be clearly identified with attribution in source so that the attribution can easily move with the code during refactors. In addition a reference to the source project and the full license the code was provided under should be included in a NOTICES (or NOTICES.md / NOTICES.txt) file at the root of the project at the time the code is added.

Notes:

  1. There is no year provided in the copyright statement on the file nor in the copyright statement in the LICENSE file. This is a deliberate simplification to avoid unnecessary churn in the code base.
  2. When containing non-ASCII characters, the CONTRIBUTORS file should be encoded in UTF-8 with no BOM (as opposed to CP-1252 / ANSI). Same goes for the LICENSE and NOTICES file.
  3. The CONTRIBUTORS file is a deliberate hybrid between a list of the copyright holders who have contributed to the project and the individuals actually contributing often on behalf of the organization. The idea is that the original copyright holders can be identified but we can also ensure all contributors get kudos for their contributions. This prevents individual file headers getting added to over time (and then accidentally getting copy/pasted or lost as code is refactored around the codebase over time). The major downside of this format is that is would mean that an individual may be listed multiple times (for example if they contributed as an individual then went to work for a company that then paid them to contribute to the project).

Inspiration

The guidance above is based on best practises seen in a number of communities, but outside of the .NET Foundation projects the following guidance was influencial in coming up with our recommendations:

TO-DO

As well as get feedback on this proposal, I want to pull together some examples for the NOTICES and LICENSE section before finalizing. Ideally pointing to current live projects for a model example, but when I started looking I was finding lots of inconsistencies which is one of the reasons I started writing this forum post). I’m also digging into SPDX to see what we can do there.

If any projects wanted to pilot implementing something like this so we can learn from that then feel free (but bear in mind that this guidance is in early draft and is likely to change based on feedback)

While heavily inspired by Angular, Apache and Eclipse - the guidance above is a little original which always worries me. I’d rather have been able to point to another community and say “do it like X” but I didn’t find one I was entirely happy with. However if you know of examples you think are best of breed then let me know.

Let me know what you think.


#2

Sounds clean and straightforward to me. Might be good for individual projects to quantify in their CONTRIBUTORS file what the minimum bar is for including someone’s name, whether it be based on LOC or # of commits or something like that so that they don’t have to constantly answer the question “why am I not listed” from people who contributed one or two tiny PRs.


#3

How about projects have something in their Contributing.md or Contribution section of their ReadMe / Wiki that says something like below (borrowed heavily from the CoreCLR project):

Typos are embarrassing! We will accept most PRs that fix typos and you shouldn’t even have to sign a contribution license agreement before they can be merged.

For significant pull requests fixing bugs or adding new features, you will need to sign the Contribution License Agreement before the pull request can be merged. You only need to do that once for any .NET Foundation project. The first time you submit a significant change to our project you should also make sure you have added your details to the CONTRIBUTORS file as part of your pull request (if in doubt ask).

In order to make it easier to review your PR, please focus on a given component with your fixes or on one type of typo across the entire repository. If it’s going to take >30 mins to review your PR, then we will probably ask you to chunk it up.

While I do see a few “resume populating” PR’s they are typically pretty rare. In more cases people are closer to showing signs of impostor syndrome and don’t add their name to the hall of fame when they probably should. If someone goes to the length of submitting a PR that is of value and includes their name being added to the CONTRIBUTORS file then I would tend to personally err on the side of accepting it and encouraging future contributions rather than asking them to pull out that change as part of the PR before accepting the PR or rejecting the PR entirely. If someone sends a PR in that is non-trivial and they haven’t updated the CONTRIBUTORS file then as a committer it would be good to encourage them to so that you can show how much you appreciate their effort.


#4

Should we add some kind of user authority like a GitHub or Twitter account to the names? Most of contributors use online names that don’t match their full-name.


#5

So something like:

* Microsoft
  - Martin Woodward (@martinwoodward)

Other project’s have used email addresses in the past but then they get removed as a spam prevention measure. I wouldn’t be opposed to a recommendation that people include their GitHub profile names if the projects are hosted on GitHub.

In the past when I’ve had to refer to one of these files and contact the authors I’ve had to spelunk the Git history for a project to get their commit email addresses. It wasn’t the worst thing in the world but it is useful to have the additional pointers to be able to disambiguate someone.

Some people I’ve worked with also prefer not to use their real name in a project but stick with their online identity as a privacy measure which I can also understand and appreciate.

Martin.


#6

I just wanted to confirm you would not be opposed to it, so that’s good, contributors can decide how they want to appear. Would love to see this added in the guidelines for this file, but that’s ok if not.


#7

Also, you mention that years are not necessary in the LICENSE file. But what if we have transitioned from Outercurve to DNF? Currently we have something like this in the file.

Copyright © 2009 Outercurve Foundation
Copyright © 2014 .NET Foundation


#8

They don’t harm. If you had been planning to update the license to update the year to © Copyright 2014-2015 then you might as well just pull the year out to save you updating next year. But don’t bother to update just to remove the year if you already have © .NET Foundation.

Having the year is more correct for a full copyright notice. However, putting the year in forces us to touch the files annually and cause necessary churn so not really worth it for source files.


#9

First of all, I appreciate the design goals chosen. Far too often I see code churn for copyright notices every year treated as an unavoidable necessity, without consideration of alternative approaches. The design above balances ownership/authorship with efficient development practices for a practical long-term approach to this problem.

Empty comment lines

I noticed in your example for C#, the sample header comment contains an empty comment line preceding and following the copyright text. Headers of this form are slightly harder to account for when using certain automation tools like StyleCop.Analyzers. I would recommend simplifying it to just the two lines containing copyright text.

File encoding

I recommend the use of UTF-8 with BOM, as described in my comments on dotnet/corefx#1470. UTF-8 files without BOM are prone to being silently treated as a different encoding by various editors (which is often also affected by the region of the user working on the file), and over time characters are likely to be saved incorrectly.

Samples

A sample NOTICES file would be valuable. Currently I’ve been including the information at the end of our LICENSE file.

Automation

It might make sense to provide a sample stylecop.json file for C# projects which enforces the recommended file header. The following file produces headers like the ones mentioned above. The license name and license file name are replaceable variables.

{
  "$schema": "https://raw.githubusercontent.com/DotNetAnalyzers/StyleCopAnalyzers/master/StyleCop.Analyzers/StyleCop.Analyzers/Settings/stylecop.schema.json",
  "settings": {
    "documentationRules": {
      "companyName": ".NET Foundation",
      "copyrightText": "Copyright (c) {companyName}. All Rights Reserved.\nLicensed under the {licenseName}. See {licenseFile} file in the project root for full license information.",
      "xmlHeader": false,
      "variables": {
        "licenseName": "MIT license",
        "licenseFile": "LICENSE"
      }
    }
  }
}

#10

The topic of file headers and copyright statements is a rather hotly-debated topic. If the .NET Foundation chooses to recommend a copyright notice which does not exactly match the form recommended by the United States Copyright Office (e.g. by omitting the year), they should be prepared to be very explicit about the ramifications of this choice. From the original post above, the benefits of such a choice (in terms of overall team productivity) are clear. However, the drawbacks are left for the individual to identify and evaluate.

A fairly lengthy discussion on this specific concern can be found in DotNetAnalyzers/StyleCopAnalyzers#1661. The .NET Foundation is well-positioned to help developers make easy decisions for their projects on topics like this, but the eventual recommendation should be accompanied by clear justification.


#11

Thanks for all the feedback everyone. Spoken to a lot of people about this to get their feedback and spent a lot of time reading the guidance of others (especially Apache). Here is where we are currently landing which is significantly different (and even more simplified) than the original proposal above:

File Header and Copyright Notice Guidance (Second Draft Proposal)

At a convenient time to the project, committers should move to a standardized file header in all non-generated source files. An example file header is as follows:

// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

Note. There is no Copyright notice in the file itself, however the project’s LICENSE file must include a Copyright statement. The LICENSE file (or LICENSE.md or LICENSE.txt) must contain the full license statement for the project. With-in that license the Copyright holder will be listed as “.NET Foundation and Contributors”. For example where a project is licensed under the MIT license the following example would be used:

The MIT License (MIT)

Copyright (c) .NET Foundation and Contributors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

And that’s the extent of the guidance. No CONTRIBUTORS file etc, no copyright notices in every file. Tracking of contributors will be done via source control.

Why the new proposal?

The goal for the guidance was to minimize the amount of boilerplate in source files and ensure that any boilerplate content was correct and didn’t get out of date over time. The issue with a common copyright notice at the top of a file is that it quickly becomes out of date as people contribute to the project and as the code moves around as it is refactored. Moving to an Apache style file header gets around this problem by letting people know that the content is provided with some obligations and where to find information about the license. The Copyright notices provided in the license is a notice that the work is protected by copyright and gives details about the conditions under which a anyone may use the content provided in the combined work which is the open source project

Anyway, I would love to know what people think to this bare-bones style before we roll it out widely.


#12

To mitigate the disappearance of the contributors file, and still recognize contributions beyond source control, at Orchard, we like to include the list of contributors in the release notes.


#13

I think crediting the major contributors to a release is definitely a good practice so love to see that continue. We should definitely add that suggestion to the final guidance. Thanks!


.NET Foundation Website | Blog | Projects | Code of Conduct