[VOTE] Daffodil into the Apache Incubator

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[VOTE] Daffodil into the Apache Incubator

Steve Lawrence
Hi All,

Based on the discussion on the incubator mailing list [1], I would like
to start a VOTE to bring the Daffodil project in as an Apache incubator
podling.

The ASF voting rules are described:

https://www.apache.org/foundation/voting.html

A vote for accepting a new Apache Incubator podling is a majority vote
for which only Incubator PMC member votes are binding.

This vote will run for at least 72 hours. Please VOTE as follows
[] +1 Accept Daffodil into the Apache Incubator
[] +0 Abstain.
[] -1 Do not accept Daffodil into the Apache Incubator because ...

The proposal is listed below, but you can also access it on the wiki:

https://wiki.apache.org/incubator/DaffodilProposal

Thank you,
- Steve

[1]
https://lists.apache.org/thread.html/190d73e84508d2deaa6cfde1be197cb70ca4caddfb215bc269b3e44f@%3Cgeneral.incubator.apache.org%3E



= Daffodil Proposal =

== Abstract ==

Daffodil is an implementation of the Data Format Description Language
(DFDL) used to convert between fixed format data and XML/JSON.

== Proposal ==

The Data Format Description Language (DFDL) is a specification,
developed by the Open Grid Forum, capable of describing many data
formats, including both textual and binary, scientific and numeric,
legacy and modern, commercial record-oriented, and many industry and
military standards. It defines a language that is a subset of W3C XML
schema to describe the logical format of the data, and annotations
within the schema to describe the physical representation.

Daffodil is an open source implementation of the DFDL specification that
uses these DFDL schemas to parse fixed format data into an infoset,
which is most commonly represented as either XML or JSON. This allows
the use of well-established XML or JSON technologies and libraries to
consume, inspect, and manipulate fixed format data in existing
solutions. Daffodil is also capable of the reverse by serializing or
"unparsing" an XML or JSON infoset back to the original data format.

== Background ==

Many different software solutions need to consume and manage data,
including data directed routing, databases, data analysis, data
cleansing, data visualizing, and more. A key aspect of such solutions is
the need to transform the data into an easily consumable format.
Usually, this means that for each unique data format, one develops a
tool that can read and extract the necessary information, often leading
to ad-hoc and data-format-specific description systems. Such systems are
often proprietary, not well tested, and incompatible, leading to vendor
lock-in, flawed software, and increased training costs. DFDL is a new
standard, with version 1.0 completed in October of 2016, that solves
these problems by defining an open standard to describe many different
data formats and how to parse and unparse between the data and XML/JSON.

Two closed source implementations of DFDL currently exist. The first was
created by IBM and is now part of their IBM® Integration Bus product.
The second was created by the European Space Agency, called DFDL4S or
"DFDL for Space" targeted at the challenges of their satellite data
processing.

Around 2005, Pacific Northwest National Lab created Defuddle, built as
an open source implementation and proof of concept of the draft DFDL
specification and a test bed to feed new concepts into specification
development. Primary development of Defuddle was eventually taken over
by the National Center for Supercomputing Applications (NCSA). However,
due to evolution of the DFDL specification and architectural and
performance issues with Defuddle, around 2009, NCSA restarted the
project with the new name of Daffodil, with a goal of implementing the
complete DFDL specification. Daffodil development continued at NCSA
until around 2012, at which point development slowed due to budget
limitations. Shortly thereafter, primary development was picked up by
Tresys Technology where it continues today, with contributions from
other entities such as the Navy Research Lab, the Air Force Research
Lab, MITRE, and Booz Allen Hamilton. In February of 2015, Daffodil
version 1.0.0 was released, including support for the DFDL features
needed to parse many common file formats. Daffodil version 2.0.0 is
expected to be released in August of 2017, which will include unparse
support with one-to-one parsing feature parity.

Entities including IBM, MITRE, NATO NCI Agency, Northrop-Grumman, Quark
Security, Raytheon, and Tresys Technology have developed DFDL schemas
for many data formats from varying technology domains, including PNG,
GIF, BMP, PCAP, HL7, EDIFACT, NACHA, vCard, iCalendar, and MIL-STD-2045
, many of which are publicly available on the DFDL Schemas github. There
are also a number of military-application data formats, the
specifications of which are not public, which have historically been
very difficult and expensive to process, and for which DFDL schemas have
been created or are actively in development; these include
MIL-STD-6040/USMTF ATO, MIL-STD-6017/VMF, MIL-STD-6016/NATO STANAG 5516
(aka "Link16").

== Rationale ==

Numerous software solutions exist that consume, inspect, analyze, and
transform data, many of which can be found in the Apache Software
Foundation (ASF). In order for tools like these to consume new types of
data, custom extensions are usually required, often with high
development and testing costs. Daffodil fills a clear gap in many of
these solutions, providing a simple and low cost way to transform data
to XML or JSON, which many of these tools natively support already. With
the upcoming 2.0.0 release, the Daffodil project will have achieved a
level of functionality in both parse and unparse that, when integrated
into existing solutions, could provide for a new method to quickly
enable support for new data formats.

== Initial Goals ==

 * Relicense the existing code from the University of Illinois/NCSA Open
Source License to the Apache License version 2.0, working with Apache
Legal to ensure correctness, and with Daffodil contributors to get their
permission.
 * Move the existing codebase, documentation, bugs, and mailing lists to
the Apache hosted infrastructure
 * Establish a formal release process and schedule, allowing for
dependable release cycles in a manner consistent with the Apache
development process.
 * Build relationships with ASF projects to add Daffodil support where
appropriate
 * Grow the community to establish a diversity of background and expertise.

== Current Status ==

=== Meritocracy ===

All initial committers are familiar with the principles of meritocracy.
The Daffodil project has followed the model of meritocracy in the past,
providing multiple outside entities commit access based on the quality
of their contributions. In order to grow the Daffodil user base and
development community, we are dedicated to continuing to operate
Daffodil as a meritocracy.

A key ingredient in a meritocracy of developers is open group code
review. The Daffodil project has operated in this mode throughout its
existence and this provides a forum to improve the code, verify code
quality, and educate new developers on the code base.

=== Community ===

Daffodil has a small community of users and developers. Although primary
Daffodil development is done by Tresys Technology, a handful of other
contributions have come from other entities including the Navy Research
Lab, the Air Force Research Lab, MITRE, and Booz Allen Hamilton. In
addition to developers, multiple users of Daffodil have created DFDL
schemas, including entities such as MITRE, IBM, Raytheon, Quark
Security, and Tresys Technology. The DFDL Schemas github community has
been created as a place for DFDL schemas to be published. The Daffodil
project also makes use of mailing lists, HipChat, and Confluence
Questions to build a community of users and system for support.

=== Core Developers ===

The core developers of Daffodil are employed by Tresys Technology. We
will work to grow the community among a more diverse set of developers
and industries.

=== Alignment ===

Daffodil was created as an open source project with a philosophy
consistent with The Apache Way. A strong belief in meritocracy,
community involvement in decisions, openness, and ensuring a high level
of quality in code, documentation, and testing are some of our shared
core beliefs.

Further, as mentioned in the Rationale section, Daffodil fills a gap
that exists in many ASF projects, including NiFi, Spark, Storm, Hadoop,
Tika, and others. In order for tools like these to consume new types of
data, custom extensions are usually required. Rather than create such
extensions, Daffodil provides an easy and standards-compliant way to
transform data to XML or JSON, which many of these tools already
natively support.

== Known Risks ==

=== Orphaned Products ===

The current core developers are the leading contributors in the space of
DFDL and wish to see it flourish. Though there is some risk that the
initial committers all come from the same company, a goal of entering
into incubation is to grow the development community to minimize the
risk of reliance on a single company.

=== Inexperience with Open Source ===

The Daffodil project began as an open source project and has continued
that model throughout development. This includes public bug tracking,
git revision control, automated builds and tests, and a public wiki for
documentation.

Additionally, the current core developers and initial committers all
work for a company that relies on, believes in, promotes, and has led or
contributed to many open source software projects, including SELinux
Userspace, OpenSCAP, CLIP, refpolicy, setools, RPM, and others. As such,
there is low risk related to inexperience with open source software and
processes.

=== Homogeneous Developers ===

The proposed initial committers come from a single entity, though we are
committed to growing the Daffodil development community to include a
broad group of additional committers from a wide array of industries.

=== Reliance on Salaried Developers ===

The proposed initial committers are paid by their employer to contribute
to the Daffodil project. We expect that Daffodil development will
continue with salaried developers, and are committed to growing the
community to include non-salaried developers as well.

=== Relationship with other Apache Projects ===

As mentioned in the Alignment section, Daffodil fills a clear gap in
numerous other ASF projects that consume and manage large amounts of data.

As a specific example, Daffodil developers have created a Daffodil
Apache NiFi Processor, currently in use in data transfer solutions,
which allows one to ingest non-native data into an Apache NiFi pipeline
as XML or JSON. This processor was well received by the Apache NiFi
developers, with positive comments about the concise API and how it
could handle non-native data. Daffodil developers have also successfully
prototyped integration with Apache Spark. We believe Daffodil could
provide a strong benefit to many other ASF projects that handle fixed
format data. We anticipate working closely with such ASF projects to
include Daffodil where applicable to increase their ability to support
new data formats with minimal effort.

Daffodil also depends on existing ASF projects, including Apache Commons
and Apache Xerces.

=== An Excessive Fascination with the Apache Brand ===

Although the Apache brand may certainly help to attract more
contributors, publicity is not the reason for this proposal. We believe
Daffodil could provide a great benefit to the ASF and the numerous data
focused projects that comprise it, as described in the Rationale and
Alignment sections. We hope to build a strong and vibrant community
built around The Apache Way, and not dependent on a single company.

=== Documentation ===

Daffodil documentation can be found at:

 *
https://opensource.ncsa.illinois.edu/confluence/display/DFDL/Daffodil%3A+Open+Source+DFDL

Information about DFDL can be found at:

 * https://www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
 *
https://www.ibm.com/support/knowledgecenter/en/SSMKHH_9.0.0/com.ibm.etools.mft.doc/df20060_.htm

Public examples of DFDL Schemas can be found at:

 * https://github.com/DFDLSchemas

== Initial Source ==

The Daffodil git repo goes back to mid-2011 with approximately 20
different contributors and feedback from many users and developers. The
core codebase is written in Scala and includes both a Scala and Java
API, along with Javadocs and Scaladocs for API usage. The initial code
will come from the git repository currently hosted by NCSA at the
University of Illinois :

https://opensource.ncsa.illinois.edu/bitbucket/projects/DFDL/repos/daffodil/

== Source and Intellectual Property Submission ==

The complete Daffodil code is licensed under the University of
Illinois/NCSA Open Source License. Much of the current codebase has been
developed by Tresys Technology, who is open to relicensing the code to
the Apache License version 2.0 and donate the source to the ASF.
Contacts at NCSA are also open to relicensing their contributions to
Apache v2. We plan to contact the other contributors and ask for
permission to relicense and donate their contributed code. For those
that decline or we cannot contact, their code will be removed or
replaced. We will work closely with Apache Legal to ensure all issues
related to relicensing are acceptable.

== External Dependencies ==

We believe all current dependencies are compatible with the ASF
guidelines. Our dependency licenses come from the following license
styles: Apache v2, BSD, MIT, and ICU. The list of current Daffodil
dependencies and their licenses are documented here:

https://opensource.ncsa.illinois.edu/confluence/display/DFDL/Dependencies+and+Licenses

== Cryptography ==

None

== Required Resources ==

=== Mailing Lists ===

 * [hidden email]
 * [hidden email]
 * [hidden email]
 * [hidden email]

=== Source Control ===

git://git.apache.org/incubator-daffodil.git

=== Issue Tracking ===

JIRA Daffodil (DFDL)

=== Initial Committers ===

 * Beth Finnegan <efinnegan at tresys dot com>
 * Dave Thompson <dthompson at tresys dot com>
 * Josh Adams <jadams at tresys dot com>
 * Mike Beckerle <mbeckerle at tresys dot com>
 * Steve Lawrence <slawrence at tresys dot com>
 * Taylor Wise <twise at tresys dot com>

=== Affiliations ===

 * Beth Finnegan (Tresys Technology)
 * Dave Thompson (Tresys Technology)
 * Josh Adams (Tresys Technology)
 * Mike Beckerle (Tresys Technology)
 * Steve Lawrence (Tresys Technology)
 * Taylor Wise (Tresys Technology)

== Sponsors ==

=== Champion ===

 * John D. Ament

=== Nominated Mentors ===

 * Dave Fisher
 * John D. Ament
 *

=== Sponsoring Entity ===

We request the Apache Incubator to sponsor this project.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Daffodil into the Apache Incubator

John D. Ament-2
+1 to accept

On Thu, Aug 10, 2017 at 3:03 PM Steve Lawrence <[hidden email]>
wrote:

> Hi All,
>
> Based on the discussion on the incubator mailing list [1], I would like
> to start a VOTE to bring the Daffodil project in as an Apache incubator
> podling.
>
> The ASF voting rules are described:
>
> https://www.apache.org/foundation/voting.html
>
> A vote for accepting a new Apache Incubator podling is a majority vote
> for which only Incubator PMC member votes are binding.
>
> This vote will run for at least 72 hours. Please VOTE as follows
> [] +1 Accept Daffodil into the Apache Incubator
> [] +0 Abstain.
> [] -1 Do not accept Daffodil into the Apache Incubator because ...
>
> The proposal is listed below, but you can also access it on the wiki:
>
> https://wiki.apache.org/incubator/DaffodilProposal
>
> Thank you,
> - Steve
>
> [1]
>
> https://lists.apache.org/thread.html/190d73e84508d2deaa6cfde1be197cb70ca4caddfb215bc269b3e44f@%3Cgeneral.incubator.apache.org%3E
>
>
>
> = Daffodil Proposal =
>
> == Abstract ==
>
> Daffodil is an implementation of the Data Format Description Language
> (DFDL) used to convert between fixed format data and XML/JSON.
>
> == Proposal ==
>
> The Data Format Description Language (DFDL) is a specification,
> developed by the Open Grid Forum, capable of describing many data
> formats, including both textual and binary, scientific and numeric,
> legacy and modern, commercial record-oriented, and many industry and
> military standards. It defines a language that is a subset of W3C XML
> schema to describe the logical format of the data, and annotations
> within the schema to describe the physical representation.
>
> Daffodil is an open source implementation of the DFDL specification that
> uses these DFDL schemas to parse fixed format data into an infoset,
> which is most commonly represented as either XML or JSON. This allows
> the use of well-established XML or JSON technologies and libraries to
> consume, inspect, and manipulate fixed format data in existing
> solutions. Daffodil is also capable of the reverse by serializing or
> "unparsing" an XML or JSON infoset back to the original data format.
>
> == Background ==
>
> Many different software solutions need to consume and manage data,
> including data directed routing, databases, data analysis, data
> cleansing, data visualizing, and more. A key aspect of such solutions is
> the need to transform the data into an easily consumable format.
> Usually, this means that for each unique data format, one develops a
> tool that can read and extract the necessary information, often leading
> to ad-hoc and data-format-specific description systems. Such systems are
> often proprietary, not well tested, and incompatible, leading to vendor
> lock-in, flawed software, and increased training costs. DFDL is a new
> standard, with version 1.0 completed in October of 2016, that solves
> these problems by defining an open standard to describe many different
> data formats and how to parse and unparse between the data and XML/JSON.
>
> Two closed source implementations of DFDL currently exist. The first was
> created by IBM and is now part of their IBM® Integration Bus product.
> The second was created by the European Space Agency, called DFDL4S or
> "DFDL for Space" targeted at the challenges of their satellite data
> processing.
>
> Around 2005, Pacific Northwest National Lab created Defuddle, built as
> an open source implementation and proof of concept of the draft DFDL
> specification and a test bed to feed new concepts into specification
> development. Primary development of Defuddle was eventually taken over
> by the National Center for Supercomputing Applications (NCSA). However,
> due to evolution of the DFDL specification and architectural and
> performance issues with Defuddle, around 2009, NCSA restarted the
> project with the new name of Daffodil, with a goal of implementing the
> complete DFDL specification. Daffodil development continued at NCSA
> until around 2012, at which point development slowed due to budget
> limitations. Shortly thereafter, primary development was picked up by
> Tresys Technology where it continues today, with contributions from
> other entities such as the Navy Research Lab, the Air Force Research
> Lab, MITRE, and Booz Allen Hamilton. In February of 2015, Daffodil
> version 1.0.0 was released, including support for the DFDL features
> needed to parse many common file formats. Daffodil version 2.0.0 is
> expected to be released in August of 2017, which will include unparse
> support with one-to-one parsing feature parity.
>
> Entities including IBM, MITRE, NATO NCI Agency, Northrop-Grumman, Quark
> Security, Raytheon, and Tresys Technology have developed DFDL schemas
> for many data formats from varying technology domains, including PNG,
> GIF, BMP, PCAP, HL7, EDIFACT, NACHA, vCard, iCalendar, and MIL-STD-2045
> , many of which are publicly available on the DFDL Schemas github. There
> are also a number of military-application data formats, the
> specifications of which are not public, which have historically been
> very difficult and expensive to process, and for which DFDL schemas have
> been created or are actively in development; these include
> MIL-STD-6040/USMTF ATO, MIL-STD-6017/VMF, MIL-STD-6016/NATO STANAG 5516
> (aka "Link16").
>
> == Rationale ==
>
> Numerous software solutions exist that consume, inspect, analyze, and
> transform data, many of which can be found in the Apache Software
> Foundation (ASF). In order for tools like these to consume new types of
> data, custom extensions are usually required, often with high
> development and testing costs. Daffodil fills a clear gap in many of
> these solutions, providing a simple and low cost way to transform data
> to XML or JSON, which many of these tools natively support already. With
> the upcoming 2.0.0 release, the Daffodil project will have achieved a
> level of functionality in both parse and unparse that, when integrated
> into existing solutions, could provide for a new method to quickly
> enable support for new data formats.
>
> == Initial Goals ==
>
>  * Relicense the existing code from the University of Illinois/NCSA Open
> Source License to the Apache License version 2.0, working with Apache
> Legal to ensure correctness, and with Daffodil contributors to get their
> permission.
>  * Move the existing codebase, documentation, bugs, and mailing lists to
> the Apache hosted infrastructure
>  * Establish a formal release process and schedule, allowing for
> dependable release cycles in a manner consistent with the Apache
> development process.
>  * Build relationships with ASF projects to add Daffodil support where
> appropriate
>  * Grow the community to establish a diversity of background and expertise.
>
> == Current Status ==
>
> === Meritocracy ===
>
> All initial committers are familiar with the principles of meritocracy.
> The Daffodil project has followed the model of meritocracy in the past,
> providing multiple outside entities commit access based on the quality
> of their contributions. In order to grow the Daffodil user base and
> development community, we are dedicated to continuing to operate
> Daffodil as a meritocracy.
>
> A key ingredient in a meritocracy of developers is open group code
> review. The Daffodil project has operated in this mode throughout its
> existence and this provides a forum to improve the code, verify code
> quality, and educate new developers on the code base.
>
> === Community ===
>
> Daffodil has a small community of users and developers. Although primary
> Daffodil development is done by Tresys Technology, a handful of other
> contributions have come from other entities including the Navy Research
> Lab, the Air Force Research Lab, MITRE, and Booz Allen Hamilton. In
> addition to developers, multiple users of Daffodil have created DFDL
> schemas, including entities such as MITRE, IBM, Raytheon, Quark
> Security, and Tresys Technology. The DFDL Schemas github community has
> been created as a place for DFDL schemas to be published. The Daffodil
> project also makes use of mailing lists, HipChat, and Confluence
> Questions to build a community of users and system for support.
>
> === Core Developers ===
>
> The core developers of Daffodil are employed by Tresys Technology. We
> will work to grow the community among a more diverse set of developers
> and industries.
>
> === Alignment ===
>
> Daffodil was created as an open source project with a philosophy
> consistent with The Apache Way. A strong belief in meritocracy,
> community involvement in decisions, openness, and ensuring a high level
> of quality in code, documentation, and testing are some of our shared
> core beliefs.
>
> Further, as mentioned in the Rationale section, Daffodil fills a gap
> that exists in many ASF projects, including NiFi, Spark, Storm, Hadoop,
> Tika, and others. In order for tools like these to consume new types of
> data, custom extensions are usually required. Rather than create such
> extensions, Daffodil provides an easy and standards-compliant way to
> transform data to XML or JSON, which many of these tools already
> natively support.
>
> == Known Risks ==
>
> === Orphaned Products ===
>
> The current core developers are the leading contributors in the space of
> DFDL and wish to see it flourish. Though there is some risk that the
> initial committers all come from the same company, a goal of entering
> into incubation is to grow the development community to minimize the
> risk of reliance on a single company.
>
> === Inexperience with Open Source ===
>
> The Daffodil project began as an open source project and has continued
> that model throughout development. This includes public bug tracking,
> git revision control, automated builds and tests, and a public wiki for
> documentation.
>
> Additionally, the current core developers and initial committers all
> work for a company that relies on, believes in, promotes, and has led or
> contributed to many open source software projects, including SELinux
> Userspace, OpenSCAP, CLIP, refpolicy, setools, RPM, and others. As such,
> there is low risk related to inexperience with open source software and
> processes.
>
> === Homogeneous Developers ===
>
> The proposed initial committers come from a single entity, though we are
> committed to growing the Daffodil development community to include a
> broad group of additional committers from a wide array of industries.
>
> === Reliance on Salaried Developers ===
>
> The proposed initial committers are paid by their employer to contribute
> to the Daffodil project. We expect that Daffodil development will
> continue with salaried developers, and are committed to growing the
> community to include non-salaried developers as well.
>
> === Relationship with other Apache Projects ===
>
> As mentioned in the Alignment section, Daffodil fills a clear gap in
> numerous other ASF projects that consume and manage large amounts of data.
>
> As a specific example, Daffodil developers have created a Daffodil
> Apache NiFi Processor, currently in use in data transfer solutions,
> which allows one to ingest non-native data into an Apache NiFi pipeline
> as XML or JSON. This processor was well received by the Apache NiFi
> developers, with positive comments about the concise API and how it
> could handle non-native data. Daffodil developers have also successfully
> prototyped integration with Apache Spark. We believe Daffodil could
> provide a strong benefit to many other ASF projects that handle fixed
> format data. We anticipate working closely with such ASF projects to
> include Daffodil where applicable to increase their ability to support
> new data formats with minimal effort.
>
> Daffodil also depends on existing ASF projects, including Apache Commons
> and Apache Xerces.
>
> === An Excessive Fascination with the Apache Brand ===
>
> Although the Apache brand may certainly help to attract more
> contributors, publicity is not the reason for this proposal. We believe
> Daffodil could provide a great benefit to the ASF and the numerous data
> focused projects that comprise it, as described in the Rationale and
> Alignment sections. We hope to build a strong and vibrant community
> built around The Apache Way, and not dependent on a single company.
>
> === Documentation ===
>
> Daffodil documentation can be found at:
>
>  *
>
> https://opensource.ncsa.illinois.edu/confluence/display/DFDL/Daffodil%3A+Open+Source+DFDL
>
> Information about DFDL can be found at:
>
>  * https://www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
>  *
>
> https://www.ibm.com/support/knowledgecenter/en/SSMKHH_9.0.0/com.ibm.etools.mft.doc/df20060_.htm
>
> Public examples of DFDL Schemas can be found at:
>
>  * https://github.com/DFDLSchemas
>
> == Initial Source ==
>
> The Daffodil git repo goes back to mid-2011 with approximately 20
> different contributors and feedback from many users and developers. The
> core codebase is written in Scala and includes both a Scala and Java
> API, along with Javadocs and Scaladocs for API usage. The initial code
> will come from the git repository currently hosted by NCSA at the
> University of Illinois :
>
>
> https://opensource.ncsa.illinois.edu/bitbucket/projects/DFDL/repos/daffodil/
>
> == Source and Intellectual Property Submission ==
>
> The complete Daffodil code is licensed under the University of
> Illinois/NCSA Open Source License. Much of the current codebase has been
> developed by Tresys Technology, who is open to relicensing the code to
> the Apache License version 2.0 and donate the source to the ASF.
> Contacts at NCSA are also open to relicensing their contributions to
> Apache v2. We plan to contact the other contributors and ask for
> permission to relicense and donate their contributed code. For those
> that decline or we cannot contact, their code will be removed or
> replaced. We will work closely with Apache Legal to ensure all issues
> related to relicensing are acceptable.
>
> == External Dependencies ==
>
> We believe all current dependencies are compatible with the ASF
> guidelines. Our dependency licenses come from the following license
> styles: Apache v2, BSD, MIT, and ICU. The list of current Daffodil
> dependencies and their licenses are documented here:
>
>
> https://opensource.ncsa.illinois.edu/confluence/display/DFDL/Dependencies+and+Licenses
>
> == Cryptography ==
>
> None
>
> == Required Resources ==
>
> === Mailing Lists ===
>
>  * [hidden email]
>  * [hidden email]
>  * [hidden email]
>  * [hidden email]
>
> === Source Control ===
>
> git://git.apache.org/incubator-daffodil.git
>
> === Issue Tracking ===
>
> JIRA Daffodil (DFDL)
>
> === Initial Committers ===
>
>  * Beth Finnegan <efinnegan at tresys dot com>
>  * Dave Thompson <dthompson at tresys dot com>
>  * Josh Adams <jadams at tresys dot com>
>  * Mike Beckerle <mbeckerle at tresys dot com>
>  * Steve Lawrence <slawrence at tresys dot com>
>  * Taylor Wise <twise at tresys dot com>
>
> === Affiliations ===
>
>  * Beth Finnegan (Tresys Technology)
>  * Dave Thompson (Tresys Technology)
>  * Josh Adams (Tresys Technology)
>  * Mike Beckerle (Tresys Technology)
>  * Steve Lawrence (Tresys Technology)
>  * Taylor Wise (Tresys Technology)
>
> == Sponsors ==
>
> === Champion ===
>
>  * John D. Ament
>
> === Nominated Mentors ===
>
>  * Dave Fisher
>  * John D. Ament
>  *
>
> === Sponsoring Entity ===
>
> We request the Apache Incubator to sponsor this project.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Daffodil into the Apache Incubator

Pierre Smits
+1

Best regards

Pierre

On Thu, 10 Aug 2017 at 21:09 John D. Ament <[hidden email]> wrote:

> +1 to accept
>
> On Thu, Aug 10, 2017 at 3:03 PM Steve Lawrence <
> [hidden email]>
> wrote:
>
> > Hi All,
> >
> > Based on the discussion on the incubator mailing list [1], I would like
> > to start a VOTE to bring the Daffodil project in as an Apache incubator
> > podling.
> >
> > The ASF voting rules are described:
> >
> > https://www.apache.org/foundation/voting.html
> >
> > A vote for accepting a new Apache Incubator podling is a majority vote
> > for which only Incubator PMC member votes are binding.
> >
> > This vote will run for at least 72 hours. Please VOTE as follows
> > [] +1 Accept Daffodil into the Apache Incubator
> > [] +0 Abstain.
> > [] -1 Do not accept Daffodil into the Apache Incubator because ...
> >
> > The proposal is listed below, but you can also access it on the wiki:
> >
> > https://wiki.apache.org/incubator/DaffodilProposal
> >
> > Thank you,
> > - Steve
> >
> > [1]
> >
> >
> https://lists.apache.org/thread.html/190d73e84508d2deaa6cfde1be197cb70ca4caddfb215bc269b3e44f@%3Cgeneral.incubator.apache.org%3E
> >
> >
> >
> > = Daffodil Proposal =
> >
> > == Abstract ==
> >
> > Daffodil is an implementation of the Data Format Description Language
> > (DFDL) used to convert between fixed format data and XML/JSON.
> >
> > == Proposal ==
> >
> > The Data Format Description Language (DFDL) is a specification,
> > developed by the Open Grid Forum, capable of describing many data
> > formats, including both textual and binary, scientific and numeric,
> > legacy and modern, commercial record-oriented, and many industry and
> > military standards. It defines a language that is a subset of W3C XML
> > schema to describe the logical format of the data, and annotations
> > within the schema to describe the physical representation.
> >
> > Daffodil is an open source implementation of the DFDL specification that
> > uses these DFDL schemas to parse fixed format data into an infoset,
> > which is most commonly represented as either XML or JSON. This allows
> > the use of well-established XML or JSON technologies and libraries to
> > consume, inspect, and manipulate fixed format data in existing
> > solutions. Daffodil is also capable of the reverse by serializing or
> > "unparsing" an XML or JSON infoset back to the original data format.
> >
> > == Background ==
> >
> > Many different software solutions need to consume and manage data,
> > including data directed routing, databases, data analysis, data
> > cleansing, data visualizing, and more. A key aspect of such solutions is
> > the need to transform the data into an easily consumable format.
> > Usually, this means that for each unique data format, one develops a
> > tool that can read and extract the necessary information, often leading
> > to ad-hoc and data-format-specific description systems. Such systems are
> > often proprietary, not well tested, and incompatible, leading to vendor
> > lock-in, flawed software, and increased training costs. DFDL is a new
> > standard, with version 1.0 completed in October of 2016, that solves
> > these problems by defining an open standard to describe many different
> > data formats and how to parse and unparse between the data and XML/JSON.
> >
> > Two closed source implementations of DFDL currently exist. The first was
> > created by IBM and is now part of their IBM® Integration Bus product.
> > The second was created by the European Space Agency, called DFDL4S or
> > "DFDL for Space" targeted at the challenges of their satellite data
> > processing.
> >
> > Around 2005, Pacific Northwest National Lab created Defuddle, built as
> > an open source implementation and proof of concept of the draft DFDL
> > specification and a test bed to feed new concepts into specification
> > development. Primary development of Defuddle was eventually taken over
> > by the National Center for Supercomputing Applications (NCSA). However,
> > due to evolution of the DFDL specification and architectural and
> > performance issues with Defuddle, around 2009, NCSA restarted the
> > project with the new name of Daffodil, with a goal of implementing the
> > complete DFDL specification. Daffodil development continued at NCSA
> > until around 2012, at which point development slowed due to budget
> > limitations. Shortly thereafter, primary development was picked up by
> > Tresys Technology where it continues today, with contributions from
> > other entities such as the Navy Research Lab, the Air Force Research
> > Lab, MITRE, and Booz Allen Hamilton. In February of 2015, Daffodil
> > version 1.0.0 was released, including support for the DFDL features
> > needed to parse many common file formats. Daffodil version 2.0.0 is
> > expected to be released in August of 2017, which will include unparse
> > support with one-to-one parsing feature parity.
> >
> > Entities including IBM, MITRE, NATO NCI Agency, Northrop-Grumman, Quark
> > Security, Raytheon, and Tresys Technology have developed DFDL schemas
> > for many data formats from varying technology domains, including PNG,
> > GIF, BMP, PCAP, HL7, EDIFACT, NACHA, vCard, iCalendar, and MIL-STD-2045
> > , many of which are publicly available on the DFDL Schemas github. There
> > are also a number of military-application data formats, the
> > specifications of which are not public, which have historically been
> > very difficult and expensive to process, and for which DFDL schemas have
> > been created or are actively in development; these include
> > MIL-STD-6040/USMTF ATO, MIL-STD-6017/VMF, MIL-STD-6016/NATO STANAG 5516
> > (aka "Link16").
> >
> > == Rationale ==
> >
> > Numerous software solutions exist that consume, inspect, analyze, and
> > transform data, many of which can be found in the Apache Software
> > Foundation (ASF). In order for tools like these to consume new types of
> > data, custom extensions are usually required, often with high
> > development and testing costs. Daffodil fills a clear gap in many of
> > these solutions, providing a simple and low cost way to transform data
> > to XML or JSON, which many of these tools natively support already. With
> > the upcoming 2.0.0 release, the Daffodil project will have achieved a
> > level of functionality in both parse and unparse that, when integrated
> > into existing solutions, could provide for a new method to quickly
> > enable support for new data formats.
> >
> > == Initial Goals ==
> >
> >  * Relicense the existing code from the University of Illinois/NCSA Open
> > Source License to the Apache License version 2.0, working with Apache
> > Legal to ensure correctness, and with Daffodil contributors to get their
> > permission.
> >  * Move the existing codebase, documentation, bugs, and mailing lists to
> > the Apache hosted infrastructure
> >  * Establish a formal release process and schedule, allowing for
> > dependable release cycles in a manner consistent with the Apache
> > development process.
> >  * Build relationships with ASF projects to add Daffodil support where
> > appropriate
> >  * Grow the community to establish a diversity of background and
> expertise.
> >
> > == Current Status ==
> >
> > === Meritocracy ===
> >
> > All initial committers are familiar with the principles of meritocracy.
> > The Daffodil project has followed the model of meritocracy in the past,
> > providing multiple outside entities commit access based on the quality
> > of their contributions. In order to grow the Daffodil user base and
> > development community, we are dedicated to continuing to operate
> > Daffodil as a meritocracy.
> >
> > A key ingredient in a meritocracy of developers is open group code
> > review. The Daffodil project has operated in this mode throughout its
> > existence and this provides a forum to improve the code, verify code
> > quality, and educate new developers on the code base.
> >
> > === Community ===
> >
> > Daffodil has a small community of users and developers. Although primary
> > Daffodil development is done by Tresys Technology, a handful of other
> > contributions have come from other entities including the Navy Research
> > Lab, the Air Force Research Lab, MITRE, and Booz Allen Hamilton. In
> > addition to developers, multiple users of Daffodil have created DFDL
> > schemas, including entities such as MITRE, IBM, Raytheon, Quark
> > Security, and Tresys Technology. The DFDL Schemas github community has
> > been created as a place for DFDL schemas to be published. The Daffodil
> > project also makes use of mailing lists, HipChat, and Confluence
> > Questions to build a community of users and system for support.
> >
> > === Core Developers ===
> >
> > The core developers of Daffodil are employed by Tresys Technology. We
> > will work to grow the community among a more diverse set of developers
> > and industries.
> >
> > === Alignment ===
> >
> > Daffodil was created as an open source project with a philosophy
> > consistent with The Apache Way. A strong belief in meritocracy,
> > community involvement in decisions, openness, and ensuring a high level
> > of quality in code, documentation, and testing are some of our shared
> > core beliefs.
> >
> > Further, as mentioned in the Rationale section, Daffodil fills a gap
> > that exists in many ASF projects, including NiFi, Spark, Storm, Hadoop,
> > Tika, and others. In order for tools like these to consume new types of
> > data, custom extensions are usually required. Rather than create such
> > extensions, Daffodil provides an easy and standards-compliant way to
> > transform data to XML or JSON, which many of these tools already
> > natively support.
> >
> > == Known Risks ==
> >
> > === Orphaned Products ===
> >
> > The current core developers are the leading contributors in the space of
> > DFDL and wish to see it flourish. Though there is some risk that the
> > initial committers all come from the same company, a goal of entering
> > into incubation is to grow the development community to minimize the
> > risk of reliance on a single company.
> >
> > === Inexperience with Open Source ===
> >
> > The Daffodil project began as an open source project and has continued
> > that model throughout development. This includes public bug tracking,
> > git revision control, automated builds and tests, and a public wiki for
> > documentation.
> >
> > Additionally, the current core developers and initial committers all
> > work for a company that relies on, believes in, promotes, and has led or
> > contributed to many open source software projects, including SELinux
> > Userspace, OpenSCAP, CLIP, refpolicy, setools, RPM, and others. As such,
> > there is low risk related to inexperience with open source software and
> > processes.
> >
> > === Homogeneous Developers ===
> >
> > The proposed initial committers come from a single entity, though we are
> > committed to growing the Daffodil development community to include a
> > broad group of additional committers from a wide array of industries.
> >
> > === Reliance on Salaried Developers ===
> >
> > The proposed initial committers are paid by their employer to contribute
> > to the Daffodil project. We expect that Daffodil development will
> > continue with salaried developers, and are committed to growing the
> > community to include non-salaried developers as well.
> >
> > === Relationship with other Apache Projects ===
> >
> > As mentioned in the Alignment section, Daffodil fills a clear gap in
> > numerous other ASF projects that consume and manage large amounts of
> data.
> >
> > As a specific example, Daffodil developers have created a Daffodil
> > Apache NiFi Processor, currently in use in data transfer solutions,
> > which allows one to ingest non-native data into an Apache NiFi pipeline
> > as XML or JSON. This processor was well received by the Apache NiFi
> > developers, with positive comments about the concise API and how it
> > could handle non-native data. Daffodil developers have also successfully
> > prototyped integration with Apache Spark. We believe Daffodil could
> > provide a strong benefit to many other ASF projects that handle fixed
> > format data. We anticipate working closely with such ASF projects to
> > include Daffodil where applicable to increase their ability to support
> > new data formats with minimal effort.
> >
> > Daffodil also depends on existing ASF projects, including Apache Commons
> > and Apache Xerces.
> >
> > === An Excessive Fascination with the Apache Brand ===
> >
> > Although the Apache brand may certainly help to attract more
> > contributors, publicity is not the reason for this proposal. We believe
> > Daffodil could provide a great benefit to the ASF and the numerous data
> > focused projects that comprise it, as described in the Rationale and
> > Alignment sections. We hope to build a strong and vibrant community
> > built around The Apache Way, and not dependent on a single company.
> >
> > === Documentation ===
> >
> > Daffodil documentation can be found at:
> >
> >  *
> >
> >
> https://opensource.ncsa.illinois.edu/confluence/display/DFDL/Daffodil%3A+Open+Source+DFDL
> >
> > Information about DFDL can be found at:
> >
> >  * https://www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
> >  *
> >
> >
> https://www.ibm.com/support/knowledgecenter/en/SSMKHH_9.0.0/com.ibm.etools.mft.doc/df20060_.htm
> >
> > Public examples of DFDL Schemas can be found at:
> >
> >  * https://github.com/DFDLSchemas
> >
> > == Initial Source ==
> >
> > The Daffodil git repo goes back to mid-2011 with approximately 20
> > different contributors and feedback from many users and developers. The
> > core codebase is written in Scala and includes both a Scala and Java
> > API, along with Javadocs and Scaladocs for API usage. The initial code
> > will come from the git repository currently hosted by NCSA at the
> > University of Illinois :
> >
> >
> >
> https://opensource.ncsa.illinois.edu/bitbucket/projects/DFDL/repos/daffodil/
> >
> > == Source and Intellectual Property Submission ==
> >
> > The complete Daffodil code is licensed under the University of
> > Illinois/NCSA Open Source License. Much of the current codebase has been
> > developed by Tresys Technology, who is open to relicensing the code to
> > the Apache License version 2.0 and donate the source to the ASF.
> > Contacts at NCSA are also open to relicensing their contributions to
> > Apache v2. We plan to contact the other contributors and ask for
> > permission to relicense and donate their contributed code. For those
> > that decline or we cannot contact, their code will be removed or
> > replaced. We will work closely with Apache Legal to ensure all issues
> > related to relicensing are acceptable.
> >
> > == External Dependencies ==
> >
> > We believe all current dependencies are compatible with the ASF
> > guidelines. Our dependency licenses come from the following license
> > styles: Apache v2, BSD, MIT, and ICU. The list of current Daffodil
> > dependencies and their licenses are documented here:
> >
> >
> >
> https://opensource.ncsa.illinois.edu/confluence/display/DFDL/Dependencies+and+Licenses
> >
> > == Cryptography ==
> >
> > None
> >
> > == Required Resources ==
> >
> > === Mailing Lists ===
> >
> >  * [hidden email]
> >  * [hidden email]
> >  * [hidden email]
> >  * [hidden email]
> >
> > === Source Control ===
> >
> > git://git.apache.org/incubator-daffodil.git
> >
> > === Issue Tracking ===
> >
> > JIRA Daffodil (DFDL)
> >
> > === Initial Committers ===
> >
> >  * Beth Finnegan <efinnegan at tresys dot com>
> >  * Dave Thompson <dthompson at tresys dot com>
> >  * Josh Adams <jadams at tresys dot com>
> >  * Mike Beckerle <mbeckerle at tresys dot com>
> >  * Steve Lawrence <slawrence at tresys dot com>
> >  * Taylor Wise <twise at tresys dot com>
> >
> > === Affiliations ===
> >
> >  * Beth Finnegan (Tresys Technology)
> >  * Dave Thompson (Tresys Technology)
> >  * Josh Adams (Tresys Technology)
> >  * Mike Beckerle (Tresys Technology)
> >  * Steve Lawrence (Tresys Technology)
> >  * Taylor Wise (Tresys Technology)
> >
> > == Sponsors ==
> >
> > === Champion ===
> >
> >  * John D. Ament
> >
> > === Nominated Mentors ===
> >
> >  * Dave Fisher
> >  * John D. Ament
> >  *
> >
> > === Sponsoring Entity ===
> >
> > We request the Apache Incubator to sponsor this project.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>
--
Pierre Smits

ORRTIZ.COM <http://www.orrtiz.com>
OFBiz based solutions & services

OFBiz Extensions Marketplace
http://oem.ofbizci.net/oci-2/
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Daffodil into the Apache Incubator

Bertrand Delacretaz
In reply to this post by Steve Lawrence
On Thu, Aug 10, 2017 at 9:03 PM, Steve Lawrence
<[hidden email]> wrote:
> ...I would like
> to start a VOTE to bring the Daffodil project in as an Apache incubator
> podling...

+1

-Bertrand

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Loading...