[VOTE] Superset Proposal for Apache Incubator

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[VOTE] Superset Proposal for Apache Incubator

Jeff Feng-2
Dear Apache Incubator Community,

We have updated the Superset proposal
<https://wiki.apache.org/incubator/SupersetProposal> (copied below) for
Apache Incubation with an additional mentor (Luke Han - [hidden email]),
and would like to start a vote thread for acceptance into the incubator.

Our team is excited to share Superset with the Apache community and we hope
for the your continued support!

Cheers,
Jeff & the Superset Team




= Superset =

== Abstract ==
Superset is an enterprise-ready web application for data exploration, data
visualization and dashboarding.

== Proposal ==
Superset is business intelligence (BI) software that helps modern
organizations visualize and interact with their data. Superset enables
users explore data from a variety of databases, assemble beautiful
dashboards and share their findings.  Superset works neatly with all modern
SQL-speaking databases, and integrates with Druid.io to provide real-time,
interactive, blazing fast data access to large datasets.

== Background ==
Data is mission critical. To succeed in this era, organizations need to
provide low-friction, intuitive and interactive access to data. It is
paramount for knowledge workers to be capable of answering their own
questions by querying, exploring and visualizing data.

The entire business intelligence industry has pivoted from a model of
centralized top-down platforms driven by IT organizations to self-service
analytics and agile workflows by any user.  This shift unblocks centralized
service bottlenecks for creating data visualizations while also creating an
environment that is iterative and fast-moving.  This means that business
intelligence software must also be easy and delightful to use.
Self-service analytics doesn’t mean that admin and governance features are
not needed.
Modern BI tools provide fine-grain access controls and auditing
capabilities to understand how data is being used.  Superset is a solution
that delivers on all of these vectors.

The technology stack is also constantly morphing - vendors are struggling
to provide cheap, quick and easy solutions to access data.  Business
intelligence users are finding existing solutions lacking as these software
products either disregard or react slowly to recent game-changing
technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
React.js and iPython’s Jupyter for instance.

== Rationale ==
Business intelligence is more relevant today than at any other point in
history.  Organizations are currently very limited in options for open
source data visualization solutions, especially solutions that are both
self-service and enterprise-ready.  Every company informing their decisions
with data needs a BI tool.

We believe that Superset will be a strong compliment to existing Apache
Software Foundation technologies by offering scalable user interactions to
distributed storage and computation solutions.  Users will often find that
Superset can act as a catalyst for tooling that can visualize the byproduct
of data and computation infrastructure.

Superset has many key design elements that help fill a gap in current
solutions for organizations:
 * Easy, low friction access to data through a simple, web-based data
exploration interface.  Composing charts and dashboards are intuitive.
Eliminating the need to write code or SQL empowers anyone to use it.
 * Access to a wide array of rich, interactive data visualization types.
 * Enterprise-ready: Integration with different authentication mechanisms
and granular permissions centered around actions and data access.
 * Realtime & fast: Superset provides realtime analytics at the speed of
thought on very large datasets when integrated with Druid.io.
 * Broad data access: Consume data out of any SQL-speaking relational
database.
 * Extensible: Can be extended to talk to many noSQL databases like Apache
Drill, Elastic Search, and other popular database engines.
 * Fast loading dashboards with configurable web-scale caching.
 * Plug-in framework that enables organizations to build custom analytical
applications with new UI/UX interfaces.
 * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
with more flexibility.  SQL Lab integrates with the visualization engine
seamlessly.

== Initial Goals ==
The initial goals of the Superset project are several-fold:
 * Move the existing codebase to Apache and integrate with the Apache
development process.
 * Redesign the user interface and interaction model for creating
visualizations/dashboards and connecting to data sources
 * Build robust support for security and governance of the tool including
popular authorization modules (including Apache Ranger and Apache Sentry)
and a more sophisticated permissions system
 * Grow the extensibility of the project both in terms of enhanced
connectivity to NoSQL-based data sources and creating a plug-in framework
that enables organizations to build custom analytical applications which
require a new UI/UX

== Current Status ==
By many standards, Superset is already a successful open source project. As
of March 2017, Superset is officially used in production at about a dozen
companies, has received contributions from over one hundred contributors on
Github, 1500+ forks, and 12k+ stars.

Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
significant contributions, and expressed their commitment to the project.
The product is feature complete and has been viable for months. It already
serves as the main interface for consuming data at many companies of
different sizes.

While the product is usable, there’s room for improvement across the board,
starting with providing a smoother user experience around content creation,
making sure all features work out-of-the-box on more platforms and
databases, providing better user training guides and videos, having a
predictable release process, and increasing the overall quality of the
Superset releases.

=== Meritocracy ===
We plan to invest in supporting a meritocracy. We will discuss the
requirements in an open forum. Several companies have expressed interest in
this project, and we intend to invite additional developers to participate.
We will encourage and monitor community participation so that privileges
can be extended to those that contribute.

=== Community ===
The need for an enterprise-ready data visualization and exploration
platform in the open source community is tremendous.  While Superset is
fairly well known, recognized and used within the Druid.io community,
adoption is currently limited outside of that niche. There is a huge
opportunity to grow the community to hundreds if not thousands of
organizations, and we are hoping that embracing “the Apache way” will
accelerate the growth of our community.

We have already been active at seeking and inviting contributions, and are
planning to scale the project by investing time and growing the support
structure to grow the community.

=== Core Developers ===
The initial committers for Superset include experienced full stack,
front-end and data engineers:
 * Maxime Beauchemin (Airbnb)
 * Alanna Scott (Airbnb)
 * Bogdan Kyryliuk (Airbnb)
 * Vera Liu  (Airbnb)
 * Jeff Feng (Airbnb)
 * Ashutosh Chauhan (Hortonworks)
 * Nishant Bangarwa (Hortonworks)
 * Slim Bouguerra (Hortonworks)
 * Priyank Shah (Hortonworks)
 * Sriharsha Chintalapani (Hortonworks)
 * Daniel Dai (Hortonworks)

We realize that additional employer diversity is needed, and we will work
aggressively to recruit developers from additional companies.

=== Alignment ===
The initial committers strongly believe that a system for interactive
visualization of data will gain broader adoption as an open source,
community driven project, where the community can contribute not only to
the core components, but also to a growing collection of connectors,
visualizations and improving integration a all potential data sources.
Superset already integrates closely with Apache Hive, the Hive metastore,
as well as most SQL-speaking databases found in modern data ecosystems.

== Known Risks ==

=== Orphaned Products ===
Superset is a vital component for both visualizing, accessing and
democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
component of the DataFlow product offering.  Thus, the risk of the project
being orphaned is relatively low.  The project could be at risk if Airbnb
changes their approach for democratizing data or if Hortonworks changes
their strategy in the market.  In such an event, the committers plan to
continue working on the project on their own time, thought the progress
will likely be slower.  We plan to mitigate this risk by recruiting
additional committers.

=== Inexperience with Open Source ===
The initial committers include veteran Apache members (committers and PPMC
members) and other developers who have varying degrees of experience with
open source projects. All have been involved with source code that has been
released under an open source license, and several also have experience
developing code with an open source development process.

=== Homogenous Developers ===
The initial committers are employed by Airbnb Inc. and Hortonworks. We are
committed to recruiting additional committers from other companies.

=== Reliance on Salaried Developers ===
It is expected that Superset development will occur on both salaried time
and on volunteer time, after hours. The majority of initial committers are
paid by their employer to contribute to this project. However, they are all
passionate about the project, and we are confident that the project will
continue even if no salaried developers contribute to the project. We are
committed to recruiting additional committers including non-salaried
developers.

=== Relationships with Other Apache Products ===
To the knowledge of the Initial Committers, there are no direct competitors
to Superset within the Apache Software Foundation.  That said, Apache
Zeppelin is an indirect competitor, but it solves a different use case.

Apache Zeppelin is a web-based notebook that enables interactive data
analytics. It enables the creation of beautiful data-driven, interactive
and collaborative documents with SQL, Scala and more.  Although a user can
create data visualizations using this project, it leverages a notebook
style user interfaces and it is geared towards the Spark community where
Scala and SQL co-exist

We look forward to collaborating with those communities, as well as other
Apache communities.

=== An Excessive Fascination with the Apache Brand ===
Superset is solving two huge challenges:
The challenge of enabling every knowledge worker to make data informed
decisions, particularly those who are not deeply skilled at writing SQL.
The challenge of visualizing huge amounts of data interactively and in
real-time

Superset was first developed as a data visualization solution for Druid.io
as a way to visualize billions of rows of data.  Since then, usage of
Superset has expanded to address data visualization use cases across SQL
speaking data sources as well.

Our rationale for developing Superset as an Apache project is detailed in
the Rationale Section.  We believe that the Apache brand and community
process will help us attract more contributors to this project, and help
grow the footprint of the project through usage at other organizations and
within other applications.  Establishing consensus among users and
developers will result in a more valuable tool for everyone.

== Documentation ==
References to further reading material:
 * [[http://airbnb.io/superset/|Superset Documentation]]
 * [[
https://medium.com/airbnb-engineering/caravel-airbnb-s-data-exploration-platform-15a72aa610e5#.npqmmbu25|Blog
Post:  Superset: Airbnb’s Data Exploration Platform]]
 * [[
https://medium.com/airbnb-engineering/superset-scaling-data-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog
Post:  Superset: Scaling Data Access & Visual Insights at Airbnb]]

== Initial Source ==
The origin of the proposed code base can be found at
https://github.com/airbnb/superset.  The code base is primarily in Python.

== Source and Intellectual Property Submission Plan ==
We do not expect any complications for the submission of the Superset code
base.  Our code is already in Github and there is only a single code base.

== External Dependencies ==
List of Python packages, from the Python Package Index (Pypi):

 * boto3
 * celery
 * cryptography
 * flask-appbuilder
 * flask-cache
 * flask-migrate
 * flask-script
 * flask-sqlalchemy
 * flask-testing
 * humanize
 * gunicorn
 * markdown
 * pandas
 * parsedatetime
 * pydruid
 * PyHive
 * python-dateutil
 * requests
 * simplejson
 * six
 * sqlalchemy
 * sqlalchemy-utils
 * sqlparse
 * thrift
 * thrift-sasl
 * werkzeug

List of Javascript packages, from NPM:
 * autobind-decorator
 * bootstrap
 * bootstrap-datepicker
 * brace
 * brfs
 * cal-heatmap
 * classnames
 * d3
 * d3-cloud
 * d3-sankey
 * d3-scale
 * d3-tip
 * datamaps
 * datatables-bootstrap3-plugin
 * datatables.net-bs
 * font-awesome
 * gridster
 * immutability-helper
 * immutable
 * jquery
 * lodash.throttle
 * mapbox-gl
 * moment
 * moments
 * mustache
 * nvd3
 * react
 * react-ace
 * react-bootstrap
 * react-bootstrap-table
 * react-dom
 * react-draggable
 * react-gravatar
 * react-grid-layout
 * react-map-gl
 * react-redux
 * react-resizable
 * react-select
 * react-syntax-highlighter
 * reactable
 * redux
 * redux-localstorage
 * redux-thunk
 * shortid
 * style-loader
 * supercluster
 * topojson
 * victory
 * viewport-mercator-project

== Cryptography ==
The proposal does not include cryptographic code.

== Required Resources ==

=== Mailing List ===
There is a current mailing list as a Google Group “airbnb_superset” that we
are planning on deprecating as the Apache.org become ready to serve our
community.

 * superset-private
 * superset-dev
 * superset-user

=== Subversion Directory ===
Git is the preferred source control system.
http://svn.apache.org/repos/asf/incubator/superset

== Git Repository ==
Git is the preferred source control system, we’re assuming
https://github.com/apache/incubator-superset based on the naming scheme

== Issue Tracking ==
JIRA Superset (SUPERSET). If possible, we’d like to use Github issues & PRs
to manage our project as much as possible. It’s been said that there are
ways to keep Github’s issues in sync with Jira, allowing us to get best of
both worlds. If that is not possible, we will comply to using Jira.

== Other Resources ==
We currently use a set of Github integrated services that are free to the
open source community, like Travis-ci, Code Climate, Coveralls,
Landscape.io, Requires.io, david-dm and Gitter. We would like to keep using
these services as they allow us to scale contributions and optimize our
development flows. These services require some elevated rights on the
Github repository in order to set up or tune and we would like for the
committers to have the required rights.


== Initial Committers ==

 * Maxime Beauchemin <[hidden email]> - PPMC & Committer
 * Alanna Scott <[hidden email]> - PPMC & Committer
 * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
 * Vera Liu <[hidden email]> - Committer
 * Jeff Feng <[hidden email]> - PPMC & Committer
 * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
 * Nishant Bangarwa <[hidden email]> - PPMC & Committer
 * Slim Bouguerra <[hidden email]> - Committer
 * Priyank Shah <[hidden email]> - Committer
 * Harsha Chintalapani <[hidden email]> - Committer
 * Daniel Dai <[hidden email]> - Champion & Committer
 * Luke Han <[hidden email]> - Mentor

== Affiliations ==
The initial committers are employees of Airbnb Inc. and Hortonworks.

== Sponsors ==

=== Champion ===
Daniel Dai <[hidden email]>

=== Nominated Mentors ===
 * Ashutosh Chauhan <[hidden email]>
 * Luke Han <[hidden email]>

=== Sponsoring Entity ===
Incubator PMC
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Luke Han
+1 binding

Love to see Superset to be new incubator project.


Best Regards!
---------------------

Luke Han

On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <[hidden email]> wrote:

> Dear Apache Incubator Community,
>
> We have updated the Superset proposal
> <https://wiki.apache.org/incubator/SupersetProposal> (copied below) for
> Apache Incubation with an additional mentor (Luke Han -
> [hidden email]),
> and would like to start a vote thread for acceptance into the incubator.
>
> Our team is excited to share Superset with the Apache community and we hope
> for the your continued support!
>
> Cheers,
> Jeff & the Superset Team
>
>
>
>
> = Superset =
>
> == Abstract ==
> Superset is an enterprise-ready web application for data exploration, data
> visualization and dashboarding.
>
> == Proposal ==
> Superset is business intelligence (BI) software that helps modern
> organizations visualize and interact with their data. Superset enables
> users explore data from a variety of databases, assemble beautiful
> dashboards and share their findings.  Superset works neatly with all modern
> SQL-speaking databases, and integrates with Druid.io to provide real-time,
> interactive, blazing fast data access to large datasets.
>
> == Background ==
> Data is mission critical. To succeed in this era, organizations need to
> provide low-friction, intuitive and interactive access to data. It is
> paramount for knowledge workers to be capable of answering their own
> questions by querying, exploring and visualizing data.
>
> The entire business intelligence industry has pivoted from a model of
> centralized top-down platforms driven by IT organizations to self-service
> analytics and agile workflows by any user.  This shift unblocks centralized
> service bottlenecks for creating data visualizations while also creating an
> environment that is iterative and fast-moving.  This means that business
> intelligence software must also be easy and delightful to use.
> Self-service analytics doesn’t mean that admin and governance features are
> not needed.
> Modern BI tools provide fine-grain access controls and auditing
> capabilities to understand how data is being used.  Superset is a solution
> that delivers on all of these vectors.
>
> The technology stack is also constantly morphing - vendors are struggling
> to provide cheap, quick and easy solutions to access data.  Business
> intelligence users are finding existing solutions lacking as these software
> products either disregard or react slowly to recent game-changing
> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
> React.js and iPython’s Jupyter for instance.
>
> == Rationale ==
> Business intelligence is more relevant today than at any other point in
> history.  Organizations are currently very limited in options for open
> source data visualization solutions, especially solutions that are both
> self-service and enterprise-ready.  Every company informing their decisions
> with data needs a BI tool.
>
> We believe that Superset will be a strong compliment to existing Apache
> Software Foundation technologies by offering scalable user interactions to
> distributed storage and computation solutions.  Users will often find that
> Superset can act as a catalyst for tooling that can visualize the byproduct
> of data and computation infrastructure.
>
> Superset has many key design elements that help fill a gap in current
> solutions for organizations:
>  * Easy, low friction access to data through a simple, web-based data
> exploration interface.  Composing charts and dashboards are intuitive.
> Eliminating the need to write code or SQL empowers anyone to use it.
>  * Access to a wide array of rich, interactive data visualization types.
>  * Enterprise-ready: Integration with different authentication mechanisms
> and granular permissions centered around actions and data access.
>  * Realtime & fast: Superset provides realtime analytics at the speed of
> thought on very large datasets when integrated with Druid.io.
>  * Broad data access: Consume data out of any SQL-speaking relational
> database.
>  * Extensible: Can be extended to talk to many noSQL databases like Apache
> Drill, Elastic Search, and other popular database engines.
>  * Fast loading dashboards with configurable web-scale caching.
>  * Plug-in framework that enables organizations to build custom analytical
> applications with new UI/UX interfaces.
>  * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
> with more flexibility.  SQL Lab integrates with the visualization engine
> seamlessly.
>
> == Initial Goals ==
> The initial goals of the Superset project are several-fold:
>  * Move the existing codebase to Apache and integrate with the Apache
> development process.
>  * Redesign the user interface and interaction model for creating
> visualizations/dashboards and connecting to data sources
>  * Build robust support for security and governance of the tool including
> popular authorization modules (including Apache Ranger and Apache Sentry)
> and a more sophisticated permissions system
>  * Grow the extensibility of the project both in terms of enhanced
> connectivity to NoSQL-based data sources and creating a plug-in framework
> that enables organizations to build custom analytical applications which
> require a new UI/UX
>
> == Current Status ==
> By many standards, Superset is already a successful open source project. As
> of March 2017, Superset is officially used in production at about a dozen
> companies, has received contributions from over one hundred contributors on
> Github, 1500+ forks, and 12k+ stars.
>
> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
> significant contributions, and expressed their commitment to the project.
> The product is feature complete and has been viable for months. It already
> serves as the main interface for consuming data at many companies of
> different sizes.
>
> While the product is usable, there’s room for improvement across the board,
> starting with providing a smoother user experience around content creation,
> making sure all features work out-of-the-box on more platforms and
> databases, providing better user training guides and videos, having a
> predictable release process, and increasing the overall quality of the
> Superset releases.
>
> === Meritocracy ===
> We plan to invest in supporting a meritocracy. We will discuss the
> requirements in an open forum. Several companies have expressed interest in
> this project, and we intend to invite additional developers to participate.
> We will encourage and monitor community participation so that privileges
> can be extended to those that contribute.
>
> === Community ===
> The need for an enterprise-ready data visualization and exploration
> platform in the open source community is tremendous.  While Superset is
> fairly well known, recognized and used within the Druid.io community,
> adoption is currently limited outside of that niche. There is a huge
> opportunity to grow the community to hundreds if not thousands of
> organizations, and we are hoping that embracing “the Apache way” will
> accelerate the growth of our community.
>
> We have already been active at seeking and inviting contributions, and are
> planning to scale the project by investing time and growing the support
> structure to grow the community.
>
> === Core Developers ===
> The initial committers for Superset include experienced full stack,
> front-end and data engineers:
>  * Maxime Beauchemin (Airbnb)
>  * Alanna Scott (Airbnb)
>  * Bogdan Kyryliuk (Airbnb)
>  * Vera Liu  (Airbnb)
>  * Jeff Feng (Airbnb)
>  * Ashutosh Chauhan (Hortonworks)
>  * Nishant Bangarwa (Hortonworks)
>  * Slim Bouguerra (Hortonworks)
>  * Priyank Shah (Hortonworks)
>  * Sriharsha Chintalapani (Hortonworks)
>  * Daniel Dai (Hortonworks)
>
> We realize that additional employer diversity is needed, and we will work
> aggressively to recruit developers from additional companies.
>
> === Alignment ===
> The initial committers strongly believe that a system for interactive
> visualization of data will gain broader adoption as an open source,
> community driven project, where the community can contribute not only to
> the core components, but also to a growing collection of connectors,
> visualizations and improving integration a all potential data sources.
> Superset already integrates closely with Apache Hive, the Hive metastore,
> as well as most SQL-speaking databases found in modern data ecosystems.
>
> == Known Risks ==
>
> === Orphaned Products ===
> Superset is a vital component for both visualizing, accessing and
> democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
> component of the DataFlow product offering.  Thus, the risk of the project
> being orphaned is relatively low.  The project could be at risk if Airbnb
> changes their approach for democratizing data or if Hortonworks changes
> their strategy in the market.  In such an event, the committers plan to
> continue working on the project on their own time, thought the progress
> will likely be slower.  We plan to mitigate this risk by recruiting
> additional committers.
>
> === Inexperience with Open Source ===
> The initial committers include veteran Apache members (committers and PPMC
> members) and other developers who have varying degrees of experience with
> open source projects. All have been involved with source code that has been
> released under an open source license, and several also have experience
> developing code with an open source development process.
>
> === Homogenous Developers ===
> The initial committers are employed by Airbnb Inc. and Hortonworks. We are
> committed to recruiting additional committers from other companies.
>
> === Reliance on Salaried Developers ===
> It is expected that Superset development will occur on both salaried time
> and on volunteer time, after hours. The majority of initial committers are
> paid by their employer to contribute to this project. However, they are all
> passionate about the project, and we are confident that the project will
> continue even if no salaried developers contribute to the project. We are
> committed to recruiting additional committers including non-salaried
> developers.
>
> === Relationships with Other Apache Products ===
> To the knowledge of the Initial Committers, there are no direct competitors
> to Superset within the Apache Software Foundation.  That said, Apache
> Zeppelin is an indirect competitor, but it solves a different use case.
>
> Apache Zeppelin is a web-based notebook that enables interactive data
> analytics. It enables the creation of beautiful data-driven, interactive
> and collaborative documents with SQL, Scala and more.  Although a user can
> create data visualizations using this project, it leverages a notebook
> style user interfaces and it is geared towards the Spark community where
> Scala and SQL co-exist
>
> We look forward to collaborating with those communities, as well as other
> Apache communities.
>
> === An Excessive Fascination with the Apache Brand ===
> Superset is solving two huge challenges:
> The challenge of enabling every knowledge worker to make data informed
> decisions, particularly those who are not deeply skilled at writing SQL.
> The challenge of visualizing huge amounts of data interactively and in
> real-time
>
> Superset was first developed as a data visualization solution for Druid.io
> as a way to visualize billions of rows of data.  Since then, usage of
> Superset has expanded to address data visualization use cases across SQL
> speaking data sources as well.
>
> Our rationale for developing Superset as an Apache project is detailed in
> the Rationale Section.  We believe that the Apache brand and community
> process will help us attract more contributors to this project, and help
> grow the footprint of the project through usage at other organizations and
> within other applications.  Establishing consensus among users and
> developers will result in a more valuable tool for everyone.
>
> == Documentation ==
> References to further reading material:
>  * [[http://airbnb.io/superset/|Superset Documentation]]
>  * [[
> https://medium.com/airbnb-engineering/caravel-airbnb-s-
> data-exploration-platform-15a72aa610e5#.npqmmbu25|Blog
> Post:  Superset: Airbnb’s Data Exploration Platform]]
>  * [[
> https://medium.com/airbnb-engineering/superset-scaling-
> data-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog
> Post:  Superset: Scaling Data Access & Visual Insights at Airbnb]]
>
> == Initial Source ==
> The origin of the proposed code base can be found at
> https://github.com/airbnb/superset.  The code base is primarily in Python.
>
> == Source and Intellectual Property Submission Plan ==
> We do not expect any complications for the submission of the Superset code
> base.  Our code is already in Github and there is only a single code base.
>
> == External Dependencies ==
> List of Python packages, from the Python Package Index (Pypi):
>
>  * boto3
>  * celery
>  * cryptography
>  * flask-appbuilder
>  * flask-cache
>  * flask-migrate
>  * flask-script
>  * flask-sqlalchemy
>  * flask-testing
>  * humanize
>  * gunicorn
>  * markdown
>  * pandas
>  * parsedatetime
>  * pydruid
>  * PyHive
>  * python-dateutil
>  * requests
>  * simplejson
>  * six
>  * sqlalchemy
>  * sqlalchemy-utils
>  * sqlparse
>  * thrift
>  * thrift-sasl
>  * werkzeug
>
> List of Javascript packages, from NPM:
>  * autobind-decorator
>  * bootstrap
>  * bootstrap-datepicker
>  * brace
>  * brfs
>  * cal-heatmap
>  * classnames
>  * d3
>  * d3-cloud
>  * d3-sankey
>  * d3-scale
>  * d3-tip
>  * datamaps
>  * datatables-bootstrap3-plugin
>  * datatables.net-bs
>  * font-awesome
>  * gridster
>  * immutability-helper
>  * immutable
>  * jquery
>  * lodash.throttle
>  * mapbox-gl
>  * moment
>  * moments
>  * mustache
>  * nvd3
>  * react
>  * react-ace
>  * react-bootstrap
>  * react-bootstrap-table
>  * react-dom
>  * react-draggable
>  * react-gravatar
>  * react-grid-layout
>  * react-map-gl
>  * react-redux
>  * react-resizable
>  * react-select
>  * react-syntax-highlighter
>  * reactable
>  * redux
>  * redux-localstorage
>  * redux-thunk
>  * shortid
>  * style-loader
>  * supercluster
>  * topojson
>  * victory
>  * viewport-mercator-project
>
> == Cryptography ==
> The proposal does not include cryptographic code.
>
> == Required Resources ==
>
> === Mailing List ===
> There is a current mailing list as a Google Group “airbnb_superset” that we
> are planning on deprecating as the Apache.org become ready to serve our
> community.
>
>  * superset-private
>  * superset-dev
>  * superset-user
>
> === Subversion Directory ===
> Git is the preferred source control system.
> http://svn.apache.org/repos/asf/incubator/superset
>
> == Git Repository ==
> Git is the preferred source control system, we’re assuming
> https://github.com/apache/incubator-superset based on the naming scheme
>
> == Issue Tracking ==
> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues & PRs
> to manage our project as much as possible. It’s been said that there are
> ways to keep Github’s issues in sync with Jira, allowing us to get best of
> both worlds. If that is not possible, we will comply to using Jira.
>
> == Other Resources ==
> We currently use a set of Github integrated services that are free to the
> open source community, like Travis-ci, Code Climate, Coveralls,
> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep using
> these services as they allow us to scale contributions and optimize our
> development flows. These services require some elevated rights on the
> Github repository in order to set up or tune and we would like for the
> committers to have the required rights.
>
>
> == Initial Committers ==
>
>  * Maxime Beauchemin <[hidden email]> - PPMC & Committer
>  * Alanna Scott <[hidden email]> - PPMC & Committer
>  * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
>  * Vera Liu <[hidden email]> - Committer
>  * Jeff Feng <[hidden email]> - PPMC & Committer
>  * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
>  * Nishant Bangarwa <[hidden email]> - PPMC & Committer
>  * Slim Bouguerra <[hidden email]> - Committer
>  * Priyank Shah <[hidden email]> - Committer
>  * Harsha Chintalapani <[hidden email]> - Committer
>  * Daniel Dai <[hidden email]> - Champion & Committer
>  * Luke Han <[hidden email]> - Mentor
>
> == Affiliations ==
> The initial committers are employees of Airbnb Inc. and Hortonworks.
>
> == Sponsors ==
>
> === Champion ===
> Daniel Dai <[hidden email]>
>
> === Nominated Mentors ===
>  * Ashutosh Chauhan <[hidden email]>
>  * Luke Han <[hidden email]>
>
> === Sponsoring Entity ===
> Incubator PMC
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Ashutosh Chauhan-2
+1 (binding)

Thanks,
Ashutosh

On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <[hidden email]> wrote:

> +1 binding
>
> Love to see Superset to be new incubator project.
>
>
> Best Regards!
> ---------------------
>
> Luke Han
>
> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <[hidden email]> wrote:
>
>> Dear Apache Incubator Community,
>>
>> We have updated the Superset proposal
>> <https://wiki.apache.org/incubator/SupersetProposal> (copied below) for
>>
>> Apache Incubation with an additional mentor (Luke Han -
>> [hidden email]),
>> and would like to start a vote thread for acceptance into the incubator.
>>
>> Our team is excited to share Superset with the Apache community and we
>> hope
>> for the your continued support!
>>
>> Cheers,
>> Jeff & the Superset Team
>>
>>
>>
>>
>> = Superset =
>>
>> == Abstract ==
>> Superset is an enterprise-ready web application for data exploration, data
>> visualization and dashboarding.
>>
>> == Proposal ==
>> Superset is business intelligence (BI) software that helps modern
>> organizations visualize and interact with their data. Superset enables
>> users explore data from a variety of databases, assemble beautiful
>> dashboards and share their findings.  Superset works neatly with all
>> modern
>> SQL-speaking databases, and integrates with Druid.io to provide real-time,
>> interactive, blazing fast data access to large datasets.
>>
>> == Background ==
>> Data is mission critical. To succeed in this era, organizations need to
>> provide low-friction, intuitive and interactive access to data. It is
>> paramount for knowledge workers to be capable of answering their own
>> questions by querying, exploring and visualizing data.
>>
>> The entire business intelligence industry has pivoted from a model of
>> centralized top-down platforms driven by IT organizations to self-service
>> analytics and agile workflows by any user.  This shift unblocks
>> centralized
>> service bottlenecks for creating data visualizations while also creating
>> an
>> environment that is iterative and fast-moving.  This means that business
>> intelligence software must also be easy and delightful to use.
>> Self-service analytics doesn’t mean that admin and governance features are
>> not needed.
>> Modern BI tools provide fine-grain access controls and auditing
>> capabilities to understand how data is being used.  Superset is a solution
>> that delivers on all of these vectors.
>>
>> The technology stack is also constantly morphing - vendors are struggling
>> to provide cheap, quick and easy solutions to access data.  Business
>> intelligence users are finding existing solutions lacking as these
>> software
>> products either disregard or react slowly to recent game-changing
>> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
>> React.js and iPython’s Jupyter for instance.
>>
>> == Rationale ==
>> Business intelligence is more relevant today than at any other point in
>> history.  Organizations are currently very limited in options for open
>> source data visualization solutions, especially solutions that are both
>> self-service and enterprise-ready.  Every company informing their
>> decisions
>> with data needs a BI tool.
>>
>> We believe that Superset will be a strong compliment to existing Apache
>> Software Foundation technologies by offering scalable user interactions to
>> distributed storage and computation solutions.  Users will often find that
>> Superset can act as a catalyst for tooling that can visualize the
>> byproduct
>> of data and computation infrastructure.
>>
>> Superset has many key design elements that help fill a gap in current
>> solutions for organizations:
>>  * Easy, low friction access to data through a simple, web-based data
>> exploration interface.  Composing charts and dashboards are intuitive.
>> Eliminating the need to write code or SQL empowers anyone to use it.
>>  * Access to a wide array of rich, interactive data visualization types.
>>  * Enterprise-ready: Integration with different authentication mechanisms
>> and granular permissions centered around actions and data access.
>>  * Realtime & fast: Superset provides realtime analytics at the speed of
>> thought on very large datasets when integrated with Druid.io.
>>  * Broad data access: Consume data out of any SQL-speaking relational
>> database.
>>  * Extensible: Can be extended to talk to many noSQL databases like Apache
>> Drill, Elastic Search, and other popular database engines.
>>  * Fast loading dashboards with configurable web-scale caching.
>>  * Plug-in framework that enables organizations to build custom analytical
>> applications with new UI/UX interfaces.
>>  * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
>> with more flexibility.  SQL Lab integrates with the visualization engine
>> seamlessly.
>>
>> == Initial Goals ==
>> The initial goals of the Superset project are several-fold:
>>  * Move the existing codebase to Apache and integrate with the Apache
>> development process.
>>  * Redesign the user interface and interaction model for creating
>> visualizations/dashboards and connecting to data sources
>>  * Build robust support for security and governance of the tool including
>> popular authorization modules (including Apache Ranger and Apache Sentry)
>> and a more sophisticated permissions system
>>  * Grow the extensibility of the project both in terms of enhanced
>> connectivity to NoSQL-based data sources and creating a plug-in framework
>> that enables organizations to build custom analytical applications which
>> require a new UI/UX
>>
>> == Current Status ==
>> By many standards, Superset is already a successful open source project.
>> As
>> of March 2017, Superset is officially used in production at about a dozen
>> companies, has received contributions from over one hundred contributors
>> on
>> Github, 1500+ forks, and 12k+ stars.
>>
>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
>> significant contributions, and expressed their commitment to the project.
>> The product is feature complete and has been viable for months. It already
>> serves as the main interface for consuming data at many companies of
>> different sizes.
>>
>> While the product is usable, there’s room for improvement across the
>> board,
>> starting with providing a smoother user experience around content
>> creation,
>> making sure all features work out-of-the-box on more platforms and
>> databases, providing better user training guides and videos, having a
>> predictable release process, and increasing the overall quality of the
>> Superset releases.
>>
>> === Meritocracy ===
>> We plan to invest in supporting a meritocracy. We will discuss the
>> requirements in an open forum. Several companies have expressed interest
>> in
>> this project, and we intend to invite additional developers to
>> participate.
>> We will encourage and monitor community participation so that privileges
>> can be extended to those that contribute.
>>
>> === Community ===
>> The need for an enterprise-ready data visualization and exploration
>> platform in the open source community is tremendous.  While Superset is
>> fairly well known, recognized and used within the Druid.io community,
>> adoption is currently limited outside of that niche. There is a huge
>> opportunity to grow the community to hundreds if not thousands of
>> organizations, and we are hoping that embracing “the Apache way” will
>> accelerate the growth of our community.
>>
>> We have already been active at seeking and inviting contributions, and are
>> planning to scale the project by investing time and growing the support
>> structure to grow the community.
>>
>> === Core Developers ===
>> The initial committers for Superset include experienced full stack,
>> front-end and data engineers:
>>  * Maxime Beauchemin (Airbnb)
>>  * Alanna Scott (Airbnb)
>>  * Bogdan Kyryliuk (Airbnb)
>>  * Vera Liu  (Airbnb)
>>  * Jeff Feng (Airbnb)
>>  * Ashutosh Chauhan (Hortonworks)
>>  * Nishant Bangarwa (Hortonworks)
>>  * Slim Bouguerra (Hortonworks)
>>  * Priyank Shah (Hortonworks)
>>  * Sriharsha Chintalapani (Hortonworks)
>>  * Daniel Dai (Hortonworks)
>>
>> We realize that additional employer diversity is needed, and we will work
>> aggressively to recruit developers from additional companies.
>>
>> === Alignment ===
>> The initial committers strongly believe that a system for interactive
>> visualization of data will gain broader adoption as an open source,
>> community driven project, where the community can contribute not only to
>> the core components, but also to a growing collection of connectors,
>> visualizations and improving integration a all potential data sources.
>> Superset already integrates closely with Apache Hive, the Hive metastore,
>> as well as most SQL-speaking databases found in modern data ecosystems.
>>
>> == Known Risks ==
>>
>> === Orphaned Products ===
>> Superset is a vital component for both visualizing, accessing and
>> democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
>> component of the DataFlow product offering.  Thus, the risk of the project
>> being orphaned is relatively low.  The project could be at risk if Airbnb
>> changes their approach for democratizing data or if Hortonworks changes
>> their strategy in the market.  In such an event, the committers plan to
>> continue working on the project on their own time, thought the progress
>> will likely be slower.  We plan to mitigate this risk by recruiting
>> additional committers.
>>
>> === Inexperience with Open Source ===
>> The initial committers include veteran Apache members (committers and PPMC
>> members) and other developers who have varying degrees of experience with
>> open source projects. All have been involved with source code that has
>> been
>> released under an open source license, and several also have experience
>> developing code with an open source development process.
>>
>> === Homogenous Developers ===
>> The initial committers are employed by Airbnb Inc. and Hortonworks. We are
>> committed to recruiting additional committers from other companies.
>>
>> === Reliance on Salaried Developers ===
>> It is expected that Superset development will occur on both salaried time
>> and on volunteer time, after hours. The majority of initial committers are
>> paid by their employer to contribute to this project. However, they are
>> all
>> passionate about the project, and we are confident that the project will
>> continue even if no salaried developers contribute to the project. We are
>> committed to recruiting additional committers including non-salaried
>> developers.
>>
>> === Relationships with Other Apache Products ===
>> To the knowledge of the Initial Committers, there are no direct
>> competitors
>> to Superset within the Apache Software Foundation.  That said, Apache
>> Zeppelin is an indirect competitor, but it solves a different use case.
>>
>> Apache Zeppelin is a web-based notebook that enables interactive data
>> analytics. It enables the creation of beautiful data-driven, interactive
>> and collaborative documents with SQL, Scala and more.  Although a user can
>> create data visualizations using this project, it leverages a notebook
>> style user interfaces and it is geared towards the Spark community where
>> Scala and SQL co-exist
>>
>> We look forward to collaborating with those communities, as well as other
>> Apache communities.
>>
>> === An Excessive Fascination with the Apache Brand ===
>> Superset is solving two huge challenges:
>> The challenge of enabling every knowledge worker to make data informed
>> decisions, particularly those who are not deeply skilled at writing SQL.
>> The challenge of visualizing huge amounts of data interactively and in
>> real-time
>>
>> Superset was first developed as a data visualization solution for Druid.io
>> as a way to visualize billions of rows of data.  Since then, usage of
>> Superset has expanded to address data visualization use cases across SQL
>> speaking data sources as well.
>>
>> Our rationale for developing Superset as an Apache project is detailed in
>> the Rationale Section.  We believe that the Apache brand and community
>> process will help us attract more contributors to this project, and help
>> grow the footprint of the project through usage at other organizations and
>> within other applications.  Establishing consensus among users and
>> developers will result in a more valuable tool for everyone.
>>
>> == Documentation ==
>> References to further reading material:
>>  * [[http://airbnb.io/superset/|Superset Documentation]]
>>  * [[
>> https://medium.com/airbnb-engineering/caravel-airbnb-s-data-
>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
>> Post:  Superset: Airbnb’s Data Exploration Platform]]
>>  * [[
>> https://medium.com/airbnb-engineering/superset-scaling-data-
>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog
>> Post:  Superset: Scaling Data Access & Visual Insights at Airbnb]]
>>
>> == Initial Source ==
>> The origin of the proposed code base can be found at
>> https://github.com/airbnb/superset.  The code base is primarily in
>> Python.
>>
>> == Source and Intellectual Property Submission Plan ==
>> We do not expect any complications for the submission of the Superset code
>> base.  Our code is already in Github and there is only a single code base.
>>
>> == External Dependencies ==
>> List of Python packages, from the Python Package Index (Pypi):
>>
>>  * boto3
>>  * celery
>>  * cryptography
>>  * flask-appbuilder
>>  * flask-cache
>>  * flask-migrate
>>  * flask-script
>>  * flask-sqlalchemy
>>  * flask-testing
>>  * humanize
>>  * gunicorn
>>  * markdown
>>  * pandas
>>  * parsedatetime
>>  * pydruid
>>  * PyHive
>>  * python-dateutil
>>  * requests
>>  * simplejson
>>  * six
>>  * sqlalchemy
>>  * sqlalchemy-utils
>>  * sqlparse
>>  * thrift
>>  * thrift-sasl
>>  * werkzeug
>>
>> List of Javascript packages, from NPM:
>>  * autobind-decorator
>>  * bootstrap
>>  * bootstrap-datepicker
>>  * brace
>>  * brfs
>>  * cal-heatmap
>>  * classnames
>>  * d3
>>  * d3-cloud
>>  * d3-sankey
>>  * d3-scale
>>  * d3-tip
>>  * datamaps
>>  * datatables-bootstrap3-plugin
>>  * datatables.net-bs
>>  * font-awesome
>>  * gridster
>>  * immutability-helper
>>  * immutable
>>  * jquery
>>  * lodash.throttle
>>  * mapbox-gl
>>  * moment
>>  * moments
>>  * mustache
>>  * nvd3
>>  * react
>>  * react-ace
>>  * react-bootstrap
>>  * react-bootstrap-table
>>  * react-dom
>>  * react-draggable
>>  * react-gravatar
>>  * react-grid-layout
>>  * react-map-gl
>>  * react-redux
>>  * react-resizable
>>  * react-select
>>  * react-syntax-highlighter
>>  * reactable
>>  * redux
>>  * redux-localstorage
>>  * redux-thunk
>>  * shortid
>>  * style-loader
>>  * supercluster
>>  * topojson
>>  * victory
>>  * viewport-mercator-project
>>
>> == Cryptography ==
>> The proposal does not include cryptographic code.
>>
>> == Required Resources ==
>>
>> === Mailing List ===
>> There is a current mailing list as a Google Group “airbnb_superset” that
>> we
>> are planning on deprecating as the Apache.org become ready to serve our
>> community.
>>
>>  * superset-private
>>  * superset-dev
>>  * superset-user
>>
>> === Subversion Directory ===
>> Git is the preferred source control system.
>> http://svn.apache.org/repos/asf/incubator/superset
>>
>> == Git Repository ==
>> Git is the preferred source control system, we’re assuming
>> https://github.com/apache/incubator-superset based on the naming scheme
>>
>> == Issue Tracking ==
>> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues &
>> PRs
>> to manage our project as much as possible. It’s been said that there are
>> ways to keep Github’s issues in sync with Jira, allowing us to get best of
>> both worlds. If that is not possible, we will comply to using Jira.
>>
>> == Other Resources ==
>> We currently use a set of Github integrated services that are free to the
>> open source community, like Travis-ci, Code Climate, Coveralls,
>> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep
>> using
>> these services as they allow us to scale contributions and optimize our
>> development flows. These services require some elevated rights on the
>> Github repository in order to set up or tune and we would like for the
>> committers to have the required rights.
>>
>>
>> == Initial Committers ==
>>
>>  * Maxime Beauchemin <[hidden email]> - PPMC & Committer
>>  * Alanna Scott <[hidden email]> - PPMC & Committer
>>  * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
>>  * Vera Liu <[hidden email]> - Committer
>>  * Jeff Feng <[hidden email]> - PPMC & Committer
>>  * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
>>  * Nishant Bangarwa <[hidden email]> - PPMC & Committer
>>  * Slim Bouguerra <[hidden email]> - Committer
>>  * Priyank Shah <[hidden email]> - Committer
>>  * Harsha Chintalapani <[hidden email]> - Committer
>>  * Daniel Dai <[hidden email]> - Champion & Committer
>>  * Luke Han <[hidden email]> - Mentor
>>
>> == Affiliations ==
>> The initial committers are employees of Airbnb Inc. and Hortonworks.
>>
>> == Sponsors ==
>>
>> === Champion ===
>> Daniel Dai <[hidden email]>
>>
>> === Nominated Mentors ===
>>  * Ashutosh Chauhan <[hidden email]>
>>  * Luke Han <[hidden email]>
>>
>> === Sponsoring Entity ===
>> Incubator PMC
>>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

moon soo Lee
+1 (non-binding)

On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <[hidden email]>
wrote:

> +1 (binding)
>
> Thanks,
> Ashutosh
>
> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <[hidden email]> wrote:
>
> > +1 binding
> >
> > Love to see Superset to be new incubator project.
> >
> >
> > Best Regards!
> > ---------------------
> >
> > Luke Han
> >
> > On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <[hidden email]> wrote:
> >
> >> Dear Apache Incubator Community,
> >>
> >> We have updated the Superset proposal
> >> <https://wiki.apache.org/incubator/SupersetProposal> (copied below) for
> >>
> >> Apache Incubation with an additional mentor (Luke Han -
> >> [hidden email]),
> >> and would like to start a vote thread for acceptance into the incubator.
> >>
> >> Our team is excited to share Superset with the Apache community and we
> >> hope
> >> for the your continued support!
> >>
> >> Cheers,
> >> Jeff & the Superset Team
> >>
> >>
> >>
> >>
> >> = Superset =
> >>
> >> == Abstract ==
> >> Superset is an enterprise-ready web application for data exploration,
> data
> >> visualization and dashboarding.
> >>
> >> == Proposal ==
> >> Superset is business intelligence (BI) software that helps modern
> >> organizations visualize and interact with their data. Superset enables
> >> users explore data from a variety of databases, assemble beautiful
> >> dashboards and share their findings.  Superset works neatly with all
> >> modern
> >> SQL-speaking databases, and integrates with Druid.io to provide
> real-time,
> >> interactive, blazing fast data access to large datasets.
> >>
> >> == Background ==
> >> Data is mission critical. To succeed in this era, organizations need to
> >> provide low-friction, intuitive and interactive access to data. It is
> >> paramount for knowledge workers to be capable of answering their own
> >> questions by querying, exploring and visualizing data.
> >>
> >> The entire business intelligence industry has pivoted from a model of
> >> centralized top-down platforms driven by IT organizations to
> self-service
> >> analytics and agile workflows by any user.  This shift unblocks
> >> centralized
> >> service bottlenecks for creating data visualizations while also creating
> >> an
> >> environment that is iterative and fast-moving.  This means that business
> >> intelligence software must also be easy and delightful to use.
> >> Self-service analytics doesn’t mean that admin and governance features
> are
> >> not needed.
> >> Modern BI tools provide fine-grain access controls and auditing
> >> capabilities to understand how data is being used.  Superset is a
> solution
> >> that delivers on all of these vectors.
> >>
> >> The technology stack is also constantly morphing - vendors are
> struggling
> >> to provide cheap, quick and easy solutions to access data.  Business
> >> intelligence users are finding existing solutions lacking as these
> >> software
> >> products either disregard or react slowly to recent game-changing
> >> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
> >> React.js and iPython’s Jupyter for instance.
> >>
> >> == Rationale ==
> >> Business intelligence is more relevant today than at any other point in
> >> history.  Organizations are currently very limited in options for open
> >> source data visualization solutions, especially solutions that are both
> >> self-service and enterprise-ready.  Every company informing their
> >> decisions
> >> with data needs a BI tool.
> >>
> >> We believe that Superset will be a strong compliment to existing Apache
> >> Software Foundation technologies by offering scalable user interactions
> to
> >> distributed storage and computation solutions.  Users will often find
> that
> >> Superset can act as a catalyst for tooling that can visualize the
> >> byproduct
> >> of data and computation infrastructure.
> >>
> >> Superset has many key design elements that help fill a gap in current
> >> solutions for organizations:
> >>  * Easy, low friction access to data through a simple, web-based data
> >> exploration interface.  Composing charts and dashboards are intuitive.
> >> Eliminating the need to write code or SQL empowers anyone to use it.
> >>  * Access to a wide array of rich, interactive data visualization types.
> >>  * Enterprise-ready: Integration with different authentication
> mechanisms
> >> and granular permissions centered around actions and data access.
> >>  * Realtime & fast: Superset provides realtime analytics at the speed of
> >> thought on very large datasets when integrated with Druid.io.
> >>  * Broad data access: Consume data out of any SQL-speaking relational
> >> database.
> >>  * Extensible: Can be extended to talk to many noSQL databases like
> Apache
> >> Drill, Elastic Search, and other popular database engines.
> >>  * Fast loading dashboards with configurable web-scale caching.
> >>  * Plug-in framework that enables organizations to build custom
> analytical
> >> applications with new UI/UX interfaces.
> >>  * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
> >> with more flexibility.  SQL Lab integrates with the visualization engine
> >> seamlessly.
> >>
> >> == Initial Goals ==
> >> The initial goals of the Superset project are several-fold:
> >>  * Move the existing codebase to Apache and integrate with the Apache
> >> development process.
> >>  * Redesign the user interface and interaction model for creating
> >> visualizations/dashboards and connecting to data sources
> >>  * Build robust support for security and governance of the tool
> including
> >> popular authorization modules (including Apache Ranger and Apache
> Sentry)
> >> and a more sophisticated permissions system
> >>  * Grow the extensibility of the project both in terms of enhanced
> >> connectivity to NoSQL-based data sources and creating a plug-in
> framework
> >> that enables organizations to build custom analytical applications which
> >> require a new UI/UX
> >>
> >> == Current Status ==
> >> By many standards, Superset is already a successful open source project.
> >> As
> >> of March 2017, Superset is officially used in production at about a
> dozen
> >> companies, has received contributions from over one hundred contributors
> >> on
> >> Github, 1500+ forks, and 12k+ stars.
> >>
> >> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
> >> significant contributions, and expressed their commitment to the
> project.
> >> The product is feature complete and has been viable for months. It
> already
> >> serves as the main interface for consuming data at many companies of
> >> different sizes.
> >>
> >> While the product is usable, there’s room for improvement across the
> >> board,
> >> starting with providing a smoother user experience around content
> >> creation,
> >> making sure all features work out-of-the-box on more platforms and
> >> databases, providing better user training guides and videos, having a
> >> predictable release process, and increasing the overall quality of the
> >> Superset releases.
> >>
> >> === Meritocracy ===
> >> We plan to invest in supporting a meritocracy. We will discuss the
> >> requirements in an open forum. Several companies have expressed interest
> >> in
> >> this project, and we intend to invite additional developers to
> >> participate.
> >> We will encourage and monitor community participation so that privileges
> >> can be extended to those that contribute.
> >>
> >> === Community ===
> >> The need for an enterprise-ready data visualization and exploration
> >> platform in the open source community is tremendous.  While Superset is
> >> fairly well known, recognized and used within the Druid.io community,
> >> adoption is currently limited outside of that niche. There is a huge
> >> opportunity to grow the community to hundreds if not thousands of
> >> organizations, and we are hoping that embracing “the Apache way” will
> >> accelerate the growth of our community.
> >>
> >> We have already been active at seeking and inviting contributions, and
> are
> >> planning to scale the project by investing time and growing the support
> >> structure to grow the community.
> >>
> >> === Core Developers ===
> >> The initial committers for Superset include experienced full stack,
> >> front-end and data engineers:
> >>  * Maxime Beauchemin (Airbnb)
> >>  * Alanna Scott (Airbnb)
> >>  * Bogdan Kyryliuk (Airbnb)
> >>  * Vera Liu  (Airbnb)
> >>  * Jeff Feng (Airbnb)
> >>  * Ashutosh Chauhan (Hortonworks)
> >>  * Nishant Bangarwa (Hortonworks)
> >>  * Slim Bouguerra (Hortonworks)
> >>  * Priyank Shah (Hortonworks)
> >>  * Sriharsha Chintalapani (Hortonworks)
> >>  * Daniel Dai (Hortonworks)
> >>
> >> We realize that additional employer diversity is needed, and we will
> work
> >> aggressively to recruit developers from additional companies.
> >>
> >> === Alignment ===
> >> The initial committers strongly believe that a system for interactive
> >> visualization of data will gain broader adoption as an open source,
> >> community driven project, where the community can contribute not only to
> >> the core components, but also to a growing collection of connectors,
> >> visualizations and improving integration a all potential data sources.
> >> Superset already integrates closely with Apache Hive, the Hive
> metastore,
> >> as well as most SQL-speaking databases found in modern data ecosystems.
> >>
> >> == Known Risks ==
> >>
> >> === Orphaned Products ===
> >> Superset is a vital component for both visualizing, accessing and
> >> democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
> >> component of the DataFlow product offering.  Thus, the risk of the
> project
> >> being orphaned is relatively low.  The project could be at risk if
> Airbnb
> >> changes their approach for democratizing data or if Hortonworks changes
> >> their strategy in the market.  In such an event, the committers plan to
> >> continue working on the project on their own time, thought the progress
> >> will likely be slower.  We plan to mitigate this risk by recruiting
> >> additional committers.
> >>
> >> === Inexperience with Open Source ===
> >> The initial committers include veteran Apache members (committers and
> PPMC
> >> members) and other developers who have varying degrees of experience
> with
> >> open source projects. All have been involved with source code that has
> >> been
> >> released under an open source license, and several also have experience
> >> developing code with an open source development process.
> >>
> >> === Homogenous Developers ===
> >> The initial committers are employed by Airbnb Inc. and Hortonworks. We
> are
> >> committed to recruiting additional committers from other companies.
> >>
> >> === Reliance on Salaried Developers ===
> >> It is expected that Superset development will occur on both salaried
> time
> >> and on volunteer time, after hours. The majority of initial committers
> are
> >> paid by their employer to contribute to this project. However, they are
> >> all
> >> passionate about the project, and we are confident that the project will
> >> continue even if no salaried developers contribute to the project. We
> are
> >> committed to recruiting additional committers including non-salaried
> >> developers.
> >>
> >> === Relationships with Other Apache Products ===
> >> To the knowledge of the Initial Committers, there are no direct
> >> competitors
> >> to Superset within the Apache Software Foundation.  That said, Apache
> >> Zeppelin is an indirect competitor, but it solves a different use case.
> >>
> >> Apache Zeppelin is a web-based notebook that enables interactive data
> >> analytics. It enables the creation of beautiful data-driven, interactive
> >> and collaborative documents with SQL, Scala and more.  Although a user
> can
> >> create data visualizations using this project, it leverages a notebook
> >> style user interfaces and it is geared towards the Spark community where
> >> Scala and SQL co-exist
> >>
> >> We look forward to collaborating with those communities, as well as
> other
> >> Apache communities.
> >>
> >> === An Excessive Fascination with the Apache Brand ===
> >> Superset is solving two huge challenges:
> >> The challenge of enabling every knowledge worker to make data informed
> >> decisions, particularly those who are not deeply skilled at writing SQL.
> >> The challenge of visualizing huge amounts of data interactively and in
> >> real-time
> >>
> >> Superset was first developed as a data visualization solution for
> Druid.io
> >> as a way to visualize billions of rows of data.  Since then, usage of
> >> Superset has expanded to address data visualization use cases across SQL
> >> speaking data sources as well.
> >>
> >> Our rationale for developing Superset as an Apache project is detailed
> in
> >> the Rationale Section.  We believe that the Apache brand and community
> >> process will help us attract more contributors to this project, and help
> >> grow the footprint of the project through usage at other organizations
> and
> >> within other applications.  Establishing consensus among users and
> >> developers will result in a more valuable tool for everyone.
> >>
> >> == Documentation ==
> >> References to further reading material:
> >>  * [[http://airbnb.io/superset/|Superset Documentation]]
> >>  * [[
> >> https://medium.com/airbnb-engineering/caravel-airbnb-s-data-
> >> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
> >> Post:  Superset: Airbnb’s Data Exploration Platform]]
> >>  * [[
> >> https://medium.com/airbnb-engineering/superset-scaling-data-
> >> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog
> >> Post:  Superset: Scaling Data Access & Visual Insights at Airbnb]]
> >>
> >> == Initial Source ==
> >> The origin of the proposed code base can be found at
> >> https://github.com/airbnb/superset.  The code base is primarily in
> >> Python.
> >>
> >> == Source and Intellectual Property Submission Plan ==
> >> We do not expect any complications for the submission of the Superset
> code
> >> base.  Our code is already in Github and there is only a single code
> base.
> >>
> >> == External Dependencies ==
> >> List of Python packages, from the Python Package Index (Pypi):
> >>
> >>  * boto3
> >>  * celery
> >>  * cryptography
> >>  * flask-appbuilder
> >>  * flask-cache
> >>  * flask-migrate
> >>  * flask-script
> >>  * flask-sqlalchemy
> >>  * flask-testing
> >>  * humanize
> >>  * gunicorn
> >>  * markdown
> >>  * pandas
> >>  * parsedatetime
> >>  * pydruid
> >>  * PyHive
> >>  * python-dateutil
> >>  * requests
> >>  * simplejson
> >>  * six
> >>  * sqlalchemy
> >>  * sqlalchemy-utils
> >>  * sqlparse
> >>  * thrift
> >>  * thrift-sasl
> >>  * werkzeug
> >>
> >> List of Javascript packages, from NPM:
> >>  * autobind-decorator
> >>  * bootstrap
> >>  * bootstrap-datepicker
> >>  * brace
> >>  * brfs
> >>  * cal-heatmap
> >>  * classnames
> >>  * d3
> >>  * d3-cloud
> >>  * d3-sankey
> >>  * d3-scale
> >>  * d3-tip
> >>  * datamaps
> >>  * datatables-bootstrap3-plugin
> >>  * datatables.net-bs
> >>  * font-awesome
> >>  * gridster
> >>  * immutability-helper
> >>  * immutable
> >>  * jquery
> >>  * lodash.throttle
> >>  * mapbox-gl
> >>  * moment
> >>  * moments
> >>  * mustache
> >>  * nvd3
> >>  * react
> >>  * react-ace
> >>  * react-bootstrap
> >>  * react-bootstrap-table
> >>  * react-dom
> >>  * react-draggable
> >>  * react-gravatar
> >>  * react-grid-layout
> >>  * react-map-gl
> >>  * react-redux
> >>  * react-resizable
> >>  * react-select
> >>  * react-syntax-highlighter
> >>  * reactable
> >>  * redux
> >>  * redux-localstorage
> >>  * redux-thunk
> >>  * shortid
> >>  * style-loader
> >>  * supercluster
> >>  * topojson
> >>  * victory
> >>  * viewport-mercator-project
> >>
> >> == Cryptography ==
> >> The proposal does not include cryptographic code.
> >>
> >> == Required Resources ==
> >>
> >> === Mailing List ===
> >> There is a current mailing list as a Google Group “airbnb_superset” that
> >> we
> >> are planning on deprecating as the Apache.org become ready to serve our
> >> community.
> >>
> >>  * superset-private
> >>  * superset-dev
> >>  * superset-user
> >>
> >> === Subversion Directory ===
> >> Git is the preferred source control system.
> >> http://svn.apache.org/repos/asf/incubator/superset
> >>
> >> == Git Repository ==
> >> Git is the preferred source control system, we’re assuming
> >> https://github.com/apache/incubator-superset based on the naming scheme
> >>
> >> == Issue Tracking ==
> >> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues &
> >> PRs
> >> to manage our project as much as possible. It’s been said that there are
> >> ways to keep Github’s issues in sync with Jira, allowing us to get best
> of
> >> both worlds. If that is not possible, we will comply to using Jira.
> >>
> >> == Other Resources ==
> >> We currently use a set of Github integrated services that are free to
> the
> >> open source community, like Travis-ci, Code Climate, Coveralls,
> >> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep
> >> using
> >> these services as they allow us to scale contributions and optimize our
> >> development flows. These services require some elevated rights on the
> >> Github repository in order to set up or tune and we would like for the
> >> committers to have the required rights.
> >>
> >>
> >> == Initial Committers ==
> >>
> >>  * Maxime Beauchemin <[hidden email]> - PPMC & Committer
> >>  * Alanna Scott <[hidden email]> - PPMC & Committer
> >>  * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
> >>  * Vera Liu <[hidden email]> - Committer
> >>  * Jeff Feng <[hidden email]> - PPMC & Committer
> >>  * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
> >>  * Nishant Bangarwa <[hidden email]> - PPMC & Committer
> >>  * Slim Bouguerra <[hidden email]> - Committer
> >>  * Priyank Shah <[hidden email]> - Committer
> >>  * Harsha Chintalapani <[hidden email]> - Committer
> >>  * Daniel Dai <[hidden email]> - Champion & Committer
> >>  * Luke Han <[hidden email]> - Mentor
> >>
> >> == Affiliations ==
> >> The initial committers are employees of Airbnb Inc. and Hortonworks.
> >>
> >> == Sponsors ==
> >>
> >> === Champion ===
> >> Daniel Dai <[hidden email]>
> >>
> >> === Nominated Mentors ===
> >>  * Ashutosh Chauhan <[hidden email]>
> >>  * Luke Han <[hidden email]>
> >>
> >> === Sponsoring Entity ===
> >> Incubator PMC
> >>
> >
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Julian Hyde-3
+1 binding

> On Apr 25, 2017, at 12:48 PM, moon soo Lee <[hidden email]> wrote:
>
> +1 (non-binding)
>
> On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <[hidden email]>
> wrote:
>
>> +1 (binding)
>>
>> Thanks,
>> Ashutosh
>>
>> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <[hidden email]> wrote:
>>
>>> +1 binding
>>>
>>> Love to see Superset to be new incubator project.
>>>
>>>
>>> Best Regards!
>>> ---------------------
>>>
>>> Luke Han
>>>
>>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <[hidden email]> wrote:
>>>
>>>> Dear Apache Incubator Community,
>>>>
>>>> We have updated the Superset proposal
>>>> <https://wiki.apache.org/incubator/SupersetProposal> (copied below) for
>>>>
>>>> Apache Incubation with an additional mentor (Luke Han -
>>>> [hidden email]),
>>>> and would like to start a vote thread for acceptance into the incubator.
>>>>
>>>> Our team is excited to share Superset with the Apache community and we
>>>> hope
>>>> for the your continued support!
>>>>
>>>> Cheers,
>>>> Jeff & the Superset Team
>>>>
>>>>
>>>>
>>>>
>>>> = Superset =
>>>>
>>>> == Abstract ==
>>>> Superset is an enterprise-ready web application for data exploration,
>> data
>>>> visualization and dashboarding.
>>>>
>>>> == Proposal ==
>>>> Superset is business intelligence (BI) software that helps modern
>>>> organizations visualize and interact with their data. Superset enables
>>>> users explore data from a variety of databases, assemble beautiful
>>>> dashboards and share their findings.  Superset works neatly with all
>>>> modern
>>>> SQL-speaking databases, and integrates with Druid.io to provide
>> real-time,
>>>> interactive, blazing fast data access to large datasets.
>>>>
>>>> == Background ==
>>>> Data is mission critical. To succeed in this era, organizations need to
>>>> provide low-friction, intuitive and interactive access to data. It is
>>>> paramount for knowledge workers to be capable of answering their own
>>>> questions by querying, exploring and visualizing data.
>>>>
>>>> The entire business intelligence industry has pivoted from a model of
>>>> centralized top-down platforms driven by IT organizations to
>> self-service
>>>> analytics and agile workflows by any user.  This shift unblocks
>>>> centralized
>>>> service bottlenecks for creating data visualizations while also creating
>>>> an
>>>> environment that is iterative and fast-moving.  This means that business
>>>> intelligence software must also be easy and delightful to use.
>>>> Self-service analytics doesn’t mean that admin and governance features
>> are
>>>> not needed.
>>>> Modern BI tools provide fine-grain access controls and auditing
>>>> capabilities to understand how data is being used.  Superset is a
>> solution
>>>> that delivers on all of these vectors.
>>>>
>>>> The technology stack is also constantly morphing - vendors are
>> struggling
>>>> to provide cheap, quick and easy solutions to access data.  Business
>>>> intelligence users are finding existing solutions lacking as these
>>>> software
>>>> products either disregard or react slowly to recent game-changing
>>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
>>>> React.js and iPython’s Jupyter for instance.
>>>>
>>>> == Rationale ==
>>>> Business intelligence is more relevant today than at any other point in
>>>> history.  Organizations are currently very limited in options for open
>>>> source data visualization solutions, especially solutions that are both
>>>> self-service and enterprise-ready.  Every company informing their
>>>> decisions
>>>> with data needs a BI tool.
>>>>
>>>> We believe that Superset will be a strong compliment to existing Apache
>>>> Software Foundation technologies by offering scalable user interactions
>> to
>>>> distributed storage and computation solutions.  Users will often find
>> that
>>>> Superset can act as a catalyst for tooling that can visualize the
>>>> byproduct
>>>> of data and computation infrastructure.
>>>>
>>>> Superset has many key design elements that help fill a gap in current
>>>> solutions for organizations:
>>>> * Easy, low friction access to data through a simple, web-based data
>>>> exploration interface.  Composing charts and dashboards are intuitive.
>>>> Eliminating the need to write code or SQL empowers anyone to use it.
>>>> * Access to a wide array of rich, interactive data visualization types.
>>>> * Enterprise-ready: Integration with different authentication
>> mechanisms
>>>> and granular permissions centered around actions and data access.
>>>> * Realtime & fast: Superset provides realtime analytics at the speed of
>>>> thought on very large datasets when integrated with Druid.io.
>>>> * Broad data access: Consume data out of any SQL-speaking relational
>>>> database.
>>>> * Extensible: Can be extended to talk to many noSQL databases like
>> Apache
>>>> Drill, Elastic Search, and other popular database engines.
>>>> * Fast loading dashboards with configurable web-scale caching.
>>>> * Plug-in framework that enables organizations to build custom
>> analytical
>>>> applications with new UI/UX interfaces.
>>>> * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
>>>> with more flexibility.  SQL Lab integrates with the visualization engine
>>>> seamlessly.
>>>>
>>>> == Initial Goals ==
>>>> The initial goals of the Superset project are several-fold:
>>>> * Move the existing codebase to Apache and integrate with the Apache
>>>> development process.
>>>> * Redesign the user interface and interaction model for creating
>>>> visualizations/dashboards and connecting to data sources
>>>> * Build robust support for security and governance of the tool
>> including
>>>> popular authorization modules (including Apache Ranger and Apache
>> Sentry)
>>>> and a more sophisticated permissions system
>>>> * Grow the extensibility of the project both in terms of enhanced
>>>> connectivity to NoSQL-based data sources and creating a plug-in
>> framework
>>>> that enables organizations to build custom analytical applications which
>>>> require a new UI/UX
>>>>
>>>> == Current Status ==
>>>> By many standards, Superset is already a successful open source project.
>>>> As
>>>> of March 2017, Superset is officially used in production at about a
>> dozen
>>>> companies, has received contributions from over one hundred contributors
>>>> on
>>>> Github, 1500+ forks, and 12k+ stars.
>>>>
>>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
>>>> significant contributions, and expressed their commitment to the
>> project.
>>>> The product is feature complete and has been viable for months. It
>> already
>>>> serves as the main interface for consuming data at many companies of
>>>> different sizes.
>>>>
>>>> While the product is usable, there’s room for improvement across the
>>>> board,
>>>> starting with providing a smoother user experience around content
>>>> creation,
>>>> making sure all features work out-of-the-box on more platforms and
>>>> databases, providing better user training guides and videos, having a
>>>> predictable release process, and increasing the overall quality of the
>>>> Superset releases.
>>>>
>>>> === Meritocracy ===
>>>> We plan to invest in supporting a meritocracy. We will discuss the
>>>> requirements in an open forum. Several companies have expressed interest
>>>> in
>>>> this project, and we intend to invite additional developers to
>>>> participate.
>>>> We will encourage and monitor community participation so that privileges
>>>> can be extended to those that contribute.
>>>>
>>>> === Community ===
>>>> The need for an enterprise-ready data visualization and exploration
>>>> platform in the open source community is tremendous.  While Superset is
>>>> fairly well known, recognized and used within the Druid.io community,
>>>> adoption is currently limited outside of that niche. There is a huge
>>>> opportunity to grow the community to hundreds if not thousands of
>>>> organizations, and we are hoping that embracing “the Apache way” will
>>>> accelerate the growth of our community.
>>>>
>>>> We have already been active at seeking and inviting contributions, and
>> are
>>>> planning to scale the project by investing time and growing the support
>>>> structure to grow the community.
>>>>
>>>> === Core Developers ===
>>>> The initial committers for Superset include experienced full stack,
>>>> front-end and data engineers:
>>>> * Maxime Beauchemin (Airbnb)
>>>> * Alanna Scott (Airbnb)
>>>> * Bogdan Kyryliuk (Airbnb)
>>>> * Vera Liu  (Airbnb)
>>>> * Jeff Feng (Airbnb)
>>>> * Ashutosh Chauhan (Hortonworks)
>>>> * Nishant Bangarwa (Hortonworks)
>>>> * Slim Bouguerra (Hortonworks)
>>>> * Priyank Shah (Hortonworks)
>>>> * Sriharsha Chintalapani (Hortonworks)
>>>> * Daniel Dai (Hortonworks)
>>>>
>>>> We realize that additional employer diversity is needed, and we will
>> work
>>>> aggressively to recruit developers from additional companies.
>>>>
>>>> === Alignment ===
>>>> The initial committers strongly believe that a system for interactive
>>>> visualization of data will gain broader adoption as an open source,
>>>> community driven project, where the community can contribute not only to
>>>> the core components, but also to a growing collection of connectors,
>>>> visualizations and improving integration a all potential data sources.
>>>> Superset already integrates closely with Apache Hive, the Hive
>> metastore,
>>>> as well as most SQL-speaking databases found in modern data ecosystems.
>>>>
>>>> == Known Risks ==
>>>>
>>>> === Orphaned Products ===
>>>> Superset is a vital component for both visualizing, accessing and
>>>> democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
>>>> component of the DataFlow product offering.  Thus, the risk of the
>> project
>>>> being orphaned is relatively low.  The project could be at risk if
>> Airbnb
>>>> changes their approach for democratizing data or if Hortonworks changes
>>>> their strategy in the market.  In such an event, the committers plan to
>>>> continue working on the project on their own time, thought the progress
>>>> will likely be slower.  We plan to mitigate this risk by recruiting
>>>> additional committers.
>>>>
>>>> === Inexperience with Open Source ===
>>>> The initial committers include veteran Apache members (committers and
>> PPMC
>>>> members) and other developers who have varying degrees of experience
>> with
>>>> open source projects. All have been involved with source code that has
>>>> been
>>>> released under an open source license, and several also have experience
>>>> developing code with an open source development process.
>>>>
>>>> === Homogenous Developers ===
>>>> The initial committers are employed by Airbnb Inc. and Hortonworks. We
>> are
>>>> committed to recruiting additional committers from other companies.
>>>>
>>>> === Reliance on Salaried Developers ===
>>>> It is expected that Superset development will occur on both salaried
>> time
>>>> and on volunteer time, after hours. The majority of initial committers
>> are
>>>> paid by their employer to contribute to this project. However, they are
>>>> all
>>>> passionate about the project, and we are confident that the project will
>>>> continue even if no salaried developers contribute to the project. We
>> are
>>>> committed to recruiting additional committers including non-salaried
>>>> developers.
>>>>
>>>> === Relationships with Other Apache Products ===
>>>> To the knowledge of the Initial Committers, there are no direct
>>>> competitors
>>>> to Superset within the Apache Software Foundation.  That said, Apache
>>>> Zeppelin is an indirect competitor, but it solves a different use case.
>>>>
>>>> Apache Zeppelin is a web-based notebook that enables interactive data
>>>> analytics. It enables the creation of beautiful data-driven, interactive
>>>> and collaborative documents with SQL, Scala and more.  Although a user
>> can
>>>> create data visualizations using this project, it leverages a notebook
>>>> style user interfaces and it is geared towards the Spark community where
>>>> Scala and SQL co-exist
>>>>
>>>> We look forward to collaborating with those communities, as well as
>> other
>>>> Apache communities.
>>>>
>>>> === An Excessive Fascination with the Apache Brand ===
>>>> Superset is solving two huge challenges:
>>>> The challenge of enabling every knowledge worker to make data informed
>>>> decisions, particularly those who are not deeply skilled at writing SQL.
>>>> The challenge of visualizing huge amounts of data interactively and in
>>>> real-time
>>>>
>>>> Superset was first developed as a data visualization solution for
>> Druid.io
>>>> as a way to visualize billions of rows of data.  Since then, usage of
>>>> Superset has expanded to address data visualization use cases across SQL
>>>> speaking data sources as well.
>>>>
>>>> Our rationale for developing Superset as an Apache project is detailed
>> in
>>>> the Rationale Section.  We believe that the Apache brand and community
>>>> process will help us attract more contributors to this project, and help
>>>> grow the footprint of the project through usage at other organizations
>> and
>>>> within other applications.  Establishing consensus among users and
>>>> developers will result in a more valuable tool for everyone.
>>>>
>>>> == Documentation ==
>>>> References to further reading material:
>>>> * [[http://airbnb.io/superset/|Superset Documentation]]
>>>> * [[
>>>> https://medium.com/airbnb-engineering/caravel-airbnb-s-data-
>>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
>>>> Post:  Superset: Airbnb’s Data Exploration Platform]]
>>>> * [[
>>>> https://medium.com/airbnb-engineering/superset-scaling-data-
>>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog
>>>> Post:  Superset: Scaling Data Access & Visual Insights at Airbnb]]
>>>>
>>>> == Initial Source ==
>>>> The origin of the proposed code base can be found at
>>>> https://github.com/airbnb/superset.  The code base is primarily in
>>>> Python.
>>>>
>>>> == Source and Intellectual Property Submission Plan ==
>>>> We do not expect any complications for the submission of the Superset
>> code
>>>> base.  Our code is already in Github and there is only a single code
>> base.
>>>>
>>>> == External Dependencies ==
>>>> List of Python packages, from the Python Package Index (Pypi):
>>>>
>>>> * boto3
>>>> * celery
>>>> * cryptography
>>>> * flask-appbuilder
>>>> * flask-cache
>>>> * flask-migrate
>>>> * flask-script
>>>> * flask-sqlalchemy
>>>> * flask-testing
>>>> * humanize
>>>> * gunicorn
>>>> * markdown
>>>> * pandas
>>>> * parsedatetime
>>>> * pydruid
>>>> * PyHive
>>>> * python-dateutil
>>>> * requests
>>>> * simplejson
>>>> * six
>>>> * sqlalchemy
>>>> * sqlalchemy-utils
>>>> * sqlparse
>>>> * thrift
>>>> * thrift-sasl
>>>> * werkzeug
>>>>
>>>> List of Javascript packages, from NPM:
>>>> * autobind-decorator
>>>> * bootstrap
>>>> * bootstrap-datepicker
>>>> * brace
>>>> * brfs
>>>> * cal-heatmap
>>>> * classnames
>>>> * d3
>>>> * d3-cloud
>>>> * d3-sankey
>>>> * d3-scale
>>>> * d3-tip
>>>> * datamaps
>>>> * datatables-bootstrap3-plugin
>>>> * datatables.net-bs
>>>> * font-awesome
>>>> * gridster
>>>> * immutability-helper
>>>> * immutable
>>>> * jquery
>>>> * lodash.throttle
>>>> * mapbox-gl
>>>> * moment
>>>> * moments
>>>> * mustache
>>>> * nvd3
>>>> * react
>>>> * react-ace
>>>> * react-bootstrap
>>>> * react-bootstrap-table
>>>> * react-dom
>>>> * react-draggable
>>>> * react-gravatar
>>>> * react-grid-layout
>>>> * react-map-gl
>>>> * react-redux
>>>> * react-resizable
>>>> * react-select
>>>> * react-syntax-highlighter
>>>> * reactable
>>>> * redux
>>>> * redux-localstorage
>>>> * redux-thunk
>>>> * shortid
>>>> * style-loader
>>>> * supercluster
>>>> * topojson
>>>> * victory
>>>> * viewport-mercator-project
>>>>
>>>> == Cryptography ==
>>>> The proposal does not include cryptographic code.
>>>>
>>>> == Required Resources ==
>>>>
>>>> === Mailing List ===
>>>> There is a current mailing list as a Google Group “airbnb_superset” that
>>>> we
>>>> are planning on deprecating as the Apache.org become ready to serve our
>>>> community.
>>>>
>>>> * superset-private
>>>> * superset-dev
>>>> * superset-user
>>>>
>>>> === Subversion Directory ===
>>>> Git is the preferred source control system.
>>>> http://svn.apache.org/repos/asf/incubator/superset
>>>>
>>>> == Git Repository ==
>>>> Git is the preferred source control system, we’re assuming
>>>> https://github.com/apache/incubator-superset based on the naming scheme
>>>>
>>>> == Issue Tracking ==
>>>> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues &
>>>> PRs
>>>> to manage our project as much as possible. It’s been said that there are
>>>> ways to keep Github’s issues in sync with Jira, allowing us to get best
>> of
>>>> both worlds. If that is not possible, we will comply to using Jira.
>>>>
>>>> == Other Resources ==
>>>> We currently use a set of Github integrated services that are free to
>> the
>>>> open source community, like Travis-ci, Code Climate, Coveralls,
>>>> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep
>>>> using
>>>> these services as they allow us to scale contributions and optimize our
>>>> development flows. These services require some elevated rights on the
>>>> Github repository in order to set up or tune and we would like for the
>>>> committers to have the required rights.
>>>>
>>>>
>>>> == Initial Committers ==
>>>>
>>>> * Maxime Beauchemin <[hidden email]> - PPMC & Committer
>>>> * Alanna Scott <[hidden email]> - PPMC & Committer
>>>> * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
>>>> * Vera Liu <[hidden email]> - Committer
>>>> * Jeff Feng <[hidden email]> - PPMC & Committer
>>>> * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
>>>> * Nishant Bangarwa <[hidden email]> - PPMC & Committer
>>>> * Slim Bouguerra <[hidden email]> - Committer
>>>> * Priyank Shah <[hidden email]> - Committer
>>>> * Harsha Chintalapani <[hidden email]> - Committer
>>>> * Daniel Dai <[hidden email]> - Champion & Committer
>>>> * Luke Han <[hidden email]> - Mentor
>>>>
>>>> == Affiliations ==
>>>> The initial committers are employees of Airbnb Inc. and Hortonworks.
>>>>
>>>> == Sponsors ==
>>>>
>>>> === Champion ===
>>>> Daniel Dai <[hidden email]>
>>>>
>>>> === Nominated Mentors ===
>>>> * Ashutosh Chauhan <[hidden email]>
>>>> * Luke Han <[hidden email]>
>>>>
>>>> === Sponsoring Entity ===
>>>> Incubator PMC
>>>>
>>>
>>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Jitendra Pandey
+1 (binding)

On 4/25/17, 1:27 PM, "Julian Hyde" <[hidden email]> wrote:

    +1 binding
   
    > On Apr 25, 2017, at 12:48 PM, moon soo Lee <[hidden email]> wrote:
    >
    > +1 (non-binding)
    >
    > On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <[hidden email]>
    > wrote:
    >
    >> +1 (binding)
    >>
    >> Thanks,
    >> Ashutosh
    >>
    >> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <[hidden email]> wrote:
    >>
    >>> +1 binding
    >>>
    >>> Love to see Superset to be new incubator project.
    >>>
    >>>
    >>> Best Regards!
    >>> ---------------------
    >>>
    >>> Luke Han
    >>>
    >>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <[hidden email]> wrote:
    >>>
    >>>> Dear Apache Incubator Community,
    >>>>
    >>>> We have updated the Superset proposal
    >>>> <https://wiki.apache.org/incubator/SupersetProposal> (copied below) for
    >>>>
    >>>> Apache Incubation with an additional mentor (Luke Han -
    >>>> [hidden email]),
    >>>> and would like to start a vote thread for acceptance into the incubator.
    >>>>
    >>>> Our team is excited to share Superset with the Apache community and we
    >>>> hope
    >>>> for the your continued support!
    >>>>
    >>>> Cheers,
    >>>> Jeff & the Superset Team
    >>>>
    >>>>
    >>>>
    >>>>
    >>>> = Superset =
    >>>>
    >>>> == Abstract ==
    >>>> Superset is an enterprise-ready web application for data exploration,
    >> data
    >>>> visualization and dashboarding.
    >>>>
    >>>> == Proposal ==
    >>>> Superset is business intelligence (BI) software that helps modern
    >>>> organizations visualize and interact with their data. Superset enables
    >>>> users explore data from a variety of databases, assemble beautiful
    >>>> dashboards and share their findings.  Superset works neatly with all
    >>>> modern
    >>>> SQL-speaking databases, and integrates with Druid.io to provide
    >> real-time,
    >>>> interactive, blazing fast data access to large datasets.
    >>>>
    >>>> == Background ==
    >>>> Data is mission critical. To succeed in this era, organizations need to
    >>>> provide low-friction, intuitive and interactive access to data. It is
    >>>> paramount for knowledge workers to be capable of answering their own
    >>>> questions by querying, exploring and visualizing data.
    >>>>
    >>>> The entire business intelligence industry has pivoted from a model of
    >>>> centralized top-down platforms driven by IT organizations to
    >> self-service
    >>>> analytics and agile workflows by any user.  This shift unblocks
    >>>> centralized
    >>>> service bottlenecks for creating data visualizations while also creating
    >>>> an
    >>>> environment that is iterative and fast-moving.  This means that business
    >>>> intelligence software must also be easy and delightful to use.
    >>>> Self-service analytics doesn’t mean that admin and governance features
    >> are
    >>>> not needed.
    >>>> Modern BI tools provide fine-grain access controls and auditing
    >>>> capabilities to understand how data is being used.  Superset is a
    >> solution
    >>>> that delivers on all of these vectors.
    >>>>
    >>>> The technology stack is also constantly morphing - vendors are
    >> struggling
    >>>> to provide cheap, quick and easy solutions to access data.  Business
    >>>> intelligence users are finding existing solutions lacking as these
    >>>> software
    >>>> products either disregard or react slowly to recent game-changing
    >>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
    >>>> React.js and iPython’s Jupyter for instance.
    >>>>
    >>>> == Rationale ==
    >>>> Business intelligence is more relevant today than at any other point in
    >>>> history.  Organizations are currently very limited in options for open
    >>>> source data visualization solutions, especially solutions that are both
    >>>> self-service and enterprise-ready.  Every company informing their
    >>>> decisions
    >>>> with data needs a BI tool.
    >>>>
    >>>> We believe that Superset will be a strong compliment to existing Apache
    >>>> Software Foundation technologies by offering scalable user interactions
    >> to
    >>>> distributed storage and computation solutions.  Users will often find
    >> that
    >>>> Superset can act as a catalyst for tooling that can visualize the
    >>>> byproduct
    >>>> of data and computation infrastructure.
    >>>>
    >>>> Superset has many key design elements that help fill a gap in current
    >>>> solutions for organizations:
    >>>> * Easy, low friction access to data through a simple, web-based data
    >>>> exploration interface.  Composing charts and dashboards are intuitive.
    >>>> Eliminating the need to write code or SQL empowers anyone to use it.
    >>>> * Access to a wide array of rich, interactive data visualization types.
    >>>> * Enterprise-ready: Integration with different authentication
    >> mechanisms
    >>>> and granular permissions centered around actions and data access.
    >>>> * Realtime & fast: Superset provides realtime analytics at the speed of
    >>>> thought on very large datasets when integrated with Druid.io.
    >>>> * Broad data access: Consume data out of any SQL-speaking relational
    >>>> database.
    >>>> * Extensible: Can be extended to talk to many noSQL databases like
    >> Apache
    >>>> Drill, Elastic Search, and other popular database engines.
    >>>> * Fast loading dashboards with configurable web-scale caching.
    >>>> * Plug-in framework that enables organizations to build custom
    >> analytical
    >>>> applications with new UI/UX interfaces.
    >>>> * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
    >>>> with more flexibility.  SQL Lab integrates with the visualization engine
    >>>> seamlessly.
    >>>>
    >>>> == Initial Goals ==
    >>>> The initial goals of the Superset project are several-fold:
    >>>> * Move the existing codebase to Apache and integrate with the Apache
    >>>> development process.
    >>>> * Redesign the user interface and interaction model for creating
    >>>> visualizations/dashboards and connecting to data sources
    >>>> * Build robust support for security and governance of the tool
    >> including
    >>>> popular authorization modules (including Apache Ranger and Apache
    >> Sentry)
    >>>> and a more sophisticated permissions system
    >>>> * Grow the extensibility of the project both in terms of enhanced
    >>>> connectivity to NoSQL-based data sources and creating a plug-in
    >> framework
    >>>> that enables organizations to build custom analytical applications which
    >>>> require a new UI/UX
    >>>>
    >>>> == Current Status ==
    >>>> By many standards, Superset is already a successful open source project.
    >>>> As
    >>>> of March 2017, Superset is officially used in production at about a
    >> dozen
    >>>> companies, has received contributions from over one hundred contributors
    >>>> on
    >>>> Github, 1500+ forks, and 12k+ stars.
    >>>>
    >>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
    >>>> significant contributions, and expressed their commitment to the
    >> project.
    >>>> The product is feature complete and has been viable for months. It
    >> already
    >>>> serves as the main interface for consuming data at many companies of
    >>>> different sizes.
    >>>>
    >>>> While the product is usable, there’s room for improvement across the
    >>>> board,
    >>>> starting with providing a smoother user experience around content
    >>>> creation,
    >>>> making sure all features work out-of-the-box on more platforms and
    >>>> databases, providing better user training guides and videos, having a
    >>>> predictable release process, and increasing the overall quality of the
    >>>> Superset releases.
    >>>>
    >>>> === Meritocracy ===
    >>>> We plan to invest in supporting a meritocracy. We will discuss the
    >>>> requirements in an open forum. Several companies have expressed interest
    >>>> in
    >>>> this project, and we intend to invite additional developers to
    >>>> participate.
    >>>> We will encourage and monitor community participation so that privileges
    >>>> can be extended to those that contribute.
    >>>>
    >>>> === Community ===
    >>>> The need for an enterprise-ready data visualization and exploration
    >>>> platform in the open source community is tremendous.  While Superset is
    >>>> fairly well known, recognized and used within the Druid.io community,
    >>>> adoption is currently limited outside of that niche. There is a huge
    >>>> opportunity to grow the community to hundreds if not thousands of
    >>>> organizations, and we are hoping that embracing “the Apache way” will
    >>>> accelerate the growth of our community.
    >>>>
    >>>> We have already been active at seeking and inviting contributions, and
    >> are
    >>>> planning to scale the project by investing time and growing the support
    >>>> structure to grow the community.
    >>>>
    >>>> === Core Developers ===
    >>>> The initial committers for Superset include experienced full stack,
    >>>> front-end and data engineers:
    >>>> * Maxime Beauchemin (Airbnb)
    >>>> * Alanna Scott (Airbnb)
    >>>> * Bogdan Kyryliuk (Airbnb)
    >>>> * Vera Liu  (Airbnb)
    >>>> * Jeff Feng (Airbnb)
    >>>> * Ashutosh Chauhan (Hortonworks)
    >>>> * Nishant Bangarwa (Hortonworks)
    >>>> * Slim Bouguerra (Hortonworks)
    >>>> * Priyank Shah (Hortonworks)
    >>>> * Sriharsha Chintalapani (Hortonworks)
    >>>> * Daniel Dai (Hortonworks)
    >>>>
    >>>> We realize that additional employer diversity is needed, and we will
    >> work
    >>>> aggressively to recruit developers from additional companies.
    >>>>
    >>>> === Alignment ===
    >>>> The initial committers strongly believe that a system for interactive
    >>>> visualization of data will gain broader adoption as an open source,
    >>>> community driven project, where the community can contribute not only to
    >>>> the core components, but also to a growing collection of connectors,
    >>>> visualizations and improving integration a all potential data sources.
    >>>> Superset already integrates closely with Apache Hive, the Hive
    >> metastore,
    >>>> as well as most SQL-speaking databases found in modern data ecosystems.
    >>>>
    >>>> == Known Risks ==
    >>>>
    >>>> === Orphaned Products ===
    >>>> Superset is a vital component for both visualizing, accessing and
    >>>> democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
    >>>> component of the DataFlow product offering.  Thus, the risk of the
    >> project
    >>>> being orphaned is relatively low.  The project could be at risk if
    >> Airbnb
    >>>> changes their approach for democratizing data or if Hortonworks changes
    >>>> their strategy in the market.  In such an event, the committers plan to
    >>>> continue working on the project on their own time, thought the progress
    >>>> will likely be slower.  We plan to mitigate this risk by recruiting
    >>>> additional committers.
    >>>>
    >>>> === Inexperience with Open Source ===
    >>>> The initial committers include veteran Apache members (committers and
    >> PPMC
    >>>> members) and other developers who have varying degrees of experience
    >> with
    >>>> open source projects. All have been involved with source code that has
    >>>> been
    >>>> released under an open source license, and several also have experience
    >>>> developing code with an open source development process.
    >>>>
    >>>> === Homogenous Developers ===
    >>>> The initial committers are employed by Airbnb Inc. and Hortonworks. We
    >> are
    >>>> committed to recruiting additional committers from other companies.
    >>>>
    >>>> === Reliance on Salaried Developers ===
    >>>> It is expected that Superset development will occur on both salaried
    >> time
    >>>> and on volunteer time, after hours. The majority of initial committers
    >> are
    >>>> paid by their employer to contribute to this project. However, they are
    >>>> all
    >>>> passionate about the project, and we are confident that the project will
    >>>> continue even if no salaried developers contribute to the project. We
    >> are
    >>>> committed to recruiting additional committers including non-salaried
    >>>> developers.
    >>>>
    >>>> === Relationships with Other Apache Products ===
    >>>> To the knowledge of the Initial Committers, there are no direct
    >>>> competitors
    >>>> to Superset within the Apache Software Foundation.  That said, Apache
    >>>> Zeppelin is an indirect competitor, but it solves a different use case.
    >>>>
    >>>> Apache Zeppelin is a web-based notebook that enables interactive data
    >>>> analytics. It enables the creation of beautiful data-driven, interactive
    >>>> and collaborative documents with SQL, Scala and more.  Although a user
    >> can
    >>>> create data visualizations using this project, it leverages a notebook
    >>>> style user interfaces and it is geared towards the Spark community where
    >>>> Scala and SQL co-exist
    >>>>
    >>>> We look forward to collaborating with those communities, as well as
    >> other
    >>>> Apache communities.
    >>>>
    >>>> === An Excessive Fascination with the Apache Brand ===
    >>>> Superset is solving two huge challenges:
    >>>> The challenge of enabling every knowledge worker to make data informed
    >>>> decisions, particularly those who are not deeply skilled at writing SQL.
    >>>> The challenge of visualizing huge amounts of data interactively and in
    >>>> real-time
    >>>>
    >>>> Superset was first developed as a data visualization solution for
    >> Druid.io
    >>>> as a way to visualize billions of rows of data.  Since then, usage of
    >>>> Superset has expanded to address data visualization use cases across SQL
    >>>> speaking data sources as well.
    >>>>
    >>>> Our rationale for developing Superset as an Apache project is detailed
    >> in
    >>>> the Rationale Section.  We believe that the Apache brand and community
    >>>> process will help us attract more contributors to this project, and help
    >>>> grow the footprint of the project through usage at other organizations
    >> and
    >>>> within other applications.  Establishing consensus among users and
    >>>> developers will result in a more valuable tool for everyone.
    >>>>
    >>>> == Documentation ==
    >>>> References to further reading material:
    >>>> * [[http://airbnb.io/superset/|Superset Documentation]]
    >>>> * [[
    >>>> https://medium.com/airbnb-engineering/caravel-airbnb-s-data-
    >>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
    >>>> Post:  Superset: Airbnb’s Data Exploration Platform]]
    >>>> * [[
    >>>> https://medium.com/airbnb-engineering/superset-scaling-data-
    >>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog
    >>>> Post:  Superset: Scaling Data Access & Visual Insights at Airbnb]]
    >>>>
    >>>> == Initial Source ==
    >>>> The origin of the proposed code base can be found at
    >>>> https://github.com/airbnb/superset.  The code base is primarily in
    >>>> Python.
    >>>>
    >>>> == Source and Intellectual Property Submission Plan ==
    >>>> We do not expect any complications for the submission of the Superset
    >> code
    >>>> base.  Our code is already in Github and there is only a single code
    >> base.
    >>>>
    >>>> == External Dependencies ==
    >>>> List of Python packages, from the Python Package Index (Pypi):
    >>>>
    >>>> * boto3
    >>>> * celery
    >>>> * cryptography
    >>>> * flask-appbuilder
    >>>> * flask-cache
    >>>> * flask-migrate
    >>>> * flask-script
    >>>> * flask-sqlalchemy
    >>>> * flask-testing
    >>>> * humanize
    >>>> * gunicorn
    >>>> * markdown
    >>>> * pandas
    >>>> * parsedatetime
    >>>> * pydruid
    >>>> * PyHive
    >>>> * python-dateutil
    >>>> * requests
    >>>> * simplejson
    >>>> * six
    >>>> * sqlalchemy
    >>>> * sqlalchemy-utils
    >>>> * sqlparse
    >>>> * thrift
    >>>> * thrift-sasl
    >>>> * werkzeug
    >>>>
    >>>> List of Javascript packages, from NPM:
    >>>> * autobind-decorator
    >>>> * bootstrap
    >>>> * bootstrap-datepicker
    >>>> * brace
    >>>> * brfs
    >>>> * cal-heatmap
    >>>> * classnames
    >>>> * d3
    >>>> * d3-cloud
    >>>> * d3-sankey
    >>>> * d3-scale
    >>>> * d3-tip
    >>>> * datamaps
    >>>> * datatables-bootstrap3-plugin
    >>>> * datatables.net-bs
    >>>> * font-awesome
    >>>> * gridster
    >>>> * immutability-helper
    >>>> * immutable
    >>>> * jquery
    >>>> * lodash.throttle
    >>>> * mapbox-gl
    >>>> * moment
    >>>> * moments
    >>>> * mustache
    >>>> * nvd3
    >>>> * react
    >>>> * react-ace
    >>>> * react-bootstrap
    >>>> * react-bootstrap-table
    >>>> * react-dom
    >>>> * react-draggable
    >>>> * react-gravatar
    >>>> * react-grid-layout
    >>>> * react-map-gl
    >>>> * react-redux
    >>>> * react-resizable
    >>>> * react-select
    >>>> * react-syntax-highlighter
    >>>> * reactable
    >>>> * redux
    >>>> * redux-localstorage
    >>>> * redux-thunk
    >>>> * shortid
    >>>> * style-loader
    >>>> * supercluster
    >>>> * topojson
    >>>> * victory
    >>>> * viewport-mercator-project
    >>>>
    >>>> == Cryptography ==
    >>>> The proposal does not include cryptographic code.
    >>>>
    >>>> == Required Resources ==
    >>>>
    >>>> === Mailing List ===
    >>>> There is a current mailing list as a Google Group “airbnb_superset” that
    >>>> we
    >>>> are planning on deprecating as the Apache.org become ready to serve our
    >>>> community.
    >>>>
    >>>> * superset-private
    >>>> * superset-dev
    >>>> * superset-user
    >>>>
    >>>> === Subversion Directory ===
    >>>> Git is the preferred source control system.
    >>>> http://svn.apache.org/repos/asf/incubator/superset
    >>>>
    >>>> == Git Repository ==
    >>>> Git is the preferred source control system, we’re assuming
    >>>> https://github.com/apache/incubator-superset based on the naming scheme
    >>>>
    >>>> == Issue Tracking ==
    >>>> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues &
    >>>> PRs
    >>>> to manage our project as much as possible. It’s been said that there are
    >>>> ways to keep Github’s issues in sync with Jira, allowing us to get best
    >> of
    >>>> both worlds. If that is not possible, we will comply to using Jira.
    >>>>
    >>>> == Other Resources ==
    >>>> We currently use a set of Github integrated services that are free to
    >> the
    >>>> open source community, like Travis-ci, Code Climate, Coveralls,
    >>>> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep
    >>>> using
    >>>> these services as they allow us to scale contributions and optimize our
    >>>> development flows. These services require some elevated rights on the
    >>>> Github repository in order to set up or tune and we would like for the
    >>>> committers to have the required rights.
    >>>>
    >>>>
    >>>> == Initial Committers ==
    >>>>
    >>>> * Maxime Beauchemin <[hidden email]> - PPMC & Committer
    >>>> * Alanna Scott <[hidden email]> - PPMC & Committer
    >>>> * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
    >>>> * Vera Liu <[hidden email]> - Committer
    >>>> * Jeff Feng <[hidden email]> - PPMC & Committer
    >>>> * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
    >>>> * Nishant Bangarwa <[hidden email]> - PPMC & Committer
    >>>> * Slim Bouguerra <[hidden email]> - Committer
    >>>> * Priyank Shah <[hidden email]> - Committer
    >>>> * Harsha Chintalapani <[hidden email]> - Committer
    >>>> * Daniel Dai <[hidden email]> - Champion & Committer
    >>>> * Luke Han <[hidden email]> - Mentor
    >>>>
    >>>> == Affiliations ==
    >>>> The initial committers are employees of Airbnb Inc. and Hortonworks.
    >>>>
    >>>> == Sponsors ==
    >>>>
    >>>> === Champion ===
    >>>> Daniel Dai <[hidden email]>
    >>>>
    >>>> === Nominated Mentors ===
    >>>> * Ashutosh Chauhan <[hidden email]>
    >>>> * Luke Han <[hidden email]>
    >>>>
    >>>> === Sponsoring Entity ===
    >>>> Incubator PMC
    >>>>
    >>>
    >>>
    >>
   
   
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [hidden email]
    For additional commands, e-mail: [hidden email]
   
   
   


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Joe Witt
+1 (binding)

On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey
<[hidden email]> wrote:

> +1 (binding)
>
> On 4/25/17, 1:27 PM, "Julian Hyde" <[hidden email]> wrote:
>
>     +1 binding
>
>     > On Apr 25, 2017, at 12:48 PM, moon soo Lee <[hidden email]> wrote:
>     >
>     > +1 (non-binding)
>     >
>     > On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <[hidden email]>
>     > wrote:
>     >
>     >> +1 (binding)
>     >>
>     >> Thanks,
>     >> Ashutosh
>     >>
>     >> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <[hidden email]> wrote:
>     >>
>     >>> +1 binding
>     >>>
>     >>> Love to see Superset to be new incubator project.
>     >>>
>     >>>
>     >>> Best Regards!
>     >>> ---------------------
>     >>>
>     >>> Luke Han
>     >>>
>     >>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <[hidden email]> wrote:
>     >>>
>     >>>> Dear Apache Incubator Community,
>     >>>>
>     >>>> We have updated the Superset proposal
>     >>>> <https://wiki.apache.org/incubator/SupersetProposal> (copied below) for
>     >>>>
>     >>>> Apache Incubation with an additional mentor (Luke Han -
>     >>>> [hidden email]),
>     >>>> and would like to start a vote thread for acceptance into the incubator.
>     >>>>
>     >>>> Our team is excited to share Superset with the Apache community and we
>     >>>> hope
>     >>>> for the your continued support!
>     >>>>
>     >>>> Cheers,
>     >>>> Jeff & the Superset Team
>     >>>>
>     >>>>
>     >>>>
>     >>>>
>     >>>> = Superset =
>     >>>>
>     >>>> == Abstract ==
>     >>>> Superset is an enterprise-ready web application for data exploration,
>     >> data
>     >>>> visualization and dashboarding.
>     >>>>
>     >>>> == Proposal ==
>     >>>> Superset is business intelligence (BI) software that helps modern
>     >>>> organizations visualize and interact with their data. Superset enables
>     >>>> users explore data from a variety of databases, assemble beautiful
>     >>>> dashboards and share their findings.  Superset works neatly with all
>     >>>> modern
>     >>>> SQL-speaking databases, and integrates with Druid.io to provide
>     >> real-time,
>     >>>> interactive, blazing fast data access to large datasets.
>     >>>>
>     >>>> == Background ==
>     >>>> Data is mission critical. To succeed in this era, organizations need to
>     >>>> provide low-friction, intuitive and interactive access to data. It is
>     >>>> paramount for knowledge workers to be capable of answering their own
>     >>>> questions by querying, exploring and visualizing data.
>     >>>>
>     >>>> The entire business intelligence industry has pivoted from a model of
>     >>>> centralized top-down platforms driven by IT organizations to
>     >> self-service
>     >>>> analytics and agile workflows by any user.  This shift unblocks
>     >>>> centralized
>     >>>> service bottlenecks for creating data visualizations while also creating
>     >>>> an
>     >>>> environment that is iterative and fast-moving.  This means that business
>     >>>> intelligence software must also be easy and delightful to use.
>     >>>> Self-service analytics doesn’t mean that admin and governance features
>     >> are
>     >>>> not needed.
>     >>>> Modern BI tools provide fine-grain access controls and auditing
>     >>>> capabilities to understand how data is being used.  Superset is a
>     >> solution
>     >>>> that delivers on all of these vectors.
>     >>>>
>     >>>> The technology stack is also constantly morphing - vendors are
>     >> struggling
>     >>>> to provide cheap, quick and easy solutions to access data.  Business
>     >>>> intelligence users are finding existing solutions lacking as these
>     >>>> software
>     >>>> products either disregard or react slowly to recent game-changing
>     >>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
>     >>>> React.js and iPython’s Jupyter for instance.
>     >>>>
>     >>>> == Rationale ==
>     >>>> Business intelligence is more relevant today than at any other point in
>     >>>> history.  Organizations are currently very limited in options for open
>     >>>> source data visualization solutions, especially solutions that are both
>     >>>> self-service and enterprise-ready.  Every company informing their
>     >>>> decisions
>     >>>> with data needs a BI tool.
>     >>>>
>     >>>> We believe that Superset will be a strong compliment to existing Apache
>     >>>> Software Foundation technologies by offering scalable user interactions
>     >> to
>     >>>> distributed storage and computation solutions.  Users will often find
>     >> that
>     >>>> Superset can act as a catalyst for tooling that can visualize the
>     >>>> byproduct
>     >>>> of data and computation infrastructure.
>     >>>>
>     >>>> Superset has many key design elements that help fill a gap in current
>     >>>> solutions for organizations:
>     >>>> * Easy, low friction access to data through a simple, web-based data
>     >>>> exploration interface.  Composing charts and dashboards are intuitive.
>     >>>> Eliminating the need to write code or SQL empowers anyone to use it.
>     >>>> * Access to a wide array of rich, interactive data visualization types.
>     >>>> * Enterprise-ready: Integration with different authentication
>     >> mechanisms
>     >>>> and granular permissions centered around actions and data access.
>     >>>> * Realtime & fast: Superset provides realtime analytics at the speed of
>     >>>> thought on very large datasets when integrated with Druid.io.
>     >>>> * Broad data access: Consume data out of any SQL-speaking relational
>     >>>> database.
>     >>>> * Extensible: Can be extended to talk to many noSQL databases like
>     >> Apache
>     >>>> Drill, Elastic Search, and other popular database engines.
>     >>>> * Fast loading dashboards with configurable web-scale caching.
>     >>>> * Plug-in framework that enables organizations to build custom
>     >> analytical
>     >>>> applications with new UI/UX interfaces.
>     >>>> * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
>     >>>> with more flexibility.  SQL Lab integrates with the visualization engine
>     >>>> seamlessly.
>     >>>>
>     >>>> == Initial Goals ==
>     >>>> The initial goals of the Superset project are several-fold:
>     >>>> * Move the existing codebase to Apache and integrate with the Apache
>     >>>> development process.
>     >>>> * Redesign the user interface and interaction model for creating
>     >>>> visualizations/dashboards and connecting to data sources
>     >>>> * Build robust support for security and governance of the tool
>     >> including
>     >>>> popular authorization modules (including Apache Ranger and Apache
>     >> Sentry)
>     >>>> and a more sophisticated permissions system
>     >>>> * Grow the extensibility of the project both in terms of enhanced
>     >>>> connectivity to NoSQL-based data sources and creating a plug-in
>     >> framework
>     >>>> that enables organizations to build custom analytical applications which
>     >>>> require a new UI/UX
>     >>>>
>     >>>> == Current Status ==
>     >>>> By many standards, Superset is already a successful open source project.
>     >>>> As
>     >>>> of March 2017, Superset is officially used in production at about a
>     >> dozen
>     >>>> companies, has received contributions from over one hundred contributors
>     >>>> on
>     >>>> Github, 1500+ forks, and 12k+ stars.
>     >>>>
>     >>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
>     >>>> significant contributions, and expressed their commitment to the
>     >> project.
>     >>>> The product is feature complete and has been viable for months. It
>     >> already
>     >>>> serves as the main interface for consuming data at many companies of
>     >>>> different sizes.
>     >>>>
>     >>>> While the product is usable, there’s room for improvement across the
>     >>>> board,
>     >>>> starting with providing a smoother user experience around content
>     >>>> creation,
>     >>>> making sure all features work out-of-the-box on more platforms and
>     >>>> databases, providing better user training guides and videos, having a
>     >>>> predictable release process, and increasing the overall quality of the
>     >>>> Superset releases.
>     >>>>
>     >>>> === Meritocracy ===
>     >>>> We plan to invest in supporting a meritocracy. We will discuss the
>     >>>> requirements in an open forum. Several companies have expressed interest
>     >>>> in
>     >>>> this project, and we intend to invite additional developers to
>     >>>> participate.
>     >>>> We will encourage and monitor community participation so that privileges
>     >>>> can be extended to those that contribute.
>     >>>>
>     >>>> === Community ===
>     >>>> The need for an enterprise-ready data visualization and exploration
>     >>>> platform in the open source community is tremendous.  While Superset is
>     >>>> fairly well known, recognized and used within the Druid.io community,
>     >>>> adoption is currently limited outside of that niche. There is a huge
>     >>>> opportunity to grow the community to hundreds if not thousands of
>     >>>> organizations, and we are hoping that embracing “the Apache way” will
>     >>>> accelerate the growth of our community.
>     >>>>
>     >>>> We have already been active at seeking and inviting contributions, and
>     >> are
>     >>>> planning to scale the project by investing time and growing the support
>     >>>> structure to grow the community.
>     >>>>
>     >>>> === Core Developers ===
>     >>>> The initial committers for Superset include experienced full stack,
>     >>>> front-end and data engineers:
>     >>>> * Maxime Beauchemin (Airbnb)
>     >>>> * Alanna Scott (Airbnb)
>     >>>> * Bogdan Kyryliuk (Airbnb)
>     >>>> * Vera Liu  (Airbnb)
>     >>>> * Jeff Feng (Airbnb)
>     >>>> * Ashutosh Chauhan (Hortonworks)
>     >>>> * Nishant Bangarwa (Hortonworks)
>     >>>> * Slim Bouguerra (Hortonworks)
>     >>>> * Priyank Shah (Hortonworks)
>     >>>> * Sriharsha Chintalapani (Hortonworks)
>     >>>> * Daniel Dai (Hortonworks)
>     >>>>
>     >>>> We realize that additional employer diversity is needed, and we will
>     >> work
>     >>>> aggressively to recruit developers from additional companies.
>     >>>>
>     >>>> === Alignment ===
>     >>>> The initial committers strongly believe that a system for interactive
>     >>>> visualization of data will gain broader adoption as an open source,
>     >>>> community driven project, where the community can contribute not only to
>     >>>> the core components, but also to a growing collection of connectors,
>     >>>> visualizations and improving integration a all potential data sources.
>     >>>> Superset already integrates closely with Apache Hive, the Hive
>     >> metastore,
>     >>>> as well as most SQL-speaking databases found in modern data ecosystems.
>     >>>>
>     >>>> == Known Risks ==
>     >>>>
>     >>>> === Orphaned Products ===
>     >>>> Superset is a vital component for both visualizing, accessing and
>     >>>> democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
>     >>>> component of the DataFlow product offering.  Thus, the risk of the
>     >> project
>     >>>> being orphaned is relatively low.  The project could be at risk if
>     >> Airbnb
>     >>>> changes their approach for democratizing data or if Hortonworks changes
>     >>>> their strategy in the market.  In such an event, the committers plan to
>     >>>> continue working on the project on their own time, thought the progress
>     >>>> will likely be slower.  We plan to mitigate this risk by recruiting
>     >>>> additional committers.
>     >>>>
>     >>>> === Inexperience with Open Source ===
>     >>>> The initial committers include veteran Apache members (committers and
>     >> PPMC
>     >>>> members) and other developers who have varying degrees of experience
>     >> with
>     >>>> open source projects. All have been involved with source code that has
>     >>>> been
>     >>>> released under an open source license, and several also have experience
>     >>>> developing code with an open source development process.
>     >>>>
>     >>>> === Homogenous Developers ===
>     >>>> The initial committers are employed by Airbnb Inc. and Hortonworks. We
>     >> are
>     >>>> committed to recruiting additional committers from other companies.
>     >>>>
>     >>>> === Reliance on Salaried Developers ===
>     >>>> It is expected that Superset development will occur on both salaried
>     >> time
>     >>>> and on volunteer time, after hours. The majority of initial committers
>     >> are
>     >>>> paid by their employer to contribute to this project. However, they are
>     >>>> all
>     >>>> passionate about the project, and we are confident that the project will
>     >>>> continue even if no salaried developers contribute to the project. We
>     >> are
>     >>>> committed to recruiting additional committers including non-salaried
>     >>>> developers.
>     >>>>
>     >>>> === Relationships with Other Apache Products ===
>     >>>> To the knowledge of the Initial Committers, there are no direct
>     >>>> competitors
>     >>>> to Superset within the Apache Software Foundation.  That said, Apache
>     >>>> Zeppelin is an indirect competitor, but it solves a different use case.
>     >>>>
>     >>>> Apache Zeppelin is a web-based notebook that enables interactive data
>     >>>> analytics. It enables the creation of beautiful data-driven, interactive
>     >>>> and collaborative documents with SQL, Scala and more.  Although a user
>     >> can
>     >>>> create data visualizations using this project, it leverages a notebook
>     >>>> style user interfaces and it is geared towards the Spark community where
>     >>>> Scala and SQL co-exist
>     >>>>
>     >>>> We look forward to collaborating with those communities, as well as
>     >> other
>     >>>> Apache communities.
>     >>>>
>     >>>> === An Excessive Fascination with the Apache Brand ===
>     >>>> Superset is solving two huge challenges:
>     >>>> The challenge of enabling every knowledge worker to make data informed
>     >>>> decisions, particularly those who are not deeply skilled at writing SQL.
>     >>>> The challenge of visualizing huge amounts of data interactively and in
>     >>>> real-time
>     >>>>
>     >>>> Superset was first developed as a data visualization solution for
>     >> Druid.io
>     >>>> as a way to visualize billions of rows of data.  Since then, usage of
>     >>>> Superset has expanded to address data visualization use cases across SQL
>     >>>> speaking data sources as well.
>     >>>>
>     >>>> Our rationale for developing Superset as an Apache project is detailed
>     >> in
>     >>>> the Rationale Section.  We believe that the Apache brand and community
>     >>>> process will help us attract more contributors to this project, and help
>     >>>> grow the footprint of the project through usage at other organizations
>     >> and
>     >>>> within other applications.  Establishing consensus among users and
>     >>>> developers will result in a more valuable tool for everyone.
>     >>>>
>     >>>> == Documentation ==
>     >>>> References to further reading material:
>     >>>> * [[http://airbnb.io/superset/|Superset Documentation]]
>     >>>> * [[
>     >>>> https://medium.com/airbnb-engineering/caravel-airbnb-s-data-
>     >>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
>     >>>> Post:  Superset: Airbnb’s Data Exploration Platform]]
>     >>>> * [[
>     >>>> https://medium.com/airbnb-engineering/superset-scaling-data-
>     >>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog
>     >>>> Post:  Superset: Scaling Data Access & Visual Insights at Airbnb]]
>     >>>>
>     >>>> == Initial Source ==
>     >>>> The origin of the proposed code base can be found at
>     >>>> https://github.com/airbnb/superset.  The code base is primarily in
>     >>>> Python.
>     >>>>
>     >>>> == Source and Intellectual Property Submission Plan ==
>     >>>> We do not expect any complications for the submission of the Superset
>     >> code
>     >>>> base.  Our code is already in Github and there is only a single code
>     >> base.
>     >>>>
>     >>>> == External Dependencies ==
>     >>>> List of Python packages, from the Python Package Index (Pypi):
>     >>>>
>     >>>> * boto3
>     >>>> * celery
>     >>>> * cryptography
>     >>>> * flask-appbuilder
>     >>>> * flask-cache
>     >>>> * flask-migrate
>     >>>> * flask-script
>     >>>> * flask-sqlalchemy
>     >>>> * flask-testing
>     >>>> * humanize
>     >>>> * gunicorn
>     >>>> * markdown
>     >>>> * pandas
>     >>>> * parsedatetime
>     >>>> * pydruid
>     >>>> * PyHive
>     >>>> * python-dateutil
>     >>>> * requests
>     >>>> * simplejson
>     >>>> * six
>     >>>> * sqlalchemy
>     >>>> * sqlalchemy-utils
>     >>>> * sqlparse
>     >>>> * thrift
>     >>>> * thrift-sasl
>     >>>> * werkzeug
>     >>>>
>     >>>> List of Javascript packages, from NPM:
>     >>>> * autobind-decorator
>     >>>> * bootstrap
>     >>>> * bootstrap-datepicker
>     >>>> * brace
>     >>>> * brfs
>     >>>> * cal-heatmap
>     >>>> * classnames
>     >>>> * d3
>     >>>> * d3-cloud
>     >>>> * d3-sankey
>     >>>> * d3-scale
>     >>>> * d3-tip
>     >>>> * datamaps
>     >>>> * datatables-bootstrap3-plugin
>     >>>> * datatables.net-bs
>     >>>> * font-awesome
>     >>>> * gridster
>     >>>> * immutability-helper
>     >>>> * immutable
>     >>>> * jquery
>     >>>> * lodash.throttle
>     >>>> * mapbox-gl
>     >>>> * moment
>     >>>> * moments
>     >>>> * mustache
>     >>>> * nvd3
>     >>>> * react
>     >>>> * react-ace
>     >>>> * react-bootstrap
>     >>>> * react-bootstrap-table
>     >>>> * react-dom
>     >>>> * react-draggable
>     >>>> * react-gravatar
>     >>>> * react-grid-layout
>     >>>> * react-map-gl
>     >>>> * react-redux
>     >>>> * react-resizable
>     >>>> * react-select
>     >>>> * react-syntax-highlighter
>     >>>> * reactable
>     >>>> * redux
>     >>>> * redux-localstorage
>     >>>> * redux-thunk
>     >>>> * shortid
>     >>>> * style-loader
>     >>>> * supercluster
>     >>>> * topojson
>     >>>> * victory
>     >>>> * viewport-mercator-project
>     >>>>
>     >>>> == Cryptography ==
>     >>>> The proposal does not include cryptographic code.
>     >>>>
>     >>>> == Required Resources ==
>     >>>>
>     >>>> === Mailing List ===
>     >>>> There is a current mailing list as a Google Group “airbnb_superset” that
>     >>>> we
>     >>>> are planning on deprecating as the Apache.org become ready to serve our
>     >>>> community.
>     >>>>
>     >>>> * superset-private
>     >>>> * superset-dev
>     >>>> * superset-user
>     >>>>
>     >>>> === Subversion Directory ===
>     >>>> Git is the preferred source control system.
>     >>>> http://svn.apache.org/repos/asf/incubator/superset
>     >>>>
>     >>>> == Git Repository ==
>     >>>> Git is the preferred source control system, we’re assuming
>     >>>> https://github.com/apache/incubator-superset based on the naming scheme
>     >>>>
>     >>>> == Issue Tracking ==
>     >>>> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues &
>     >>>> PRs
>     >>>> to manage our project as much as possible. It’s been said that there are
>     >>>> ways to keep Github’s issues in sync with Jira, allowing us to get best
>     >> of
>     >>>> both worlds. If that is not possible, we will comply to using Jira.
>     >>>>
>     >>>> == Other Resources ==
>     >>>> We currently use a set of Github integrated services that are free to
>     >> the
>     >>>> open source community, like Travis-ci, Code Climate, Coveralls,
>     >>>> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep
>     >>>> using
>     >>>> these services as they allow us to scale contributions and optimize our
>     >>>> development flows. These services require some elevated rights on the
>     >>>> Github repository in order to set up or tune and we would like for the
>     >>>> committers to have the required rights.
>     >>>>
>     >>>>
>     >>>> == Initial Committers ==
>     >>>>
>     >>>> * Maxime Beauchemin <[hidden email]> - PPMC & Committer
>     >>>> * Alanna Scott <[hidden email]> - PPMC & Committer
>     >>>> * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
>     >>>> * Vera Liu <[hidden email]> - Committer
>     >>>> * Jeff Feng <[hidden email]> - PPMC & Committer
>     >>>> * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
>     >>>> * Nishant Bangarwa <[hidden email]> - PPMC & Committer
>     >>>> * Slim Bouguerra <[hidden email]> - Committer
>     >>>> * Priyank Shah <[hidden email]> - Committer
>     >>>> * Harsha Chintalapani <[hidden email]> - Committer
>     >>>> * Daniel Dai <[hidden email]> - Champion & Committer
>     >>>> * Luke Han <[hidden email]> - Mentor
>     >>>>
>     >>>> == Affiliations ==
>     >>>> The initial committers are employees of Airbnb Inc. and Hortonworks.
>     >>>>
>     >>>> == Sponsors ==
>     >>>>
>     >>>> === Champion ===
>     >>>> Daniel Dai <[hidden email]>
>     >>>>
>     >>>> === Nominated Mentors ===
>     >>>> * Ashutosh Chauhan <[hidden email]>
>     >>>> * Luke Han <[hidden email]>
>     >>>>
>     >>>> === Sponsoring Entity ===
>     >>>> Incubator PMC
>     >>>>
>     >>>
>     >>>
>     >>
>
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: [hidden email]
>     For additional commands, e-mail: [hidden email]
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Ted Dunning
+1 (binding)



On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <[hidden email]> wrote:

> +1 (binding)
>
> On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey
> <[hidden email]> wrote:
> > +1 (binding)
> >
> > On 4/25/17, 1:27 PM, "Julian Hyde" <[hidden email]> wrote:
> >
> >     +1 binding
> >
> >     > On Apr 25, 2017, at 12:48 PM, moon soo Lee <[hidden email]>
> wrote:
> >     >
> >     > +1 (non-binding)
> >     >
> >     > On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <
> [hidden email]>
> >     > wrote:
> >     >
> >     >> +1 (binding)
> >     >>
> >     >> Thanks,
> >     >> Ashutosh
> >     >>
> >     >> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <[hidden email]>
> wrote:
> >     >>
> >     >>> +1 binding
> >     >>>
> >     >>> Love to see Superset to be new incubator project.
> >     >>>
> >     >>>
> >     >>> Best Regards!
> >     >>> ---------------------
> >     >>>
> >     >>> Luke Han
> >     >>>
> >     >>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <[hidden email]>
> wrote:
> >     >>>
> >     >>>> Dear Apache Incubator Community,
> >     >>>>
> >     >>>> We have updated the Superset proposal
> >     >>>> <https://wiki.apache.org/incubator/SupersetProposal> (copied
> below) for
> >     >>>>
> >     >>>> Apache Incubation with an additional mentor (Luke Han -
> >     >>>> [hidden email]),
> >     >>>> and would like to start a vote thread for acceptance into the
> incubator.
> >     >>>>
> >     >>>> Our team is excited to share Superset with the Apache community
> and we
> >     >>>> hope
> >     >>>> for the your continued support!
> >     >>>>
> >     >>>> Cheers,
> >     >>>> Jeff & the Superset Team
> >     >>>>
> >     >>>>
> >     >>>>
> >     >>>>
> >     >>>> = Superset =
> >     >>>>
> >     >>>> == Abstract ==
> >     >>>> Superset is an enterprise-ready web application for data
> exploration,
> >     >> data
> >     >>>> visualization and dashboarding.
> >     >>>>
> >     >>>> == Proposal ==
> >     >>>> Superset is business intelligence (BI) software that helps
> modern
> >     >>>> organizations visualize and interact with their data. Superset
> enables
> >     >>>> users explore data from a variety of databases, assemble
> beautiful
> >     >>>> dashboards and share their findings.  Superset works neatly
> with all
> >     >>>> modern
> >     >>>> SQL-speaking databases, and integrates with Druid.io to provide
> >     >> real-time,
> >     >>>> interactive, blazing fast data access to large datasets.
> >     >>>>
> >     >>>> == Background ==
> >     >>>> Data is mission critical. To succeed in this era, organizations
> need to
> >     >>>> provide low-friction, intuitive and interactive access to data.
> It is
> >     >>>> paramount for knowledge workers to be capable of answering
> their own
> >     >>>> questions by querying, exploring and visualizing data.
> >     >>>>
> >     >>>> The entire business intelligence industry has pivoted from a
> model of
> >     >>>> centralized top-down platforms driven by IT organizations to
> >     >> self-service
> >     >>>> analytics and agile workflows by any user.  This shift unblocks
> >     >>>> centralized
> >     >>>> service bottlenecks for creating data visualizations while also
> creating
> >     >>>> an
> >     >>>> environment that is iterative and fast-moving.  This means that
> business
> >     >>>> intelligence software must also be easy and delightful to use.
> >     >>>> Self-service analytics doesn’t mean that admin and governance
> features
> >     >> are
> >     >>>> not needed.
> >     >>>> Modern BI tools provide fine-grain access controls and auditing
> >     >>>> capabilities to understand how data is being used.  Superset is
> a
> >     >> solution
> >     >>>> that delivers on all of these vectors.
> >     >>>>
> >     >>>> The technology stack is also constantly morphing - vendors are
> >     >> struggling
> >     >>>> to provide cheap, quick and easy solutions to access data.
> Business
> >     >>>> intelligence users are finding existing solutions lacking as
> these
> >     >>>> software
> >     >>>> products either disregard or react slowly to recent
> game-changing
> >     >>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache
> Kylin, d3.js,
> >     >>>> React.js and iPython’s Jupyter for instance.
> >     >>>>
> >     >>>> == Rationale ==
> >     >>>> Business intelligence is more relevant today than at any other
> point in
> >     >>>> history.  Organizations are currently very limited in options
> for open
> >     >>>> source data visualization solutions, especially solutions that
> are both
> >     >>>> self-service and enterprise-ready.  Every company informing
> their
> >     >>>> decisions
> >     >>>> with data needs a BI tool.
> >     >>>>
> >     >>>> We believe that Superset will be a strong compliment to
> existing Apache
> >     >>>> Software Foundation technologies by offering scalable user
> interactions
> >     >> to
> >     >>>> distributed storage and computation solutions.  Users will
> often find
> >     >> that
> >     >>>> Superset can act as a catalyst for tooling that can visualize
> the
> >     >>>> byproduct
> >     >>>> of data and computation infrastructure.
> >     >>>>
> >     >>>> Superset has many key design elements that help fill a gap in
> current
> >     >>>> solutions for organizations:
> >     >>>> * Easy, low friction access to data through a simple, web-based
> data
> >     >>>> exploration interface.  Composing charts and dashboards are
> intuitive.
> >     >>>> Eliminating the need to write code or SQL empowers anyone to
> use it.
> >     >>>> * Access to a wide array of rich, interactive data
> visualization types.
> >     >>>> * Enterprise-ready: Integration with different authentication
> >     >> mechanisms
> >     >>>> and granular permissions centered around actions and data
> access.
> >     >>>> * Realtime & fast: Superset provides realtime analytics at the
> speed of
> >     >>>> thought on very large datasets when integrated with Druid.io.
> >     >>>> * Broad data access: Consume data out of any SQL-speaking
> relational
> >     >>>> database.
> >     >>>> * Extensible: Can be extended to talk to many noSQL databases
> like
> >     >> Apache
> >     >>>> Drill, Elastic Search, and other popular database engines.
> >     >>>> * Fast loading dashboards with configurable web-scale caching.
> >     >>>> * Plug-in framework that enables organizations to build custom
> >     >> analytical
> >     >>>> applications with new UI/UX interfaces.
> >     >>>> * SQL Lab, a state-of-the-art SQL IDE that empowers
> SQL-speaking users
> >     >>>> with more flexibility.  SQL Lab integrates with the
> visualization engine
> >     >>>> seamlessly.
> >     >>>>
> >     >>>> == Initial Goals ==
> >     >>>> The initial goals of the Superset project are several-fold:
> >     >>>> * Move the existing codebase to Apache and integrate with the
> Apache
> >     >>>> development process.
> >     >>>> * Redesign the user interface and interaction model for creating
> >     >>>> visualizations/dashboards and connecting to data sources
> >     >>>> * Build robust support for security and governance of the tool
> >     >> including
> >     >>>> popular authorization modules (including Apache Ranger and
> Apache
> >     >> Sentry)
> >     >>>> and a more sophisticated permissions system
> >     >>>> * Grow the extensibility of the project both in terms of
> enhanced
> >     >>>> connectivity to NoSQL-based data sources and creating a plug-in
> >     >> framework
> >     >>>> that enables organizations to build custom analytical
> applications which
> >     >>>> require a new UI/UX
> >     >>>>
> >     >>>> == Current Status ==
> >     >>>> By many standards, Superset is already a successful open source
> project.
> >     >>>> As
> >     >>>> of March 2017, Superset is officially used in production at
> about a
> >     >> dozen
> >     >>>> companies, has received contributions from over one hundred
> contributors
> >     >>>> on
> >     >>>> Github, 1500+ forks, and 12k+ stars.
> >     >>>>
> >     >>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
> >     >>>> significant contributions, and expressed their commitment to the
> >     >> project.
> >     >>>> The product is feature complete and has been viable for months.
> It
> >     >> already
> >     >>>> serves as the main interface for consuming data at many
> companies of
> >     >>>> different sizes.
> >     >>>>
> >     >>>> While the product is usable, there’s room for improvement
> across the
> >     >>>> board,
> >     >>>> starting with providing a smoother user experience around
> content
> >     >>>> creation,
> >     >>>> making sure all features work out-of-the-box on more platforms
> and
> >     >>>> databases, providing better user training guides and videos,
> having a
> >     >>>> predictable release process, and increasing the overall quality
> of the
> >     >>>> Superset releases.
> >     >>>>
> >     >>>> === Meritocracy ===
> >     >>>> We plan to invest in supporting a meritocracy. We will discuss
> the
> >     >>>> requirements in an open forum. Several companies have expressed
> interest
> >     >>>> in
> >     >>>> this project, and we intend to invite additional developers to
> >     >>>> participate.
> >     >>>> We will encourage and monitor community participation so that
> privileges
> >     >>>> can be extended to those that contribute.
> >     >>>>
> >     >>>> === Community ===
> >     >>>> The need for an enterprise-ready data visualization and
> exploration
> >     >>>> platform in the open source community is tremendous.  While
> Superset is
> >     >>>> fairly well known, recognized and used within the Druid.io
> community,
> >     >>>> adoption is currently limited outside of that niche. There is a
> huge
> >     >>>> opportunity to grow the community to hundreds if not thousands
> of
> >     >>>> organizations, and we are hoping that embracing “the Apache
> way” will
> >     >>>> accelerate the growth of our community.
> >     >>>>
> >     >>>> We have already been active at seeking and inviting
> contributions, and
> >     >> are
> >     >>>> planning to scale the project by investing time and growing the
> support
> >     >>>> structure to grow the community.
> >     >>>>
> >     >>>> === Core Developers ===
> >     >>>> The initial committers for Superset include experienced full
> stack,
> >     >>>> front-end and data engineers:
> >     >>>> * Maxime Beauchemin (Airbnb)
> >     >>>> * Alanna Scott (Airbnb)
> >     >>>> * Bogdan Kyryliuk (Airbnb)
> >     >>>> * Vera Liu  (Airbnb)
> >     >>>> * Jeff Feng (Airbnb)
> >     >>>> * Ashutosh Chauhan (Hortonworks)
> >     >>>> * Nishant Bangarwa (Hortonworks)
> >     >>>> * Slim Bouguerra (Hortonworks)
> >     >>>> * Priyank Shah (Hortonworks)
> >     >>>> * Sriharsha Chintalapani (Hortonworks)
> >     >>>> * Daniel Dai (Hortonworks)
> >     >>>>
> >     >>>> We realize that additional employer diversity is needed, and we
> will
> >     >> work
> >     >>>> aggressively to recruit developers from additional companies.
> >     >>>>
> >     >>>> === Alignment ===
> >     >>>> The initial committers strongly believe that a system for
> interactive
> >     >>>> visualization of data will gain broader adoption as an open
> source,
> >     >>>> community driven project, where the community can contribute
> not only to
> >     >>>> the core components, but also to a growing collection of
> connectors,
> >     >>>> visualizations and improving integration a all potential data
> sources.
> >     >>>> Superset already integrates closely with Apache Hive, the Hive
> >     >> metastore,
> >     >>>> as well as most SQL-speaking databases found in modern data
> ecosystems.
> >     >>>>
> >     >>>> == Known Risks ==
> >     >>>>
> >     >>>> === Orphaned Products ===
> >     >>>> Superset is a vital component for both visualizing, accessing
> and
> >     >>>> democratizing data at Airbnb.  Also at Hortonworks, Superset is
> a core
> >     >>>> component of the DataFlow product offering.  Thus, the risk of
> the
> >     >> project
> >     >>>> being orphaned is relatively low.  The project could be at risk
> if
> >     >> Airbnb
> >     >>>> changes their approach for democratizing data or if Hortonworks
> changes
> >     >>>> their strategy in the market.  In such an event, the committers
> plan to
> >     >>>> continue working on the project on their own time, thought the
> progress
> >     >>>> will likely be slower.  We plan to mitigate this risk by
> recruiting
> >     >>>> additional committers.
> >     >>>>
> >     >>>> === Inexperience with Open Source ===
> >     >>>> The initial committers include veteran Apache members
> (committers and
> >     >> PPMC
> >     >>>> members) and other developers who have varying degrees of
> experience
> >     >> with
> >     >>>> open source projects. All have been involved with source code
> that has
> >     >>>> been
> >     >>>> released under an open source license, and several also have
> experience
> >     >>>> developing code with an open source development process.
> >     >>>>
> >     >>>> === Homogenous Developers ===
> >     >>>> The initial committers are employed by Airbnb Inc. and
> Hortonworks. We
> >     >> are
> >     >>>> committed to recruiting additional committers from other
> companies.
> >     >>>>
> >     >>>> === Reliance on Salaried Developers ===
> >     >>>> It is expected that Superset development will occur on both
> salaried
> >     >> time
> >     >>>> and on volunteer time, after hours. The majority of initial
> committers
> >     >> are
> >     >>>> paid by their employer to contribute to this project. However,
> they are
> >     >>>> all
> >     >>>> passionate about the project, and we are confident that the
> project will
> >     >>>> continue even if no salaried developers contribute to the
> project. We
> >     >> are
> >     >>>> committed to recruiting additional committers including
> non-salaried
> >     >>>> developers.
> >     >>>>
> >     >>>> === Relationships with Other Apache Products ===
> >     >>>> To the knowledge of the Initial Committers, there are no direct
> >     >>>> competitors
> >     >>>> to Superset within the Apache Software Foundation.  That said,
> Apache
> >     >>>> Zeppelin is an indirect competitor, but it solves a different
> use case.
> >     >>>>
> >     >>>> Apache Zeppelin is a web-based notebook that enables
> interactive data
> >     >>>> analytics. It enables the creation of beautiful data-driven,
> interactive
> >     >>>> and collaborative documents with SQL, Scala and more.  Although
> a user
> >     >> can
> >     >>>> create data visualizations using this project, it leverages a
> notebook
> >     >>>> style user interfaces and it is geared towards the Spark
> community where
> >     >>>> Scala and SQL co-exist
> >     >>>>
> >     >>>> We look forward to collaborating with those communities, as
> well as
> >     >> other
> >     >>>> Apache communities.
> >     >>>>
> >     >>>> === An Excessive Fascination with the Apache Brand ===
> >     >>>> Superset is solving two huge challenges:
> >     >>>> The challenge of enabling every knowledge worker to make data
> informed
> >     >>>> decisions, particularly those who are not deeply skilled at
> writing SQL.
> >     >>>> The challenge of visualizing huge amounts of data interactively
> and in
> >     >>>> real-time
> >     >>>>
> >     >>>> Superset was first developed as a data visualization solution
> for
> >     >> Druid.io
> >     >>>> as a way to visualize billions of rows of data.  Since then,
> usage of
> >     >>>> Superset has expanded to address data visualization use cases
> across SQL
> >     >>>> speaking data sources as well.
> >     >>>>
> >     >>>> Our rationale for developing Superset as an Apache project is
> detailed
> >     >> in
> >     >>>> the Rationale Section.  We believe that the Apache brand and
> community
> >     >>>> process will help us attract more contributors to this project,
> and help
> >     >>>> grow the footprint of the project through usage at other
> organizations
> >     >> and
> >     >>>> within other applications.  Establishing consensus among users
> and
> >     >>>> developers will result in a more valuable tool for everyone.
> >     >>>>
> >     >>>> == Documentation ==
> >     >>>> References to further reading material:
> >     >>>> * [[http://airbnb.io/superset/|Superset Documentation]]
> >     >>>> * [[
> >     >>>> https://medium.com/airbnb-engineering/caravel-airbnb-s-data-
> >     >>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
> >     >>>> Post:  Superset: Airbnb’s Data Exploration Platform]]
> >     >>>> * [[
> >     >>>> https://medium.com/airbnb-engineering/superset-scaling-data-
> >     >>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.
> a505zvb1t|Blog
> >     >>>> Post:  Superset: Scaling Data Access & Visual Insights at
> Airbnb]]
> >     >>>>
> >     >>>> == Initial Source ==
> >     >>>> The origin of the proposed code base can be found at
> >     >>>> https://github.com/airbnb/superset.  The code base is
> primarily in
> >     >>>> Python.
> >     >>>>
> >     >>>> == Source and Intellectual Property Submission Plan ==
> >     >>>> We do not expect any complications for the submission of the
> Superset
> >     >> code
> >     >>>> base.  Our code is already in Github and there is only a single
> code
> >     >> base.
> >     >>>>
> >     >>>> == External Dependencies ==
> >     >>>> List of Python packages, from the Python Package Index (Pypi):
> >     >>>>
> >     >>>> * boto3
> >     >>>> * celery
> >     >>>> * cryptography
> >     >>>> * flask-appbuilder
> >     >>>> * flask-cache
> >     >>>> * flask-migrate
> >     >>>> * flask-script
> >     >>>> * flask-sqlalchemy
> >     >>>> * flask-testing
> >     >>>> * humanize
> >     >>>> * gunicorn
> >     >>>> * markdown
> >     >>>> * pandas
> >     >>>> * parsedatetime
> >     >>>> * pydruid
> >     >>>> * PyHive
> >     >>>> * python-dateutil
> >     >>>> * requests
> >     >>>> * simplejson
> >     >>>> * six
> >     >>>> * sqlalchemy
> >     >>>> * sqlalchemy-utils
> >     >>>> * sqlparse
> >     >>>> * thrift
> >     >>>> * thrift-sasl
> >     >>>> * werkzeug
> >     >>>>
> >     >>>> List of Javascript packages, from NPM:
> >     >>>> * autobind-decorator
> >     >>>> * bootstrap
> >     >>>> * bootstrap-datepicker
> >     >>>> * brace
> >     >>>> * brfs
> >     >>>> * cal-heatmap
> >     >>>> * classnames
> >     >>>> * d3
> >     >>>> * d3-cloud
> >     >>>> * d3-sankey
> >     >>>> * d3-scale
> >     >>>> * d3-tip
> >     >>>> * datamaps
> >     >>>> * datatables-bootstrap3-plugin
> >     >>>> * datatables.net-bs
> >     >>>> * font-awesome
> >     >>>> * gridster
> >     >>>> * immutability-helper
> >     >>>> * immutable
> >     >>>> * jquery
> >     >>>> * lodash.throttle
> >     >>>> * mapbox-gl
> >     >>>> * moment
> >     >>>> * moments
> >     >>>> * mustache
> >     >>>> * nvd3
> >     >>>> * react
> >     >>>> * react-ace
> >     >>>> * react-bootstrap
> >     >>>> * react-bootstrap-table
> >     >>>> * react-dom
> >     >>>> * react-draggable
> >     >>>> * react-gravatar
> >     >>>> * react-grid-layout
> >     >>>> * react-map-gl
> >     >>>> * react-redux
> >     >>>> * react-resizable
> >     >>>> * react-select
> >     >>>> * react-syntax-highlighter
> >     >>>> * reactable
> >     >>>> * redux
> >     >>>> * redux-localstorage
> >     >>>> * redux-thunk
> >     >>>> * shortid
> >     >>>> * style-loader
> >     >>>> * supercluster
> >     >>>> * topojson
> >     >>>> * victory
> >     >>>> * viewport-mercator-project
> >     >>>>
> >     >>>> == Cryptography ==
> >     >>>> The proposal does not include cryptographic code.
> >     >>>>
> >     >>>> == Required Resources ==
> >     >>>>
> >     >>>> === Mailing List ===
> >     >>>> There is a current mailing list as a Google Group
> “airbnb_superset” that
> >     >>>> we
> >     >>>> are planning on deprecating as the Apache.org become ready to
> serve our
> >     >>>> community.
> >     >>>>
> >     >>>> * superset-private
> >     >>>> * superset-dev
> >     >>>> * superset-user
> >     >>>>
> >     >>>> === Subversion Directory ===
> >     >>>> Git is the preferred source control system.
> >     >>>> http://svn.apache.org/repos/asf/incubator/superset
> >     >>>>
> >     >>>> == Git Repository ==
> >     >>>> Git is the preferred source control system, we’re assuming
> >     >>>> https://github.com/apache/incubator-superset based on the
> naming scheme
> >     >>>>
> >     >>>> == Issue Tracking ==
> >     >>>> JIRA Superset (SUPERSET). If possible, we’d like to use Github
> issues &
> >     >>>> PRs
> >     >>>> to manage our project as much as possible. It’s been said that
> there are
> >     >>>> ways to keep Github’s issues in sync with Jira, allowing us to
> get best
> >     >> of
> >     >>>> both worlds. If that is not possible, we will comply to using
> Jira.
> >     >>>>
> >     >>>> == Other Resources ==
> >     >>>> We currently use a set of Github integrated services that are
> free to
> >     >> the
> >     >>>> open source community, like Travis-ci, Code Climate, Coveralls,
> >     >>>> Landscape.io, Requires.io, david-dm and Gitter. We would like
> to keep
> >     >>>> using
> >     >>>> these services as they allow us to scale contributions and
> optimize our
> >     >>>> development flows. These services require some elevated rights
> on the
> >     >>>> Github repository in order to set up or tune and we would like
> for the
> >     >>>> committers to have the required rights.
> >     >>>>
> >     >>>>
> >     >>>> == Initial Committers ==
> >     >>>>
> >     >>>> * Maxime Beauchemin <[hidden email]> - PPMC &
> Committer
> >     >>>> * Alanna Scott <[hidden email]> - PPMC & Committer
> >     >>>> * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
> >     >>>> * Vera Liu <[hidden email]> - Committer
> >     >>>> * Jeff Feng <[hidden email]> - PPMC & Committer
> >     >>>> * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
> >     >>>> * Nishant Bangarwa <[hidden email]> - PPMC &
> Committer
> >     >>>> * Slim Bouguerra <[hidden email]> - Committer
> >     >>>> * Priyank Shah <[hidden email]> - Committer
> >     >>>> * Harsha Chintalapani <[hidden email]> -
> Committer
> >     >>>> * Daniel Dai <[hidden email]> - Champion & Committer
> >     >>>> * Luke Han <[hidden email]> - Mentor
> >     >>>>
> >     >>>> == Affiliations ==
> >     >>>> The initial committers are employees of Airbnb Inc. and
> Hortonworks.
> >     >>>>
> >     >>>> == Sponsors ==
> >     >>>>
> >     >>>> === Champion ===
> >     >>>> Daniel Dai <[hidden email]>
> >     >>>>
> >     >>>> === Nominated Mentors ===
> >     >>>> * Ashutosh Chauhan <[hidden email]>
> >     >>>> * Luke Han <[hidden email]>
> >     >>>>
> >     >>>> === Sponsoring Entity ===
> >     >>>> Incubator PMC
> >     >>>>
> >     >>>
> >     >>>
> >     >>
> >
> >
> >     ------------------------------------------------------------
> ---------
> >     To unsubscribe, e-mail: [hidden email]
> >     For additional commands, e-mail: [hidden email]
> >
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

P. Taylor Goetz
In reply to this post by Jeff Feng-2
There was a minor change to the proposal in the Incubator wiki, so technically what were voting on and the wiki don't match.

I'm okay with that and don't think it should disrupt the vote. Highlighting the change in this thread should be enough.

+1 (binding)

I would recommend asking for/finding additional mentors to bring the count to >= 3.

-Taylor

> On Apr 23, 2017, at 10:53 AM, Jeff Feng <[hidden email]> wrote:
>
> Dear Apache Incubator Community,
>
> We have updated the Superset proposal
> <https://wiki.apache.org/incubator/SupersetProposal> (copied below) for
> Apache Incubation with an additional mentor (Luke Han - [hidden email]),
> and would like to start a vote thread for acceptance into the incubator.
>
> Our team is excited to share Superset with the Apache community and we hope
> for the your continued support!
>
> Cheers,
> Jeff & the Superset Team
>
>
>
>
> = Superset =
>
> == Abstract ==
> Superset is an enterprise-ready web application for data exploration, data
> visualization and dashboarding.
>
> == Proposal ==
> Superset is business intelligence (BI) software that helps modern
> organizations visualize and interact with their data. Superset enables
> users explore data from a variety of databases, assemble beautiful
> dashboards and share their findings.  Superset works neatly with all modern
> SQL-speaking databases, and integrates with Druid.io to provide real-time,
> interactive, blazing fast data access to large datasets.
>
> == Background ==
> Data is mission critical. To succeed in this era, organizations need to
> provide low-friction, intuitive and interactive access to data. It is
> paramount for knowledge workers to be capable of answering their own
> questions by querying, exploring and visualizing data.
>
> The entire business intelligence industry has pivoted from a model of
> centralized top-down platforms driven by IT organizations to self-service
> analytics and agile workflows by any user.  This shift unblocks centralized
> service bottlenecks for creating data visualizations while also creating an
> environment that is iterative and fast-moving.  This means that business
> intelligence software must also be easy and delightful to use.
> Self-service analytics doesn’t mean that admin and governance features are
> not needed.
> Modern BI tools provide fine-grain access controls and auditing
> capabilities to understand how data is being used.  Superset is a solution
> that delivers on all of these vectors.
>
> The technology stack is also constantly morphing - vendors are struggling
> to provide cheap, quick and easy solutions to access data.  Business
> intelligence users are finding existing solutions lacking as these software
> products either disregard or react slowly to recent game-changing
> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
> React.js and iPython’s Jupyter for instance.
>
> == Rationale ==
> Business intelligence is more relevant today than at any other point in
> history.  Organizations are currently very limited in options for open
> source data visualization solutions, especially solutions that are both
> self-service and enterprise-ready.  Every company informing their decisions
> with data needs a BI tool.
>
> We believe that Superset will be a strong compliment to existing Apache
> Software Foundation technologies by offering scalable user interactions to
> distributed storage and computation solutions.  Users will often find that
> Superset can act as a catalyst for tooling that can visualize the byproduct
> of data and computation infrastructure.
>
> Superset has many key design elements that help fill a gap in current
> solutions for organizations:
> * Easy, low friction access to data through a simple, web-based data
> exploration interface.  Composing charts and dashboards are intuitive.
> Eliminating the need to write code or SQL empowers anyone to use it.
> * Access to a wide array of rich, interactive data visualization types.
> * Enterprise-ready: Integration with different authentication mechanisms
> and granular permissions centered around actions and data access.
> * Realtime & fast: Superset provides realtime analytics at the speed of
> thought on very large datasets when integrated with Druid.io.
> * Broad data access: Consume data out of any SQL-speaking relational
> database.
> * Extensible: Can be extended to talk to many noSQL databases like Apache
> Drill, Elastic Search, and other popular database engines.
> * Fast loading dashboards with configurable web-scale caching.
> * Plug-in framework that enables organizations to build custom analytical
> applications with new UI/UX interfaces.
> * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
> with more flexibility.  SQL Lab integrates with the visualization engine
> seamlessly.
>
> == Initial Goals ==
> The initial goals of the Superset project are several-fold:
> * Move the existing codebase to Apache and integrate with the Apache
> development process.
> * Redesign the user interface and interaction model for creating
> visualizations/dashboards and connecting to data sources
> * Build robust support for security and governance of the tool including
> popular authorization modules (including Apache Ranger and Apache Sentry)
> and a more sophisticated permissions system
> * Grow the extensibility of the project both in terms of enhanced
> connectivity to NoSQL-based data sources and creating a plug-in framework
> that enables organizations to build custom analytical applications which
> require a new UI/UX
>
> == Current Status ==
> By many standards, Superset is already a successful open source project. As
> of March 2017, Superset is officially used in production at about a dozen
> companies, has received contributions from over one hundred contributors on
> Github, 1500+ forks, and 12k+ stars.
>
> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
> significant contributions, and expressed their commitment to the project.
> The product is feature complete and has been viable for months. It already
> serves as the main interface for consuming data at many companies of
> different sizes.
>
> While the product is usable, there’s room for improvement across the board,
> starting with providing a smoother user experience around content creation,
> making sure all features work out-of-the-box on more platforms and
> databases, providing better user training guides and videos, having a
> predictable release process, and increasing the overall quality of the
> Superset releases.
>
> === Meritocracy ===
> We plan to invest in supporting a meritocracy. We will discuss the
> requirements in an open forum. Several companies have expressed interest in
> this project, and we intend to invite additional developers to participate.
> We will encourage and monitor community participation so that privileges
> can be extended to those that contribute.
>
> === Community ===
> The need for an enterprise-ready data visualization and exploration
> platform in the open source community is tremendous.  While Superset is
> fairly well known, recognized and used within the Druid.io community,
> adoption is currently limited outside of that niche. There is a huge
> opportunity to grow the community to hundreds if not thousands of
> organizations, and we are hoping that embracing “the Apache way” will
> accelerate the growth of our community.
>
> We have already been active at seeking and inviting contributions, and are
> planning to scale the project by investing time and growing the support
> structure to grow the community.
>
> === Core Developers ===
> The initial committers for Superset include experienced full stack,
> front-end and data engineers:
> * Maxime Beauchemin (Airbnb)
> * Alanna Scott (Airbnb)
> * Bogdan Kyryliuk (Airbnb)
> * Vera Liu  (Airbnb)
> * Jeff Feng (Airbnb)
> * Ashutosh Chauhan (Hortonworks)
> * Nishant Bangarwa (Hortonworks)
> * Slim Bouguerra (Hortonworks)
> * Priyank Shah (Hortonworks)
> * Sriharsha Chintalapani (Hortonworks)
> * Daniel Dai (Hortonworks)
>
> We realize that additional employer diversity is needed, and we will work
> aggressively to recruit developers from additional companies.
>
> === Alignment ===
> The initial committers strongly believe that a system for interactive
> visualization of data will gain broader adoption as an open source,
> community driven project, where the community can contribute not only to
> the core components, but also to a growing collection of connectors,
> visualizations and improving integration a all potential data sources.
> Superset already integrates closely with Apache Hive, the Hive metastore,
> as well as most SQL-speaking databases found in modern data ecosystems.
>
> == Known Risks ==
>
> === Orphaned Products ===
> Superset is a vital component for both visualizing, accessing and
> democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
> component of the DataFlow product offering.  Thus, the risk of the project
> being orphaned is relatively low.  The project could be at risk if Airbnb
> changes their approach for democratizing data or if Hortonworks changes
> their strategy in the market.  In such an event, the committers plan to
> continue working on the project on their own time, thought the progress
> will likely be slower.  We plan to mitigate this risk by recruiting
> additional committers.
>
> === Inexperience with Open Source ===
> The initial committers include veteran Apache members (committers and PPMC
> members) and other developers who have varying degrees of experience with
> open source projects. All have been involved with source code that has been
> released under an open source license, and several also have experience
> developing code with an open source development process.
>
> === Homogenous Developers ===
> The initial committers are employed by Airbnb Inc. and Hortonworks. We are
> committed to recruiting additional committers from other companies.
>
> === Reliance on Salaried Developers ===
> It is expected that Superset development will occur on both salaried time
> and on volunteer time, after hours. The majority of initial committers are
> paid by their employer to contribute to this project. However, they are all
> passionate about the project, and we are confident that the project will
> continue even if no salaried developers contribute to the project. We are
> committed to recruiting additional committers including non-salaried
> developers.
>
> === Relationships with Other Apache Products ===
> To the knowledge of the Initial Committers, there are no direct competitors
> to Superset within the Apache Software Foundation.  That said, Apache
> Zeppelin is an indirect competitor, but it solves a different use case.
>
> Apache Zeppelin is a web-based notebook that enables interactive data
> analytics. It enables the creation of beautiful data-driven, interactive
> and collaborative documents with SQL, Scala and more.  Although a user can
> create data visualizations using this project, it leverages a notebook
> style user interfaces and it is geared towards the Spark community where
> Scala and SQL co-exist
>
> We look forward to collaborating with those communities, as well as other
> Apache communities.
>
> === An Excessive Fascination with the Apache Brand ===
> Superset is solving two huge challenges:
> The challenge of enabling every knowledge worker to make data informed
> decisions, particularly those who are not deeply skilled at writing SQL.
> The challenge of visualizing huge amounts of data interactively and in
> real-time
>
> Superset was first developed as a data visualization solution for Druid.io
> as a way to visualize billions of rows of data.  Since then, usage of
> Superset has expanded to address data visualization use cases across SQL
> speaking data sources as well.
>
> Our rationale for developing Superset as an Apache project is detailed in
> the Rationale Section.  We believe that the Apache brand and community
> process will help us attract more contributors to this project, and help
> grow the footprint of the project through usage at other organizations and
> within other applications.  Establishing consensus among users and
> developers will result in a more valuable tool for everyone.
>
> == Documentation ==
> References to further reading material:
> * [[http://airbnb.io/superset/|Superset Documentation]]
> * [[
> https://medium.com/airbnb-engineering/caravel-airbnb-s-data-exploration-platform-15a72aa610e5#.npqmmbu25|Blog
> Post:  Superset: Airbnb’s Data Exploration Platform]]
> * [[
> https://medium.com/airbnb-engineering/superset-scaling-data-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog
> Post:  Superset: Scaling Data Access & Visual Insights at Airbnb]]
>
> == Initial Source ==
> The origin of the proposed code base can be found at
> https://github.com/airbnb/superset.  The code base is primarily in Python.
>
> == Source and Intellectual Property Submission Plan ==
> We do not expect any complications for the submission of the Superset code
> base.  Our code is already in Github and there is only a single code base.
>
> == External Dependencies ==
> List of Python packages, from the Python Package Index (Pypi):
>
> * boto3
> * celery
> * cryptography
> * flask-appbuilder
> * flask-cache
> * flask-migrate
> * flask-script
> * flask-sqlalchemy
> * flask-testing
> * humanize
> * gunicorn
> * markdown
> * pandas
> * parsedatetime
> * pydruid
> * PyHive
> * python-dateutil
> * requests
> * simplejson
> * six
> * sqlalchemy
> * sqlalchemy-utils
> * sqlparse
> * thrift
> * thrift-sasl
> * werkzeug
>
> List of Javascript packages, from NPM:
> * autobind-decorator
> * bootstrap
> * bootstrap-datepicker
> * brace
> * brfs
> * cal-heatmap
> * classnames
> * d3
> * d3-cloud
> * d3-sankey
> * d3-scale
> * d3-tip
> * datamaps
> * datatables-bootstrap3-plugin
> * datatables.net-bs
> * font-awesome
> * gridster
> * immutability-helper
> * immutable
> * jquery
> * lodash.throttle
> * mapbox-gl
> * moment
> * moments
> * mustache
> * nvd3
> * react
> * react-ace
> * react-bootstrap
> * react-bootstrap-table
> * react-dom
> * react-draggable
> * react-gravatar
> * react-grid-layout
> * react-map-gl
> * react-redux
> * react-resizable
> * react-select
> * react-syntax-highlighter
> * reactable
> * redux
> * redux-localstorage
> * redux-thunk
> * shortid
> * style-loader
> * supercluster
> * topojson
> * victory
> * viewport-mercator-project
>
> == Cryptography ==
> The proposal does not include cryptographic code.
>
> == Required Resources ==
>
> === Mailing List ===
> There is a current mailing list as a Google Group “airbnb_superset” that we
> are planning on deprecating as the Apache.org become ready to serve our
> community.
>
> * superset-private
> * superset-dev
> * superset-user
>
> === Subversion Directory ===
> Git is the preferred source control system.
> http://svn.apache.org/repos/asf/incubator/superset
>
> == Git Repository ==
> Git is the preferred source control system, we’re assuming
> https://github.com/apache/incubator-superset based on the naming scheme
>
> == Issue Tracking ==
> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues & PRs
> to manage our project as much as possible. It’s been said that there are
> ways to keep Github’s issues in sync with Jira, allowing us to get best of
> both worlds. If that is not possible, we will comply to using Jira.
>
> == Other Resources ==
> We currently use a set of Github integrated services that are free to the
> open source community, like Travis-ci, Code Climate, Coveralls,
> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep using
> these services as they allow us to scale contributions and optimize our
> development flows. These services require some elevated rights on the
> Github repository in order to set up or tune and we would like for the
> committers to have the required rights.
>
>
> == Initial Committers ==
>
> * Maxime Beauchemin <[hidden email]> - PPMC & Committer
> * Alanna Scott <[hidden email]> - PPMC & Committer
> * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
> * Vera Liu <[hidden email]> - Committer
> * Jeff Feng <[hidden email]> - PPMC & Committer
> * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
> * Nishant Bangarwa <[hidden email]> - PPMC & Committer
> * Slim Bouguerra <[hidden email]> - Committer
> * Priyank Shah <[hidden email]> - Committer
> * Harsha Chintalapani <[hidden email]> - Committer
> * Daniel Dai <[hidden email]> - Champion & Committer
> * Luke Han <[hidden email]> - Mentor
>
> == Affiliations ==
> The initial committers are employees of Airbnb Inc. and Hortonworks.
>
> == Sponsors ==
>
> === Champion ===
> Daniel Dai <[hidden email]>
>
> === Nominated Mentors ===
> * Ashutosh Chauhan <[hidden email]>
> * Luke Han <[hidden email]>
>
> === Sponsoring Entity ===
> Incubator PMC

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Naresh Agarwal-2
In reply to this post by Jeff Feng-2
+1 (non-binding).

Thanks
Naresh Agarwal

On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <[hidden email]> wrote:

> +1 (binding)
>
>
>
> On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <[hidden email]> wrote:
>
> > +1 (binding)
> >
> > On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey
> > <[hidden email]> wrote:
> > > +1 (binding)
> > >
> > > On 4/25/17, 1:27 PM, "Julian Hyde" <[hidden email]> wrote:
> > >
> > >     +1 binding
> > >
> > >     > On Apr 25, 2017, at 12:48 PM, moon soo Lee <[hidden email]>
> > wrote:
> > >     >
> > >     > +1 (non-binding)
> > >     >
> > >     > On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <
> > [hidden email]>
> > >     > wrote:
> > >     >
> > >     >> +1 (binding)
> > >     >>
> > >     >> Thanks,
> > >     >> Ashutosh
> > >     >>
> > >     >> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <[hidden email]>
> > wrote:
> > >     >>
> > >     >>> +1 binding
> > >     >>>
> > >     >>> Love to see Superset to be new incubator project.
> > >     >>>
> > >     >>>
> > >     >>> Best Regards!
> > >     >>> ---------------------
> > >     >>>
> > >     >>> Luke Han
> > >     >>>
> > >     >>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <
> [hidden email]>
> > wrote:
> > >     >>>
> > >     >>>> Dear Apache Incubator Community,
> > >     >>>>
> > >     >>>> We have updated the Superset proposal
> > >     >>>> <https://wiki.apache.org/incubator/SupersetProposal> (copied
> > below) for
> > >     >>>>
> > >     >>>> Apache Incubation with an additional mentor (Luke Han -
> > >     >>>> [hidden email]),
> > >     >>>> and would like to start a vote thread for acceptance into the
> > incubator.
> > >     >>>>
> > >     >>>> Our team is excited to share Superset with the Apache
> community
> > and we
> > >     >>>> hope
> > >     >>>> for the your continued support!
> > >     >>>>
> > >     >>>> Cheers,
> > >     >>>> Jeff & the Superset Team
> > >     >>>>
> > >     >>>>
> > >     >>>>
> > >     >>>>
> > >     >>>> = Superset =
> > >     >>>>
> > >     >>>> == Abstract ==
> > >     >>>> Superset is an enterprise-ready web application for data
> > exploration,
> > >     >> data
> > >     >>>> visualization and dashboarding.
> > >     >>>>
> > >     >>>> == Proposal ==
> > >     >>>> Superset is business intelligence (BI) software that helps
> > modern
> > >     >>>> organizations visualize and interact with their data. Superset
> > enables
> > >     >>>> users explore data from a variety of databases, assemble
> > beautiful
> > >     >>>> dashboards and share their findings.  Superset works neatly
> > with all
> > >     >>>> modern
> > >     >>>> SQL-speaking databases, and integrates with Druid.io to
> provide
> > >     >> real-time,
> > >     >>>> interactive, blazing fast data access to large datasets.
> > >     >>>>
> > >     >>>> == Background ==
> > >     >>>> Data is mission critical. To succeed in this era,
> organizations
> > need to
> > >     >>>> provide low-friction, intuitive and interactive access to
> data.
> > It is
> > >     >>>> paramount for knowledge workers to be capable of answering
> > their own
> > >     >>>> questions by querying, exploring and visualizing data.
> > >     >>>>
> > >     >>>> The entire business intelligence industry has pivoted from a
> > model of
> > >     >>>> centralized top-down platforms driven by IT organizations to
> > >     >> self-service
> > >     >>>> analytics and agile workflows by any user.  This shift
> unblocks
> > >     >>>> centralized
> > >     >>>> service bottlenecks for creating data visualizations while
> also
> > creating
> > >     >>>> an
> > >     >>>> environment that is iterative and fast-moving.  This means
> that
> > business
> > >     >>>> intelligence software must also be easy and delightful to use.
> > >     >>>> Self-service analytics doesn’t mean that admin and governance
> > features
> > >     >> are
> > >     >>>> not needed.
> > >     >>>> Modern BI tools provide fine-grain access controls and
> auditing
> > >     >>>> capabilities to understand how data is being used.  Superset
> is
> > a
> > >     >> solution
> > >     >>>> that delivers on all of these vectors.
> > >     >>>>
> > >     >>>> The technology stack is also constantly morphing - vendors are
> > >     >> struggling
> > >     >>>> to provide cheap, quick and easy solutions to access data.
> > Business
> > >     >>>> intelligence users are finding existing solutions lacking as
> > these
> > >     >>>> software
> > >     >>>> products either disregard or react slowly to recent
> > game-changing
> > >     >>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache
> > Kylin, d3.js,
> > >     >>>> React.js and iPython’s Jupyter for instance.
> > >     >>>>
> > >     >>>> == Rationale ==
> > >     >>>> Business intelligence is more relevant today than at any other
> > point in
> > >     >>>> history.  Organizations are currently very limited in options
> > for open
> > >     >>>> source data visualization solutions, especially solutions that
> > are both
> > >     >>>> self-service and enterprise-ready.  Every company informing
> > their
> > >     >>>> decisions
> > >     >>>> with data needs a BI tool.
> > >     >>>>
> > >     >>>> We believe that Superset will be a strong compliment to
> > existing Apache
> > >     >>>> Software Foundation technologies by offering scalable user
> > interactions
> > >     >> to
> > >     >>>> distributed storage and computation solutions.  Users will
> > often find
> > >     >> that
> > >     >>>> Superset can act as a catalyst for tooling that can visualize
> > the
> > >     >>>> byproduct
> > >     >>>> of data and computation infrastructure.
> > >     >>>>
> > >     >>>> Superset has many key design elements that help fill a gap in
> > current
> > >     >>>> solutions for organizations:
> > >     >>>> * Easy, low friction access to data through a simple,
> web-based
> > data
> > >     >>>> exploration interface.  Composing charts and dashboards are
> > intuitive.
> > >     >>>> Eliminating the need to write code or SQL empowers anyone to
> > use it.
> > >     >>>> * Access to a wide array of rich, interactive data
> > visualization types.
> > >     >>>> * Enterprise-ready: Integration with different authentication
> > >     >> mechanisms
> > >     >>>> and granular permissions centered around actions and data
> > access.
> > >     >>>> * Realtime & fast: Superset provides realtime analytics at the
> > speed of
> > >     >>>> thought on very large datasets when integrated with Druid.io.
> > >     >>>> * Broad data access: Consume data out of any SQL-speaking
> > relational
> > >     >>>> database.
> > >     >>>> * Extensible: Can be extended to talk to many noSQL databases
> > like
> > >     >> Apache
> > >     >>>> Drill, Elastic Search, and other popular database engines.
> > >     >>>> * Fast loading dashboards with configurable web-scale caching.
> > >     >>>> * Plug-in framework that enables organizations to build custom
> > >     >> analytical
> > >     >>>> applications with new UI/UX interfaces.
> > >     >>>> * SQL Lab, a state-of-the-art SQL IDE that empowers
> > SQL-speaking users
> > >     >>>> with more flexibility.  SQL Lab integrates with the
> > visualization engine
> > >     >>>> seamlessly.
> > >     >>>>
> > >     >>>> == Initial Goals ==
> > >     >>>> The initial goals of the Superset project are several-fold:
> > >     >>>> * Move the existing codebase to Apache and integrate with the
> > Apache
> > >     >>>> development process.
> > >     >>>> * Redesign the user interface and interaction model for
> creating
> > >     >>>> visualizations/dashboards and connecting to data sources
> > >     >>>> * Build robust support for security and governance of the tool
> > >     >> including
> > >     >>>> popular authorization modules (including Apache Ranger and
> > Apache
> > >     >> Sentry)
> > >     >>>> and a more sophisticated permissions system
> > >     >>>> * Grow the extensibility of the project both in terms of
> > enhanced
> > >     >>>> connectivity to NoSQL-based data sources and creating a
> plug-in
> > >     >> framework
> > >     >>>> that enables organizations to build custom analytical
> > applications which
> > >     >>>> require a new UI/UX
> > >     >>>>
> > >     >>>> == Current Status ==
> > >     >>>> By many standards, Superset is already a successful open
> source
> > project.
> > >     >>>> As
> > >     >>>> of March 2017, Superset is officially used in production at
> > about a
> > >     >> dozen
> > >     >>>> companies, has received contributions from over one hundred
> > contributors
> > >     >>>> on
> > >     >>>> Github, 1500+ forks, and 12k+ stars.
> > >     >>>>
> > >     >>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have
> made
> > >     >>>> significant contributions, and expressed their commitment to
> the
> > >     >> project.
> > >     >>>> The product is feature complete and has been viable for
> months.
> > It
> > >     >> already
> > >     >>>> serves as the main interface for consuming data at many
> > companies of
> > >     >>>> different sizes.
> > >     >>>>
> > >     >>>> While the product is usable, there’s room for improvement
> > across the
> > >     >>>> board,
> > >     >>>> starting with providing a smoother user experience around
> > content
> > >     >>>> creation,
> > >     >>>> making sure all features work out-of-the-box on more platforms
> > and
> > >     >>>> databases, providing better user training guides and videos,
> > having a
> > >     >>>> predictable release process, and increasing the overall
> quality
> > of the
> > >     >>>> Superset releases.
> > >     >>>>
> > >     >>>> === Meritocracy ===
> > >     >>>> We plan to invest in supporting a meritocracy. We will discuss
> > the
> > >     >>>> requirements in an open forum. Several companies have
> expressed
> > interest
> > >     >>>> in
> > >     >>>> this project, and we intend to invite additional developers to
> > >     >>>> participate.
> > >     >>>> We will encourage and monitor community participation so that
> > privileges
> > >     >>>> can be extended to those that contribute.
> > >     >>>>
> > >     >>>> === Community ===
> > >     >>>> The need for an enterprise-ready data visualization and
> > exploration
> > >     >>>> platform in the open source community is tremendous.  While
> > Superset is
> > >     >>>> fairly well known, recognized and used within the Druid.io
> > community,
> > >     >>>> adoption is currently limited outside of that niche. There is
> a
> > huge
> > >     >>>> opportunity to grow the community to hundreds if not thousands
> > of
> > >     >>>> organizations, and we are hoping that embracing “the Apache
> > way” will
> > >     >>>> accelerate the growth of our community.
> > >     >>>>
> > >     >>>> We have already been active at seeking and inviting
> > contributions, and
> > >     >> are
> > >     >>>> planning to scale the project by investing time and growing
> the
> > support
> > >     >>>> structure to grow the community.
> > >     >>>>
> > >     >>>> === Core Developers ===
> > >     >>>> The initial committers for Superset include experienced full
> > stack,
> > >     >>>> front-end and data engineers:
> > >     >>>> * Maxime Beauchemin (Airbnb)
> > >     >>>> * Alanna Scott (Airbnb)
> > >     >>>> * Bogdan Kyryliuk (Airbnb)
> > >     >>>> * Vera Liu  (Airbnb)
> > >     >>>> * Jeff Feng (Airbnb)
> > >     >>>> * Ashutosh Chauhan (Hortonworks)
> > >     >>>> * Nishant Bangarwa (Hortonworks)
> > >     >>>> * Slim Bouguerra (Hortonworks)
> > >     >>>> * Priyank Shah (Hortonworks)
> > >     >>>> * Sriharsha Chintalapani (Hortonworks)
> > >     >>>> * Daniel Dai (Hortonworks)
> > >     >>>>
> > >     >>>> We realize that additional employer diversity is needed, and
> we
> > will
> > >     >> work
> > >     >>>> aggressively to recruit developers from additional companies.
> > >     >>>>
> > >     >>>> === Alignment ===
> > >     >>>> The initial committers strongly believe that a system for
> > interactive
> > >     >>>> visualization of data will gain broader adoption as an open
> > source,
> > >     >>>> community driven project, where the community can contribute
> > not only to
> > >     >>>> the core components, but also to a growing collection of
> > connectors,
> > >     >>>> visualizations and improving integration a all potential data
> > sources.
> > >     >>>> Superset already integrates closely with Apache Hive, the Hive
> > >     >> metastore,
> > >     >>>> as well as most SQL-speaking databases found in modern data
> > ecosystems.
> > >     >>>>
> > >     >>>> == Known Risks ==
> > >     >>>>
> > >     >>>> === Orphaned Products ===
> > >     >>>> Superset is a vital component for both visualizing, accessing
> > and
> > >     >>>> democratizing data at Airbnb.  Also at Hortonworks, Superset
> is
> > a core
> > >     >>>> component of the DataFlow product offering.  Thus, the risk of
> > the
> > >     >> project
> > >     >>>> being orphaned is relatively low.  The project could be at
> risk
> > if
> > >     >> Airbnb
> > >     >>>> changes their approach for democratizing data or if
> Hortonworks
> > changes
> > >     >>>> their strategy in the market.  In such an event, the
> committers
> > plan to
> > >     >>>> continue working on the project on their own time, thought the
> > progress
> > >     >>>> will likely be slower.  We plan to mitigate this risk by
> > recruiting
> > >     >>>> additional committers.
> > >     >>>>
> > >     >>>> === Inexperience with Open Source ===
> > >     >>>> The initial committers include veteran Apache members
> > (committers and
> > >     >> PPMC
> > >     >>>> members) and other developers who have varying degrees of
> > experience
> > >     >> with
> > >     >>>> open source projects. All have been involved with source code
> > that has
> > >     >>>> been
> > >     >>>> released under an open source license, and several also have
> > experience
> > >     >>>> developing code with an open source development process.
> > >     >>>>
> > >     >>>> === Homogenous Developers ===
> > >     >>>> The initial committers are employed by Airbnb Inc. and
> > Hortonworks. We
> > >     >> are
> > >     >>>> committed to recruiting additional committers from other
> > companies.
> > >     >>>>
> > >     >>>> === Reliance on Salaried Developers ===
> > >     >>>> It is expected that Superset development will occur on both
> > salaried
> > >     >> time
> > >     >>>> and on volunteer time, after hours. The majority of initial
> > committers
> > >     >> are
> > >     >>>> paid by their employer to contribute to this project. However,
> > they are
> > >     >>>> all
> > >     >>>> passionate about the project, and we are confident that the
> > project will
> > >     >>>> continue even if no salaried developers contribute to the
> > project. We
> > >     >> are
> > >     >>>> committed to recruiting additional committers including
> > non-salaried
> > >     >>>> developers.
> > >     >>>>
> > >     >>>> === Relationships with Other Apache Products ===
> > >     >>>> To the knowledge of the Initial Committers, there are no
> direct
> > >     >>>> competitors
> > >     >>>> to Superset within the Apache Software Foundation.  That said,
> > Apache
> > >     >>>> Zeppelin is an indirect competitor, but it solves a different
> > use case.
> > >     >>>>
> > >     >>>> Apache Zeppelin is a web-based notebook that enables
> > interactive data
> > >     >>>> analytics. It enables the creation of beautiful data-driven,
> > interactive
> > >     >>>> and collaborative documents with SQL, Scala and more.
> Although
> > a user
> > >     >> can
> > >     >>>> create data visualizations using this project, it leverages a
> > notebook
> > >     >>>> style user interfaces and it is geared towards the Spark
> > community where
> > >     >>>> Scala and SQL co-exist
> > >     >>>>
> > >     >>>> We look forward to collaborating with those communities, as
> > well as
> > >     >> other
> > >     >>>> Apache communities.
> > >     >>>>
> > >     >>>> === An Excessive Fascination with the Apache Brand ===
> > >     >>>> Superset is solving two huge challenges:
> > >     >>>> The challenge of enabling every knowledge worker to make data
> > informed
> > >     >>>> decisions, particularly those who are not deeply skilled at
> > writing SQL.
> > >     >>>> The challenge of visualizing huge amounts of data
> interactively
> > and in
> > >     >>>> real-time
> > >     >>>>
> > >     >>>> Superset was first developed as a data visualization solution
> > for
> > >     >> Druid.io
> > >     >>>> as a way to visualize billions of rows of data.  Since then,
> > usage of
> > >     >>>> Superset has expanded to address data visualization use cases
> > across SQL
> > >     >>>> speaking data sources as well.
> > >     >>>>
> > >     >>>> Our rationale for developing Superset as an Apache project is
> > detailed
> > >     >> in
> > >     >>>> the Rationale Section.  We believe that the Apache brand and
> > community
> > >     >>>> process will help us attract more contributors to this
> project,
> > and help
> > >     >>>> grow the footprint of the project through usage at other
> > organizations
> > >     >> and
> > >     >>>> within other applications.  Establishing consensus among users
> > and
> > >     >>>> developers will result in a more valuable tool for everyone.
> > >     >>>>
> > >     >>>> == Documentation ==
> > >     >>>> References to further reading material:
> > >     >>>> * [[http://airbnb.io/superset/|Superset Documentation]]
> > >     >>>> * [[
> > >     >>>> https://medium.com/airbnb-engineering/caravel-airbnb-s-data-
> > >     >>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
> > >     >>>> Post:  Superset: Airbnb’s Data Exploration Platform]]
> > >     >>>> * [[
> > >     >>>> https://medium.com/airbnb-engineering/superset-scaling-data-
> > >     >>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.
> > a505zvb1t|Blog
> > >     >>>> Post:  Superset: Scaling Data Access & Visual Insights at
> > Airbnb]]
> > >     >>>>
> > >     >>>> == Initial Source ==
> > >     >>>> The origin of the proposed code base can be found at
> > >     >>>> https://github.com/airbnb/superset.  The code base is
> > primarily in
> > >     >>>> Python.
> > >     >>>>
> > >     >>>> == Source and Intellectual Property Submission Plan ==
> > >     >>>> We do not expect any complications for the submission of the
> > Superset
> > >     >> code
> > >     >>>> base.  Our code is already in Github and there is only a
> single
> > code
> > >     >> base.
> > >     >>>>
> > >     >>>> == External Dependencies ==
> > >     >>>> List of Python packages, from the Python Package Index (Pypi):
> > >     >>>>
> > >     >>>> * boto3
> > >     >>>> * celery
> > >     >>>> * cryptography
> > >     >>>> * flask-appbuilder
> > >     >>>> * flask-cache
> > >     >>>> * flask-migrate
> > >     >>>> * flask-script
> > >     >>>> * flask-sqlalchemy
> > >     >>>> * flask-testing
> > >     >>>> * humanize
> > >     >>>> * gunicorn
> > >     >>>> * markdown
> > >     >>>> * pandas
> > >     >>>> * parsedatetime
> > >     >>>> * pydruid
> > >     >>>> * PyHive
> > >     >>>> * python-dateutil
> > >     >>>> * requests
> > >     >>>> * simplejson
> > >     >>>> * six
> > >     >>>> * sqlalchemy
> > >     >>>> * sqlalchemy-utils
> > >     >>>> * sqlparse
> > >     >>>> * thrift
> > >     >>>> * thrift-sasl
> > >     >>>> * werkzeug
> > >     >>>>
> > >     >>>> List of Javascript packages, from NPM:
> > >     >>>> * autobind-decorator
> > >     >>>> * bootstrap
> > >     >>>> * bootstrap-datepicker
> > >     >>>> * brace
> > >     >>>> * brfs
> > >     >>>> * cal-heatmap
> > >     >>>> * classnames
> > >     >>>> * d3
> > >     >>>> * d3-cloud
> > >     >>>> * d3-sankey
> > >     >>>> * d3-scale
> > >     >>>> * d3-tip
> > >     >>>> * datamaps
> > >     >>>> * datatables-bootstrap3-plugin
> > >     >>>> * datatables.net-bs
> > >     >>>> * font-awesome
> > >     >>>> * gridster
> > >     >>>> * immutability-helper
> > >     >>>> * immutable
> > >     >>>> * jquery
> > >     >>>> * lodash.throttle
> > >     >>>> * mapbox-gl
> > >     >>>> * moment
> > >     >>>> * moments
> > >     >>>> * mustache
> > >     >>>> * nvd3
> > >     >>>> * react
> > >     >>>> * react-ace
> > >     >>>> * react-bootstrap
> > >     >>>> * react-bootstrap-table
> > >     >>>> * react-dom
> > >     >>>> * react-draggable
> > >     >>>> * react-gravatar
> > >     >>>> * react-grid-layout
> > >     >>>> * react-map-gl
> > >     >>>> * react-redux
> > >     >>>> * react-resizable
> > >     >>>> * react-select
> > >     >>>> * react-syntax-highlighter
> > >     >>>> * reactable
> > >     >>>> * redux
> > >     >>>> * redux-localstorage
> > >     >>>> * redux-thunk
> > >     >>>> * shortid
> > >     >>>> * style-loader
> > >     >>>> * supercluster
> > >     >>>> * topojson
> > >     >>>> * victory
> > >     >>>> * viewport-mercator-project
> > >     >>>>
> > >     >>>> == Cryptography ==
> > >     >>>> The proposal does not include cryptographic code.
> > >     >>>>
> > >     >>>> == Required Resources ==
> > >     >>>>
> > >     >>>> === Mailing List ===
> > >     >>>> There is a current mailing list as a Google Group
> > “airbnb_superset” that
> > >     >>>> we
> > >     >>>> are planning on deprecating as the Apache.org become ready to
> > serve our
> > >     >>>> community.
> > >     >>>>
> > >     >>>> * superset-private
> > >     >>>> * superset-dev
> > >     >>>> * superset-user
> > >     >>>>
> > >     >>>> === Subversion Directory ===
> > >     >>>> Git is the preferred source control system.
> > >     >>>> http://svn.apache.org/repos/asf/incubator/superset
> > >     >>>>
> > >     >>>> == Git Repository ==
> > >     >>>> Git is the preferred source control system, we’re assuming
> > >     >>>> https://github.com/apache/incubator-superset based on the
> > naming scheme
> > >     >>>>
> > >     >>>> == Issue Tracking ==
> > >     >>>> JIRA Superset (SUPERSET). If possible, we’d like to use Github
> > issues &
> > >     >>>> PRs
> > >     >>>> to manage our project as much as possible. It’s been said that
> > there are
> > >     >>>> ways to keep Github’s issues in sync with Jira, allowing us to
> > get best
> > >     >> of
> > >     >>>> both worlds. If that is not possible, we will comply to using
> > Jira.
> > >     >>>>
> > >     >>>> == Other Resources ==
> > >     >>>> We currently use a set of Github integrated services that are
> > free to
> > >     >> the
> > >     >>>> open source community, like Travis-ci, Code Climate,
> Coveralls,
> > >     >>>> Landscape.io, Requires.io, david-dm and Gitter. We would like
> > to keep
> > >     >>>> using
> > >     >>>> these services as they allow us to scale contributions and
> > optimize our
> > >     >>>> development flows. These services require some elevated rights
> > on the
> > >     >>>> Github repository in order to set up or tune and we would like
> > for the
> > >     >>>> committers to have the required rights.
> > >     >>>>
> > >     >>>>
> > >     >>>> == Initial Committers ==
> > >     >>>>
> > >     >>>> * Maxime Beauchemin <[hidden email]> - PPMC &
> > Committer
> > >     >>>> * Alanna Scott <[hidden email]> - PPMC & Committer
> > >     >>>> * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
> > >     >>>> * Vera Liu <[hidden email]> - Committer
> > >     >>>> * Jeff Feng <[hidden email]> - PPMC & Committer
> > >     >>>> * Ashutosh Chauhan <[hidden email]> - Mentor &
> Committer
> > >     >>>> * Nishant Bangarwa <[hidden email]> - PPMC &
> > Committer
> > >     >>>> * Slim Bouguerra <[hidden email]> - Committer
> > >     >>>> * Priyank Shah <[hidden email]> - Committer
> > >     >>>> * Harsha Chintalapani <[hidden email]> -
> > Committer
> > >     >>>> * Daniel Dai <[hidden email]> - Champion & Committer
> > >     >>>> * Luke Han <[hidden email]> - Mentor
> > >     >>>>
> > >     >>>> == Affiliations ==
> > >     >>>> The initial committers are employees of Airbnb Inc. and
> > Hortonworks.
> > >     >>>>
> > >     >>>> == Sponsors ==
> > >     >>>>
> > >     >>>> === Champion ===
> > >     >>>> Daniel Dai <[hidden email]>
> > >     >>>>
> > >     >>>> === Nominated Mentors ===
> > >     >>>> * Ashutosh Chauhan <[hidden email]>
> > >     >>>> * Luke Han <[hidden email]>
> > >     >>>>
> > >     >>>> === Sponsoring Entity ===
> > >     >>>> Incubator PMC
> > >     >>>>
> > >     >>>
> > >     >>>
> > >     >>
> > >
> > >
> > >     ------------------------------------------------------------
> > ---------
> > >     To unsubscribe, e-mail: [hidden email]
> > >     For additional commands, e-mail: [hidden email]
> > >
> > >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Edward J. Yoon-2
+1 binding

On Thu, Apr 27, 2017 at 10:29 AM, Naresh Agarwal
<[hidden email]> wrote:

> +1 (non-binding).
>
> Thanks
> Naresh Agarwal
>
> On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <[hidden email]> wrote:
>
>> +1 (binding)
>>
>>
>>
>> On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <[hidden email]> wrote:
>>
>> > +1 (binding)
>> >
>> > On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey
>> > <[hidden email]> wrote:
>> > > +1 (binding)
>> > >
>> > > On 4/25/17, 1:27 PM, "Julian Hyde" <[hidden email]> wrote:
>> > >
>> > >     +1 binding
>> > >
>> > >     > On Apr 25, 2017, at 12:48 PM, moon soo Lee <[hidden email]>
>> > wrote:
>> > >     >
>> > >     > +1 (non-binding)
>> > >     >
>> > >     > On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <
>> > [hidden email]>
>> > >     > wrote:
>> > >     >
>> > >     >> +1 (binding)
>> > >     >>
>> > >     >> Thanks,
>> > >     >> Ashutosh
>> > >     >>
>> > >     >> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <[hidden email]>
>> > wrote:
>> > >     >>
>> > >     >>> +1 binding
>> > >     >>>
>> > >     >>> Love to see Superset to be new incubator project.
>> > >     >>>
>> > >     >>>
>> > >     >>> Best Regards!
>> > >     >>> ---------------------
>> > >     >>>
>> > >     >>> Luke Han
>> > >     >>>
>> > >     >>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <
>> [hidden email]>
>> > wrote:
>> > >     >>>
>> > >     >>>> Dear Apache Incubator Community,
>> > >     >>>>
>> > >     >>>> We have updated the Superset proposal
>> > >     >>>> <https://wiki.apache.org/incubator/SupersetProposal> (copied
>> > below) for
>> > >     >>>>
>> > >     >>>> Apache Incubation with an additional mentor (Luke Han -
>> > >     >>>> [hidden email]),
>> > >     >>>> and would like to start a vote thread for acceptance into the
>> > incubator.
>> > >     >>>>
>> > >     >>>> Our team is excited to share Superset with the Apache
>> community
>> > and we
>> > >     >>>> hope
>> > >     >>>> for the your continued support!
>> > >     >>>>
>> > >     >>>> Cheers,
>> > >     >>>> Jeff & the Superset Team
>> > >     >>>>
>> > >     >>>>
>> > >     >>>>
>> > >     >>>>
>> > >     >>>> = Superset =
>> > >     >>>>
>> > >     >>>> == Abstract ==
>> > >     >>>> Superset is an enterprise-ready web application for data
>> > exploration,
>> > >     >> data
>> > >     >>>> visualization and dashboarding.
>> > >     >>>>
>> > >     >>>> == Proposal ==
>> > >     >>>> Superset is business intelligence (BI) software that helps
>> > modern
>> > >     >>>> organizations visualize and interact with their data. Superset
>> > enables
>> > >     >>>> users explore data from a variety of databases, assemble
>> > beautiful
>> > >     >>>> dashboards and share their findings.  Superset works neatly
>> > with all
>> > >     >>>> modern
>> > >     >>>> SQL-speaking databases, and integrates with Druid.io to
>> provide
>> > >     >> real-time,
>> > >     >>>> interactive, blazing fast data access to large datasets.
>> > >     >>>>
>> > >     >>>> == Background ==
>> > >     >>>> Data is mission critical. To succeed in this era,
>> organizations
>> > need to
>> > >     >>>> provide low-friction, intuitive and interactive access to
>> data.
>> > It is
>> > >     >>>> paramount for knowledge workers to be capable of answering
>> > their own
>> > >     >>>> questions by querying, exploring and visualizing data.
>> > >     >>>>
>> > >     >>>> The entire business intelligence industry has pivoted from a
>> > model of
>> > >     >>>> centralized top-down platforms driven by IT organizations to
>> > >     >> self-service
>> > >     >>>> analytics and agile workflows by any user.  This shift
>> unblocks
>> > >     >>>> centralized
>> > >     >>>> service bottlenecks for creating data visualizations while
>> also
>> > creating
>> > >     >>>> an
>> > >     >>>> environment that is iterative and fast-moving.  This means
>> that
>> > business
>> > >     >>>> intelligence software must also be easy and delightful to use.
>> > >     >>>> Self-service analytics doesn’t mean that admin and governance
>> > features
>> > >     >> are
>> > >     >>>> not needed.
>> > >     >>>> Modern BI tools provide fine-grain access controls and
>> auditing
>> > >     >>>> capabilities to understand how data is being used.  Superset
>> is
>> > a
>> > >     >> solution
>> > >     >>>> that delivers on all of these vectors.
>> > >     >>>>
>> > >     >>>> The technology stack is also constantly morphing - vendors are
>> > >     >> struggling
>> > >     >>>> to provide cheap, quick and easy solutions to access data.
>> > Business
>> > >     >>>> intelligence users are finding existing solutions lacking as
>> > these
>> > >     >>>> software
>> > >     >>>> products either disregard or react slowly to recent
>> > game-changing
>> > >     >>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache
>> > Kylin, d3.js,
>> > >     >>>> React.js and iPython’s Jupyter for instance.
>> > >     >>>>
>> > >     >>>> == Rationale ==
>> > >     >>>> Business intelligence is more relevant today than at any other
>> > point in
>> > >     >>>> history.  Organizations are currently very limited in options
>> > for open
>> > >     >>>> source data visualization solutions, especially solutions that
>> > are both
>> > >     >>>> self-service and enterprise-ready.  Every company informing
>> > their
>> > >     >>>> decisions
>> > >     >>>> with data needs a BI tool.
>> > >     >>>>
>> > >     >>>> We believe that Superset will be a strong compliment to
>> > existing Apache
>> > >     >>>> Software Foundation technologies by offering scalable user
>> > interactions
>> > >     >> to
>> > >     >>>> distributed storage and computation solutions.  Users will
>> > often find
>> > >     >> that
>> > >     >>>> Superset can act as a catalyst for tooling that can visualize
>> > the
>> > >     >>>> byproduct
>> > >     >>>> of data and computation infrastructure.
>> > >     >>>>
>> > >     >>>> Superset has many key design elements that help fill a gap in
>> > current
>> > >     >>>> solutions for organizations:
>> > >     >>>> * Easy, low friction access to data through a simple,
>> web-based
>> > data
>> > >     >>>> exploration interface.  Composing charts and dashboards are
>> > intuitive.
>> > >     >>>> Eliminating the need to write code or SQL empowers anyone to
>> > use it.
>> > >     >>>> * Access to a wide array of rich, interactive data
>> > visualization types.
>> > >     >>>> * Enterprise-ready: Integration with different authentication
>> > >     >> mechanisms
>> > >     >>>> and granular permissions centered around actions and data
>> > access.
>> > >     >>>> * Realtime & fast: Superset provides realtime analytics at the
>> > speed of
>> > >     >>>> thought on very large datasets when integrated with Druid.io.
>> > >     >>>> * Broad data access: Consume data out of any SQL-speaking
>> > relational
>> > >     >>>> database.
>> > >     >>>> * Extensible: Can be extended to talk to many noSQL databases
>> > like
>> > >     >> Apache
>> > >     >>>> Drill, Elastic Search, and other popular database engines.
>> > >     >>>> * Fast loading dashboards with configurable web-scale caching.
>> > >     >>>> * Plug-in framework that enables organizations to build custom
>> > >     >> analytical
>> > >     >>>> applications with new UI/UX interfaces.
>> > >     >>>> * SQL Lab, a state-of-the-art SQL IDE that empowers
>> > SQL-speaking users
>> > >     >>>> with more flexibility.  SQL Lab integrates with the
>> > visualization engine
>> > >     >>>> seamlessly.
>> > >     >>>>
>> > >     >>>> == Initial Goals ==
>> > >     >>>> The initial goals of the Superset project are several-fold:
>> > >     >>>> * Move the existing codebase to Apache and integrate with the
>> > Apache
>> > >     >>>> development process.
>> > >     >>>> * Redesign the user interface and interaction model for
>> creating
>> > >     >>>> visualizations/dashboards and connecting to data sources
>> > >     >>>> * Build robust support for security and governance of the tool
>> > >     >> including
>> > >     >>>> popular authorization modules (including Apache Ranger and
>> > Apache
>> > >     >> Sentry)
>> > >     >>>> and a more sophisticated permissions system
>> > >     >>>> * Grow the extensibility of the project both in terms of
>> > enhanced
>> > >     >>>> connectivity to NoSQL-based data sources and creating a
>> plug-in
>> > >     >> framework
>> > >     >>>> that enables organizations to build custom analytical
>> > applications which
>> > >     >>>> require a new UI/UX
>> > >     >>>>
>> > >     >>>> == Current Status ==
>> > >     >>>> By many standards, Superset is already a successful open
>> source
>> > project.
>> > >     >>>> As
>> > >     >>>> of March 2017, Superset is officially used in production at
>> > about a
>> > >     >> dozen
>> > >     >>>> companies, has received contributions from over one hundred
>> > contributors
>> > >     >>>> on
>> > >     >>>> Github, 1500+ forks, and 12k+ stars.
>> > >     >>>>
>> > >     >>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have
>> made
>> > >     >>>> significant contributions, and expressed their commitment to
>> the
>> > >     >> project.
>> > >     >>>> The product is feature complete and has been viable for
>> months.
>> > It
>> > >     >> already
>> > >     >>>> serves as the main interface for consuming data at many
>> > companies of
>> > >     >>>> different sizes.
>> > >     >>>>
>> > >     >>>> While the product is usable, there’s room for improvement
>> > across the
>> > >     >>>> board,
>> > >     >>>> starting with providing a smoother user experience around
>> > content
>> > >     >>>> creation,
>> > >     >>>> making sure all features work out-of-the-box on more platforms
>> > and
>> > >     >>>> databases, providing better user training guides and videos,
>> > having a
>> > >     >>>> predictable release process, and increasing the overall
>> quality
>> > of the
>> > >     >>>> Superset releases.
>> > >     >>>>
>> > >     >>>> === Meritocracy ===
>> > >     >>>> We plan to invest in supporting a meritocracy. We will discuss
>> > the
>> > >     >>>> requirements in an open forum. Several companies have
>> expressed
>> > interest
>> > >     >>>> in
>> > >     >>>> this project, and we intend to invite additional developers to
>> > >     >>>> participate.
>> > >     >>>> We will encourage and monitor community participation so that
>> > privileges
>> > >     >>>> can be extended to those that contribute.
>> > >     >>>>
>> > >     >>>> === Community ===
>> > >     >>>> The need for an enterprise-ready data visualization and
>> > exploration
>> > >     >>>> platform in the open source community is tremendous.  While
>> > Superset is
>> > >     >>>> fairly well known, recognized and used within the Druid.io
>> > community,
>> > >     >>>> adoption is currently limited outside of that niche. There is
>> a
>> > huge
>> > >     >>>> opportunity to grow the community to hundreds if not thousands
>> > of
>> > >     >>>> organizations, and we are hoping that embracing “the Apache
>> > way” will
>> > >     >>>> accelerate the growth of our community.
>> > >     >>>>
>> > >     >>>> We have already been active at seeking and inviting
>> > contributions, and
>> > >     >> are
>> > >     >>>> planning to scale the project by investing time and growing
>> the
>> > support
>> > >     >>>> structure to grow the community.
>> > >     >>>>
>> > >     >>>> === Core Developers ===
>> > >     >>>> The initial committers for Superset include experienced full
>> > stack,
>> > >     >>>> front-end and data engineers:
>> > >     >>>> * Maxime Beauchemin (Airbnb)
>> > >     >>>> * Alanna Scott (Airbnb)
>> > >     >>>> * Bogdan Kyryliuk (Airbnb)
>> > >     >>>> * Vera Liu  (Airbnb)
>> > >     >>>> * Jeff Feng (Airbnb)
>> > >     >>>> * Ashutosh Chauhan (Hortonworks)
>> > >     >>>> * Nishant Bangarwa (Hortonworks)
>> > >     >>>> * Slim Bouguerra (Hortonworks)
>> > >     >>>> * Priyank Shah (Hortonworks)
>> > >     >>>> * Sriharsha Chintalapani (Hortonworks)
>> > >     >>>> * Daniel Dai (Hortonworks)
>> > >     >>>>
>> > >     >>>> We realize that additional employer diversity is needed, and
>> we
>> > will
>> > >     >> work
>> > >     >>>> aggressively to recruit developers from additional companies.
>> > >     >>>>
>> > >     >>>> === Alignment ===
>> > >     >>>> The initial committers strongly believe that a system for
>> > interactive
>> > >     >>>> visualization of data will gain broader adoption as an open
>> > source,
>> > >     >>>> community driven project, where the community can contribute
>> > not only to
>> > >     >>>> the core components, but also to a growing collection of
>> > connectors,
>> > >     >>>> visualizations and improving integration a all potential data
>> > sources.
>> > >     >>>> Superset already integrates closely with Apache Hive, the Hive
>> > >     >> metastore,
>> > >     >>>> as well as most SQL-speaking databases found in modern data
>> > ecosystems.
>> > >     >>>>
>> > >     >>>> == Known Risks ==
>> > >     >>>>
>> > >     >>>> === Orphaned Products ===
>> > >     >>>> Superset is a vital component for both visualizing, accessing
>> > and
>> > >     >>>> democratizing data at Airbnb.  Also at Hortonworks, Superset
>> is
>> > a core
>> > >     >>>> component of the DataFlow product offering.  Thus, the risk of
>> > the
>> > >     >> project
>> > >     >>>> being orphaned is relatively low.  The project could be at
>> risk
>> > if
>> > >     >> Airbnb
>> > >     >>>> changes their approach for democratizing data or if
>> Hortonworks
>> > changes
>> > >     >>>> their strategy in the market.  In such an event, the
>> committers
>> > plan to
>> > >     >>>> continue working on the project on their own time, thought the
>> > progress
>> > >     >>>> will likely be slower.  We plan to mitigate this risk by
>> > recruiting
>> > >     >>>> additional committers.
>> > >     >>>>
>> > >     >>>> === Inexperience with Open Source ===
>> > >     >>>> The initial committers include veteran Apache members
>> > (committers and
>> > >     >> PPMC
>> > >     >>>> members) and other developers who have varying degrees of
>> > experience
>> > >     >> with
>> > >     >>>> open source projects. All have been involved with source code
>> > that has
>> > >     >>>> been
>> > >     >>>> released under an open source license, and several also have
>> > experience
>> > >     >>>> developing code with an open source development process.
>> > >     >>>>
>> > >     >>>> === Homogenous Developers ===
>> > >     >>>> The initial committers are employed by Airbnb Inc. and
>> > Hortonworks. We
>> > >     >> are
>> > >     >>>> committed to recruiting additional committers from other
>> > companies.
>> > >     >>>>
>> > >     >>>> === Reliance on Salaried Developers ===
>> > >     >>>> It is expected that Superset development will occur on both
>> > salaried
>> > >     >> time
>> > >     >>>> and on volunteer time, after hours. The majority of initial
>> > committers
>> > >     >> are
>> > >     >>>> paid by their employer to contribute to this project. However,
>> > they are
>> > >     >>>> all
>> > >     >>>> passionate about the project, and we are confident that the
>> > project will
>> > >     >>>> continue even if no salaried developers contribute to the
>> > project. We
>> > >     >> are
>> > >     >>>> committed to recruiting additional committers including
>> > non-salaried
>> > >     >>>> developers.
>> > >     >>>>
>> > >     >>>> === Relationships with Other Apache Products ===
>> > >     >>>> To the knowledge of the Initial Committers, there are no
>> direct
>> > >     >>>> competitors
>> > >     >>>> to Superset within the Apache Software Foundation.  That said,
>> > Apache
>> > >     >>>> Zeppelin is an indirect competitor, but it solves a different
>> > use case.
>> > >     >>>>
>> > >     >>>> Apache Zeppelin is a web-based notebook that enables
>> > interactive data
>> > >     >>>> analytics. It enables the creation of beautiful data-driven,
>> > interactive
>> > >     >>>> and collaborative documents with SQL, Scala and more.
>> Although
>> > a user
>> > >     >> can
>> > >     >>>> create data visualizations using this project, it leverages a
>> > notebook
>> > >     >>>> style user interfaces and it is geared towards the Spark
>> > community where
>> > >     >>>> Scala and SQL co-exist
>> > >     >>>>
>> > >     >>>> We look forward to collaborating with those communities, as
>> > well as
>> > >     >> other
>> > >     >>>> Apache communities.
>> > >     >>>>
>> > >     >>>> === An Excessive Fascination with the Apache Brand ===
>> > >     >>>> Superset is solving two huge challenges:
>> > >     >>>> The challenge of enabling every knowledge worker to make data
>> > informed
>> > >     >>>> decisions, particularly those who are not deeply skilled at
>> > writing SQL.
>> > >     >>>> The challenge of visualizing huge amounts of data
>> interactively
>> > and in
>> > >     >>>> real-time
>> > >     >>>>
>> > >     >>>> Superset was first developed as a data visualization solution
>> > for
>> > >     >> Druid.io
>> > >     >>>> as a way to visualize billions of rows of data.  Since then,
>> > usage of
>> > >     >>>> Superset has expanded to address data visualization use cases
>> > across SQL
>> > >     >>>> speaking data sources as well.
>> > >     >>>>
>> > >     >>>> Our rationale for developing Superset as an Apache project is
>> > detailed
>> > >     >> in
>> > >     >>>> the Rationale Section.  We believe that the Apache brand and
>> > community
>> > >     >>>> process will help us attract more contributors to this
>> project,
>> > and help
>> > >     >>>> grow the footprint of the project through usage at other
>> > organizations
>> > >     >> and
>> > >     >>>> within other applications.  Establishing consensus among users
>> > and
>> > >     >>>> developers will result in a more valuable tool for everyone.
>> > >     >>>>
>> > >     >>>> == Documentation ==
>> > >     >>>> References to further reading material:
>> > >     >>>> * [[http://airbnb.io/superset/|Superset Documentation]]
>> > >     >>>> * [[
>> > >     >>>> https://medium.com/airbnb-engineering/caravel-airbnb-s-data-
>> > >     >>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
>> > >     >>>> Post:  Superset: Airbnb’s Data Exploration Platform]]
>> > >     >>>> * [[
>> > >     >>>> https://medium.com/airbnb-engineering/superset-scaling-data-
>> > >     >>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.
>> > a505zvb1t|Blog
>> > >     >>>> Post:  Superset: Scaling Data Access & Visual Insights at
>> > Airbnb]]
>> > >     >>>>
>> > >     >>>> == Initial Source ==
>> > >     >>>> The origin of the proposed code base can be found at
>> > >     >>>> https://github.com/airbnb/superset.  The code base is
>> > primarily in
>> > >     >>>> Python.
>> > >     >>>>
>> > >     >>>> == Source and Intellectual Property Submission Plan ==
>> > >     >>>> We do not expect any complications for the submission of the
>> > Superset
>> > >     >> code
>> > >     >>>> base.  Our code is already in Github and there is only a
>> single
>> > code
>> > >     >> base.
>> > >     >>>>
>> > >     >>>> == External Dependencies ==
>> > >     >>>> List of Python packages, from the Python Package Index (Pypi):
>> > >     >>>>
>> > >     >>>> * boto3
>> > >     >>>> * celery
>> > >     >>>> * cryptography
>> > >     >>>> * flask-appbuilder
>> > >     >>>> * flask-cache
>> > >     >>>> * flask-migrate
>> > >     >>>> * flask-script
>> > >     >>>> * flask-sqlalchemy
>> > >     >>>> * flask-testing
>> > >     >>>> * humanize
>> > >     >>>> * gunicorn
>> > >     >>>> * markdown
>> > >     >>>> * pandas
>> > >     >>>> * parsedatetime
>> > >     >>>> * pydruid
>> > >     >>>> * PyHive
>> > >     >>>> * python-dateutil
>> > >     >>>> * requests
>> > >     >>>> * simplejson
>> > >     >>>> * six
>> > >     >>>> * sqlalchemy
>> > >     >>>> * sqlalchemy-utils
>> > >     >>>> * sqlparse
>> > >     >>>> * thrift
>> > >     >>>> * thrift-sasl
>> > >     >>>> * werkzeug
>> > >     >>>>
>> > >     >>>> List of Javascript packages, from NPM:
>> > >     >>>> * autobind-decorator
>> > >     >>>> * bootstrap
>> > >     >>>> * bootstrap-datepicker
>> > >     >>>> * brace
>> > >     >>>> * brfs
>> > >     >>>> * cal-heatmap
>> > >     >>>> * classnames
>> > >     >>>> * d3
>> > >     >>>> * d3-cloud
>> > >     >>>> * d3-sankey
>> > >     >>>> * d3-scale
>> > >     >>>> * d3-tip
>> > >     >>>> * datamaps
>> > >     >>>> * datatables-bootstrap3-plugin
>> > >     >>>> * datatables.net-bs
>> > >     >>>> * font-awesome
>> > >     >>>> * gridster
>> > >     >>>> * immutability-helper
>> > >     >>>> * immutable
>> > >     >>>> * jquery
>> > >     >>>> * lodash.throttle
>> > >     >>>> * mapbox-gl
>> > >     >>>> * moment
>> > >     >>>> * moments
>> > >     >>>> * mustache
>> > >     >>>> * nvd3
>> > >     >>>> * react
>> > >     >>>> * react-ace
>> > >     >>>> * react-bootstrap
>> > >     >>>> * react-bootstrap-table
>> > >     >>>> * react-dom
>> > >     >>>> * react-draggable
>> > >     >>>> * react-gravatar
>> > >     >>>> * react-grid-layout
>> > >     >>>> * react-map-gl
>> > >     >>>> * react-redux
>> > >     >>>> * react-resizable
>> > >     >>>> * react-select
>> > >     >>>> * react-syntax-highlighter
>> > >     >>>> * reactable
>> > >     >>>> * redux
>> > >     >>>> * redux-localstorage
>> > >     >>>> * redux-thunk
>> > >     >>>> * shortid
>> > >     >>>> * style-loader
>> > >     >>>> * supercluster
>> > >     >>>> * topojson
>> > >     >>>> * victory
>> > >     >>>> * viewport-mercator-project
>> > >     >>>>
>> > >     >>>> == Cryptography ==
>> > >     >>>> The proposal does not include cryptographic code.
>> > >     >>>>
>> > >     >>>> == Required Resources ==
>> > >     >>>>
>> > >     >>>> === Mailing List ===
>> > >     >>>> There is a current mailing list as a Google Group
>> > “airbnb_superset” that
>> > >     >>>> we
>> > >     >>>> are planning on deprecating as the Apache.org become ready to
>> > serve our
>> > >     >>>> community.
>> > >     >>>>
>> > >     >>>> * superset-private
>> > >     >>>> * superset-dev
>> > >     >>>> * superset-user
>> > >     >>>>
>> > >     >>>> === Subversion Directory ===
>> > >     >>>> Git is the preferred source control system.
>> > >     >>>> http://svn.apache.org/repos/asf/incubator/superset
>> > >     >>>>
>> > >     >>>> == Git Repository ==
>> > >     >>>> Git is the preferred source control system, we’re assuming
>> > >     >>>> https://github.com/apache/incubator-superset based on the
>> > naming scheme
>> > >     >>>>
>> > >     >>>> == Issue Tracking ==
>> > >     >>>> JIRA Superset (SUPERSET). If possible, we’d like to use Github
>> > issues &
>> > >     >>>> PRs
>> > >     >>>> to manage our project as much as possible. It’s been said that
>> > there are
>> > >     >>>> ways to keep Github’s issues in sync with Jira, allowing us to
>> > get best
>> > >     >> of
>> > >     >>>> both worlds. If that is not possible, we will comply to using
>> > Jira.
>> > >     >>>>
>> > >     >>>> == Other Resources ==
>> > >     >>>> We currently use a set of Github integrated services that are
>> > free to
>> > >     >> the
>> > >     >>>> open source community, like Travis-ci, Code Climate,
>> Coveralls,
>> > >     >>>> Landscape.io, Requires.io, david-dm and Gitter. We would like
>> > to keep
>> > >     >>>> using
>> > >     >>>> these services as they allow us to scale contributions and
>> > optimize our
>> > >     >>>> development flows. These services require some elevated rights
>> > on the
>> > >     >>>> Github repository in order to set up or tune and we would like
>> > for the
>> > >     >>>> committers to have the required rights.
>> > >     >>>>
>> > >     >>>>
>> > >     >>>> == Initial Committers ==
>> > >     >>>>
>> > >     >>>> * Maxime Beauchemin <[hidden email]> - PPMC &
>> > Committer
>> > >     >>>> * Alanna Scott <[hidden email]> - PPMC & Committer
>> > >     >>>> * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
>> > >     >>>> * Vera Liu <[hidden email]> - Committer
>> > >     >>>> * Jeff Feng <[hidden email]> - PPMC & Committer
>> > >     >>>> * Ashutosh Chauhan <[hidden email]> - Mentor &
>> Committer
>> > >     >>>> * Nishant Bangarwa <[hidden email]> - PPMC &
>> > Committer
>> > >     >>>> * Slim Bouguerra <[hidden email]> - Committer
>> > >     >>>> * Priyank Shah <[hidden email]> - Committer
>> > >     >>>> * Harsha Chintalapani <[hidden email]> -
>> > Committer
>> > >     >>>> * Daniel Dai <[hidden email]> - Champion & Committer
>> > >     >>>> * Luke Han <[hidden email]> - Mentor
>> > >     >>>>
>> > >     >>>> == Affiliations ==
>> > >     >>>> The initial committers are employees of Airbnb Inc. and
>> > Hortonworks.
>> > >     >>>>
>> > >     >>>> == Sponsors ==
>> > >     >>>>
>> > >     >>>> === Champion ===
>> > >     >>>> Daniel Dai <[hidden email]>
>> > >     >>>>
>> > >     >>>> === Nominated Mentors ===
>> > >     >>>> * Ashutosh Chauhan <[hidden email]>
>> > >     >>>> * Luke Han <[hidden email]>
>> > >     >>>>
>> > >     >>>> === Sponsoring Entity ===
>> > >     >>>> Incubator PMC
>> > >     >>>>
>> > >     >>>
>> > >     >>>
>> > >     >>
>> > >
>> > >
>> > >     ------------------------------------------------------------
>> > ---------
>> > >     To unsubscribe, e-mail: [hidden email]
>> > >     For additional commands, e-mail: [hidden email]
>> > >
>> > >
>> > >
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: [hidden email]
>> > For additional commands, e-mail: [hidden email]
>> >
>> >
>>



--
Best Regards, Edward J. Yoon

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Jeff Feng-2
Hello everyone,

Thank you for checking out our proposal on Superset and for your
consideration for the Apache Incubator.  So far, I believe we have 8
binding votes and 2 non-binding votes.

As Taylor mentioned earlier, we made a minor update to the wording in the
"Source and Intellectual Property Submission Plan" section based on a
suggestion by John Ament.  The update was to help confirm the previously
unstated assumption that we will submit an SGA.  I have copied the updated
proposal from the wiki to the email below and highlighted (in yellow) the
new sentence below in the document.

Folks on the cc line who have already voted, please let us know if the
change impacts your vote.

Thank you all,
Jeff



= Superset =

== Abstract ==
Superset is an enterprise-ready web application for data exploration, data
visualization and dashboarding.

== Proposal ==
Superset is business intelligence (BI) software that helps modern
organizations visualize and interact with their data. Superset enables
users explore data from a variety of databases, assemble beautiful
dashboards and share their findings.  Superset works neatly with all modern
SQL-speaking databases, and integrates with Druid.io to provide real-time,
interactive, blazing fast data access to large datasets.

== Background ==
Data is mission critical. To succeed in this era, organizations need to
provide low-friction, intuitive and interactive access to data. It is
paramount for knowledge workers to be capable of answering their own
questions by querying, exploring and visualizing data.

The entire business intelligence industry has pivoted from a model of
centralized top-down platforms driven by IT organizations to self-service
analytics and agile workflows by any user.  This shift unblocks centralized
service bottlenecks for creating data visualizations while also creating an
environment that is iterative and fast-moving.  This means that business
intelligence software must also be easy and delightful to use.
Self-service analytics doesn’t mean that admin and governance features are
not needed.
Modern BI tools provide fine-grain access controls and auditing
capabilities to understand how data is being used.  Superset is a solution
that delivers on all of these vectors.

The technology stack is also constantly morphing - vendors are struggling
to provide cheap, quick and easy solutions to access data.  Business
intelligence users are finding existing solutions lacking as these software
products either disregard or react slowly to recent game-changing
technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
React.js and iPython’s Jupyter for instance.

== Rationale ==
Business intelligence is more relevant today than at any other point in
history.  Organizations are currently very limited in options for open
source data visualization solutions, especially solutions that are both
self-service and enterprise-ready.  Every company informing their decisions
with data needs a BI tool.

We believe that Superset will be a strong compliment to existing Apache
Software Foundation technologies by offering scalable user interactions to
distributed storage and computation solutions.  Users will often find that
Superset can act as a catalyst for tooling that can visualize the byproduct
of data and computation infrastructure.

Superset has many key design elements that help fill a gap in current
solutions for organizations:
 * Easy, low friction access to data through a simple, web-based data
exploration interface.  Composing charts and dashboards are intuitive.
Eliminating the need to write code or SQL empowers anyone to use it.
 * Access to a wide array of rich, interactive data visualization types.
 * Enterprise-ready: Integration with different authentication mechanisms
and granular permissions centered around actions and data access.
 * Realtime & fast: Superset provides realtime analytics at the speed of
thought on very large datasets when integrated with Druid.io.
 * Broad data access: Consume data out of any SQL-speaking relational
database.
 * Extensible: Can be extended to talk to many noSQL databases like Apache
Drill, Elastic Search, and other popular database engines.
 * Fast loading dashboards with configurable web-scale caching.
 * Plug-in framework that enables organizations to build custom analytical
applications with new UI/UX interfaces.
 * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
with more flexibility.  SQL Lab integrates with the visualization engine
seamlessly.

== Initial Goals ==
The initial goals of the Superset project are several-fold:
 * Move the existing codebase to Apache and integrate with the Apache
development process.
 * Redesign the user interface and interaction model for creating
visualizations/dashboards and connecting to data sources
 * Build robust support for security and governance of the tool including
popular authorization modules (including Apache Ranger and Apache Sentry)
and a more sophisticated permissions system
 * Grow the extensibility of the project both in terms of enhanced
connectivity to NoSQL-based data sources and creating a plug-in framework
that enables organizations to build custom analytical applications which
require a new UI/UX

== Current Status ==
By many standards, Superset is already a successful open source project. As
of March 2017, Superset is officially used in production at about a dozen
companies, has received contributions from over one hundred contributors on
Github, 1500+ forks, and 12k+ stars.

Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
significant contributions, and expressed their commitment to the project.
The product is feature complete and has been viable for months. It already
serves as the main interface for consuming data at many companies of
different sizes.

While the product is usable, there’s room for improvement across the board,
starting with providing a smoother user experience around content creation,
making sure all features work out-of-the-box on more platforms and
databases, providing better user training guides and videos, having a
predictable release process, and increasing the overall quality of the
Superset releases.

=== Meritocracy ===
We plan to invest in supporting a meritocracy. We will discuss the
requirements in an open forum. Several companies have expressed interest in
this project, and we intend to invite additional developers to participate.
We will encourage and monitor community participation so that privileges
can be extended to those that contribute.

=== Community ===
The need for an enterprise-ready data visualization and exploration
platform in the open source community is tremendous.  While Superset is
fairly well known, recognized and used within the Druid.io community,
adoption is currently limited outside of that niche. There is a huge
opportunity to grow the community to hundreds if not thousands of
organizations, and we are hoping that embracing “the Apache way” will
accelerate the growth of our community.

We have already been active at seeking and inviting contributions, and are
planning to scale the project by investing time and growing the support
structure to grow the community.

=== Core Developers ===
The initial committers for Superset include experienced full stack,
front-end and data engineers:
 * Maxime Beauchemin (Airbnb)
 * Alanna Scott (Airbnb)
 * Bogdan Kyryliuk (Airbnb)
 * Vera Liu  (Airbnb)
 * Jeff Feng (Airbnb)
 * Ashutosh Chauhan (Hortonworks)
 * Nishant Bangarwa (Hortonworks)
 * Slim Bouguerra (Hortonworks)
 * Priyank Shah (Hortonworks)
 * Sriharsha Chintalapani (Hortonworks)
 * Daniel Dai (Hortonworks)

We realize that additional employer diversity is needed, and we will work
aggressively to recruit developers from additional companies.

=== Alignment ===
The initial committers strongly believe that a system for interactive
visualization of data will gain broader adoption as an open source,
community driven project, where the community can contribute not only to
the core components, but also to a growing collection of connectors,
visualizations and improving integration a all potential data sources.
Superset already integrates closely with Apache Hive, the Hive metastore,
as well as most SQL-speaking databases found in modern data ecosystems.

== Known Risks ==

=== Orphaned Products ===
Superset is a vital component for both visualizing, accessing and
democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
component of the DataFlow product offering.  Thus, the risk of the project
being orphaned is relatively low.  The project could be at risk if Airbnb
changes their approach for democratizing data or if Hortonworks changes
their strategy in the market.  In such an event, the committers plan to
continue working on the project on their own time, thought the progress
will likely be slower.  We plan to mitigate this risk by recruiting
additional committers.

=== Inexperience with Open Source ===
The initial committers include veteran Apache members (committers and PPMC
members) and other developers who have varying degrees of experience with
open source projects. All have been involved with source code that has been
released under an open source license, and several also have experience
developing code with an open source development process.

=== Homogenous Developers ===
The initial committers are employed by Airbnb Inc. and Hortonworks. We are
committed to recruiting additional committers from other companies.

=== Reliance on Salaried Developers ===
It is expected that Superset development will occur on both salaried time
and on volunteer time, after hours. The majority of initial committers are
paid by their employer to contribute to this project. However, they are all
passionate about the project, and we are confident that the project will
continue even if no salaried developers contribute to the project. We are
committed to recruiting additional committers including non-salaried
developers.

=== Relationships with Other Apache Products ===
To the knowledge of the Initial Committers, there are no direct competitors
to Superset within the Apache Software Foundation.  That said, Apache
Zeppelin is an indirect competitor, but it solves a different use case.

Apache Zeppelin is a web-based notebook that enables interactive data
analytics. It enables the creation of beautiful data-driven, interactive
and collaborative documents with SQL, Scala and more.  Although a user can
create data visualizations using this project, it leverages a notebook
style user interfaces and it is geared towards the Spark community where
Scala and SQL co-exist

We look forward to collaborating with those communities, as well as other
Apache communities.

=== An Excessive Fascination with the Apache Brand ===
Superset is solving two huge challenges:
The challenge of enabling every knowledge worker to make data informed
decisions, particularly those who are not deeply skilled at writing SQL.
The challenge of visualizing huge amounts of data interactively and in
real-time

Superset was first developed as a data visualization solution for Druid.io
as a way to visualize billions of rows of data.  Since then, usage of
Superset has expanded to address data visualization use cases across SQL
speaking data sources as well.

Our rationale for developing Superset as an Apache project is detailed in
the Rationale Section.  We believe that the Apache brand and community
process will help us attract more contributors to this project, and help
grow the footprint of the project through usage at other organizations and
within other applications.  Establishing consensus among users and
developers will result in a more valuable tool for everyone.

== Documentation ==
References to further reading material:
 * [[http://airbnb.io/superset/|Superset Documentation]]
 * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat
a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post:  Superset:
Airbnb’s Data Exploration Platform]]
 * [[https://medium.com/airbnb-engineering/superset-scaling-dat
a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog Post:
 Superset: Scaling Data Access & Visual Insights at Airbnb]]

== Initial Source ==
The origin of the proposed code base can be found at
https://github.com/airbnb/superset.  The code base is primarily in Python.

== Source and Intellectual Property Submission Plan ==
Airbnb will submit a Software Grant Agreement (SGA) as Superset joins the
incubator. We do not expect any complications for the submission of the
Superset code base.  Our code is already in Github and there is only a
single code base.

== External Dependencies ==
List of Python packages, from the Python Package Index (Pypi):

 * boto3
 * celery
 * cryptography
 * flask-appbuilder
 * flask-cache
 * flask-migrate
 * flask-script
 * flask-sqlalchemy
 * flask-testing
 * humanize
 * gunicorn
 * markdown
 * pandas
 * parsedatetime
 * pydruid
 * PyHive
 * python-dateutil
 * requests
 * simplejson
 * six
 * sqlalchemy
 * sqlalchemy-utils
 * sqlparse
 * thrift
 * thrift-sasl
 * werkzeug

List of Javascript packages, from NPM:
 * autobind-decorator
 * bootstrap
 * bootstrap-datepicker
 * brace
 * brfs
 * cal-heatmap
 * classnames
 * d3
 * d3-cloud
 * d3-sankey
 * d3-scale
 * d3-tip
 * datamaps
 * datatables-bootstrap3-plugin
 * datatables.net-bs
 * font-awesome
 * gridster
 * immutability-helper
 * immutable
 * jquery
 * lodash.throttle
 * mapbox-gl
 * moment
 * moments
 * mustache
 * nvd3
 * react
 * react-ace
 * react-bootstrap
 * react-bootstrap-table
 * react-dom
 * react-draggable
 * react-gravatar
 * react-grid-layout
 * react-map-gl
 * react-redux
 * react-resizable
 * react-select
 * react-syntax-highlighter
 * reactable
 * redux
 * redux-localstorage
 * redux-thunk
 * shortid
 * style-loader
 * supercluster
 * topojson
 * victory
 * viewport-mercator-project

== Cryptography ==
The proposal does not include cryptographic code.

== Required Resources ==

=== Mailing List ===
There is a current mailing list as a Google Group “airbnb_superset” that we
are planning on deprecating as the Apache.org become ready to serve our
community.

 * superset-private
 * superset-dev
 * superset-user

=== Subversion Directory ===
Git is the preferred source control system. http://svn.apache.org/repos/as
f/incubator/superset

== Git Repository ==
Git is the preferred source control system, we’re assuming
https://github.com/apache/incubator-superset based on the naming scheme

== Issue Tracking ==
JIRA Superset (SUPERSET). If possible, we’d like to use Github issues & PRs
to manage our project as much as possible. It’s been said that there are
ways to keep Github’s issues in sync with Jira, allowing us to get best of
both worlds. If that is not possible, we will comply to using Jira.

== Other Resources ==
We currently use a set of Github integrated services that are free to the
open source community, like Travis-ci, Code Climate, Coveralls,
Landscape.io, Requires.io, david-dm and Gitter. We would like to keep using
these services as they allow us to scale contributions and optimize our
development flows. These services require some elevated rights on the
Github repository in order to set up or tune and we would like for the
committers to have the required rights.


== Initial Committers ==

 * Maxime Beauchemin <[hidden email]> - PPMC & Committer
 * Alanna Scott <[hidden email]> - PPMC & Committer
 * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
 * Vera Liu <[hidden email]> - Committer
 * Jeff Feng <[hidden email]> - PPMC & Committer
 * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
 * Nishant Bangarwa <[hidden email]> - PPMC & Committer
 * Slim Bouguerra <[hidden email]> - Committer
 * Priyank Shah <[hidden email]> - Committer
 * Harsha Chintalapani <[hidden email]> - Committer
 * Daniel Dai <[hidden email]> - Champion & Committer
 * Luke Han <[hidden email]> - Mentor

== Affiliations ==
The initial committers are employees of Airbnb Inc. and Hortonworks.

== Sponsors ==

=== Champion ===
Daniel Dai <[hidden email]>

=== Nominated Mentors ===
 * Ashutosh Chauhan <[hidden email]>
 * Luke Han <[hidden email]>

=== Sponsoring Entity ===
Incubator PMC





On Wed, Apr 26, 2017 at 6:31 PM, Edward J. Yoon <[hidden email]>
wrote:

> +1 binding
>
> On Thu, Apr 27, 2017 at 10:29 AM, Naresh Agarwal
> <[hidden email]> wrote:
> > +1 (non-binding).
> >
> > Thanks
> > Naresh Agarwal
> >
> > On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <[hidden email]>
> wrote:
> >
> >> +1 (binding)
> >>
> >>
> >>
> >> On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <[hidden email]> wrote:
> >>
> >> > +1 (binding)
> >> >
> >> > On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey
> >> > <[hidden email]> wrote:
> >> > > +1 (binding)
> >> > >
> >> > > On 4/25/17, 1:27 PM, "Julian Hyde" <[hidden email]> wrote:
> >> > >
> >> > >     +1 binding
> >> > >
> >> > >     > On Apr 25, 2017, at 12:48 PM, moon soo Lee <[hidden email]>
> >> > wrote:
> >> > >     >
> >> > >     > +1 (non-binding)
> >> > >     >
> >> > >     > On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <
> >> > [hidden email]>
> >> > >     > wrote:
> >> > >     >
> >> > >     >> +1 (binding)
> >> > >     >>
> >> > >     >> Thanks,
> >> > >     >> Ashutosh
> >> > >     >>
> >> > >     >> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <[hidden email]
> >
> >> > wrote:
> >> > >     >>
> >> > >     >>> +1 binding
> >> > >     >>>
> >> > >     >>> Love to see Superset to be new incubator project.
> >> > >     >>>
> >> > >     >>>
> >> > >     >>> Best Regards!
> >> > >     >>> ---------------------
> >> > >     >>>
> >> > >     >>> Luke Han
> >> > >     >>>
> >> > >     >>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <
> >> [hidden email]>
> >> > wrote:
> >> > >     >>>
> >> > >     >>>> Dear Apache Incubator Community,
> >> > >     >>>>
> >> > >     >>>> We have updated the Superset proposal
> >> > >     >>>> <https://wiki.apache.org/incubator/SupersetProposal>
> (copied
> >> > below) for
> >> > >     >>>>
> >> > >     >>>> Apache Incubation with an additional mentor (Luke Han -
> >> > >     >>>> [hidden email]),
> >> > >     >>>> and would like to start a vote thread for acceptance into
> the
> >> > incubator.
> >> > >     >>>>
> >> > >     >>>> Our team is excited to share Superset with the Apache
> >> community
> >> > and we
> >> > >     >>>> hope
> >> > >     >>>> for the your continued support!
> >> > >     >>>>
> >> > >     >>>> Cheers,
> >> > >     >>>> Jeff & the Superset Team
> >> > >     >>>>
> >> > >     >>>>
> >> > >     >>>>
> >> > >     >>>>
> >> > >     >>>> = Superset =
> >> > >     >>>>
> >> > >     >>>> == Abstract ==
> >> > >     >>>> Superset is an enterprise-ready web application for data
> >> > exploration,
> >> > >     >> data
> >> > >     >>>> visualization and dashboarding.
> >> > >     >>>>
> >> > >     >>>> == Proposal ==
> >> > >     >>>> Superset is business intelligence (BI) software that helps
> >> > modern
> >> > >     >>>> organizations visualize and interact with their data.
> Superset
> >> > enables
> >> > >     >>>> users explore data from a variety of databases, assemble
> >> > beautiful
> >> > >     >>>> dashboards and share their findings.  Superset works neatly
> >> > with all
> >> > >     >>>> modern
> >> > >     >>>> SQL-speaking databases, and integrates with Druid.io to
> >> provide
> >> > >     >> real-time,
> >> > >     >>>> interactive, blazing fast data access to large datasets.
> >> > >     >>>>
> >> > >     >>>> == Background ==
> >> > >     >>>> Data is mission critical. To succeed in this era,
> >> organizations
> >> > need to
> >> > >     >>>> provide low-friction, intuitive and interactive access to
> >> data.
> >> > It is
> >> > >     >>>> paramount for knowledge workers to be capable of answering
> >> > their own
> >> > >     >>>> questions by querying, exploring and visualizing data.
> >> > >     >>>>
> >> > >     >>>> The entire business intelligence industry has pivoted from
> a
> >> > model of
> >> > >     >>>> centralized top-down platforms driven by IT organizations
> to
> >> > >     >> self-service
> >> > >     >>>> analytics and agile workflows by any user.  This shift
> >> unblocks
> >> > >     >>>> centralized
> >> > >     >>>> service bottlenecks for creating data visualizations while
> >> also
> >> > creating
> >> > >     >>>> an
> >> > >     >>>> environment that is iterative and fast-moving.  This means
> >> that
> >> > business
> >> > >     >>>> intelligence software must also be easy and delightful to
> use.
> >> > >     >>>> Self-service analytics doesn’t mean that admin and
> governance
> >> > features
> >> > >     >> are
> >> > >     >>>> not needed.
> >> > >     >>>> Modern BI tools provide fine-grain access controls and
> >> auditing
> >> > >     >>>> capabilities to understand how data is being used.
> Superset
> >> is
> >> > a
> >> > >     >> solution
> >> > >     >>>> that delivers on all of these vectors.
> >> > >     >>>>
> >> > >     >>>> The technology stack is also constantly morphing - vendors
> are
> >> > >     >> struggling
> >> > >     >>>> to provide cheap, quick and easy solutions to access data.
> >> > Business
> >> > >     >>>> intelligence users are finding existing solutions lacking
> as
> >> > these
> >> > >     >>>> software
> >> > >     >>>> products either disregard or react slowly to recent
> >> > game-changing
> >> > >     >>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache
> >> > Kylin, d3.js,
> >> > >     >>>> React.js and iPython’s Jupyter for instance.
> >> > >     >>>>
> >> > >     >>>> == Rationale ==
> >> > >     >>>> Business intelligence is more relevant today than at any
> other
> >> > point in
> >> > >     >>>> history.  Organizations are currently very limited in
> options
> >> > for open
> >> > >     >>>> source data visualization solutions, especially solutions
> that
> >> > are both
> >> > >     >>>> self-service and enterprise-ready.  Every company informing
> >> > their
> >> > >     >>>> decisions
> >> > >     >>>> with data needs a BI tool.
> >> > >     >>>>
> >> > >     >>>> We believe that Superset will be a strong compliment to
> >> > existing Apache
> >> > >     >>>> Software Foundation technologies by offering scalable user
> >> > interactions
> >> > >     >> to
> >> > >     >>>> distributed storage and computation solutions.  Users will
> >> > often find
> >> > >     >> that
> >> > >     >>>> Superset can act as a catalyst for tooling that can
> visualize
> >> > the
> >> > >     >>>> byproduct
> >> > >     >>>> of data and computation infrastructure.
> >> > >     >>>>
> >> > >     >>>> Superset has many key design elements that help fill a gap
> in
> >> > current
> >> > >     >>>> solutions for organizations:
> >> > >     >>>> * Easy, low friction access to data through a simple,
> >> web-based
> >> > data
> >> > >     >>>> exploration interface.  Composing charts and dashboards are
> >> > intuitive.
> >> > >     >>>> Eliminating the need to write code or SQL empowers anyone
> to
> >> > use it.
> >> > >     >>>> * Access to a wide array of rich, interactive data
> >> > visualization types.
> >> > >     >>>> * Enterprise-ready: Integration with different
> authentication
> >> > >     >> mechanisms
> >> > >     >>>> and granular permissions centered around actions and data
> >> > access.
> >> > >     >>>> * Realtime & fast: Superset provides realtime analytics at
> the
> >> > speed of
> >> > >     >>>> thought on very large datasets when integrated with
> Druid.io.
> >> > >     >>>> * Broad data access: Consume data out of any SQL-speaking
> >> > relational
> >> > >     >>>> database.
> >> > >     >>>> * Extensible: Can be extended to talk to many noSQL
> databases
> >> > like
> >> > >     >> Apache
> >> > >     >>>> Drill, Elastic Search, and other popular database engines.
> >> > >     >>>> * Fast loading dashboards with configurable web-scale
> caching.
> >> > >     >>>> * Plug-in framework that enables organizations to build
> custom
> >> > >     >> analytical
> >> > >     >>>> applications with new UI/UX interfaces.
> >> > >     >>>> * SQL Lab, a state-of-the-art SQL IDE that empowers
> >> > SQL-speaking users
> >> > >     >>>> with more flexibility.  SQL Lab integrates with the
> >> > visualization engine
> >> > >     >>>> seamlessly.
> >> > >     >>>>
> >> > >     >>>> == Initial Goals ==
> >> > >     >>>> The initial goals of the Superset project are several-fold:
> >> > >     >>>> * Move the existing codebase to Apache and integrate with
> the
> >> > Apache
> >> > >     >>>> development process.
> >> > >     >>>> * Redesign the user interface and interaction model for
> >> creating
> >> > >     >>>> visualizations/dashboards and connecting to data sources
> >> > >     >>>> * Build robust support for security and governance of the
> tool
> >> > >     >> including
> >> > >     >>>> popular authorization modules (including Apache Ranger and
> >> > Apache
> >> > >     >> Sentry)
> >> > >     >>>> and a more sophisticated permissions system
> >> > >     >>>> * Grow the extensibility of the project both in terms of
> >> > enhanced
> >> > >     >>>> connectivity to NoSQL-based data sources and creating a
> >> plug-in
> >> > >     >> framework
> >> > >     >>>> that enables organizations to build custom analytical
> >> > applications which
> >> > >     >>>> require a new UI/UX
> >> > >     >>>>
> >> > >     >>>> == Current Status ==
> >> > >     >>>> By many standards, Superset is already a successful open
> >> source
> >> > project.
> >> > >     >>>> As
> >> > >     >>>> of March 2017, Superset is officially used in production at
> >> > about a
> >> > >     >> dozen
> >> > >     >>>> companies, has received contributions from over one hundred
> >> > contributors
> >> > >     >>>> on
> >> > >     >>>> Github, 1500+ forks, and 12k+ stars.
> >> > >     >>>>
> >> > >     >>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have
> >> made
> >> > >     >>>> significant contributions, and expressed their commitment
> to
> >> the
> >> > >     >> project.
> >> > >     >>>> The product is feature complete and has been viable for
> >> months.
> >> > It
> >> > >     >> already
> >> > >     >>>> serves as the main interface for consuming data at many
> >> > companies of
> >> > >     >>>> different sizes.
> >> > >     >>>>
> >> > >     >>>> While the product is usable, there’s room for improvement
> >> > across the
> >> > >     >>>> board,
> >> > >     >>>> starting with providing a smoother user experience around
> >> > content
> >> > >     >>>> creation,
> >> > >     >>>> making sure all features work out-of-the-box on more
> platforms
> >> > and
> >> > >     >>>> databases, providing better user training guides and
> videos,
> >> > having a
> >> > >     >>>> predictable release process, and increasing the overall
> >> quality
> >> > of the
> >> > >     >>>> Superset releases.
> >> > >     >>>>
> >> > >     >>>> === Meritocracy ===
> >> > >     >>>> We plan to invest in supporting a meritocracy. We will
> discuss
> >> > the
> >> > >     >>>> requirements in an open forum. Several companies have
> >> expressed
> >> > interest
> >> > >     >>>> in
> >> > >     >>>> this project, and we intend to invite additional
> developers to
> >> > >     >>>> participate.
> >> > >     >>>> We will encourage and monitor community participation so
> that
> >> > privileges
> >> > >     >>>> can be extended to those that contribute.
> >> > >     >>>>
> >> > >     >>>> === Community ===
> >> > >     >>>> The need for an enterprise-ready data visualization and
> >> > exploration
> >> > >     >>>> platform in the open source community is tremendous.  While
> >> > Superset is
> >> > >     >>>> fairly well known, recognized and used within the Druid.io
> >> > community,
> >> > >     >>>> adoption is currently limited outside of that niche. There
> is
> >> a
> >> > huge
> >> > >     >>>> opportunity to grow the community to hundreds if not
> thousands
> >> > of
> >> > >     >>>> organizations, and we are hoping that embracing “the Apache
> >> > way” will
> >> > >     >>>> accelerate the growth of our community.
> >> > >     >>>>
> >> > >     >>>> We have already been active at seeking and inviting
> >> > contributions, and
> >> > >     >> are
> >> > >     >>>> planning to scale the project by investing time and growing
> >> the
> >> > support
> >> > >     >>>> structure to grow the community.
> >> > >     >>>>
> >> > >     >>>> === Core Developers ===
> >> > >     >>>> The initial committers for Superset include experienced
> full
> >> > stack,
> >> > >     >>>> front-end and data engineers:
> >> > >     >>>> * Maxime Beauchemin (Airbnb)
> >> > >     >>>> * Alanna Scott (Airbnb)
> >> > >     >>>> * Bogdan Kyryliuk (Airbnb)
> >> > >     >>>> * Vera Liu  (Airbnb)
> >> > >     >>>> * Jeff Feng (Airbnb)
> >> > >     >>>> * Ashutosh Chauhan (Hortonworks)
> >> > >     >>>> * Nishant Bangarwa (Hortonworks)
> >> > >     >>>> * Slim Bouguerra (Hortonworks)
> >> > >     >>>> * Priyank Shah (Hortonworks)
> >> > >     >>>> * Sriharsha Chintalapani (Hortonworks)
> >> > >     >>>> * Daniel Dai (Hortonworks)
> >> > >     >>>>
> >> > >     >>>> We realize that additional employer diversity is needed,
> and
> >> we
> >> > will
> >> > >     >> work
> >> > >     >>>> aggressively to recruit developers from additional
> companies.
> >> > >     >>>>
> >> > >     >>>> === Alignment ===
> >> > >     >>>> The initial committers strongly believe that a system for
> >> > interactive
> >> > >     >>>> visualization of data will gain broader adoption as an open
> >> > source,
> >> > >     >>>> community driven project, where the community can
> contribute
> >> > not only to
> >> > >     >>>> the core components, but also to a growing collection of
> >> > connectors,
> >> > >     >>>> visualizations and improving integration a all potential
> data
> >> > sources.
> >> > >     >>>> Superset already integrates closely with Apache Hive, the
> Hive
> >> > >     >> metastore,
> >> > >     >>>> as well as most SQL-speaking databases found in modern data
> >> > ecosystems.
> >> > >     >>>>
> >> > >     >>>> == Known Risks ==
> >> > >     >>>>
> >> > >     >>>> === Orphaned Products ===
> >> > >     >>>> Superset is a vital component for both visualizing,
> accessing
> >> > and
> >> > >     >>>> democratizing data at Airbnb.  Also at Hortonworks,
> Superset
> >> is
> >> > a core
> >> > >     >>>> component of the DataFlow product offering.  Thus, the
> risk of
> >> > the
> >> > >     >> project
> >> > >     >>>> being orphaned is relatively low.  The project could be at
> >> risk
> >> > if
> >> > >     >> Airbnb
> >> > >     >>>> changes their approach for democratizing data or if
> >> Hortonworks
> >> > changes
> >> > >     >>>> their strategy in the market.  In such an event, the
> >> committers
> >> > plan to
> >> > >     >>>> continue working on the project on their own time, thought
> the
> >> > progress
> >> > >     >>>> will likely be slower.  We plan to mitigate this risk by
> >> > recruiting
> >> > >     >>>> additional committers.
> >> > >     >>>>
> >> > >     >>>> === Inexperience with Open Source ===
> >> > >     >>>> The initial committers include veteran Apache members
> >> > (committers and
> >> > >     >> PPMC
> >> > >     >>>> members) and other developers who have varying degrees of
> >> > experience
> >> > >     >> with
> >> > >     >>>> open source projects. All have been involved with source
> code
> >> > that has
> >> > >     >>>> been
> >> > >     >>>> released under an open source license, and several also
> have
> >> > experience
> >> > >     >>>> developing code with an open source development process.
> >> > >     >>>>
> >> > >     >>>> === Homogenous Developers ===
> >> > >     >>>> The initial committers are employed by Airbnb Inc. and
> >> > Hortonworks. We
> >> > >     >> are
> >> > >     >>>> committed to recruiting additional committers from other
> >> > companies.
> >> > >     >>>>
> >> > >     >>>> === Reliance on Salaried Developers ===
> >> > >     >>>> It is expected that Superset development will occur on both
> >> > salaried
> >> > >     >> time
> >> > >     >>>> and on volunteer time, after hours. The majority of initial
> >> > committers
> >> > >     >> are
> >> > >     >>>> paid by their employer to contribute to this project.
> However,
> >> > they are
> >> > >     >>>> all
> >> > >     >>>> passionate about the project, and we are confident that the
> >> > project will
> >> > >     >>>> continue even if no salaried developers contribute to the
> >> > project. We
> >> > >     >> are
> >> > >     >>>> committed to recruiting additional committers including
> >> > non-salaried
> >> > >     >>>> developers.
> >> > >     >>>>
> >> > >     >>>> === Relationships with Other Apache Products ===
> >> > >     >>>> To the knowledge of the Initial Committers, there are no
> >> direct
> >> > >     >>>> competitors
> >> > >     >>>> to Superset within the Apache Software Foundation.  That
> said,
> >> > Apache
> >> > >     >>>> Zeppelin is an indirect competitor, but it solves a
> different
> >> > use case.
> >> > >     >>>>
> >> > >     >>>> Apache Zeppelin is a web-based notebook that enables
> >> > interactive data
> >> > >     >>>> analytics. It enables the creation of beautiful
> data-driven,
> >> > interactive
> >> > >     >>>> and collaborative documents with SQL, Scala and more.
> >> Although
> >> > a user
> >> > >     >> can
> >> > >     >>>> create data visualizations using this project, it
> leverages a
> >> > notebook
> >> > >     >>>> style user interfaces and it is geared towards the Spark
> >> > community where
> >> > >     >>>> Scala and SQL co-exist
> >> > >     >>>>
> >> > >     >>>> We look forward to collaborating with those communities, as
> >> > well as
> >> > >     >> other
> >> > >     >>>> Apache communities.
> >> > >     >>>>
> >> > >     >>>> === An Excessive Fascination with the Apache Brand ===
> >> > >     >>>> Superset is solving two huge challenges:
> >> > >     >>>> The challenge of enabling every knowledge worker to make
> data
> >> > informed
> >> > >     >>>> decisions, particularly those who are not deeply skilled at
> >> > writing SQL.
> >> > >     >>>> The challenge of visualizing huge amounts of data
> >> interactively
> >> > and in
> >> > >     >>>> real-time
> >> > >     >>>>
> >> > >     >>>> Superset was first developed as a data visualization
> solution
> >> > for
> >> > >     >> Druid.io
> >> > >     >>>> as a way to visualize billions of rows of data.  Since
> then,
> >> > usage of
> >> > >     >>>> Superset has expanded to address data visualization use
> cases
> >> > across SQL
> >> > >     >>>> speaking data sources as well.
> >> > >     >>>>
> >> > >     >>>> Our rationale for developing Superset as an Apache project
> is
> >> > detailed
> >> > >     >> in
> >> > >     >>>> the Rationale Section.  We believe that the Apache brand
> and
> >> > community
> >> > >     >>>> process will help us attract more contributors to this
> >> project,
> >> > and help
> >> > >     >>>> grow the footprint of the project through usage at other
> >> > organizations
> >> > >     >> and
> >> > >     >>>> within other applications.  Establishing consensus among
> users
> >> > and
> >> > >     >>>> developers will result in a more valuable tool for
> everyone.
> >> > >     >>>>
> >> > >     >>>> == Documentation ==
> >> > >     >>>> References to further reading material:
> >> > >     >>>> * [[http://airbnb.io/superset/|Superset Documentation]]
> >> > >     >>>> * [[
> >> > >     >>>> https://medium.com/airbnb-engi
> neering/caravel-airbnb-s-data-
> >> > >     >>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
> >> > >     >>>> Post:  Superset: Airbnb’s Data Exploration Platform]]
> >> > >     >>>> * [[
> >> > >     >>>> https://medium.com/airbnb-engi
> neering/superset-scaling-data-
> >> > >     >>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.
> >> > a505zvb1t|Blog
> >> > >     >>>> Post:  Superset: Scaling Data Access & Visual Insights at
> >> > Airbnb]]
> >> > >     >>>>
> >> > >     >>>> == Initial Source ==
> >> > >     >>>> The origin of the proposed code base can be found at
> >> > >     >>>> https://github.com/airbnb/superset.  The code base is
> >> > primarily in
> >> > >     >>>> Python.
> >> > >     >>>>
> >> > >     >>>> == Source and Intellectual Property Submission Plan ==
> >> > >     >>>> We do not expect any complications for the submission of
> the
> >> > Superset
> >> > >     >> code
> >> > >     >>>> base.  Our code is already in Github and there is only a
> >> single
> >> > code
> >> > >     >> base.
> >> > >     >>>>
> >> > >     >>>> == External Dependencies ==
> >> > >     >>>> List of Python packages, from the Python Package Index
> (Pypi):
> >> > >     >>>>
> >> > >     >>>> * boto3
> >> > >     >>>> * celery
> >> > >     >>>> * cryptography
> >> > >     >>>> * flask-appbuilder
> >> > >     >>>> * flask-cache
> >> > >     >>>> * flask-migrate
> >> > >     >>>> * flask-script
> >> > >     >>>> * flask-sqlalchemy
> >> > >     >>>> * flask-testing
> >> > >     >>>> * humanize
> >> > >     >>>> * gunicorn
> >> > >     >>>> * markdown
> >> > >     >>>> * pandas
> >> > >     >>>> * parsedatetime
> >> > >     >>>> * pydruid
> >> > >     >>>> * PyHive
> >> > >     >>>> * python-dateutil
> >> > >     >>>> * requests
> >> > >     >>>> * simplejson
> >> > >     >>>> * six
> >> > >     >>>> * sqlalchemy
> >> > >     >>>> * sqlalchemy-utils
> >> > >     >>>> * sqlparse
> >> > >     >>>> * thrift
> >> > >     >>>> * thrift-sasl
> >> > >     >>>> * werkzeug
> >> > >     >>>>
> >> > >     >>>> List of Javascript packages, from NPM:
> >> > >     >>>> * autobind-decorator
> >> > >     >>>> * bootstrap
> >> > >     >>>> * bootstrap-datepicker
> >> > >     >>>> * brace
> >> > >     >>>> * brfs
> >> > >     >>>> * cal-heatmap
> >> > >     >>>> * classnames
> >> > >     >>>> * d3
> >> > >     >>>> * d3-cloud
> >> > >     >>>> * d3-sankey
> >> > >     >>>> * d3-scale
> >> > >     >>>> * d3-tip
> >> > >     >>>> * datamaps
> >> > >     >>>> * datatables-bootstrap3-plugin
> >> > >     >>>> * datatables.net-bs
> >> > >     >>>> * font-awesome
> >> > >     >>>> * gridster
> >> > >     >>>> * immutability-helper
> >> > >     >>>> * immutable
> >> > >     >>>> * jquery
> >> > >     >>>> * lodash.throttle
> >> > >     >>>> * mapbox-gl
> >> > >     >>>> * moment
> >> > >     >>>> * moments
> >> > >     >>>> * mustache
> >> > >     >>>> * nvd3
> >> > >     >>>> * react
> >> > >     >>>> * react-ace
> >> > >     >>>> * react-bootstrap
> >> > >     >>>> * react-bootstrap-table
> >> > >     >>>> * react-dom
> >> > >     >>>> * react-draggable
> >> > >     >>>> * react-gravatar
> >> > >     >>>> * react-grid-layout
> >> > >     >>>> * react-map-gl
> >> > >     >>>> * react-redux
> >> > >     >>>> * react-resizable
> >> > >     >>>> * react-select
> >> > >     >>>> * react-syntax-highlighter
> >> > >     >>>> * reactable
> >> > >     >>>> * redux
> >> > >     >>>> * redux-localstorage
> >> > >     >>>> * redux-thunk
> >> > >     >>>> * shortid
> >> > >     >>>> * style-loader
> >> > >     >>>> * supercluster
> >> > >     >>>> * topojson
> >> > >     >>>> * victory
> >> > >     >>>> * viewport-mercator-project
> >> > >     >>>>
> >> > >     >>>> == Cryptography ==
> >> > >     >>>> The proposal does not include cryptographic code.
> >> > >     >>>>
> >> > >     >>>> == Required Resources ==
> >> > >     >>>>
> >> > >     >>>> === Mailing List ===
> >> > >     >>>> There is a current mailing list as a Google Group
> >> > “airbnb_superset” that
> >> > >     >>>> we
> >> > >     >>>> are planning on deprecating as the Apache.org become ready
> to
> >> > serve our
> >> > >     >>>> community.
> >> > >     >>>>
> >> > >     >>>> * superset-private
> >> > >     >>>> * superset-dev
> >> > >     >>>> * superset-user
> >> > >     >>>>
> >> > >     >>>> === Subversion Directory ===
> >> > >     >>>> Git is the preferred source control system.
> >> > >     >>>> http://svn.apache.org/repos/asf/incubator/superset
> >> > >     >>>>
> >> > >     >>>> == Git Repository ==
> >> > >     >>>> Git is the preferred source control system, we’re assuming
> >> > >     >>>> https://github.com/apache/incubator-superset based on the
> >> > naming scheme
> >> > >     >>>>
> >> > >     >>>> == Issue Tracking ==
> >> > >     >>>> JIRA Superset (SUPERSET). If possible, we’d like to use
> Github
> >> > issues &
> >> > >     >>>> PRs
> >> > >     >>>> to manage our project as much as possible. It’s been said
> that
> >> > there are
> >> > >     >>>> ways to keep Github’s issues in sync with Jira, allowing
> us to
> >> > get best
> >> > >     >> of
> >> > >     >>>> both worlds. If that is not possible, we will comply to
> using
> >> > Jira.
> >> > >     >>>>
> >> > >     >>>> == Other Resources ==
> >> > >     >>>> We currently use a set of Github integrated services that
> are
> >> > free to
> >> > >     >> the
> >> > >     >>>> open source community, like Travis-ci, Code Climate,
> >> Coveralls,
> >> > >     >>>> Landscape.io, Requires.io, david-dm and Gitter. We would
> like
> >> > to keep
> >> > >     >>>> using
> >> > >     >>>> these services as they allow us to scale contributions and
> >> > optimize our
> >> > >     >>>> development flows. These services require some elevated
> rights
> >> > on the
> >> > >     >>>> Github repository in order to set up or tune and we would
> like
> >> > for the
> >> > >     >>>> committers to have the required rights.
> >> > >     >>>>
> >> > >     >>>>
> >> > >     >>>> == Initial Committers ==
> >> > >     >>>>
> >> > >     >>>> * Maxime Beauchemin <[hidden email]> - PPMC
> &
> >> > Committer
> >> > >     >>>> * Alanna Scott <[hidden email]> - PPMC &
> Committer
> >> > >     >>>> * Bogdan Kyryliuk <[hidden email]> - PPMC &
> Committer
> >> > >     >>>> * Vera Liu <[hidden email]> - Committer
> >> > >     >>>> * Jeff Feng <[hidden email]> - PPMC & Committer
> >> > >     >>>> * Ashutosh Chauhan <[hidden email]> - Mentor &
> >> Committer
> >> > >     >>>> * Nishant Bangarwa <[hidden email]> - PPMC &
> >> > Committer
> >> > >     >>>> * Slim Bouguerra <[hidden email]> - Committer
> >> > >     >>>> * Priyank Shah <[hidden email]> - Committer
> >> > >     >>>> * Harsha Chintalapani <[hidden email]> -
> >> > Committer
> >> > >     >>>> * Daniel Dai <[hidden email]> - Champion & Committer
> >> > >     >>>> * Luke Han <[hidden email]> - Mentor
> >> > >     >>>>
> >> > >     >>>> == Affiliations ==
> >> > >     >>>> The initial committers are employees of Airbnb Inc. and
> >> > Hortonworks.
> >> > >     >>>>
> >> > >     >>>> == Sponsors ==
> >> > >     >>>>
> >> > >     >>>> === Champion ===
> >> > >     >>>> Daniel Dai <[hidden email]>
> >> > >     >>>>
> >> > >     >>>> === Nominated Mentors ===
> >> > >     >>>> * Ashutosh Chauhan <[hidden email]>
> >> > >     >>>> * Luke Han <[hidden email]>
> >> > >     >>>>
> >> > >     >>>> === Sponsoring Entity ===
> >> > >     >>>> Incubator PMC
> >> > >     >>>>
> >> > >     >>>
> >> > >     >>>
> >> > >     >>
> >> > >
> >> > >
> >> > >     ------------------------------------------------------------
> >> > ---------
> >> > >     To unsubscribe, e-mail: general-unsubscribe@incubator.
> apache.org
> >> > >     For additional commands, e-mail: [hidden email].
> org
> >> > >
> >> > >
> >> > >
> >> > >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: [hidden email]
> >> > For additional commands, e-mail: [hidden email]
> >> >
> >> >
> >>
>
>
>
> --
> Best Regards, Edward J. Yoon
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Felix Cheung
+1 (nonbinding)

On Wed, Apr 26, 2017 at 11:13 PM Jeff Feng <[hidden email]> wrote:

> Hello everyone,
>
> Thank you for checking out our proposal on Superset and for your
> consideration for the Apache Incubator.  So far, I believe we have 8
> binding votes and 2 non-binding votes.
>
> As Taylor mentioned earlier, we made a minor update to the wording in the
> "Source and Intellectual Property Submission Plan" section based on a
> suggestion by John Ament.  The update was to help confirm the previously
> unstated assumption that we will submit an SGA.  I have copied the updated
> proposal from the wiki to the email below and highlighted (in yellow) the
> new sentence below in the document.
>
> Folks on the cc line who have already voted, please let us know if the
> change impacts your vote.
>
> Thank you all,
> Jeff
>
>
>
> = Superset =
>
> == Abstract ==
> Superset is an enterprise-ready web application for data exploration, data
> visualization and dashboarding.
>
> == Proposal ==
> Superset is business intelligence (BI) software that helps modern
> organizations visualize and interact with their data. Superset enables
> users explore data from a variety of databases, assemble beautiful
> dashboards and share their findings.  Superset works neatly with all modern
> SQL-speaking databases, and integrates with Druid.io to provide real-time,
> interactive, blazing fast data access to large datasets.
>
> == Background ==
> Data is mission critical. To succeed in this era, organizations need to
> provide low-friction, intuitive and interactive access to data. It is
> paramount for knowledge workers to be capable of answering their own
> questions by querying, exploring and visualizing data.
>
> The entire business intelligence industry has pivoted from a model of
> centralized top-down platforms driven by IT organizations to self-service
> analytics and agile workflows by any user.  This shift unblocks centralized
> service bottlenecks for creating data visualizations while also creating an
> environment that is iterative and fast-moving.  This means that business
> intelligence software must also be easy and delightful to use.
> Self-service analytics doesn’t mean that admin and governance features are
> not needed.
> Modern BI tools provide fine-grain access controls and auditing
> capabilities to understand how data is being used.  Superset is a solution
> that delivers on all of these vectors.
>
> The technology stack is also constantly morphing - vendors are struggling
> to provide cheap, quick and easy solutions to access data.  Business
> intelligence users are finding existing solutions lacking as these software
> products either disregard or react slowly to recent game-changing
> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
> React.js and iPython’s Jupyter for instance.
>
> == Rationale ==
> Business intelligence is more relevant today than at any other point in
> history.  Organizations are currently very limited in options for open
> source data visualization solutions, especially solutions that are both
> self-service and enterprise-ready.  Every company informing their decisions
> with data needs a BI tool.
>
> We believe that Superset will be a strong compliment to existing Apache
> Software Foundation technologies by offering scalable user interactions to
> distributed storage and computation solutions.  Users will often find that
> Superset can act as a catalyst for tooling that can visualize the byproduct
> of data and computation infrastructure.
>
> Superset has many key design elements that help fill a gap in current
> solutions for organizations:
>  * Easy, low friction access to data through a simple, web-based data
> exploration interface.  Composing charts and dashboards are intuitive.
> Eliminating the need to write code or SQL empowers anyone to use it.
>  * Access to a wide array of rich, interactive data visualization types.
>  * Enterprise-ready: Integration with different authentication mechanisms
> and granular permissions centered around actions and data access.
>  * Realtime & fast: Superset provides realtime analytics at the speed of
> thought on very large datasets when integrated with Druid.io.
>  * Broad data access: Consume data out of any SQL-speaking relational
> database.
>  * Extensible: Can be extended to talk to many noSQL databases like Apache
> Drill, Elastic Search, and other popular database engines.
>  * Fast loading dashboards with configurable web-scale caching.
>  * Plug-in framework that enables organizations to build custom analytical
> applications with new UI/UX interfaces.
>  * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
> with more flexibility.  SQL Lab integrates with the visualization engine
> seamlessly.
>
> == Initial Goals ==
> The initial goals of the Superset project are several-fold:
>  * Move the existing codebase to Apache and integrate with the Apache
> development process.
>  * Redesign the user interface and interaction model for creating
> visualizations/dashboards and connecting to data sources
>  * Build robust support for security and governance of the tool including
> popular authorization modules (including Apache Ranger and Apache Sentry)
> and a more sophisticated permissions system
>  * Grow the extensibility of the project both in terms of enhanced
> connectivity to NoSQL-based data sources and creating a plug-in framework
> that enables organizations to build custom analytical applications which
> require a new UI/UX
>
> == Current Status ==
> By many standards, Superset is already a successful open source project. As
> of March 2017, Superset is officially used in production at about a dozen
> companies, has received contributions from over one hundred contributors on
> Github, 1500+ forks, and 12k+ stars.
>
> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
> significant contributions, and expressed their commitment to the project.
> The product is feature complete and has been viable for months. It already
> serves as the main interface for consuming data at many companies of
> different sizes.
>
> While the product is usable, there’s room for improvement across the board,
> starting with providing a smoother user experience around content creation,
> making sure all features work out-of-the-box on more platforms and
> databases, providing better user training guides and videos, having a
> predictable release process, and increasing the overall quality of the
> Superset releases.
>
> === Meritocracy ===
> We plan to invest in supporting a meritocracy. We will discuss the
> requirements in an open forum. Several companies have expressed interest in
> this project, and we intend to invite additional developers to participate.
> We will encourage and monitor community participation so that privileges
> can be extended to those that contribute.
>
> === Community ===
> The need for an enterprise-ready data visualization and exploration
> platform in the open source community is tremendous.  While Superset is
> fairly well known, recognized and used within the Druid.io community,
> adoption is currently limited outside of that niche. There is a huge
> opportunity to grow the community to hundreds if not thousands of
> organizations, and we are hoping that embracing “the Apache way” will
> accelerate the growth of our community.
>
> We have already been active at seeking and inviting contributions, and are
> planning to scale the project by investing time and growing the support
> structure to grow the community.
>
> === Core Developers ===
> The initial committers for Superset include experienced full stack,
> front-end and data engineers:
>  * Maxime Beauchemin (Airbnb)
>  * Alanna Scott (Airbnb)
>  * Bogdan Kyryliuk (Airbnb)
>  * Vera Liu  (Airbnb)
>  * Jeff Feng (Airbnb)
>  * Ashutosh Chauhan (Hortonworks)
>  * Nishant Bangarwa (Hortonworks)
>  * Slim Bouguerra (Hortonworks)
>  * Priyank Shah (Hortonworks)
>  * Sriharsha Chintalapani (Hortonworks)
>  * Daniel Dai (Hortonworks)
>
> We realize that additional employer diversity is needed, and we will work
> aggressively to recruit developers from additional companies.
>
> === Alignment ===
> The initial committers strongly believe that a system for interactive
> visualization of data will gain broader adoption as an open source,
> community driven project, where the community can contribute not only to
> the core components, but also to a growing collection of connectors,
> visualizations and improving integration a all potential data sources.
> Superset already integrates closely with Apache Hive, the Hive metastore,
> as well as most SQL-speaking databases found in modern data ecosystems.
>
> == Known Risks ==
>
> === Orphaned Products ===
> Superset is a vital component for both visualizing, accessing and
> democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
> component of the DataFlow product offering.  Thus, the risk of the project
> being orphaned is relatively low.  The project could be at risk if Airbnb
> changes their approach for democratizing data or if Hortonworks changes
> their strategy in the market.  In such an event, the committers plan to
> continue working on the project on their own time, thought the progress
> will likely be slower.  We plan to mitigate this risk by recruiting
> additional committers.
>
> === Inexperience with Open Source ===
> The initial committers include veteran Apache members (committers and PPMC
> members) and other developers who have varying degrees of experience with
> open source projects. All have been involved with source code that has been
> released under an open source license, and several also have experience
> developing code with an open source development process.
>
> === Homogenous Developers ===
> The initial committers are employed by Airbnb Inc. and Hortonworks. We are
> committed to recruiting additional committers from other companies.
>
> === Reliance on Salaried Developers ===
> It is expected that Superset development will occur on both salaried time
> and on volunteer time, after hours. The majority of initial committers are
> paid by their employer to contribute to this project. However, they are all
> passionate about the project, and we are confident that the project will
> continue even if no salaried developers contribute to the project. We are
> committed to recruiting additional committers including non-salaried
> developers.
>
> === Relationships with Other Apache Products ===
> To the knowledge of the Initial Committers, there are no direct competitors
> to Superset within the Apache Software Foundation.  That said, Apache
> Zeppelin is an indirect competitor, but it solves a different use case.
>
> Apache Zeppelin is a web-based notebook that enables interactive data
> analytics. It enables the creation of beautiful data-driven, interactive
> and collaborative documents with SQL, Scala and more.  Although a user can
> create data visualizations using this project, it leverages a notebook
> style user interfaces and it is geared towards the Spark community where
> Scala and SQL co-exist
>
> We look forward to collaborating with those communities, as well as other
> Apache communities.
>
> === An Excessive Fascination with the Apache Brand ===
> Superset is solving two huge challenges:
> The challenge of enabling every knowledge worker to make data informed
> decisions, particularly those who are not deeply skilled at writing SQL.
> The challenge of visualizing huge amounts of data interactively and in
> real-time
>
> Superset was first developed as a data visualization solution for Druid.io
> as a way to visualize billions of rows of data.  Since then, usage of
> Superset has expanded to address data visualization use cases across SQL
> speaking data sources as well.
>
> Our rationale for developing Superset as an Apache project is detailed in
> the Rationale Section.  We believe that the Apache brand and community
> process will help us attract more contributors to this project, and help
> grow the footprint of the project through usage at other organizations and
> within other applications.  Establishing consensus among users and
> developers will result in a more valuable tool for everyone.
>
> == Documentation ==
> References to further reading material:
>  * [[http://airbnb.io/superset/|Superset Documentation]]
>  * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat
> a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post:  Superset:
> Airbnb’s Data Exploration Platform]]
>  * [[https://medium.com/airbnb-engineering/superset-scaling-dat
> a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog Post:
>  Superset: Scaling Data Access & Visual Insights at Airbnb]]
>
> == Initial Source ==
> The origin of the proposed code base can be found at
> https://github.com/airbnb/superset.  The code base is primarily in Python.
>
> == Source and Intellectual Property Submission Plan ==
> Airbnb will submit a Software Grant Agreement (SGA) as Superset joins the
> incubator. We do not expect any complications for the submission of the
> Superset code base.  Our code is already in Github and there is only a
> single code base.
>
> == External Dependencies ==
> List of Python packages, from the Python Package Index (Pypi):
>
>  * boto3
>  * celery
>  * cryptography
>  * flask-appbuilder
>  * flask-cache
>  * flask-migrate
>  * flask-script
>  * flask-sqlalchemy
>  * flask-testing
>  * humanize
>  * gunicorn
>  * markdown
>  * pandas
>  * parsedatetime
>  * pydruid
>  * PyHive
>  * python-dateutil
>  * requests
>  * simplejson
>  * six
>  * sqlalchemy
>  * sqlalchemy-utils
>  * sqlparse
>  * thrift
>  * thrift-sasl
>  * werkzeug
>
> List of Javascript packages, from NPM:
>  * autobind-decorator
>  * bootstrap
>  * bootstrap-datepicker
>  * brace
>  * brfs
>  * cal-heatmap
>  * classnames
>  * d3
>  * d3-cloud
>  * d3-sankey
>  * d3-scale
>  * d3-tip
>  * datamaps
>  * datatables-bootstrap3-plugin
>  * datatables.net-bs
>  * font-awesome
>  * gridster
>  * immutability-helper
>  * immutable
>  * jquery
>  * lodash.throttle
>  * mapbox-gl
>  * moment
>  * moments
>  * mustache
>  * nvd3
>  * react
>  * react-ace
>  * react-bootstrap
>  * react-bootstrap-table
>  * react-dom
>  * react-draggable
>  * react-gravatar
>  * react-grid-layout
>  * react-map-gl
>  * react-redux
>  * react-resizable
>  * react-select
>  * react-syntax-highlighter
>  * reactable
>  * redux
>  * redux-localstorage
>  * redux-thunk
>  * shortid
>  * style-loader
>  * supercluster
>  * topojson
>  * victory
>  * viewport-mercator-project
>
> == Cryptography ==
> The proposal does not include cryptographic code.
>
> == Required Resources ==
>
> === Mailing List ===
> There is a current mailing list as a Google Group “airbnb_superset” that we
> are planning on deprecating as the Apache.org become ready to serve our
> community.
>
>  * superset-private
>  * superset-dev
>  * superset-user
>
> === Subversion Directory ===
> Git is the preferred source control system. http://svn.apache.org/repos/as
> f/incubator/superset <http://svn.apache.org/repos/asf/incubator/superset>
>
> == Git Repository ==
> Git is the preferred source control system, we’re assuming
> https://github.com/apache/incubator-superset based on the naming scheme
>
> == Issue Tracking ==
> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues & PRs
> to manage our project as much as possible. It’s been said that there are
> ways to keep Github’s issues in sync with Jira, allowing us to get best of
> both worlds. If that is not possible, we will comply to using Jira.
>
> == Other Resources ==
> We currently use a set of Github integrated services that are free to the
> open source community, like Travis-ci, Code Climate, Coveralls,
> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep using
> these services as they allow us to scale contributions and optimize our
> development flows. These services require some elevated rights on the
> Github repository in order to set up or tune and we would like for the
> committers to have the required rights.
>
>
> == Initial Committers ==
>
>  * Maxime Beauchemin <[hidden email]> - PPMC & Committer
>  * Alanna Scott <[hidden email]> - PPMC & Committer
>  * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
>  * Vera Liu <[hidden email]> - Committer
>  * Jeff Feng <[hidden email]> - PPMC & Committer
>  * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
>  * Nishant Bangarwa <[hidden email]> - PPMC & Committer
>  * Slim Bouguerra <[hidden email]> - Committer
>  * Priyank Shah <[hidden email]> - Committer
>  * Harsha Chintalapani <[hidden email]> - Committer
>  * Daniel Dai <[hidden email]> - Champion & Committer
>  * Luke Han <[hidden email]> - Mentor
>
> == Affiliations ==
> The initial committers are employees of Airbnb Inc. and Hortonworks.
>
> == Sponsors ==
>
> === Champion ===
> Daniel Dai <[hidden email]>
>
> === Nominated Mentors ===
>  * Ashutosh Chauhan <[hidden email]>
>  * Luke Han <[hidden email]>
>
> === Sponsoring Entity ===
> Incubator PMC
>
>
>
>
>
> On Wed, Apr 26, 2017 at 6:31 PM, Edward J. Yoon <[hidden email]>
> wrote:
>
> > +1 binding
> >
> > On Thu, Apr 27, 2017 at 10:29 AM, Naresh Agarwal
> > <[hidden email]> wrote:
> > > +1 (non-binding).
> > >
> > > Thanks
> > > Naresh Agarwal
> > >
> > > On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <[hidden email]>
> > wrote:
> > >
> > >> +1 (binding)
> > >>
> > >>
> > >>
> > >> On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <[hidden email]> wrote:
> > >>
> > >> > +1 (binding)
> > >> >
> > >> > On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey
> > >> > <[hidden email]> wrote:
> > >> > > +1 (binding)
> > >> > >
> > >> > > On 4/25/17, 1:27 PM, "Julian Hyde" <[hidden email]> wrote:
> > >> > >
> > >> > >     +1 binding
> > >> > >
> > >> > >     > On Apr 25, 2017, at 12:48 PM, moon soo Lee <[hidden email]
> >
> > >> > wrote:
> > >> > >     >
> > >> > >     > +1 (non-binding)
> > >> > >     >
> > >> > >     > On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <
> > >> > [hidden email]>
> > >> > >     > wrote:
> > >> > >     >
> > >> > >     >> +1 (binding)
> > >> > >     >>
> > >> > >     >> Thanks,
> > >> > >     >> Ashutosh
> > >> > >     >>
> > >> > >     >> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <
> [hidden email]
> > >
> > >> > wrote:
> > >> > >     >>
> > >> > >     >>> +1 binding
> > >> > >     >>>
> > >> > >     >>> Love to see Superset to be new incubator project.
> > >> > >     >>>
> > >> > >     >>>
> > >> > >     >>> Best Regards!
> > >> > >     >>> ---------------------
> > >> > >     >>>
> > >> > >     >>> Luke Han
> > >> > >     >>>
> > >> > >     >>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <
> > >> [hidden email]>
> > >> > wrote:
> > >> > >     >>>
> > >> > >     >>>> Dear Apache Incubator Community,
> > >> > >     >>>>
> > >> > >     >>>> We have updated the Superset proposal
> > >> > >     >>>> <https://wiki.apache.org/incubator/SupersetProposal>
> > (copied
> > >> > below) for
> > >> > >     >>>>
> > >> > >     >>>> Apache Incubation with an additional mentor (Luke Han -
> > >> > >     >>>> [hidden email]),
> > >> > >     >>>> and would like to start a vote thread for acceptance into
> > the
> > >> > incubator.
> > >> > >     >>>>
> > >> > >     >>>> Our team is excited to share Superset with the Apache
> > >> community
> > >> > and we
> > >> > >     >>>> hope
> > >> > >     >>>> for the your continued support!
> > >> > >     >>>>
> > >> > >     >>>> Cheers,
> > >> > >     >>>> Jeff & the Superset Team
> > >> > >     >>>>
> > >> > >     >>>>
> > >> > >     >>>>
> > >> > >     >>>>
> > >> > >     >>>> = Superset =
> > >> > >     >>>>
> > >> > >     >>>> == Abstract ==
> > >> > >     >>>> Superset is an enterprise-ready web application for data
> > >> > exploration,
> > >> > >     >> data
> > >> > >     >>>> visualization and dashboarding.
> > >> > >     >>>>
> > >> > >     >>>> == Proposal ==
> > >> > >     >>>> Superset is business intelligence (BI) software that
> helps
> > >> > modern
> > >> > >     >>>> organizations visualize and interact with their data.
> > Superset
> > >> > enables
> > >> > >     >>>> users explore data from a variety of databases, assemble
> > >> > beautiful
> > >> > >     >>>> dashboards and share their findings.  Superset works
> neatly
> > >> > with all
> > >> > >     >>>> modern
> > >> > >     >>>> SQL-speaking databases, and integrates with Druid.io to
> > >> provide
> > >> > >     >> real-time,
> > >> > >     >>>> interactive, blazing fast data access to large datasets.
> > >> > >     >>>>
> > >> > >     >>>> == Background ==
> > >> > >     >>>> Data is mission critical. To succeed in this era,
> > >> organizations
> > >> > need to
> > >> > >     >>>> provide low-friction, intuitive and interactive access to
> > >> data.
> > >> > It is
> > >> > >     >>>> paramount for knowledge workers to be capable of
> answering
> > >> > their own
> > >> > >     >>>> questions by querying, exploring and visualizing data.
> > >> > >     >>>>
> > >> > >     >>>> The entire business intelligence industry has pivoted
> from
> > a
> > >> > model of
> > >> > >     >>>> centralized top-down platforms driven by IT organizations
> > to
> > >> > >     >> self-service
> > >> > >     >>>> analytics and agile workflows by any user.  This shift
> > >> unblocks
> > >> > >     >>>> centralized
> > >> > >     >>>> service bottlenecks for creating data visualizations
> while
> > >> also
> > >> > creating
> > >> > >     >>>> an
> > >> > >     >>>> environment that is iterative and fast-moving.  This
> means
> > >> that
> > >> > business
> > >> > >     >>>> intelligence software must also be easy and delightful to
> > use.
> > >> > >     >>>> Self-service analytics doesn’t mean that admin and
> > governance
> > >> > features
> > >> > >     >> are
> > >> > >     >>>> not needed.
> > >> > >     >>>> Modern BI tools provide fine-grain access controls and
> > >> auditing
> > >> > >     >>>> capabilities to understand how data is being used.
> > Superset
> > >> is
> > >> > a
> > >> > >     >> solution
> > >> > >     >>>> that delivers on all of these vectors.
> > >> > >     >>>>
> > >> > >     >>>> The technology stack is also constantly morphing -
> vendors
> > are
> > >> > >     >> struggling
> > >> > >     >>>> to provide cheap, quick and easy solutions to access
> data.
> > >> > Business
> > >> > >     >>>> intelligence users are finding existing solutions lacking
> > as
> > >> > these
> > >> > >     >>>> software
> > >> > >     >>>> products either disregard or react slowly to recent
> > >> > game-changing
> > >> > >     >>>> technologies like Druid.io, PrestoDB, Apache Drill,
> Apache
> > >> > Kylin, d3.js,
> > >> > >     >>>> React.js and iPython’s Jupyter for instance.
> > >> > >     >>>>
> > >> > >     >>>> == Rationale ==
> > >> > >     >>>> Business intelligence is more relevant today than at any
> > other
> > >> > point in
> > >> > >     >>>> history.  Organizations are currently very limited in
> > options
> > >> > for open
> > >> > >     >>>> source data visualization solutions, especially solutions
> > that
> > >> > are both
> > >> > >     >>>> self-service and enterprise-ready.  Every company
> informing
> > >> > their
> > >> > >     >>>> decisions
> > >> > >     >>>> with data needs a BI tool.
> > >> > >     >>>>
> > >> > >     >>>> We believe that Superset will be a strong compliment to
> > >> > existing Apache
> > >> > >     >>>> Software Foundation technologies by offering scalable
> user
> > >> > interactions
> > >> > >     >> to
> > >> > >     >>>> distributed storage and computation solutions.  Users
> will
> > >> > often find
> > >> > >     >> that
> > >> > >     >>>> Superset can act as a catalyst for tooling that can
> > visualize
> > >> > the
> > >> > >     >>>> byproduct
> > >> > >     >>>> of data and computation infrastructure.
> > >> > >     >>>>
> > >> > >     >>>> Superset has many key design elements that help fill a
> gap
> > in
> > >> > current
> > >> > >     >>>> solutions for organizations:
> > >> > >     >>>> * Easy, low friction access to data through a simple,
> > >> web-based
> > >> > data
> > >> > >     >>>> exploration interface.  Composing charts and dashboards
> are
> > >> > intuitive.
> > >> > >     >>>> Eliminating the need to write code or SQL empowers anyone
> > to
> > >> > use it.
> > >> > >     >>>> * Access to a wide array of rich, interactive data
> > >> > visualization types.
> > >> > >     >>>> * Enterprise-ready: Integration with different
> > authentication
> > >> > >     >> mechanisms
> > >> > >     >>>> and granular permissions centered around actions and data
> > >> > access.
> > >> > >     >>>> * Realtime & fast: Superset provides realtime analytics
> at
> > the
> > >> > speed of
> > >> > >     >>>> thought on very large datasets when integrated with
> > Druid.io.
> > >> > >     >>>> * Broad data access: Consume data out of any SQL-speaking
> > >> > relational
> > >> > >     >>>> database.
> > >> > >     >>>> * Extensible: Can be extended to talk to many noSQL
> > databases
> > >> > like
> > >> > >     >> Apache
> > >> > >     >>>> Drill, Elastic Search, and other popular database
> engines.
> > >> > >     >>>> * Fast loading dashboards with configurable web-scale
> > caching.
> > >> > >     >>>> * Plug-in framework that enables organizations to build
> > custom
> > >> > >     >> analytical
> > >> > >     >>>> applications with new UI/UX interfaces.
> > >> > >     >>>> * SQL Lab, a state-of-the-art SQL IDE that empowers
> > >> > SQL-speaking users
> > >> > >     >>>> with more flexibility.  SQL Lab integrates with the
> > >> > visualization engine
> > >> > >     >>>> seamlessly.
> > >> > >     >>>>
> > >> > >     >>>> == Initial Goals ==
> > >> > >     >>>> The initial goals of the Superset project are
> several-fold:
> > >> > >     >>>> * Move the existing codebase to Apache and integrate with
> > the
> > >> > Apache
> > >> > >     >>>> development process.
> > >> > >     >>>> * Redesign the user interface and interaction model for
> > >> creating
> > >> > >     >>>> visualizations/dashboards and connecting to data sources
> > >> > >     >>>> * Build robust support for security and governance of the
> > tool
> > >> > >     >> including
> > >> > >     >>>> popular authorization modules (including Apache Ranger
> and
> > >> > Apache
> > >> > >     >> Sentry)
> > >> > >     >>>> and a more sophisticated permissions system
> > >> > >     >>>> * Grow the extensibility of the project both in terms of
> > >> > enhanced
> > >> > >     >>>> connectivity to NoSQL-based data sources and creating a
> > >> plug-in
> > >> > >     >> framework
> > >> > >     >>>> that enables organizations to build custom analytical
> > >> > applications which
> > >> > >     >>>> require a new UI/UX
> > >> > >     >>>>
> > >> > >     >>>> == Current Status ==
> > >> > >     >>>> By many standards, Superset is already a successful open
> > >> source
> > >> > project.
> > >> > >     >>>> As
> > >> > >     >>>> of March 2017, Superset is officially used in production
> at
> > >> > about a
> > >> > >     >> dozen
> > >> > >     >>>> companies, has received contributions from over one
> hundred
> > >> > contributors
> > >> > >     >>>> on
> > >> > >     >>>> Github, 1500+ forks, and 12k+ stars.
> > >> > >     >>>>
> > >> > >     >>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks
> have
> > >> made
> > >> > >     >>>> significant contributions, and expressed their commitment
> > to
> > >> the
> > >> > >     >> project.
> > >> > >     >>>> The product is feature complete and has been viable for
> > >> months.
> > >> > It
> > >> > >     >> already
> > >> > >     >>>> serves as the main interface for consuming data at many
> > >> > companies of
> > >> > >     >>>> different sizes.
> > >> > >     >>>>
> > >> > >     >>>> While the product is usable, there’s room for improvement
> > >> > across the
> > >> > >     >>>> board,
> > >> > >     >>>> starting with providing a smoother user experience around
> > >> > content
> > >> > >     >>>> creation,
> > >> > >     >>>> making sure all features work out-of-the-box on more
> > platforms
> > >> > and
> > >> > >     >>>> databases, providing better user training guides and
> > videos,
> > >> > having a
> > >> > >     >>>> predictable release process, and increasing the overall
> > >> quality
> > >> > of the
> > >> > >     >>>> Superset releases.
> > >> > >     >>>>
> > >> > >     >>>> === Meritocracy ===
> > >> > >     >>>> We plan to invest in supporting a meritocracy. We will
> > discuss
> > >> > the
> > >> > >     >>>> requirements in an open forum. Several companies have
> > >> expressed
> > >> > interest
> > >> > >     >>>> in
> > >> > >     >>>> this project, and we intend to invite additional
> > developers to
> > >> > >     >>>> participate.
> > >> > >     >>>> We will encourage and monitor community participation so
> > that
> > >> > privileges
> > >> > >     >>>> can be extended to those that contribute.
> > >> > >     >>>>
> > >> > >     >>>> === Community ===
> > >> > >     >>>> The need for an enterprise-ready data visualization and
> > >> > exploration
> > >> > >     >>>> platform in the open source community is tremendous.
> While
> > >> > Superset is
> > >> > >     >>>> fairly well known, recognized and used within the
> Druid.io
> > >> > community,
> > >> > >     >>>> adoption is currently limited outside of that niche.
> There
> > is
> > >> a
> > >> > huge
> > >> > >     >>>> opportunity to grow the community to hundreds if not
> > thousands
> > >> > of
> > >> > >     >>>> organizations, and we are hoping that embracing “the
> Apache
> > >> > way” will
> > >> > >     >>>> accelerate the growth of our community.
> > >> > >     >>>>
> > >> > >     >>>> We have already been active at seeking and inviting
> > >> > contributions, and
> > >> > >     >> are
> > >> > >     >>>> planning to scale the project by investing time and
> growing
> > >> the
> > >> > support
> > >> > >     >>>> structure to grow the community.
> > >> > >     >>>>
> > >> > >     >>>> === Core Developers ===
> > >> > >     >>>> The initial committers for Superset include experienced
> > full
> > >> > stack,
> > >> > >     >>>> front-end and data engineers:
> > >> > >     >>>> * Maxime Beauchemin (Airbnb)
> > >> > >     >>>> * Alanna Scott (Airbnb)
> > >> > >     >>>> * Bogdan Kyryliuk (Airbnb)
> > >> > >     >>>> * Vera Liu  (Airbnb)
> > >> > >     >>>> * Jeff Feng (Airbnb)
> > >> > >     >>>> * Ashutosh Chauhan (Hortonworks)
> > >> > >     >>>> * Nishant Bangarwa (Hortonworks)
> > >> > >     >>>> * Slim Bouguerra (Hortonworks)
> > >> > >     >>>> * Priyank Shah (Hortonworks)
> > >> > >     >>>> * Sriharsha Chintalapani (Hortonworks)
> > >> > >     >>>> * Daniel Dai (Hortonworks)
> > >> > >     >>>>
> > >> > >     >>>> We realize that additional employer diversity is needed,
> > and
> > >> we
> > >> > will
> > >> > >     >> work
> > >> > >     >>>> aggressively to recruit developers from additional
> > companies.
> > >> > >     >>>>
> > >> > >     >>>> === Alignment ===
> > >> > >     >>>> The initial committers strongly believe that a system for
> > >> > interactive
> > >> > >     >>>> visualization of data will gain broader adoption as an
> open
> > >> > source,
> > >> > >     >>>> community driven project, where the community can
> > contribute
> > >> > not only to
> > >> > >     >>>> the core components, but also to a growing collection of
> > >> > connectors,
> > >> > >     >>>> visualizations and improving integration a all potential
> > data
> > >> > sources.
> > >> > >     >>>> Superset already integrates closely with Apache Hive, the
> > Hive
> > >> > >     >> metastore,
> > >> > >     >>>> as well as most SQL-speaking databases found in modern
> data
> > >> > ecosystems.
> > >> > >     >>>>
> > >> > >     >>>> == Known Risks ==
> > >> > >     >>>>
> > >> > >     >>>> === Orphaned Products ===
> > >> > >     >>>> Superset is a vital component for both visualizing,
> > accessing
> > >> > and
> > >> > >     >>>> democratizing data at Airbnb.  Also at Hortonworks,
> > Superset
> > >> is
> > >> > a core
> > >> > >     >>>> component of the DataFlow product offering.  Thus, the
> > risk of
> > >> > the
> > >> > >     >> project
> > >> > >     >>>> being orphaned is relatively low.  The project could be
> at
> > >> risk
> > >> > if
> > >> > >     >> Airbnb
> > >> > >     >>>> changes their approach for democratizing data or if
> > >> Hortonworks
> > >> > changes
> > >> > >     >>>> their strategy in the market.  In such an event, the
> > >> committers
> > >> > plan to
> > >> > >     >>>> continue working on the project on their own time,
> thought
> > the
> > >> > progress
> > >> > >     >>>> will likely be slower.  We plan to mitigate this risk by
> > >> > recruiting
> > >> > >     >>>> additional committers.
> > >> > >     >>>>
> > >> > >     >>>> === Inexperience with Open Source ===
> > >> > >     >>>> The initial committers include veteran Apache members
> > >> > (committers and
> > >> > >     >> PPMC
> > >> > >     >>>> members) and other developers who have varying degrees of
> > >> > experience
> > >> > >     >> with
> > >> > >     >>>> open source projects. All have been involved with source
> > code
> > >> > that has
> > >> > >     >>>> been
> > >> > >     >>>> released under an open source license, and several also
> > have
> > >> > experience
> > >> > >     >>>> developing code with an open source development process.
> > >> > >     >>>>
> > >> > >     >>>> === Homogenous Developers ===
> > >> > >     >>>> The initial committers are employed by Airbnb Inc. and
> > >> > Hortonworks. We
> > >> > >     >> are
> > >> > >     >>>> committed to recruiting additional committers from other
> > >> > companies.
> > >> > >     >>>>
> > >> > >     >>>> === Reliance on Salaried Developers ===
> > >> > >     >>>> It is expected that Superset development will occur on
> both
> > >> > salaried
> > >> > >     >> time
> > >> > >     >>>> and on volunteer time, after hours. The majority of
> initial
> > >> > committers
> > >> > >     >> are
> > >> > >     >>>> paid by their employer to contribute to this project.
> > However,
> > >> > they are
> > >> > >     >>>> all
> > >> > >     >>>> passionate about the project, and we are confident that
> the
> > >> > project will
> > >> > >     >>>> continue even if no salaried developers contribute to the
> > >> > project. We
> > >> > >     >> are
> > >> > >     >>>> committed to recruiting additional committers including
> > >> > non-salaried
> > >> > >     >>>> developers.
> > >> > >     >>>>
> > >> > >     >>>> === Relationships with Other Apache Products ===
> > >> > >     >>>> To the knowledge of the Initial Committers, there are no
> > >> direct
> > >> > >     >>>> competitors
> > >> > >     >>>> to Superset within the Apache Software Foundation.  That
> > said,
> > >> > Apache
> > >> > >     >>>> Zeppelin is an indirect competitor, but it solves a
> > different
> > >> > use case.
> > >> > >     >>>>
> > >> > >     >>>> Apache Zeppelin is a web-based notebook that enables
> > >> > interactive data
> > >> > >     >>>> analytics. It enables the creation of beautiful
> > data-driven,
> > >> > interactive
> > >> > >     >>>> and collaborative documents with SQL, Scala and more.
> > >> Although
> > >> > a user
> > >> > >     >> can
> > >> > >     >>>> create data visualizations using this project, it
> > leverages a
> > >> > notebook
> > >> > >     >>>> style user interfaces and it is geared towards the Spark
> > >> > community where
> > >> > >     >>>> Scala and SQL co-exist
> > >> > >     >>>>
> > >> > >     >>>> We look forward to collaborating with those communities,
> as
> > >> > well as
> > >> > >     >> other
> > >> > >     >>>> Apache communities.
> > >> > >     >>>>
> > >> > >     >>>> === An Excessive Fascination with the Apache Brand ===
> > >> > >     >>>> Superset is solving two huge challenges:
> > >> > >     >>>> The challenge of enabling every knowledge worker to make
> > data
> > >> > informed
> > >> > >     >>>> decisions, particularly those who are not deeply skilled
> at
> > >> > writing SQL.
> > >> > >     >>>> The challenge of visualizing huge amounts of data
> > >> interactively
> > >> > and in
> > >> > >     >>>> real-time
> > >> > >     >>>>
> > >> > >     >>>> Superset was first developed as a data visualization
> > solution
> > >> > for
> > >> > >     >> Druid.io
> > >> > >     >>>> as a way to visualize billions of rows of data.  Since
> > then,
> > >> > usage of
> > >> > >     >>>> Superset has expanded to address data visualization use
> > cases
> > >> > across SQL
> > >> > >     >>>> speaking data sources as well.
> > >> > >     >>>>
> > >> > >     >>>> Our rationale for developing Superset as an Apache
> project
> > is
> > >> > detailed
> > >> > >     >> in
> > >> > >     >>>> the Rationale Section.  We believe that the Apache brand
> > and
> > >> > community
> > >> > >     >>>> process will help us attract more contributors to this
> > >> project,
> > >> > and help
> > >> > >     >>>> grow the footprint of the project through usage at other
> > >> > organizations
> > >> > >     >> and
> > >> > >     >>>> within other applications.  Establishing consensus among
> > users
> > >> > and
> > >> > >     >>>> developers will result in a more valuable tool for
> > everyone.
> > >> > >     >>>>
> > >> > >     >>>> == Documentation ==
> > >> > >     >>>> References to further reading material:
> > >> > >     >>>> * [[http://airbnb.io/superset/|Superset Documentation]]
> > >> > >     >>>> * [[
> > >> > >     >>>> https://medium.com/airbnb-engi
> > neering/caravel-airbnb-s-data-
> > >> > >     >>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
> > >> > >     >>>> Post:  Superset: Airbnb’s Data Exploration Platform]]
> > >> > >     >>>> * [[
> > >> > >     >>>> https://medium.com/airbnb-engi
> > neering/superset-scaling-data-
> > >> > >     >>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.
> > >> > a505zvb1t|Blog
> > >> > >     >>>> Post:  Superset: Scaling Data Access & Visual Insights at
> > >> > Airbnb]]
> > >> > >     >>>>
> > >> > >     >>>> == Initial Source ==
> > >> > >     >>>> The origin of the proposed code base can be found at
> > >> > >     >>>> https://github.com/airbnb/superset.  The code base is
> > >> > primarily in
> > >> > >     >>>> Python.
> > >> > >     >>>>
> > >> > >     >>>> == Source and Intellectual Property Submission Plan ==
> > >> > >     >>>> We do not expect any complications for the submission of
> > the
> > >> > Superset
> > >> > >     >> code
> > >> > >     >>>> base.  Our code is already in Github and there is only a
> > >> single
> > >> > code
> > >> > >     >> base.
> > >> > >     >>>>
> > >> > >     >>>> == External Dependencies ==
> > >> > >     >>>> List of Python packages, from the Python Package Index
> > (Pypi):
> > >> > >     >>>>
> > >> > >     >>>> * boto3
> > >> > >     >>>> * celery
> > >> > >     >>>> * cryptography
> > >> > >     >>>> * flask-appbuilder
> > >> > >     >>>> * flask-cache
> > >> > >     >>>> * flask-migrate
> > >> > >     >>>> * flask-script
> > >> > >     >>>> * flask-sqlalchemy
> > >> > >     >>>> * flask-testing
> > >> > >     >>>> * humanize
> > >> > >     >>>> * gunicorn
> > >> > >     >>>> * markdown
> > >> > >     >>>> * pandas
> > >> > >     >>>> * parsedatetime
> > >> > >     >>>> * pydruid
> > >> > >     >>>> * PyHive
> > >> > >     >>>> * python-dateutil
> > >> > >     >>>> * requests
> > >> > >     >>>> * simplejson
> > >> > >     >>>> * six
> > >> > >     >>>> * sqlalchemy
> > >> > >     >>>> * sqlalchemy-utils
> > >> > >     >>>> * sqlparse
> > >> > >     >>>> * thrift
> > >> > >     >>>> * thrift-sasl
> > >> > >     >>>> * werkzeug
> > >> > >     >>>>
> > >> > >     >>>> List of Javascript packages, from NPM:
> > >> > >     >>>> * autobind-decorator
> > >> > >     >>>> * bootstrap
> > >> > >     >>>> * bootstrap-datepicker
> > >> > >     >>>> * brace
> > >> > >     >>>> * brfs
> > >> > >     >>>> * cal-heatmap
> > >> > >     >>>> * classnames
> > >> > >     >>>> * d3
> > >> > >     >>>> * d3-cloud
> > >> > >     >>>> * d3-sankey
> > >> > >     >>>> * d3-scale
> > >> > >     >>>> * d3-tip
> > >> > >     >>>> * datamaps
> > >> > >     >>>> * datatables-bootstrap3-plugin
> > >> > >     >>>> * datatables.net-bs
> > >> > >     >>>> * font-awesome
> > >> > >     >>>> * gridster
> > >> > >     >>>> * immutability-helper
> > >> > >     >>>> * immutable
> > >> > >     >>>> * jquery
> > >> > >     >>>> * lodash.throttle
> > >> > >     >>>> * mapbox-gl
> > >> > >     >>>> * moment
> > >> > >     >>>> * moments
> > >> > >     >>>> * mustache
> > >> > >     >>>> * nvd3
> > >> > >     >>>> * react
> > >> > >     >>>> * react-ace
> > >> > >     >>>> * react-bootstrap
> > >> > >     >>>> * react-bootstrap-table
> > >> > >     >>>> * react-dom
> > >> > >     >>>> * react-draggable
> > >> > >     >>>> * react-gravatar
> > >> > >     >>>> * react-grid-layout
> > >> > >     >>>> * react-map-gl
> > >> > >     >>>> * react-redux
> > >> > >     >>>> * react-resizable
> > >> > >     >>>> * react-select
> > >> > >     >>>> * react-syntax-highlighter
> > >> > >     >>>> * reactable
> > >> > >     >>>> * redux
> > >> > >     >>>> * redux-localstorage
> > >> > >     >>>> * redux-thunk
> > >> > >     >>>> * shortid
> > >> > >     >>>> * style-loader
> > >> > >     >>>> * supercluster
> > >> > >     >>>> * topojson
> > >> > >     >>>> * victory
> > >> > >     >>>> * viewport-mercator-project
> > >> > >     >>>>
> > >> > >     >>>> == Cryptography ==
> > >> > >     >>>> The proposal does not include cryptographic code.
> > >> > >     >>>>
> > >> > >     >>>> == Required Resources ==
> > >> > >     >>>>
> > >> > >     >>>> === Mailing List ===
> > >> > >     >>>> There is a current mailing list as a Google Group
> > >> > “airbnb_superset” that
> > >> > >     >>>> we
> > >> > >     >>>> are planning on deprecating as the Apache.org become
> ready
> > to
> > >> > serve our
> > >> > >     >>>> community.
> > >> > >     >>>>
> > >> > >     >>>> * superset-private
> > >> > >     >>>> * superset-dev
> > >> > >     >>>> * superset-user
> > >> > >     >>>>
> > >> > >     >>>> === Subversion Directory ===
> > >> > >     >>>> Git is the preferred source control system.
> > >> > >     >>>> http://svn.apache.org/repos/asf/incubator/superset
> > >> > >     >>>>
> > >> > >     >>>> == Git Repository ==
> > >> > >     >>>> Git is the preferred source control system, we’re
> assuming
> > >> > >     >>>> https://github.com/apache/incubator-superset based on
> the
> > >> > naming scheme
> > >> > >     >>>>
> > >> > >     >>>> == Issue Tracking ==
> > >> > >     >>>> JIRA Superset (SUPERSET). If possible, we’d like to use
> > Github
> > >> > issues &
> > >> > >     >>>> PRs
> > >> > >     >>>> to manage our project as much as possible. It’s been said
> > that
> > >> > there are
> > >> > >     >>>> ways to keep Github’s issues in sync with Jira, allowing
> > us to
> > >> > get best
> > >> > >     >> of
> > >> > >     >>>> both worlds. If that is not possible, we will comply to
> > using
> > >> > Jira.
> > >> > >     >>>>
> > >> > >     >>>> == Other Resources ==
> > >> > >     >>>> We currently use a set of Github integrated services that
> > are
> > >> > free to
> > >> > >     >> the
> > >> > >     >>>> open source community, like Travis-ci, Code Climate,
> > >> Coveralls,
> > >> > >     >>>> Landscape.io, Requires.io, david-dm and Gitter. We would
> > like
> > >> > to keep
> > >> > >     >>>> using
> > >> > >     >>>> these services as they allow us to scale contributions
> and
> > >> > optimize our
> > >> > >     >>>> development flows. These services require some elevated
> > rights
> > >> > on the
> > >> > >     >>>> Github repository in order to set up or tune and we would
> > like
> > >> > for the
> > >> > >     >>>> committers to have the required rights.
> > >> > >     >>>>
> > >> > >     >>>>
> > >> > >     >>>> == Initial Committers ==
> > >> > >     >>>>
> > >> > >     >>>> * Maxime Beauchemin <[hidden email]> -
> PPMC
> > &
> > >> > Committer
> > >> > >     >>>> * Alanna Scott <[hidden email]> - PPMC &
> > Committer
> > >> > >     >>>> * Bogdan Kyryliuk <[hidden email]> - PPMC &
> > Committer
> > >> > >     >>>> * Vera Liu <[hidden email]> - Committer
> > >> > >     >>>> * Jeff Feng <[hidden email]> - PPMC & Committer
> > >> > >     >>>> * Ashutosh Chauhan <[hidden email]> - Mentor &
> > >> Committer
> > >> > >     >>>> * Nishant Bangarwa <[hidden email]> - PPMC &
> > >> > Committer
> > >> > >     >>>> * Slim Bouguerra <[hidden email]> -
> Committer
> > >> > >     >>>> * Priyank Shah <[hidden email]> - Committer
> > >> > >     >>>> * Harsha Chintalapani <[hidden email]> -
> > >> > Committer
> > >> > >     >>>> * Daniel Dai <[hidden email]> - Champion & Committer
> > >> > >     >>>> * Luke Han <[hidden email]> - Mentor
> > >> > >     >>>>
> > >> > >     >>>> == Affiliations ==
> > >> > >     >>>> The initial committers are employees of Airbnb Inc. and
> > >> > Hortonworks.
> > >> > >     >>>>
> > >> > >     >>>> == Sponsors ==
> > >> > >     >>>>
> > >> > >     >>>> === Champion ===
> > >> > >     >>>> Daniel Dai <[hidden email]>
> > >> > >     >>>>
> > >> > >     >>>> === Nominated Mentors ===
> > >> > >     >>>> * Ashutosh Chauhan <[hidden email]>
> > >> > >     >>>> * Luke Han <[hidden email]>
> > >> > >     >>>>
> > >> > >     >>>> === Sponsoring Entity ===
> > >> > >     >>>> Incubator PMC
> > >> > >     >>>>
> > >> > >     >>>
> > >> > >     >>>
> > >> > >     >>
> > >> > >
> > >> > >
> > >> > >     ------------------------------------------------------------
> > >> > ---------
> > >> > >     To unsubscribe, e-mail: general-unsubscribe@incubator.
> > apache.org
> > >> > >     For additional commands, e-mail:
> [hidden email].
> > org
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> >
> > >> >
> ---------------------------------------------------------------------
> > >> > To unsubscribe, e-mail: [hidden email]
> > >> > For additional commands, e-mail: [hidden email]
> > >> >
> > >> >
> > >>
> >
> >
> >
> > --
> > Best Regards, Edward J. Yoon
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Julian Hyde-3
In reply to this post by Jeff Feng-2
Re-affriming my vote:

+1 (binding)

> On Apr 26, 2017, at 11:12 PM, Jeff Feng <[hidden email]> wrote:
>
> Hello everyone,
>
> Thank you for checking out our proposal on Superset and for your
> consideration for the Apache Incubator.  So far, I believe we have 8
> binding votes and 2 non-binding votes.
>
> As Taylor mentioned earlier, we made a minor update to the wording in the
> "Source and Intellectual Property Submission Plan" section based on a
> suggestion by John Ament.  The update was to help confirm the previously
> unstated assumption that we will submit an SGA.  I have copied the updated
> proposal from the wiki to the email below and highlighted (in yellow) the
> new sentence below in the document.
>
> Folks on the cc line who have already voted, please let us know if the
> change impacts your vote.
>
> Thank you all,
> Jeff
>
>
>
> = Superset =
>
> == Abstract ==
> Superset is an enterprise-ready web application for data exploration, data
> visualization and dashboarding.
>
> == Proposal ==
> Superset is business intelligence (BI) software that helps modern
> organizations visualize and interact with their data. Superset enables
> users explore data from a variety of databases, assemble beautiful
> dashboards and share their findings.  Superset works neatly with all modern
> SQL-speaking databases, and integrates with Druid.io to provide real-time,
> interactive, blazing fast data access to large datasets.
>
> == Background ==
> Data is mission critical. To succeed in this era, organizations need to
> provide low-friction, intuitive and interactive access to data. It is
> paramount for knowledge workers to be capable of answering their own
> questions by querying, exploring and visualizing data.
>
> The entire business intelligence industry has pivoted from a model of
> centralized top-down platforms driven by IT organizations to self-service
> analytics and agile workflows by any user.  This shift unblocks centralized
> service bottlenecks for creating data visualizations while also creating an
> environment that is iterative and fast-moving.  This means that business
> intelligence software must also be easy and delightful to use.
> Self-service analytics doesn’t mean that admin and governance features are
> not needed.
> Modern BI tools provide fine-grain access controls and auditing
> capabilities to understand how data is being used.  Superset is a solution
> that delivers on all of these vectors.
>
> The technology stack is also constantly morphing - vendors are struggling
> to provide cheap, quick and easy solutions to access data.  Business
> intelligence users are finding existing solutions lacking as these software
> products either disregard or react slowly to recent game-changing
> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
> React.js and iPython’s Jupyter for instance.
>
> == Rationale ==
> Business intelligence is more relevant today than at any other point in
> history.  Organizations are currently very limited in options for open
> source data visualization solutions, especially solutions that are both
> self-service and enterprise-ready.  Every company informing their decisions
> with data needs a BI tool.
>
> We believe that Superset will be a strong compliment to existing Apache
> Software Foundation technologies by offering scalable user interactions to
> distributed storage and computation solutions.  Users will often find that
> Superset can act as a catalyst for tooling that can visualize the byproduct
> of data and computation infrastructure.
>
> Superset has many key design elements that help fill a gap in current
> solutions for organizations:
> * Easy, low friction access to data through a simple, web-based data
> exploration interface.  Composing charts and dashboards are intuitive.
> Eliminating the need to write code or SQL empowers anyone to use it.
> * Access to a wide array of rich, interactive data visualization types.
> * Enterprise-ready: Integration with different authentication mechanisms
> and granular permissions centered around actions and data access.
> * Realtime & fast: Superset provides realtime analytics at the speed of
> thought on very large datasets when integrated with Druid.io.
> * Broad data access: Consume data out of any SQL-speaking relational
> database.
> * Extensible: Can be extended to talk to many noSQL databases like Apache
> Drill, Elastic Search, and other popular database engines.
> * Fast loading dashboards with configurable web-scale caching.
> * Plug-in framework that enables organizations to build custom analytical
> applications with new UI/UX interfaces.
> * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
> with more flexibility.  SQL Lab integrates with the visualization engine
> seamlessly.
>
> == Initial Goals ==
> The initial goals of the Superset project are several-fold:
> * Move the existing codebase to Apache and integrate with the Apache
> development process.
> * Redesign the user interface and interaction model for creating
> visualizations/dashboards and connecting to data sources
> * Build robust support for security and governance of the tool including
> popular authorization modules (including Apache Ranger and Apache Sentry)
> and a more sophisticated permissions system
> * Grow the extensibility of the project both in terms of enhanced
> connectivity to NoSQL-based data sources and creating a plug-in framework
> that enables organizations to build custom analytical applications which
> require a new UI/UX
>
> == Current Status ==
> By many standards, Superset is already a successful open source project. As
> of March 2017, Superset is officially used in production at about a dozen
> companies, has received contributions from over one hundred contributors on
> Github, 1500+ forks, and 12k+ stars.
>
> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
> significant contributions, and expressed their commitment to the project.
> The product is feature complete and has been viable for months. It already
> serves as the main interface for consuming data at many companies of
> different sizes.
>
> While the product is usable, there’s room for improvement across the board,
> starting with providing a smoother user experience around content creation,
> making sure all features work out-of-the-box on more platforms and
> databases, providing better user training guides and videos, having a
> predictable release process, and increasing the overall quality of the
> Superset releases.
>
> === Meritocracy ===
> We plan to invest in supporting a meritocracy. We will discuss the
> requirements in an open forum. Several companies have expressed interest in
> this project, and we intend to invite additional developers to participate.
> We will encourage and monitor community participation so that privileges
> can be extended to those that contribute.
>
> === Community ===
> The need for an enterprise-ready data visualization and exploration
> platform in the open source community is tremendous.  While Superset is
> fairly well known, recognized and used within the Druid.io community,
> adoption is currently limited outside of that niche. There is a huge
> opportunity to grow the community to hundreds if not thousands of
> organizations, and we are hoping that embracing “the Apache way” will
> accelerate the growth of our community.
>
> We have already been active at seeking and inviting contributions, and are
> planning to scale the project by investing time and growing the support
> structure to grow the community.
>
> === Core Developers ===
> The initial committers for Superset include experienced full stack,
> front-end and data engineers:
> * Maxime Beauchemin (Airbnb)
> * Alanna Scott (Airbnb)
> * Bogdan Kyryliuk (Airbnb)
> * Vera Liu  (Airbnb)
> * Jeff Feng (Airbnb)
> * Ashutosh Chauhan (Hortonworks)
> * Nishant Bangarwa (Hortonworks)
> * Slim Bouguerra (Hortonworks)
> * Priyank Shah (Hortonworks)
> * Sriharsha Chintalapani (Hortonworks)
> * Daniel Dai (Hortonworks)
>
> We realize that additional employer diversity is needed, and we will work
> aggressively to recruit developers from additional companies.
>
> === Alignment ===
> The initial committers strongly believe that a system for interactive
> visualization of data will gain broader adoption as an open source,
> community driven project, where the community can contribute not only to
> the core components, but also to a growing collection of connectors,
> visualizations and improving integration a all potential data sources.
> Superset already integrates closely with Apache Hive, the Hive metastore,
> as well as most SQL-speaking databases found in modern data ecosystems.
>
> == Known Risks ==
>
> === Orphaned Products ===
> Superset is a vital component for both visualizing, accessing and
> democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
> component of the DataFlow product offering.  Thus, the risk of the project
> being orphaned is relatively low.  The project could be at risk if Airbnb
> changes their approach for democratizing data or if Hortonworks changes
> their strategy in the market.  In such an event, the committers plan to
> continue working on the project on their own time, thought the progress
> will likely be slower.  We plan to mitigate this risk by recruiting
> additional committers.
>
> === Inexperience with Open Source ===
> The initial committers include veteran Apache members (committers and PPMC
> members) and other developers who have varying degrees of experience with
> open source projects. All have been involved with source code that has been
> released under an open source license, and several also have experience
> developing code with an open source development process.
>
> === Homogenous Developers ===
> The initial committers are employed by Airbnb Inc. and Hortonworks. We are
> committed to recruiting additional committers from other companies.
>
> === Reliance on Salaried Developers ===
> It is expected that Superset development will occur on both salaried time
> and on volunteer time, after hours. The majority of initial committers are
> paid by their employer to contribute to this project. However, they are all
> passionate about the project, and we are confident that the project will
> continue even if no salaried developers contribute to the project. We are
> committed to recruiting additional committers including non-salaried
> developers.
>
> === Relationships with Other Apache Products ===
> To the knowledge of the Initial Committers, there are no direct competitors
> to Superset within the Apache Software Foundation.  That said, Apache
> Zeppelin is an indirect competitor, but it solves a different use case.
>
> Apache Zeppelin is a web-based notebook that enables interactive data
> analytics. It enables the creation of beautiful data-driven, interactive
> and collaborative documents with SQL, Scala and more.  Although a user can
> create data visualizations using this project, it leverages a notebook
> style user interfaces and it is geared towards the Spark community where
> Scala and SQL co-exist
>
> We look forward to collaborating with those communities, as well as other
> Apache communities.
>
> === An Excessive Fascination with the Apache Brand ===
> Superset is solving two huge challenges:
> The challenge of enabling every knowledge worker to make data informed
> decisions, particularly those who are not deeply skilled at writing SQL.
> The challenge of visualizing huge amounts of data interactively and in
> real-time
>
> Superset was first developed as a data visualization solution for Druid.io
> as a way to visualize billions of rows of data.  Since then, usage of
> Superset has expanded to address data visualization use cases across SQL
> speaking data sources as well.
>
> Our rationale for developing Superset as an Apache project is detailed in
> the Rationale Section.  We believe that the Apache brand and community
> process will help us attract more contributors to this project, and help
> grow the footprint of the project through usage at other organizations and
> within other applications.  Establishing consensus among users and
> developers will result in a more valuable tool for everyone.
>
> == Documentation ==
> References to further reading material:
> * [[http://airbnb.io/superset/|Superset Documentation]]
> * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat
> a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post:  Superset:
> Airbnb’s Data Exploration Platform]]
> * [[https://medium.com/airbnb-engineering/superset-scaling-dat
> a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog Post:
> Superset: Scaling Data Access & Visual Insights at Airbnb]]
>
> == Initial Source ==
> The origin of the proposed code base can be found at
> https://github.com/airbnb/superset.  The code base is primarily in Python.
>
> == Source and Intellectual Property Submission Plan ==
> Airbnb will submit a Software Grant Agreement (SGA) as Superset joins the
> incubator. We do not expect any complications for the submission of the
> Superset code base.  Our code is already in Github and there is only a
> single code base.
>
> == External Dependencies ==
> List of Python packages, from the Python Package Index (Pypi):
>
> * boto3
> * celery
> * cryptography
> * flask-appbuilder
> * flask-cache
> * flask-migrate
> * flask-script
> * flask-sqlalchemy
> * flask-testing
> * humanize
> * gunicorn
> * markdown
> * pandas
> * parsedatetime
> * pydruid
> * PyHive
> * python-dateutil
> * requests
> * simplejson
> * six
> * sqlalchemy
> * sqlalchemy-utils
> * sqlparse
> * thrift
> * thrift-sasl
> * werkzeug
>
> List of Javascript packages, from NPM:
> * autobind-decorator
> * bootstrap
> * bootstrap-datepicker
> * brace
> * brfs
> * cal-heatmap
> * classnames
> * d3
> * d3-cloud
> * d3-sankey
> * d3-scale
> * d3-tip
> * datamaps
> * datatables-bootstrap3-plugin
> * datatables.net-bs
> * font-awesome
> * gridster
> * immutability-helper
> * immutable
> * jquery
> * lodash.throttle
> * mapbox-gl
> * moment
> * moments
> * mustache
> * nvd3
> * react
> * react-ace
> * react-bootstrap
> * react-bootstrap-table
> * react-dom
> * react-draggable
> * react-gravatar
> * react-grid-layout
> * react-map-gl
> * react-redux
> * react-resizable
> * react-select
> * react-syntax-highlighter
> * reactable
> * redux
> * redux-localstorage
> * redux-thunk
> * shortid
> * style-loader
> * supercluster
> * topojson
> * victory
> * viewport-mercator-project
>
> == Cryptography ==
> The proposal does not include cryptographic code.
>
> == Required Resources ==
>
> === Mailing List ===
> There is a current mailing list as a Google Group “airbnb_superset” that we
> are planning on deprecating as the Apache.org become ready to serve our
> community.
>
> * superset-private
> * superset-dev
> * superset-user
>
> === Subversion Directory ===
> Git is the preferred source control system. http://svn.apache.org/repos/as
> f/incubator/superset
>
> == Git Repository ==
> Git is the preferred source control system, we’re assuming
> https://github.com/apache/incubator-superset based on the naming scheme
>
> == Issue Tracking ==
> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues & PRs
> to manage our project as much as possible. It’s been said that there are
> ways to keep Github’s issues in sync with Jira, allowing us to get best of
> both worlds. If that is not possible, we will comply to using Jira.
>
> == Other Resources ==
> We currently use a set of Github integrated services that are free to the
> open source community, like Travis-ci, Code Climate, Coveralls,
> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep using
> these services as they allow us to scale contributions and optimize our
> development flows. These services require some elevated rights on the
> Github repository in order to set up or tune and we would like for the
> committers to have the required rights.
>
>
> == Initial Committers ==
>
> * Maxime Beauchemin <[hidden email]> - PPMC & Committer
> * Alanna Scott <[hidden email]> - PPMC & Committer
> * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
> * Vera Liu <[hidden email]> - Committer
> * Jeff Feng <[hidden email]> - PPMC & Committer
> * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
> * Nishant Bangarwa <[hidden email]> - PPMC & Committer
> * Slim Bouguerra <[hidden email]> - Committer
> * Priyank Shah <[hidden email]> - Committer
> * Harsha Chintalapani <[hidden email]> - Committer
> * Daniel Dai <[hidden email]> - Champion & Committer
> * Luke Han <[hidden email]> - Mentor
>
> == Affiliations ==
> The initial committers are employees of Airbnb Inc. and Hortonworks.
>
> == Sponsors ==
>
> === Champion ===
> Daniel Dai <[hidden email]>
>
> === Nominated Mentors ===
> * Ashutosh Chauhan <[hidden email]>
> * Luke Han <[hidden email]>
>
> === Sponsoring Entity ===
> Incubator PMC
>
>
>
>
>
> On Wed, Apr 26, 2017 at 6:31 PM, Edward J. Yoon <[hidden email]>
> wrote:
>
>> +1 binding
>>
>> On Thu, Apr 27, 2017 at 10:29 AM, Naresh Agarwal
>> <[hidden email]> wrote:
>>> +1 (non-binding).
>>>
>>> Thanks
>>> Naresh Agarwal
>>>
>>> On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <[hidden email]>
>> wrote:
>>>
>>>> +1 (binding)
>>>>
>>>>
>>>>
>>>> On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <[hidden email]> wrote:
>>>>
>>>>> +1 (binding)
>>>>>
>>>>> On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey
>>>>> <[hidden email]> wrote:
>>>>>> +1 (binding)
>>>>>>
>>>>>> On 4/25/17, 1:27 PM, "Julian Hyde" <[hidden email]> wrote:
>>>>>>
>>>>>>    +1 binding
>>>>>>
>>>>>>> On Apr 25, 2017, at 12:48 PM, moon soo Lee <[hidden email]>
>>>>> wrote:
>>>>>>>
>>>>>>> +1 (non-binding)
>>>>>>>
>>>>>>> On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <
>>>>> [hidden email]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> +1 (binding)
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Ashutosh
>>>>>>>>
>>>>>>>> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <[hidden email]
>>>
>>>>> wrote:
>>>>>>>>
>>>>>>>>> +1 binding
>>>>>>>>>
>>>>>>>>> Love to see Superset to be new incubator project.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best Regards!
>>>>>>>>> ---------------------
>>>>>>>>>
>>>>>>>>> Luke Han
>>>>>>>>>
>>>>>>>>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <
>>>> [hidden email]>
>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Dear Apache Incubator Community,
>>>>>>>>>>
>>>>>>>>>> We have updated the Superset proposal
>>>>>>>>>> <https://wiki.apache.org/incubator/SupersetProposal>
>> (copied
>>>>> below) for
>>>>>>>>>>
>>>>>>>>>> Apache Incubation with an additional mentor (Luke Han -
>>>>>>>>>> [hidden email]),
>>>>>>>>>> and would like to start a vote thread for acceptance into
>> the
>>>>> incubator.
>>>>>>>>>>
>>>>>>>>>> Our team is excited to share Superset with the Apache
>>>> community
>>>>> and we
>>>>>>>>>> hope
>>>>>>>>>> for the your continued support!
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Jeff & the Superset Team
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> = Superset =
>>>>>>>>>>
>>>>>>>>>> == Abstract ==
>>>>>>>>>> Superset is an enterprise-ready web application for data
>>>>> exploration,
>>>>>>>> data
>>>>>>>>>> visualization and dashboarding.
>>>>>>>>>>
>>>>>>>>>> == Proposal ==
>>>>>>>>>> Superset is business intelligence (BI) software that helps
>>>>> modern
>>>>>>>>>> organizations visualize and interact with their data.
>> Superset
>>>>> enables
>>>>>>>>>> users explore data from a variety of databases, assemble
>>>>> beautiful
>>>>>>>>>> dashboards and share their findings.  Superset works neatly
>>>>> with all
>>>>>>>>>> modern
>>>>>>>>>> SQL-speaking databases, and integrates with Druid.io to
>>>> provide
>>>>>>>> real-time,
>>>>>>>>>> interactive, blazing fast data access to large datasets.
>>>>>>>>>>
>>>>>>>>>> == Background ==
>>>>>>>>>> Data is mission critical. To succeed in this era,
>>>> organizations
>>>>> need to
>>>>>>>>>> provide low-friction, intuitive and interactive access to
>>>> data.
>>>>> It is
>>>>>>>>>> paramount for knowledge workers to be capable of answering
>>>>> their own
>>>>>>>>>> questions by querying, exploring and visualizing data.
>>>>>>>>>>
>>>>>>>>>> The entire business intelligence industry has pivoted from
>> a
>>>>> model of
>>>>>>>>>> centralized top-down platforms driven by IT organizations
>> to
>>>>>>>> self-service
>>>>>>>>>> analytics and agile workflows by any user.  This shift
>>>> unblocks
>>>>>>>>>> centralized
>>>>>>>>>> service bottlenecks for creating data visualizations while
>>>> also
>>>>> creating
>>>>>>>>>> an
>>>>>>>>>> environment that is iterative and fast-moving.  This means
>>>> that
>>>>> business
>>>>>>>>>> intelligence software must also be easy and delightful to
>> use.
>>>>>>>>>> Self-service analytics doesn’t mean that admin and
>> governance
>>>>> features
>>>>>>>> are
>>>>>>>>>> not needed.
>>>>>>>>>> Modern BI tools provide fine-grain access controls and
>>>> auditing
>>>>>>>>>> capabilities to understand how data is being used.
>> Superset
>>>> is
>>>>> a
>>>>>>>> solution
>>>>>>>>>> that delivers on all of these vectors.
>>>>>>>>>>
>>>>>>>>>> The technology stack is also constantly morphing - vendors
>> are
>>>>>>>> struggling
>>>>>>>>>> to provide cheap, quick and easy solutions to access data.
>>>>> Business
>>>>>>>>>> intelligence users are finding existing solutions lacking
>> as
>>>>> these
>>>>>>>>>> software
>>>>>>>>>> products either disregard or react slowly to recent
>>>>> game-changing
>>>>>>>>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache
>>>>> Kylin, d3.js,
>>>>>>>>>> React.js and iPython’s Jupyter for instance.
>>>>>>>>>>
>>>>>>>>>> == Rationale ==
>>>>>>>>>> Business intelligence is more relevant today than at any
>> other
>>>>> point in
>>>>>>>>>> history.  Organizations are currently very limited in
>> options
>>>>> for open
>>>>>>>>>> source data visualization solutions, especially solutions
>> that
>>>>> are both
>>>>>>>>>> self-service and enterprise-ready.  Every company informing
>>>>> their
>>>>>>>>>> decisions
>>>>>>>>>> with data needs a BI tool.
>>>>>>>>>>
>>>>>>>>>> We believe that Superset will be a strong compliment to
>>>>> existing Apache
>>>>>>>>>> Software Foundation technologies by offering scalable user
>>>>> interactions
>>>>>>>> to
>>>>>>>>>> distributed storage and computation solutions.  Users will
>>>>> often find
>>>>>>>> that
>>>>>>>>>> Superset can act as a catalyst for tooling that can
>> visualize
>>>>> the
>>>>>>>>>> byproduct
>>>>>>>>>> of data and computation infrastructure.
>>>>>>>>>>
>>>>>>>>>> Superset has many key design elements that help fill a gap
>> in
>>>>> current
>>>>>>>>>> solutions for organizations:
>>>>>>>>>> * Easy, low friction access to data through a simple,
>>>> web-based
>>>>> data
>>>>>>>>>> exploration interface.  Composing charts and dashboards are
>>>>> intuitive.
>>>>>>>>>> Eliminating the need to write code or SQL empowers anyone
>> to
>>>>> use it.
>>>>>>>>>> * Access to a wide array of rich, interactive data
>>>>> visualization types.
>>>>>>>>>> * Enterprise-ready: Integration with different
>> authentication
>>>>>>>> mechanisms
>>>>>>>>>> and granular permissions centered around actions and data
>>>>> access.
>>>>>>>>>> * Realtime & fast: Superset provides realtime analytics at
>> the
>>>>> speed of
>>>>>>>>>> thought on very large datasets when integrated with
>> Druid.io.
>>>>>>>>>> * Broad data access: Consume data out of any SQL-speaking
>>>>> relational
>>>>>>>>>> database.
>>>>>>>>>> * Extensible: Can be extended to talk to many noSQL
>> databases
>>>>> like
>>>>>>>> Apache
>>>>>>>>>> Drill, Elastic Search, and other popular database engines.
>>>>>>>>>> * Fast loading dashboards with configurable web-scale
>> caching.
>>>>>>>>>> * Plug-in framework that enables organizations to build
>> custom
>>>>>>>> analytical
>>>>>>>>>> applications with new UI/UX interfaces.
>>>>>>>>>> * SQL Lab, a state-of-the-art SQL IDE that empowers
>>>>> SQL-speaking users
>>>>>>>>>> with more flexibility.  SQL Lab integrates with the
>>>>> visualization engine
>>>>>>>>>> seamlessly.
>>>>>>>>>>
>>>>>>>>>> == Initial Goals ==
>>>>>>>>>> The initial goals of the Superset project are several-fold:
>>>>>>>>>> * Move the existing codebase to Apache and integrate with
>> the
>>>>> Apache
>>>>>>>>>> development process.
>>>>>>>>>> * Redesign the user interface and interaction model for
>>>> creating
>>>>>>>>>> visualizations/dashboards and connecting to data sources
>>>>>>>>>> * Build robust support for security and governance of the
>> tool
>>>>>>>> including
>>>>>>>>>> popular authorization modules (including Apache Ranger and
>>>>> Apache
>>>>>>>> Sentry)
>>>>>>>>>> and a more sophisticated permissions system
>>>>>>>>>> * Grow the extensibility of the project both in terms of
>>>>> enhanced
>>>>>>>>>> connectivity to NoSQL-based data sources and creating a
>>>> plug-in
>>>>>>>> framework
>>>>>>>>>> that enables organizations to build custom analytical
>>>>> applications which
>>>>>>>>>> require a new UI/UX
>>>>>>>>>>
>>>>>>>>>> == Current Status ==
>>>>>>>>>> By many standards, Superset is already a successful open
>>>> source
>>>>> project.
>>>>>>>>>> As
>>>>>>>>>> of March 2017, Superset is officially used in production at
>>>>> about a
>>>>>>>> dozen
>>>>>>>>>> companies, has received contributions from over one hundred
>>>>> contributors
>>>>>>>>>> on
>>>>>>>>>> Github, 1500+ forks, and 12k+ stars.
>>>>>>>>>>
>>>>>>>>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have
>>>> made
>>>>>>>>>> significant contributions, and expressed their commitment
>> to
>>>> the
>>>>>>>> project.
>>>>>>>>>> The product is feature complete and has been viable for
>>>> months.
>>>>> It
>>>>>>>> already
>>>>>>>>>> serves as the main interface for consuming data at many
>>>>> companies of
>>>>>>>>>> different sizes.
>>>>>>>>>>
>>>>>>>>>> While the product is usable, there’s room for improvement
>>>>> across the
>>>>>>>>>> board,
>>>>>>>>>> starting with providing a smoother user experience around
>>>>> content
>>>>>>>>>> creation,
>>>>>>>>>> making sure all features work out-of-the-box on more
>> platforms
>>>>> and
>>>>>>>>>> databases, providing better user training guides and
>> videos,
>>>>> having a
>>>>>>>>>> predictable release process, and increasing the overall
>>>> quality
>>>>> of the
>>>>>>>>>> Superset releases.
>>>>>>>>>>
>>>>>>>>>> === Meritocracy ===
>>>>>>>>>> We plan to invest in supporting a meritocracy. We will
>> discuss
>>>>> the
>>>>>>>>>> requirements in an open forum. Several companies have
>>>> expressed
>>>>> interest
>>>>>>>>>> in
>>>>>>>>>> this project, and we intend to invite additional
>> developers to
>>>>>>>>>> participate.
>>>>>>>>>> We will encourage and monitor community participation so
>> that
>>>>> privileges
>>>>>>>>>> can be extended to those that contribute.
>>>>>>>>>>
>>>>>>>>>> === Community ===
>>>>>>>>>> The need for an enterprise-ready data visualization and
>>>>> exploration
>>>>>>>>>> platform in the open source community is tremendous.  While
>>>>> Superset is
>>>>>>>>>> fairly well known, recognized and used within the Druid.io
>>>>> community,
>>>>>>>>>> adoption is currently limited outside of that niche. There
>> is
>>>> a
>>>>> huge
>>>>>>>>>> opportunity to grow the community to hundreds if not
>> thousands
>>>>> of
>>>>>>>>>> organizations, and we are hoping that embracing “the Apache
>>>>> way” will
>>>>>>>>>> accelerate the growth of our community.
>>>>>>>>>>
>>>>>>>>>> We have already been active at seeking and inviting
>>>>> contributions, and
>>>>>>>> are
>>>>>>>>>> planning to scale the project by investing time and growing
>>>> the
>>>>> support
>>>>>>>>>> structure to grow the community.
>>>>>>>>>>
>>>>>>>>>> === Core Developers ===
>>>>>>>>>> The initial committers for Superset include experienced
>> full
>>>>> stack,
>>>>>>>>>> front-end and data engineers:
>>>>>>>>>> * Maxime Beauchemin (Airbnb)
>>>>>>>>>> * Alanna Scott (Airbnb)
>>>>>>>>>> * Bogdan Kyryliuk (Airbnb)
>>>>>>>>>> * Vera Liu  (Airbnb)
>>>>>>>>>> * Jeff Feng (Airbnb)
>>>>>>>>>> * Ashutosh Chauhan (Hortonworks)
>>>>>>>>>> * Nishant Bangarwa (Hortonworks)
>>>>>>>>>> * Slim Bouguerra (Hortonworks)
>>>>>>>>>> * Priyank Shah (Hortonworks)
>>>>>>>>>> * Sriharsha Chintalapani (Hortonworks)
>>>>>>>>>> * Daniel Dai (Hortonworks)
>>>>>>>>>>
>>>>>>>>>> We realize that additional employer diversity is needed,
>> and
>>>> we
>>>>> will
>>>>>>>> work
>>>>>>>>>> aggressively to recruit developers from additional
>> companies.
>>>>>>>>>>
>>>>>>>>>> === Alignment ===
>>>>>>>>>> The initial committers strongly believe that a system for
>>>>> interactive
>>>>>>>>>> visualization of data will gain broader adoption as an open
>>>>> source,
>>>>>>>>>> community driven project, where the community can
>> contribute
>>>>> not only to
>>>>>>>>>> the core components, but also to a growing collection of
>>>>> connectors,
>>>>>>>>>> visualizations and improving integration a all potential
>> data
>>>>> sources.
>>>>>>>>>> Superset already integrates closely with Apache Hive, the
>> Hive
>>>>>>>> metastore,
>>>>>>>>>> as well as most SQL-speaking databases found in modern data
>>>>> ecosystems.
>>>>>>>>>>
>>>>>>>>>> == Known Risks ==
>>>>>>>>>>
>>>>>>>>>> === Orphaned Products ===
>>>>>>>>>> Superset is a vital component for both visualizing,
>> accessing
>>>>> and
>>>>>>>>>> democratizing data at Airbnb.  Also at Hortonworks,
>> Superset
>>>> is
>>>>> a core
>>>>>>>>>> component of the DataFlow product offering.  Thus, the
>> risk of
>>>>> the
>>>>>>>> project
>>>>>>>>>> being orphaned is relatively low.  The project could be at
>>>> risk
>>>>> if
>>>>>>>> Airbnb
>>>>>>>>>> changes their approach for democratizing data or if
>>>> Hortonworks
>>>>> changes
>>>>>>>>>> their strategy in the market.  In such an event, the
>>>> committers
>>>>> plan to
>>>>>>>>>> continue working on the project on their own time, thought
>> the
>>>>> progress
>>>>>>>>>> will likely be slower.  We plan to mitigate this risk by
>>>>> recruiting
>>>>>>>>>> additional committers.
>>>>>>>>>>
>>>>>>>>>> === Inexperience with Open Source ===
>>>>>>>>>> The initial committers include veteran Apache members
>>>>> (committers and
>>>>>>>> PPMC
>>>>>>>>>> members) and other developers who have varying degrees of
>>>>> experience
>>>>>>>> with
>>>>>>>>>> open source projects. All have been involved with source
>> code
>>>>> that has
>>>>>>>>>> been
>>>>>>>>>> released under an open source license, and several also
>> have
>>>>> experience
>>>>>>>>>> developing code with an open source development process.
>>>>>>>>>>
>>>>>>>>>> === Homogenous Developers ===
>>>>>>>>>> The initial committers are employed by Airbnb Inc. and
>>>>> Hortonworks. We
>>>>>>>> are
>>>>>>>>>> committed to recruiting additional committers from other
>>>>> companies.
>>>>>>>>>>
>>>>>>>>>> === Reliance on Salaried Developers ===
>>>>>>>>>> It is expected that Superset development will occur on both
>>>>> salaried
>>>>>>>> time
>>>>>>>>>> and on volunteer time, after hours. The majority of initial
>>>>> committers
>>>>>>>> are
>>>>>>>>>> paid by their employer to contribute to this project.
>> However,
>>>>> they are
>>>>>>>>>> all
>>>>>>>>>> passionate about the project, and we are confident that the
>>>>> project will
>>>>>>>>>> continue even if no salaried developers contribute to the
>>>>> project. We
>>>>>>>> are
>>>>>>>>>> committed to recruiting additional committers including
>>>>> non-salaried
>>>>>>>>>> developers.
>>>>>>>>>>
>>>>>>>>>> === Relationships with Other Apache Products ===
>>>>>>>>>> To the knowledge of the Initial Committers, there are no
>>>> direct
>>>>>>>>>> competitors
>>>>>>>>>> to Superset within the Apache Software Foundation.  That
>> said,
>>>>> Apache
>>>>>>>>>> Zeppelin is an indirect competitor, but it solves a
>> different
>>>>> use case.
>>>>>>>>>>
>>>>>>>>>> Apache Zeppelin is a web-based notebook that enables
>>>>> interactive data
>>>>>>>>>> analytics. It enables the creation of beautiful
>> data-driven,
>>>>> interactive
>>>>>>>>>> and collaborative documents with SQL, Scala and more.
>>>> Although
>>>>> a user
>>>>>>>> can
>>>>>>>>>> create data visualizations using this project, it
>> leverages a
>>>>> notebook
>>>>>>>>>> style user interfaces and it is geared towards the Spark
>>>>> community where
>>>>>>>>>> Scala and SQL co-exist
>>>>>>>>>>
>>>>>>>>>> We look forward to collaborating with those communities, as
>>>>> well as
>>>>>>>> other
>>>>>>>>>> Apache communities.
>>>>>>>>>>
>>>>>>>>>> === An Excessive Fascination with the Apache Brand ===
>>>>>>>>>> Superset is solving two huge challenges:
>>>>>>>>>> The challenge of enabling every knowledge worker to make
>> data
>>>>> informed
>>>>>>>>>> decisions, particularly those who are not deeply skilled at
>>>>> writing SQL.
>>>>>>>>>> The challenge of visualizing huge amounts of data
>>>> interactively
>>>>> and in
>>>>>>>>>> real-time
>>>>>>>>>>
>>>>>>>>>> Superset was first developed as a data visualization
>> solution
>>>>> for
>>>>>>>> Druid.io
>>>>>>>>>> as a way to visualize billions of rows of data.  Since
>> then,
>>>>> usage of
>>>>>>>>>> Superset has expanded to address data visualization use
>> cases
>>>>> across SQL
>>>>>>>>>> speaking data sources as well.
>>>>>>>>>>
>>>>>>>>>> Our rationale for developing Superset as an Apache project
>> is
>>>>> detailed
>>>>>>>> in
>>>>>>>>>> the Rationale Section.  We believe that the Apache brand
>> and
>>>>> community
>>>>>>>>>> process will help us attract more contributors to this
>>>> project,
>>>>> and help
>>>>>>>>>> grow the footprint of the project through usage at other
>>>>> organizations
>>>>>>>> and
>>>>>>>>>> within other applications.  Establishing consensus among
>> users
>>>>> and
>>>>>>>>>> developers will result in a more valuable tool for
>> everyone.
>>>>>>>>>>
>>>>>>>>>> == Documentation ==
>>>>>>>>>> References to further reading material:
>>>>>>>>>> * [[http://airbnb.io/superset/|Superset Documentation]]
>>>>>>>>>> * [[
>>>>>>>>>> https://medium.com/airbnb-engi
>> neering/caravel-airbnb-s-data-
>>>>>>>>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
>>>>>>>>>> Post:  Superset: Airbnb’s Data Exploration Platform]]
>>>>>>>>>> * [[
>>>>>>>>>> https://medium.com/airbnb-engi
>> neering/superset-scaling-data-
>>>>>>>>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.
>>>>> a505zvb1t|Blog
>>>>>>>>>> Post:  Superset: Scaling Data Access & Visual Insights at
>>>>> Airbnb]]
>>>>>>>>>>
>>>>>>>>>> == Initial Source ==
>>>>>>>>>> The origin of the proposed code base can be found at
>>>>>>>>>> https://github.com/airbnb/superset.  The code base is
>>>>> primarily in
>>>>>>>>>> Python.
>>>>>>>>>>
>>>>>>>>>> == Source and Intellectual Property Submission Plan ==
>>>>>>>>>> We do not expect any complications for the submission of
>> the
>>>>> Superset
>>>>>>>> code
>>>>>>>>>> base.  Our code is already in Github and there is only a
>>>> single
>>>>> code
>>>>>>>> base.
>>>>>>>>>>
>>>>>>>>>> == External Dependencies ==
>>>>>>>>>> List of Python packages, from the Python Package Index
>> (Pypi):
>>>>>>>>>>
>>>>>>>>>> * boto3
>>>>>>>>>> * celery
>>>>>>>>>> * cryptography
>>>>>>>>>> * flask-appbuilder
>>>>>>>>>> * flask-cache
>>>>>>>>>> * flask-migrate
>>>>>>>>>> * flask-script
>>>>>>>>>> * flask-sqlalchemy
>>>>>>>>>> * flask-testing
>>>>>>>>>> * humanize
>>>>>>>>>> * gunicorn
>>>>>>>>>> * markdown
>>>>>>>>>> * pandas
>>>>>>>>>> * parsedatetime
>>>>>>>>>> * pydruid
>>>>>>>>>> * PyHive
>>>>>>>>>> * python-dateutil
>>>>>>>>>> * requests
>>>>>>>>>> * simplejson
>>>>>>>>>> * six
>>>>>>>>>> * sqlalchemy
>>>>>>>>>> * sqlalchemy-utils
>>>>>>>>>> * sqlparse
>>>>>>>>>> * thrift
>>>>>>>>>> * thrift-sasl
>>>>>>>>>> * werkzeug
>>>>>>>>>>
>>>>>>>>>> List of Javascript packages, from NPM:
>>>>>>>>>> * autobind-decorator
>>>>>>>>>> * bootstrap
>>>>>>>>>> * bootstrap-datepicker
>>>>>>>>>> * brace
>>>>>>>>>> * brfs
>>>>>>>>>> * cal-heatmap
>>>>>>>>>> * classnames
>>>>>>>>>> * d3
>>>>>>>>>> * d3-cloud
>>>>>>>>>> * d3-sankey
>>>>>>>>>> * d3-scale
>>>>>>>>>> * d3-tip
>>>>>>>>>> * datamaps
>>>>>>>>>> * datatables-bootstrap3-plugin
>>>>>>>>>> * datatables.net-bs
>>>>>>>>>> * font-awesome
>>>>>>>>>> * gridster
>>>>>>>>>> * immutability-helper
>>>>>>>>>> * immutable
>>>>>>>>>> * jquery
>>>>>>>>>> * lodash.throttle
>>>>>>>>>> * mapbox-gl
>>>>>>>>>> * moment
>>>>>>>>>> * moments
>>>>>>>>>> * mustache
>>>>>>>>>> * nvd3
>>>>>>>>>> * react
>>>>>>>>>> * react-ace
>>>>>>>>>> * react-bootstrap
>>>>>>>>>> * react-bootstrap-table
>>>>>>>>>> * react-dom
>>>>>>>>>> * react-draggable
>>>>>>>>>> * react-gravatar
>>>>>>>>>> * react-grid-layout
>>>>>>>>>> * react-map-gl
>>>>>>>>>> * react-redux
>>>>>>>>>> * react-resizable
>>>>>>>>>> * react-select
>>>>>>>>>> * react-syntax-highlighter
>>>>>>>>>> * reactable
>>>>>>>>>> * redux
>>>>>>>>>> * redux-localstorage
>>>>>>>>>> * redux-thunk
>>>>>>>>>> * shortid
>>>>>>>>>> * style-loader
>>>>>>>>>> * supercluster
>>>>>>>>>> * topojson
>>>>>>>>>> * victory
>>>>>>>>>> * viewport-mercator-project
>>>>>>>>>>
>>>>>>>>>> == Cryptography ==
>>>>>>>>>> The proposal does not include cryptographic code.
>>>>>>>>>>
>>>>>>>>>> == Required Resources ==
>>>>>>>>>>
>>>>>>>>>> === Mailing List ===
>>>>>>>>>> There is a current mailing list as a Google Group
>>>>> “airbnb_superset” that
>>>>>>>>>> we
>>>>>>>>>> are planning on deprecating as the Apache.org become ready
>> to
>>>>> serve our
>>>>>>>>>> community.
>>>>>>>>>>
>>>>>>>>>> * superset-private
>>>>>>>>>> * superset-dev
>>>>>>>>>> * superset-user
>>>>>>>>>>
>>>>>>>>>> === Subversion Directory ===
>>>>>>>>>> Git is the preferred source control system.
>>>>>>>>>> http://svn.apache.org/repos/asf/incubator/superset
>>>>>>>>>>
>>>>>>>>>> == Git Repository ==
>>>>>>>>>> Git is the preferred source control system, we’re assuming
>>>>>>>>>> https://github.com/apache/incubator-superset based on the
>>>>> naming scheme
>>>>>>>>>>
>>>>>>>>>> == Issue Tracking ==
>>>>>>>>>> JIRA Superset (SUPERSET). If possible, we’d like to use
>> Github
>>>>> issues &
>>>>>>>>>> PRs
>>>>>>>>>> to manage our project as much as possible. It’s been said
>> that
>>>>> there are
>>>>>>>>>> ways to keep Github’s issues in sync with Jira, allowing
>> us to
>>>>> get best
>>>>>>>> of
>>>>>>>>>> both worlds. If that is not possible, we will comply to
>> using
>>>>> Jira.
>>>>>>>>>>
>>>>>>>>>> == Other Resources ==
>>>>>>>>>> We currently use a set of Github integrated services that
>> are
>>>>> free to
>>>>>>>> the
>>>>>>>>>> open source community, like Travis-ci, Code Climate,
>>>> Coveralls,
>>>>>>>>>> Landscape.io, Requires.io, david-dm and Gitter. We would
>> like
>>>>> to keep
>>>>>>>>>> using
>>>>>>>>>> these services as they allow us to scale contributions and
>>>>> optimize our
>>>>>>>>>> development flows. These services require some elevated
>> rights
>>>>> on the
>>>>>>>>>> Github repository in order to set up or tune and we would
>> like
>>>>> for the
>>>>>>>>>> committers to have the required rights.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> == Initial Committers ==
>>>>>>>>>>
>>>>>>>>>> * Maxime Beauchemin <[hidden email]> - PPMC
>> &
>>>>> Committer
>>>>>>>>>> * Alanna Scott <[hidden email]> - PPMC &
>> Committer
>>>>>>>>>> * Bogdan Kyryliuk <[hidden email]> - PPMC &
>> Committer
>>>>>>>>>> * Vera Liu <[hidden email]> - Committer
>>>>>>>>>> * Jeff Feng <[hidden email]> - PPMC & Committer
>>>>>>>>>> * Ashutosh Chauhan <[hidden email]> - Mentor &
>>>> Committer
>>>>>>>>>> * Nishant Bangarwa <[hidden email]> - PPMC &
>>>>> Committer
>>>>>>>>>> * Slim Bouguerra <[hidden email]> - Committer
>>>>>>>>>> * Priyank Shah <[hidden email]> - Committer
>>>>>>>>>> * Harsha Chintalapani <[hidden email]> -
>>>>> Committer
>>>>>>>>>> * Daniel Dai <[hidden email]> - Champion & Committer
>>>>>>>>>> * Luke Han <[hidden email]> - Mentor
>>>>>>>>>>
>>>>>>>>>> == Affiliations ==
>>>>>>>>>> The initial committers are employees of Airbnb Inc. and
>>>>> Hortonworks.
>>>>>>>>>>
>>>>>>>>>> == Sponsors ==
>>>>>>>>>>
>>>>>>>>>> === Champion ===
>>>>>>>>>> Daniel Dai <[hidden email]>
>>>>>>>>>>
>>>>>>>>>> === Nominated Mentors ===
>>>>>>>>>> * Ashutosh Chauhan <[hidden email]>
>>>>>>>>>> * Luke Han <[hidden email]>
>>>>>>>>>>
>>>>>>>>>> === Sponsoring Entity ===
>>>>>>>>>> Incubator PMC
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>>    ------------------------------------------------------------
>>>>> ---------
>>>>>>    To unsubscribe, e-mail: general-unsubscribe@incubator.
>> apache.org
>>>>>>    For additional commands, e-mail: [hidden email].
>> org
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>>
>>>>>
>>>>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Ashutosh Chauhan
Re-affirming my vote as well:

+1 (binding)

On Thu, Apr 27, 2017 at 10:45 AM, Julian Hyde <[hidden email]> wrote:

> Re-affriming my vote:
>
> +1 (binding)
>
> > On Apr 26, 2017, at 11:12 PM, Jeff Feng <[hidden email]> wrote:
> >
> > Hello everyone,
> >
> > Thank you for checking out our proposal on Superset and for your
> > consideration for the Apache Incubator.  So far, I believe we have 8
> > binding votes and 2 non-binding votes.
> >
> > As Taylor mentioned earlier, we made a minor update to the wording in the
> > "Source and Intellectual Property Submission Plan" section based on a
> > suggestion by John Ament.  The update was to help confirm the previously
> > unstated assumption that we will submit an SGA.  I have copied the
> updated
> > proposal from the wiki to the email below and highlighted (in yellow) the
> > new sentence below in the document.
> >
> > Folks on the cc line who have already voted, please let us know if the
> > change impacts your vote.
> >
> > Thank you all,
> > Jeff
> >
> >
> >
> > = Superset =
> >
> > == Abstract ==
> > Superset is an enterprise-ready web application for data exploration,
> data
> > visualization and dashboarding.
> >
> > == Proposal ==
> > Superset is business intelligence (BI) software that helps modern
> > organizations visualize and interact with their data. Superset enables
> > users explore data from a variety of databases, assemble beautiful
> > dashboards and share their findings.  Superset works neatly with all
> modern
> > SQL-speaking databases, and integrates with Druid.io to provide
> real-time,
> > interactive, blazing fast data access to large datasets.
> >
> > == Background ==
> > Data is mission critical. To succeed in this era, organizations need to
> > provide low-friction, intuitive and interactive access to data. It is
> > paramount for knowledge workers to be capable of answering their own
> > questions by querying, exploring and visualizing data.
> >
> > The entire business intelligence industry has pivoted from a model of
> > centralized top-down platforms driven by IT organizations to self-service
> > analytics and agile workflows by any user.  This shift unblocks
> centralized
> > service bottlenecks for creating data visualizations while also creating
> an
> > environment that is iterative and fast-moving.  This means that business
> > intelligence software must also be easy and delightful to use.
> > Self-service analytics doesn’t mean that admin and governance features
> are
> > not needed.
> > Modern BI tools provide fine-grain access controls and auditing
> > capabilities to understand how data is being used.  Superset is a
> solution
> > that delivers on all of these vectors.
> >
> > The technology stack is also constantly morphing - vendors are struggling
> > to provide cheap, quick and easy solutions to access data.  Business
> > intelligence users are finding existing solutions lacking as these
> software
> > products either disregard or react slowly to recent game-changing
> > technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
> > React.js and iPython’s Jupyter for instance.
> >
> > == Rationale ==
> > Business intelligence is more relevant today than at any other point in
> > history.  Organizations are currently very limited in options for open
> > source data visualization solutions, especially solutions that are both
> > self-service and enterprise-ready.  Every company informing their
> decisions
> > with data needs a BI tool.
> >
> > We believe that Superset will be a strong compliment to existing Apache
> > Software Foundation technologies by offering scalable user interactions
> to
> > distributed storage and computation solutions.  Users will often find
> that
> > Superset can act as a catalyst for tooling that can visualize the
> byproduct
> > of data and computation infrastructure.
> >
> > Superset has many key design elements that help fill a gap in current
> > solutions for organizations:
> > * Easy, low friction access to data through a simple, web-based data
> > exploration interface.  Composing charts and dashboards are intuitive.
> > Eliminating the need to write code or SQL empowers anyone to use it.
> > * Access to a wide array of rich, interactive data visualization types.
> > * Enterprise-ready: Integration with different authentication mechanisms
> > and granular permissions centered around actions and data access.
> > * Realtime & fast: Superset provides realtime analytics at the speed of
> > thought on very large datasets when integrated with Druid.io.
> > * Broad data access: Consume data out of any SQL-speaking relational
> > database.
> > * Extensible: Can be extended to talk to many noSQL databases like Apache
> > Drill, Elastic Search, and other popular database engines.
> > * Fast loading dashboards with configurable web-scale caching.
> > * Plug-in framework that enables organizations to build custom analytical
> > applications with new UI/UX interfaces.
> > * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
> > with more flexibility.  SQL Lab integrates with the visualization engine
> > seamlessly.
> >
> > == Initial Goals ==
> > The initial goals of the Superset project are several-fold:
> > * Move the existing codebase to Apache and integrate with the Apache
> > development process.
> > * Redesign the user interface and interaction model for creating
> > visualizations/dashboards and connecting to data sources
> > * Build robust support for security and governance of the tool including
> > popular authorization modules (including Apache Ranger and Apache Sentry)
> > and a more sophisticated permissions system
> > * Grow the extensibility of the project both in terms of enhanced
> > connectivity to NoSQL-based data sources and creating a plug-in framework
> > that enables organizations to build custom analytical applications which
> > require a new UI/UX
> >
> > == Current Status ==
> > By many standards, Superset is already a successful open source project.
> As
> > of March 2017, Superset is officially used in production at about a dozen
> > companies, has received contributions from over one hundred contributors
> on
> > Github, 1500+ forks, and 12k+ stars.
> >
> > Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
> > significant contributions, and expressed their commitment to the project.
> > The product is feature complete and has been viable for months. It
> already
> > serves as the main interface for consuming data at many companies of
> > different sizes.
> >
> > While the product is usable, there’s room for improvement across the
> board,
> > starting with providing a smoother user experience around content
> creation,
> > making sure all features work out-of-the-box on more platforms and
> > databases, providing better user training guides and videos, having a
> > predictable release process, and increasing the overall quality of the
> > Superset releases.
> >
> > === Meritocracy ===
> > We plan to invest in supporting a meritocracy. We will discuss the
> > requirements in an open forum. Several companies have expressed interest
> in
> > this project, and we intend to invite additional developers to
> participate.
> > We will encourage and monitor community participation so that privileges
> > can be extended to those that contribute.
> >
> > === Community ===
> > The need for an enterprise-ready data visualization and exploration
> > platform in the open source community is tremendous.  While Superset is
> > fairly well known, recognized and used within the Druid.io community,
> > adoption is currently limited outside of that niche. There is a huge
> > opportunity to grow the community to hundreds if not thousands of
> > organizations, and we are hoping that embracing “the Apache way” will
> > accelerate the growth of our community.
> >
> > We have already been active at seeking and inviting contributions, and
> are
> > planning to scale the project by investing time and growing the support
> > structure to grow the community.
> >
> > === Core Developers ===
> > The initial committers for Superset include experienced full stack,
> > front-end and data engineers:
> > * Maxime Beauchemin (Airbnb)
> > * Alanna Scott (Airbnb)
> > * Bogdan Kyryliuk (Airbnb)
> > * Vera Liu  (Airbnb)
> > * Jeff Feng (Airbnb)
> > * Ashutosh Chauhan (Hortonworks)
> > * Nishant Bangarwa (Hortonworks)
> > * Slim Bouguerra (Hortonworks)
> > * Priyank Shah (Hortonworks)
> > * Sriharsha Chintalapani (Hortonworks)
> > * Daniel Dai (Hortonworks)
> >
> > We realize that additional employer diversity is needed, and we will work
> > aggressively to recruit developers from additional companies.
> >
> > === Alignment ===
> > The initial committers strongly believe that a system for interactive
> > visualization of data will gain broader adoption as an open source,
> > community driven project, where the community can contribute not only to
> > the core components, but also to a growing collection of connectors,
> > visualizations and improving integration a all potential data sources.
> > Superset already integrates closely with Apache Hive, the Hive metastore,
> > as well as most SQL-speaking databases found in modern data ecosystems.
> >
> > == Known Risks ==
> >
> > === Orphaned Products ===
> > Superset is a vital component for both visualizing, accessing and
> > democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
> > component of the DataFlow product offering.  Thus, the risk of the
> project
> > being orphaned is relatively low.  The project could be at risk if Airbnb
> > changes their approach for democratizing data or if Hortonworks changes
> > their strategy in the market.  In such an event, the committers plan to
> > continue working on the project on their own time, thought the progress
> > will likely be slower.  We plan to mitigate this risk by recruiting
> > additional committers.
> >
> > === Inexperience with Open Source ===
> > The initial committers include veteran Apache members (committers and
> PPMC
> > members) and other developers who have varying degrees of experience with
> > open source projects. All have been involved with source code that has
> been
> > released under an open source license, and several also have experience
> > developing code with an open source development process.
> >
> > === Homogenous Developers ===
> > The initial committers are employed by Airbnb Inc. and Hortonworks. We
> are
> > committed to recruiting additional committers from other companies.
> >
> > === Reliance on Salaried Developers ===
> > It is expected that Superset development will occur on both salaried time
> > and on volunteer time, after hours. The majority of initial committers
> are
> > paid by their employer to contribute to this project. However, they are
> all
> > passionate about the project, and we are confident that the project will
> > continue even if no salaried developers contribute to the project. We are
> > committed to recruiting additional committers including non-salaried
> > developers.
> >
> > === Relationships with Other Apache Products ===
> > To the knowledge of the Initial Committers, there are no direct
> competitors
> > to Superset within the Apache Software Foundation.  That said, Apache
> > Zeppelin is an indirect competitor, but it solves a different use case.
> >
> > Apache Zeppelin is a web-based notebook that enables interactive data
> > analytics. It enables the creation of beautiful data-driven, interactive
> > and collaborative documents with SQL, Scala and more.  Although a user
> can
> > create data visualizations using this project, it leverages a notebook
> > style user interfaces and it is geared towards the Spark community where
> > Scala and SQL co-exist
> >
> > We look forward to collaborating with those communities, as well as other
> > Apache communities.
> >
> > === An Excessive Fascination with the Apache Brand ===
> > Superset is solving two huge challenges:
> > The challenge of enabling every knowledge worker to make data informed
> > decisions, particularly those who are not deeply skilled at writing SQL.
> > The challenge of visualizing huge amounts of data interactively and in
> > real-time
> >
> > Superset was first developed as a data visualization solution for
> Druid.io
> > as a way to visualize billions of rows of data.  Since then, usage of
> > Superset has expanded to address data visualization use cases across SQL
> > speaking data sources as well.
> >
> > Our rationale for developing Superset as an Apache project is detailed in
> > the Rationale Section.  We believe that the Apache brand and community
> > process will help us attract more contributors to this project, and help
> > grow the footprint of the project through usage at other organizations
> and
> > within other applications.  Establishing consensus among users and
> > developers will result in a more valuable tool for everyone.
> >
> > == Documentation ==
> > References to further reading material:
> > * [[http://airbnb.io/superset/|Superset Documentation]]
> > * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat
> > a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post:  Superset:
> > Airbnb’s Data Exploration Platform]]
> > * [[https://medium.com/airbnb-engineering/superset-scaling-dat
> > a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog
> Post:
> > Superset: Scaling Data Access & Visual Insights at Airbnb]]
> >
> > == Initial Source ==
> > The origin of the proposed code base can be found at
> > https://github.com/airbnb/superset.  The code base is primarily in
> Python.
> >
> > == Source and Intellectual Property Submission Plan ==
> > Airbnb will submit a Software Grant Agreement (SGA) as Superset joins the
> > incubator. We do not expect any complications for the submission of the
> > Superset code base.  Our code is already in Github and there is only a
> > single code base.
> >
> > == External Dependencies ==
> > List of Python packages, from the Python Package Index (Pypi):
> >
> > * boto3
> > * celery
> > * cryptography
> > * flask-appbuilder
> > * flask-cache
> > * flask-migrate
> > * flask-script
> > * flask-sqlalchemy
> > * flask-testing
> > * humanize
> > * gunicorn
> > * markdown
> > * pandas
> > * parsedatetime
> > * pydruid
> > * PyHive
> > * python-dateutil
> > * requests
> > * simplejson
> > * six
> > * sqlalchemy
> > * sqlalchemy-utils
> > * sqlparse
> > * thrift
> > * thrift-sasl
> > * werkzeug
> >
> > List of Javascript packages, from NPM:
> > * autobind-decorator
> > * bootstrap
> > * bootstrap-datepicker
> > * brace
> > * brfs
> > * cal-heatmap
> > * classnames
> > * d3
> > * d3-cloud
> > * d3-sankey
> > * d3-scale
> > * d3-tip
> > * datamaps
> > * datatables-bootstrap3-plugin
> > * datatables.net-bs
> > * font-awesome
> > * gridster
> > * immutability-helper
> > * immutable
> > * jquery
> > * lodash.throttle
> > * mapbox-gl
> > * moment
> > * moments
> > * mustache
> > * nvd3
> > * react
> > * react-ace
> > * react-bootstrap
> > * react-bootstrap-table
> > * react-dom
> > * react-draggable
> > * react-gravatar
> > * react-grid-layout
> > * react-map-gl
> > * react-redux
> > * react-resizable
> > * react-select
> > * react-syntax-highlighter
> > * reactable
> > * redux
> > * redux-localstorage
> > * redux-thunk
> > * shortid
> > * style-loader
> > * supercluster
> > * topojson
> > * victory
> > * viewport-mercator-project
> >
> > == Cryptography ==
> > The proposal does not include cryptographic code.
> >
> > == Required Resources ==
> >
> > === Mailing List ===
> > There is a current mailing list as a Google Group “airbnb_superset” that
> we
> > are planning on deprecating as the Apache.org become ready to serve our
> > community.
> >
> > * superset-private
> > * superset-dev
> > * superset-user
> >
> > === Subversion Directory ===
> > Git is the preferred source control system.
> http://svn.apache.org/repos/as
> > f/incubator/superset
> >
> > == Git Repository ==
> > Git is the preferred source control system, we’re assuming
> > https://github.com/apache/incubator-superset based on the naming scheme
> >
> > == Issue Tracking ==
> > JIRA Superset (SUPERSET). If possible, we’d like to use Github issues &
> PRs
> > to manage our project as much as possible. It’s been said that there are
> > ways to keep Github’s issues in sync with Jira, allowing us to get best
> of
> > both worlds. If that is not possible, we will comply to using Jira.
> >
> > == Other Resources ==
> > We currently use a set of Github integrated services that are free to the
> > open source community, like Travis-ci, Code Climate, Coveralls,
> > Landscape.io, Requires.io, david-dm and Gitter. We would like to keep
> using
> > these services as they allow us to scale contributions and optimize our
> > development flows. These services require some elevated rights on the
> > Github repository in order to set up or tune and we would like for the
> > committers to have the required rights.
> >
> >
> > == Initial Committers ==
> >
> > * Maxime Beauchemin <[hidden email]> - PPMC & Committer
> > * Alanna Scott <[hidden email]> - PPMC & Committer
> > * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
> > * Vera Liu <[hidden email]> - Committer
> > * Jeff Feng <[hidden email]> - PPMC & Committer
> > * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
> > * Nishant Bangarwa <[hidden email]> - PPMC & Committer
> > * Slim Bouguerra <[hidden email]> - Committer
> > * Priyank Shah <[hidden email]> - Committer
> > * Harsha Chintalapani <[hidden email]> - Committer
> > * Daniel Dai <[hidden email]> - Champion & Committer
> > * Luke Han <[hidden email]> - Mentor
> >
> > == Affiliations ==
> > The initial committers are employees of Airbnb Inc. and Hortonworks.
> >
> > == Sponsors ==
> >
> > === Champion ===
> > Daniel Dai <[hidden email]>
> >
> > === Nominated Mentors ===
> > * Ashutosh Chauhan <[hidden email]>
> > * Luke Han <[hidden email]>
> >
> > === Sponsoring Entity ===
> > Incubator PMC
> >
> >
> >
> >
> >
> > On Wed, Apr 26, 2017 at 6:31 PM, Edward J. Yoon <[hidden email]>
> > wrote:
> >
> >> +1 binding
> >>
> >> On Thu, Apr 27, 2017 at 10:29 AM, Naresh Agarwal
> >> <[hidden email]> wrote:
> >>> +1 (non-binding).
> >>>
> >>> Thanks
> >>> Naresh Agarwal
> >>>
> >>> On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <[hidden email]>
> >> wrote:
> >>>
> >>>> +1 (binding)
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <[hidden email]> wrote:
> >>>>
> >>>>> +1 (binding)
> >>>>>
> >>>>> On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey
> >>>>> <[hidden email]> wrote:
> >>>>>> +1 (binding)
> >>>>>>
> >>>>>> On 4/25/17, 1:27 PM, "Julian Hyde" <[hidden email]> wrote:
> >>>>>>
> >>>>>>    +1 binding
> >>>>>>
> >>>>>>> On Apr 25, 2017, at 12:48 PM, moon soo Lee <[hidden email]>
> >>>>> wrote:
> >>>>>>>
> >>>>>>> +1 (non-binding)
> >>>>>>>
> >>>>>>> On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <
> >>>>> [hidden email]>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> +1 (binding)
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Ashutosh
> >>>>>>>>
> >>>>>>>> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <[hidden email]
> >>>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>>> +1 binding
> >>>>>>>>>
> >>>>>>>>> Love to see Superset to be new incubator project.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Best Regards!
> >>>>>>>>> ---------------------
> >>>>>>>>>
> >>>>>>>>> Luke Han
> >>>>>>>>>
> >>>>>>>>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <
> >>>> [hidden email]>
> >>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Dear Apache Incubator Community,
> >>>>>>>>>>
> >>>>>>>>>> We have updated the Superset proposal
> >>>>>>>>>> <https://wiki.apache.org/incubator/SupersetProposal>
> >> (copied
> >>>>> below) for
> >>>>>>>>>>
> >>>>>>>>>> Apache Incubation with an additional mentor (Luke Han -
> >>>>>>>>>> [hidden email]),
> >>>>>>>>>> and would like to start a vote thread for acceptance into
> >> the
> >>>>> incubator.
> >>>>>>>>>>
> >>>>>>>>>> Our team is excited to share Superset with the Apache
> >>>> community
> >>>>> and we
> >>>>>>>>>> hope
> >>>>>>>>>> for the your continued support!
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>> Jeff & the Superset Team
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> = Superset =
> >>>>>>>>>>
> >>>>>>>>>> == Abstract ==
> >>>>>>>>>> Superset is an enterprise-ready web application for data
> >>>>> exploration,
> >>>>>>>> data
> >>>>>>>>>> visualization and dashboarding.
> >>>>>>>>>>
> >>>>>>>>>> == Proposal ==
> >>>>>>>>>> Superset is business intelligence (BI) software that helps
> >>>>> modern
> >>>>>>>>>> organizations visualize and interact with their data.
> >> Superset
> >>>>> enables
> >>>>>>>>>> users explore data from a variety of databases, assemble
> >>>>> beautiful
> >>>>>>>>>> dashboards and share their findings.  Superset works neatly
> >>>>> with all
> >>>>>>>>>> modern
> >>>>>>>>>> SQL-speaking databases, and integrates with Druid.io to
> >>>> provide
> >>>>>>>> real-time,
> >>>>>>>>>> interactive, blazing fast data access to large datasets.
> >>>>>>>>>>
> >>>>>>>>>> == Background ==
> >>>>>>>>>> Data is mission critical. To succeed in this era,
> >>>> organizations
> >>>>> need to
> >>>>>>>>>> provide low-friction, intuitive and interactive access to
> >>>> data.
> >>>>> It is
> >>>>>>>>>> paramount for knowledge workers to be capable of answering
> >>>>> their own
> >>>>>>>>>> questions by querying, exploring and visualizing data.
> >>>>>>>>>>
> >>>>>>>>>> The entire business intelligence industry has pivoted from
> >> a
> >>>>> model of
> >>>>>>>>>> centralized top-down platforms driven by IT organizations
> >> to
> >>>>>>>> self-service
> >>>>>>>>>> analytics and agile workflows by any user.  This shift
> >>>> unblocks
> >>>>>>>>>> centralized
> >>>>>>>>>> service bottlenecks for creating data visualizations while
> >>>> also
> >>>>> creating
> >>>>>>>>>> an
> >>>>>>>>>> environment that is iterative and fast-moving.  This means
> >>>> that
> >>>>> business
> >>>>>>>>>> intelligence software must also be easy and delightful to
> >> use.
> >>>>>>>>>> Self-service analytics doesn’t mean that admin and
> >> governance
> >>>>> features
> >>>>>>>> are
> >>>>>>>>>> not needed.
> >>>>>>>>>> Modern BI tools provide fine-grain access controls and
> >>>> auditing
> >>>>>>>>>> capabilities to understand how data is being used.
> >> Superset
> >>>> is
> >>>>> a
> >>>>>>>> solution
> >>>>>>>>>> that delivers on all of these vectors.
> >>>>>>>>>>
> >>>>>>>>>> The technology stack is also constantly morphing - vendors
> >> are
> >>>>>>>> struggling
> >>>>>>>>>> to provide cheap, quick and easy solutions to access data.
> >>>>> Business
> >>>>>>>>>> intelligence users are finding existing solutions lacking
> >> as
> >>>>> these
> >>>>>>>>>> software
> >>>>>>>>>> products either disregard or react slowly to recent
> >>>>> game-changing
> >>>>>>>>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache
> >>>>> Kylin, d3.js,
> >>>>>>>>>> React.js and iPython’s Jupyter for instance.
> >>>>>>>>>>
> >>>>>>>>>> == Rationale ==
> >>>>>>>>>> Business intelligence is more relevant today than at any
> >> other
> >>>>> point in
> >>>>>>>>>> history.  Organizations are currently very limited in
> >> options
> >>>>> for open
> >>>>>>>>>> source data visualization solutions, especially solutions
> >> that
> >>>>> are both
> >>>>>>>>>> self-service and enterprise-ready.  Every company informing
> >>>>> their
> >>>>>>>>>> decisions
> >>>>>>>>>> with data needs a BI tool.
> >>>>>>>>>>
> >>>>>>>>>> We believe that Superset will be a strong compliment to
> >>>>> existing Apache
> >>>>>>>>>> Software Foundation technologies by offering scalable user
> >>>>> interactions
> >>>>>>>> to
> >>>>>>>>>> distributed storage and computation solutions.  Users will
> >>>>> often find
> >>>>>>>> that
> >>>>>>>>>> Superset can act as a catalyst for tooling that can
> >> visualize
> >>>>> the
> >>>>>>>>>> byproduct
> >>>>>>>>>> of data and computation infrastructure.
> >>>>>>>>>>
> >>>>>>>>>> Superset has many key design elements that help fill a gap
> >> in
> >>>>> current
> >>>>>>>>>> solutions for organizations:
> >>>>>>>>>> * Easy, low friction access to data through a simple,
> >>>> web-based
> >>>>> data
> >>>>>>>>>> exploration interface.  Composing charts and dashboards are
> >>>>> intuitive.
> >>>>>>>>>> Eliminating the need to write code or SQL empowers anyone
> >> to
> >>>>> use it.
> >>>>>>>>>> * Access to a wide array of rich, interactive data
> >>>>> visualization types.
> >>>>>>>>>> * Enterprise-ready: Integration with different
> >> authentication
> >>>>>>>> mechanisms
> >>>>>>>>>> and granular permissions centered around actions and data
> >>>>> access.
> >>>>>>>>>> * Realtime & fast: Superset provides realtime analytics at
> >> the
> >>>>> speed of
> >>>>>>>>>> thought on very large datasets when integrated with
> >> Druid.io.
> >>>>>>>>>> * Broad data access: Consume data out of any SQL-speaking
> >>>>> relational
> >>>>>>>>>> database.
> >>>>>>>>>> * Extensible: Can be extended to talk to many noSQL
> >> databases
> >>>>> like
> >>>>>>>> Apache
> >>>>>>>>>> Drill, Elastic Search, and other popular database engines.
> >>>>>>>>>> * Fast loading dashboards with configurable web-scale
> >> caching.
> >>>>>>>>>> * Plug-in framework that enables organizations to build
> >> custom
> >>>>>>>> analytical
> >>>>>>>>>> applications with new UI/UX interfaces.
> >>>>>>>>>> * SQL Lab, a state-of-the-art SQL IDE that empowers
> >>>>> SQL-speaking users
> >>>>>>>>>> with more flexibility.  SQL Lab integrates with the
> >>>>> visualization engine
> >>>>>>>>>> seamlessly.
> >>>>>>>>>>
> >>>>>>>>>> == Initial Goals ==
> >>>>>>>>>> The initial goals of the Superset project are several-fold:
> >>>>>>>>>> * Move the existing codebase to Apache and integrate with
> >> the
> >>>>> Apache
> >>>>>>>>>> development process.
> >>>>>>>>>> * Redesign the user interface and interaction model for
> >>>> creating
> >>>>>>>>>> visualizations/dashboards and connecting to data sources
> >>>>>>>>>> * Build robust support for security and governance of the
> >> tool
> >>>>>>>> including
> >>>>>>>>>> popular authorization modules (including Apache Ranger and
> >>>>> Apache
> >>>>>>>> Sentry)
> >>>>>>>>>> and a more sophisticated permissions system
> >>>>>>>>>> * Grow the extensibility of the project both in terms of
> >>>>> enhanced
> >>>>>>>>>> connectivity to NoSQL-based data sources and creating a
> >>>> plug-in
> >>>>>>>> framework
> >>>>>>>>>> that enables organizations to build custom analytical
> >>>>> applications which
> >>>>>>>>>> require a new UI/UX
> >>>>>>>>>>
> >>>>>>>>>> == Current Status ==
> >>>>>>>>>> By many standards, Superset is already a successful open
> >>>> source
> >>>>> project.
> >>>>>>>>>> As
> >>>>>>>>>> of March 2017, Superset is officially used in production at
> >>>>> about a
> >>>>>>>> dozen
> >>>>>>>>>> companies, has received contributions from over one hundred
> >>>>> contributors
> >>>>>>>>>> on
> >>>>>>>>>> Github, 1500+ forks, and 12k+ stars.
> >>>>>>>>>>
> >>>>>>>>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have
> >>>> made
> >>>>>>>>>> significant contributions, and expressed their commitment
> >> to
> >>>> the
> >>>>>>>> project.
> >>>>>>>>>> The product is feature complete and has been viable for
> >>>> months.
> >>>>> It
> >>>>>>>> already
> >>>>>>>>>> serves as the main interface for consuming data at many
> >>>>> companies of
> >>>>>>>>>> different sizes.
> >>>>>>>>>>
> >>>>>>>>>> While the product is usable, there’s room for improvement
> >>>>> across the
> >>>>>>>>>> board,
> >>>>>>>>>> starting with providing a smoother user experience around
> >>>>> content
> >>>>>>>>>> creation,
> >>>>>>>>>> making sure all features work out-of-the-box on more
> >> platforms
> >>>>> and
> >>>>>>>>>> databases, providing better user training guides and
> >> videos,
> >>>>> having a
> >>>>>>>>>> predictable release process, and increasing the overall
> >>>> quality
> >>>>> of the
> >>>>>>>>>> Superset releases.
> >>>>>>>>>>
> >>>>>>>>>> === Meritocracy ===
> >>>>>>>>>> We plan to invest in supporting a meritocracy. We will
> >> discuss
> >>>>> the
> >>>>>>>>>> requirements in an open forum. Several companies have
> >>>> expressed
> >>>>> interest
> >>>>>>>>>> in
> >>>>>>>>>> this project, and we intend to invite additional
> >> developers to
> >>>>>>>>>> participate.
> >>>>>>>>>> We will encourage and monitor community participation so
> >> that
> >>>>> privileges
> >>>>>>>>>> can be extended to those that contribute.
> >>>>>>>>>>
> >>>>>>>>>> === Community ===
> >>>>>>>>>> The need for an enterprise-ready data visualization and
> >>>>> exploration
> >>>>>>>>>> platform in the open source community is tremendous.  While
> >>>>> Superset is
> >>>>>>>>>> fairly well known, recognized and used within the Druid.io
> >>>>> community,
> >>>>>>>>>> adoption is currently limited outside of that niche. There
> >> is
> >>>> a
> >>>>> huge
> >>>>>>>>>> opportunity to grow the community to hundreds if not
> >> thousands
> >>>>> of
> >>>>>>>>>> organizations, and we are hoping that embracing “the Apache
> >>>>> way” will
> >>>>>>>>>> accelerate the growth of our community.
> >>>>>>>>>>
> >>>>>>>>>> We have already been active at seeking and inviting
> >>>>> contributions, and
> >>>>>>>> are
> >>>>>>>>>> planning to scale the project by investing time and growing
> >>>> the
> >>>>> support
> >>>>>>>>>> structure to grow the community.
> >>>>>>>>>>
> >>>>>>>>>> === Core Developers ===
> >>>>>>>>>> The initial committers for Superset include experienced
> >> full
> >>>>> stack,
> >>>>>>>>>> front-end and data engineers:
> >>>>>>>>>> * Maxime Beauchemin (Airbnb)
> >>>>>>>>>> * Alanna Scott (Airbnb)
> >>>>>>>>>> * Bogdan Kyryliuk (Airbnb)
> >>>>>>>>>> * Vera Liu  (Airbnb)
> >>>>>>>>>> * Jeff Feng (Airbnb)
> >>>>>>>>>> * Ashutosh Chauhan (Hortonworks)
> >>>>>>>>>> * Nishant Bangarwa (Hortonworks)
> >>>>>>>>>> * Slim Bouguerra (Hortonworks)
> >>>>>>>>>> * Priyank Shah (Hortonworks)
> >>>>>>>>>> * Sriharsha Chintalapani (Hortonworks)
> >>>>>>>>>> * Daniel Dai (Hortonworks)
> >>>>>>>>>>
> >>>>>>>>>> We realize that additional employer diversity is needed,
> >> and
> >>>> we
> >>>>> will
> >>>>>>>> work
> >>>>>>>>>> aggressively to recruit developers from additional
> >> companies.
> >>>>>>>>>>
> >>>>>>>>>> === Alignment ===
> >>>>>>>>>> The initial committers strongly believe that a system for
> >>>>> interactive
> >>>>>>>>>> visualization of data will gain broader adoption as an open
> >>>>> source,
> >>>>>>>>>> community driven project, where the community can
> >> contribute
> >>>>> not only to
> >>>>>>>>>> the core components, but also to a growing collection of
> >>>>> connectors,
> >>>>>>>>>> visualizations and improving integration a all potential
> >> data
> >>>>> sources.
> >>>>>>>>>> Superset already integrates closely with Apache Hive, the
> >> Hive
> >>>>>>>> metastore,
> >>>>>>>>>> as well as most SQL-speaking databases found in modern data
> >>>>> ecosystems.
> >>>>>>>>>>
> >>>>>>>>>> == Known Risks ==
> >>>>>>>>>>
> >>>>>>>>>> === Orphaned Products ===
> >>>>>>>>>> Superset is a vital component for both visualizing,
> >> accessing
> >>>>> and
> >>>>>>>>>> democratizing data at Airbnb.  Also at Hortonworks,
> >> Superset
> >>>> is
> >>>>> a core
> >>>>>>>>>> component of the DataFlow product offering.  Thus, the
> >> risk of
> >>>>> the
> >>>>>>>> project
> >>>>>>>>>> being orphaned is relatively low.  The project could be at
> >>>> risk
> >>>>> if
> >>>>>>>> Airbnb
> >>>>>>>>>> changes their approach for democratizing data or if
> >>>> Hortonworks
> >>>>> changes
> >>>>>>>>>> their strategy in the market.  In such an event, the
> >>>> committers
> >>>>> plan to
> >>>>>>>>>> continue working on the project on their own time, thought
> >> the
> >>>>> progress
> >>>>>>>>>> will likely be slower.  We plan to mitigate this risk by
> >>>>> recruiting
> >>>>>>>>>> additional committers.
> >>>>>>>>>>
> >>>>>>>>>> === Inexperience with Open Source ===
> >>>>>>>>>> The initial committers include veteran Apache members
> >>>>> (committers and
> >>>>>>>> PPMC
> >>>>>>>>>> members) and other developers who have varying degrees of
> >>>>> experience
> >>>>>>>> with
> >>>>>>>>>> open source projects. All have been involved with source
> >> code
> >>>>> that has
> >>>>>>>>>> been
> >>>>>>>>>> released under an open source license, and several also
> >> have
> >>>>> experience
> >>>>>>>>>> developing code with an open source development process.
> >>>>>>>>>>
> >>>>>>>>>> === Homogenous Developers ===
> >>>>>>>>>> The initial committers are employed by Airbnb Inc. and
> >>>>> Hortonworks. We
> >>>>>>>> are
> >>>>>>>>>> committed to recruiting additional committers from other
> >>>>> companies.
> >>>>>>>>>>
> >>>>>>>>>> === Reliance on Salaried Developers ===
> >>>>>>>>>> It is expected that Superset development will occur on both
> >>>>> salaried
> >>>>>>>> time
> >>>>>>>>>> and on volunteer time, after hours. The majority of initial
> >>>>> committers
> >>>>>>>> are
> >>>>>>>>>> paid by their employer to contribute to this project.
> >> However,
> >>>>> they are
> >>>>>>>>>> all
> >>>>>>>>>> passionate about the project, and we are confident that the
> >>>>> project will
> >>>>>>>>>> continue even if no salaried developers contribute to the
> >>>>> project. We
> >>>>>>>> are
> >>>>>>>>>> committed to recruiting additional committers including
> >>>>> non-salaried
> >>>>>>>>>> developers.
> >>>>>>>>>>
> >>>>>>>>>> === Relationships with Other Apache Products ===
> >>>>>>>>>> To the knowledge of the Initial Committers, there are no
> >>>> direct
> >>>>>>>>>> competitors
> >>>>>>>>>> to Superset within the Apache Software Foundation.  That
> >> said,
> >>>>> Apache
> >>>>>>>>>> Zeppelin is an indirect competitor, but it solves a
> >> different
> >>>>> use case.
> >>>>>>>>>>
> >>>>>>>>>> Apache Zeppelin is a web-based notebook that enables
> >>>>> interactive data
> >>>>>>>>>> analytics. It enables the creation of beautiful
> >> data-driven,
> >>>>> interactive
> >>>>>>>>>> and collaborative documents with SQL, Scala and more.
> >>>> Although
> >>>>> a user
> >>>>>>>> can
> >>>>>>>>>> create data visualizations using this project, it
> >> leverages a
> >>>>> notebook
> >>>>>>>>>> style user interfaces and it is geared towards the Spark
> >>>>> community where
> >>>>>>>>>> Scala and SQL co-exist
> >>>>>>>>>>
> >>>>>>>>>> We look forward to collaborating with those communities, as
> >>>>> well as
> >>>>>>>> other
> >>>>>>>>>> Apache communities.
> >>>>>>>>>>
> >>>>>>>>>> === An Excessive Fascination with the Apache Brand ===
> >>>>>>>>>> Superset is solving two huge challenges:
> >>>>>>>>>> The challenge of enabling every knowledge worker to make
> >> data
> >>>>> informed
> >>>>>>>>>> decisions, particularly those who are not deeply skilled at
> >>>>> writing SQL.
> >>>>>>>>>> The challenge of visualizing huge amounts of data
> >>>> interactively
> >>>>> and in
> >>>>>>>>>> real-time
> >>>>>>>>>>
> >>>>>>>>>> Superset was first developed as a data visualization
> >> solution
> >>>>> for
> >>>>>>>> Druid.io
> >>>>>>>>>> as a way to visualize billions of rows of data.  Since
> >> then,
> >>>>> usage of
> >>>>>>>>>> Superset has expanded to address data visualization use
> >> cases
> >>>>> across SQL
> >>>>>>>>>> speaking data sources as well.
> >>>>>>>>>>
> >>>>>>>>>> Our rationale for developing Superset as an Apache project
> >> is
> >>>>> detailed
> >>>>>>>> in
> >>>>>>>>>> the Rationale Section.  We believe that the Apache brand
> >> and
> >>>>> community
> >>>>>>>>>> process will help us attract more contributors to this
> >>>> project,
> >>>>> and help
> >>>>>>>>>> grow the footprint of the project through usage at other
> >>>>> organizations
> >>>>>>>> and
> >>>>>>>>>> within other applications.  Establishing consensus among
> >> users
> >>>>> and
> >>>>>>>>>> developers will result in a more valuable tool for
> >> everyone.
> >>>>>>>>>>
> >>>>>>>>>> == Documentation ==
> >>>>>>>>>> References to further reading material:
> >>>>>>>>>> * [[http://airbnb.io/superset/|Superset Documentation]]
> >>>>>>>>>> * [[
> >>>>>>>>>> https://medium.com/airbnb-engi
> >> neering/caravel-airbnb-s-data-
> >>>>>>>>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
> >>>>>>>>>> Post:  Superset: Airbnb’s Data Exploration Platform]]
> >>>>>>>>>> * [[
> >>>>>>>>>> https://medium.com/airbnb-engi
> >> neering/superset-scaling-data-
> >>>>>>>>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.
> >>>>> a505zvb1t|Blog
> >>>>>>>>>> Post:  Superset: Scaling Data Access & Visual Insights at
> >>>>> Airbnb]]
> >>>>>>>>>>
> >>>>>>>>>> == Initial Source ==
> >>>>>>>>>> The origin of the proposed code base can be found at
> >>>>>>>>>> https://github.com/airbnb/superset.  The code base is
> >>>>> primarily in
> >>>>>>>>>> Python.
> >>>>>>>>>>
> >>>>>>>>>> == Source and Intellectual Property Submission Plan ==
> >>>>>>>>>> We do not expect any complications for the submission of
> >> the
> >>>>> Superset
> >>>>>>>> code
> >>>>>>>>>> base.  Our code is already in Github and there is only a
> >>>> single
> >>>>> code
> >>>>>>>> base.
> >>>>>>>>>>
> >>>>>>>>>> == External Dependencies ==
> >>>>>>>>>> List of Python packages, from the Python Package Index
> >> (Pypi):
> >>>>>>>>>>
> >>>>>>>>>> * boto3
> >>>>>>>>>> * celery
> >>>>>>>>>> * cryptography
> >>>>>>>>>> * flask-appbuilder
> >>>>>>>>>> * flask-cache
> >>>>>>>>>> * flask-migrate
> >>>>>>>>>> * flask-script
> >>>>>>>>>> * flask-sqlalchemy
> >>>>>>>>>> * flask-testing
> >>>>>>>>>> * humanize
> >>>>>>>>>> * gunicorn
> >>>>>>>>>> * markdown
> >>>>>>>>>> * pandas
> >>>>>>>>>> * parsedatetime
> >>>>>>>>>> * pydruid
> >>>>>>>>>> * PyHive
> >>>>>>>>>> * python-dateutil
> >>>>>>>>>> * requests
> >>>>>>>>>> * simplejson
> >>>>>>>>>> * six
> >>>>>>>>>> * sqlalchemy
> >>>>>>>>>> * sqlalchemy-utils
> >>>>>>>>>> * sqlparse
> >>>>>>>>>> * thrift
> >>>>>>>>>> * thrift-sasl
> >>>>>>>>>> * werkzeug
> >>>>>>>>>>
> >>>>>>>>>> List of Javascript packages, from NPM:
> >>>>>>>>>> * autobind-decorator
> >>>>>>>>>> * bootstrap
> >>>>>>>>>> * bootstrap-datepicker
> >>>>>>>>>> * brace
> >>>>>>>>>> * brfs
> >>>>>>>>>> * cal-heatmap
> >>>>>>>>>> * classnames
> >>>>>>>>>> * d3
> >>>>>>>>>> * d3-cloud
> >>>>>>>>>> * d3-sankey
> >>>>>>>>>> * d3-scale
> >>>>>>>>>> * d3-tip
> >>>>>>>>>> * datamaps
> >>>>>>>>>> * datatables-bootstrap3-plugin
> >>>>>>>>>> * datatables.net-bs
> >>>>>>>>>> * font-awesome
> >>>>>>>>>> * gridster
> >>>>>>>>>> * immutability-helper
> >>>>>>>>>> * immutable
> >>>>>>>>>> * jquery
> >>>>>>>>>> * lodash.throttle
> >>>>>>>>>> * mapbox-gl
> >>>>>>>>>> * moment
> >>>>>>>>>> * moments
> >>>>>>>>>> * mustache
> >>>>>>>>>> * nvd3
> >>>>>>>>>> * react
> >>>>>>>>>> * react-ace
> >>>>>>>>>> * react-bootstrap
> >>>>>>>>>> * react-bootstrap-table
> >>>>>>>>>> * react-dom
> >>>>>>>>>> * react-draggable
> >>>>>>>>>> * react-gravatar
> >>>>>>>>>> * react-grid-layout
> >>>>>>>>>> * react-map-gl
> >>>>>>>>>> * react-redux
> >>>>>>>>>> * react-resizable
> >>>>>>>>>> * react-select
> >>>>>>>>>> * react-syntax-highlighter
> >>>>>>>>>> * reactable
> >>>>>>>>>> * redux
> >>>>>>>>>> * redux-localstorage
> >>>>>>>>>> * redux-thunk
> >>>>>>>>>> * shortid
> >>>>>>>>>> * style-loader
> >>>>>>>>>> * supercluster
> >>>>>>>>>> * topojson
> >>>>>>>>>> * victory
> >>>>>>>>>> * viewport-mercator-project
> >>>>>>>>>>
> >>>>>>>>>> == Cryptography ==
> >>>>>>>>>> The proposal does not include cryptographic code.
> >>>>>>>>>>
> >>>>>>>>>> == Required Resources ==
> >>>>>>>>>>
> >>>>>>>>>> === Mailing List ===
> >>>>>>>>>> There is a current mailing list as a Google Group
> >>>>> “airbnb_superset” that
> >>>>>>>>>> we
> >>>>>>>>>> are planning on deprecating as the Apache.org become ready
> >> to
> >>>>> serve our
> >>>>>>>>>> community.
> >>>>>>>>>>
> >>>>>>>>>> * superset-private
> >>>>>>>>>> * superset-dev
> >>>>>>>>>> * superset-user
> >>>>>>>>>>
> >>>>>>>>>> === Subversion Directory ===
> >>>>>>>>>> Git is the preferred source control system.
> >>>>>>>>>> http://svn.apache.org/repos/asf/incubator/superset
> >>>>>>>>>>
> >>>>>>>>>> == Git Repository ==
> >>>>>>>>>> Git is the preferred source control system, we’re assuming
> >>>>>>>>>> https://github.com/apache/incubator-superset based on the
> >>>>> naming scheme
> >>>>>>>>>>
> >>>>>>>>>> == Issue Tracking ==
> >>>>>>>>>> JIRA Superset (SUPERSET). If possible, we’d like to use
> >> Github
> >>>>> issues &
> >>>>>>>>>> PRs
> >>>>>>>>>> to manage our project as much as possible. It’s been said
> >> that
> >>>>> there are
> >>>>>>>>>> ways to keep Github’s issues in sync with Jira, allowing
> >> us to
> >>>>> get best
> >>>>>>>> of
> >>>>>>>>>> both worlds. If that is not possible, we will comply to
> >> using
> >>>>> Jira.
> >>>>>>>>>>
> >>>>>>>>>> == Other Resources ==
> >>>>>>>>>> We currently use a set of Github integrated services that
> >> are
> >>>>> free to
> >>>>>>>> the
> >>>>>>>>>> open source community, like Travis-ci, Code Climate,
> >>>> Coveralls,
> >>>>>>>>>> Landscape.io, Requires.io, david-dm and Gitter. We would
> >> like
> >>>>> to keep
> >>>>>>>>>> using
> >>>>>>>>>> these services as they allow us to scale contributions and
> >>>>> optimize our
> >>>>>>>>>> development flows. These services require some elevated
> >> rights
> >>>>> on the
> >>>>>>>>>> Github repository in order to set up or tune and we would
> >> like
> >>>>> for the
> >>>>>>>>>> committers to have the required rights.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> == Initial Committers ==
> >>>>>>>>>>
> >>>>>>>>>> * Maxime Beauchemin <[hidden email]> - PPMC
> >> &
> >>>>> Committer
> >>>>>>>>>> * Alanna Scott <[hidden email]> - PPMC &
> >> Committer
> >>>>>>>>>> * Bogdan Kyryliuk <[hidden email]> - PPMC &
> >> Committer
> >>>>>>>>>> * Vera Liu <[hidden email]> - Committer
> >>>>>>>>>> * Jeff Feng <[hidden email]> - PPMC & Committer
> >>>>>>>>>> * Ashutosh Chauhan <[hidden email]> - Mentor &
> >>>> Committer
> >>>>>>>>>> * Nishant Bangarwa <[hidden email]> - PPMC &
> >>>>> Committer
> >>>>>>>>>> * Slim Bouguerra <[hidden email]> - Committer
> >>>>>>>>>> * Priyank Shah <[hidden email]> - Committer
> >>>>>>>>>> * Harsha Chintalapani <[hidden email]> -
> >>>>> Committer
> >>>>>>>>>> * Daniel Dai <[hidden email]> - Champion & Committer
> >>>>>>>>>> * Luke Han <[hidden email]> - Mentor
> >>>>>>>>>>
> >>>>>>>>>> == Affiliations ==
> >>>>>>>>>> The initial committers are employees of Airbnb Inc. and
> >>>>> Hortonworks.
> >>>>>>>>>>
> >>>>>>>>>> == Sponsors ==
> >>>>>>>>>>
> >>>>>>>>>> === Champion ===
> >>>>>>>>>> Daniel Dai <[hidden email]>
> >>>>>>>>>>
> >>>>>>>>>> === Nominated Mentors ===
> >>>>>>>>>> * Ashutosh Chauhan <[hidden email]>
> >>>>>>>>>> * Luke Han <[hidden email]>
> >>>>>>>>>>
> >>>>>>>>>> === Sponsoring Entity ===
> >>>>>>>>>> Incubator PMC
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>    ------------------------------------------------------------
> >>>>> ---------
> >>>>>>    To unsubscribe, e-mail: general-unsubscribe@incubator.
> >> apache.org
> >>>>>>    For additional commands, e-mail: [hidden email].
> >> org
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> ------------------------------------------------------------
> ---------
> >>>>> To unsubscribe, e-mail: [hidden email]
> >>>>> For additional commands, e-mail: [hidden email]
> >>>>>
> >>>>>
> >>>>
> >>
> >>
> >>
> >> --
> >> Best Regards, Edward J. Yoon
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [hidden email]
> >> For additional commands, e-mail: [hidden email]
> >>
> >>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Jitendra Pandey
Re-affirming. +1 (binding)

On 4/27/17, 1:24 PM, "Ashutosh Chauhan" <[hidden email]> wrote:

    Re-affirming my vote as well:
   
    +1 (binding)
   
    On Thu, Apr 27, 2017 at 10:45 AM, Julian Hyde <[hidden email]> wrote:
   
    > Re-affriming my vote:
    >
    > +1 (binding)
    >
    > > On Apr 26, 2017, at 11:12 PM, Jeff Feng <[hidden email]> wrote:
    > >
    > > Hello everyone,
    > >
    > > Thank you for checking out our proposal on Superset and for your
    > > consideration for the Apache Incubator.  So far, I believe we have 8
    > > binding votes and 2 non-binding votes.
    > >
    > > As Taylor mentioned earlier, we made a minor update to the wording in the
    > > "Source and Intellectual Property Submission Plan" section based on a
    > > suggestion by John Ament.  The update was to help confirm the previously
    > > unstated assumption that we will submit an SGA.  I have copied the
    > updated
    > > proposal from the wiki to the email below and highlighted (in yellow) the
    > > new sentence below in the document.
    > >
    > > Folks on the cc line who have already voted, please let us know if the
    > > change impacts your vote.
    > >
    > > Thank you all,
    > > Jeff
    > >
    > >
    > >
    > > = Superset =
    > >
    > > == Abstract ==
    > > Superset is an enterprise-ready web application for data exploration,
    > data
    > > visualization and dashboarding.
    > >
    > > == Proposal ==
    > > Superset is business intelligence (BI) software that helps modern
    > > organizations visualize and interact with their data. Superset enables
    > > users explore data from a variety of databases, assemble beautiful
    > > dashboards and share their findings.  Superset works neatly with all
    > modern
    > > SQL-speaking databases, and integrates with Druid.io to provide
    > real-time,
    > > interactive, blazing fast data access to large datasets.
    > >
    > > == Background ==
    > > Data is mission critical. To succeed in this era, organizations need to
    > > provide low-friction, intuitive and interactive access to data. It is
    > > paramount for knowledge workers to be capable of answering their own
    > > questions by querying, exploring and visualizing data.
    > >
    > > The entire business intelligence industry has pivoted from a model of
    > > centralized top-down platforms driven by IT organizations to self-service
    > > analytics and agile workflows by any user.  This shift unblocks
    > centralized
    > > service bottlenecks for creating data visualizations while also creating
    > an
    > > environment that is iterative and fast-moving.  This means that business
    > > intelligence software must also be easy and delightful to use.
    > > Self-service analytics doesn’t mean that admin and governance features
    > are
    > > not needed.
    > > Modern BI tools provide fine-grain access controls and auditing
    > > capabilities to understand how data is being used.  Superset is a
    > solution
    > > that delivers on all of these vectors.
    > >
    > > The technology stack is also constantly morphing - vendors are struggling
    > > to provide cheap, quick and easy solutions to access data.  Business
    > > intelligence users are finding existing solutions lacking as these
    > software
    > > products either disregard or react slowly to recent game-changing
    > > technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js,
    > > React.js and iPython’s Jupyter for instance.
    > >
    > > == Rationale ==
    > > Business intelligence is more relevant today than at any other point in
    > > history.  Organizations are currently very limited in options for open
    > > source data visualization solutions, especially solutions that are both
    > > self-service and enterprise-ready.  Every company informing their
    > decisions
    > > with data needs a BI tool.
    > >
    > > We believe that Superset will be a strong compliment to existing Apache
    > > Software Foundation technologies by offering scalable user interactions
    > to
    > > distributed storage and computation solutions.  Users will often find
    > that
    > > Superset can act as a catalyst for tooling that can visualize the
    > byproduct
    > > of data and computation infrastructure.
    > >
    > > Superset has many key design elements that help fill a gap in current
    > > solutions for organizations:
    > > * Easy, low friction access to data through a simple, web-based data
    > > exploration interface.  Composing charts and dashboards are intuitive.
    > > Eliminating the need to write code or SQL empowers anyone to use it.
    > > * Access to a wide array of rich, interactive data visualization types.
    > > * Enterprise-ready: Integration with different authentication mechanisms
    > > and granular permissions centered around actions and data access.
    > > * Realtime & fast: Superset provides realtime analytics at the speed of
    > > thought on very large datasets when integrated with Druid.io.
    > > * Broad data access: Consume data out of any SQL-speaking relational
    > > database.
    > > * Extensible: Can be extended to talk to many noSQL databases like Apache
    > > Drill, Elastic Search, and other popular database engines.
    > > * Fast loading dashboards with configurable web-scale caching.
    > > * Plug-in framework that enables organizations to build custom analytical
    > > applications with new UI/UX interfaces.
    > > * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users
    > > with more flexibility.  SQL Lab integrates with the visualization engine
    > > seamlessly.
    > >
    > > == Initial Goals ==
    > > The initial goals of the Superset project are several-fold:
    > > * Move the existing codebase to Apache and integrate with the Apache
    > > development process.
    > > * Redesign the user interface and interaction model for creating
    > > visualizations/dashboards and connecting to data sources
    > > * Build robust support for security and governance of the tool including
    > > popular authorization modules (including Apache Ranger and Apache Sentry)
    > > and a more sophisticated permissions system
    > > * Grow the extensibility of the project both in terms of enhanced
    > > connectivity to NoSQL-based data sources and creating a plug-in framework
    > > that enables organizations to build custom analytical applications which
    > > require a new UI/UX
    > >
    > > == Current Status ==
    > > By many standards, Superset is already a successful open source project.
    > As
    > > of March 2017, Superset is officially used in production at about a dozen
    > > companies, has received contributions from over one hundred contributors
    > on
    > > Github, 1500+ forks, and 12k+ stars.
    > >
    > > Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
    > > significant contributions, and expressed their commitment to the project.
    > > The product is feature complete and has been viable for months. It
    > already
    > > serves as the main interface for consuming data at many companies of
    > > different sizes.
    > >
    > > While the product is usable, there’s room for improvement across the
    > board,
    > > starting with providing a smoother user experience around content
    > creation,
    > > making sure all features work out-of-the-box on more platforms and
    > > databases, providing better user training guides and videos, having a
    > > predictable release process, and increasing the overall quality of the
    > > Superset releases.
    > >
    > > === Meritocracy ===
    > > We plan to invest in supporting a meritocracy. We will discuss the
    > > requirements in an open forum. Several companies have expressed interest
    > in
    > > this project, and we intend to invite additional developers to
    > participate.
    > > We will encourage and monitor community participation so that privileges
    > > can be extended to those that contribute.
    > >
    > > === Community ===
    > > The need for an enterprise-ready data visualization and exploration
    > > platform in the open source community is tremendous.  While Superset is
    > > fairly well known, recognized and used within the Druid.io community,
    > > adoption is currently limited outside of that niche. There is a huge
    > > opportunity to grow the community to hundreds if not thousands of
    > > organizations, and we are hoping that embracing “the Apache way” will
    > > accelerate the growth of our community.
    > >
    > > We have already been active at seeking and inviting contributions, and
    > are
    > > planning to scale the project by investing time and growing the support
    > > structure to grow the community.
    > >
    > > === Core Developers ===
    > > The initial committers for Superset include experienced full stack,
    > > front-end and data engineers:
    > > * Maxime Beauchemin (Airbnb)
    > > * Alanna Scott (Airbnb)
    > > * Bogdan Kyryliuk (Airbnb)
    > > * Vera Liu  (Airbnb)
    > > * Jeff Feng (Airbnb)
    > > * Ashutosh Chauhan (Hortonworks)
    > > * Nishant Bangarwa (Hortonworks)
    > > * Slim Bouguerra (Hortonworks)
    > > * Priyank Shah (Hortonworks)
    > > * Sriharsha Chintalapani (Hortonworks)
    > > * Daniel Dai (Hortonworks)
    > >
    > > We realize that additional employer diversity is needed, and we will work
    > > aggressively to recruit developers from additional companies.
    > >
    > > === Alignment ===
    > > The initial committers strongly believe that a system for interactive
    > > visualization of data will gain broader adoption as an open source,
    > > community driven project, where the community can contribute not only to
    > > the core components, but also to a growing collection of connectors,
    > > visualizations and improving integration a all potential data sources.
    > > Superset already integrates closely with Apache Hive, the Hive metastore,
    > > as well as most SQL-speaking databases found in modern data ecosystems.
    > >
    > > == Known Risks ==
    > >
    > > === Orphaned Products ===
    > > Superset is a vital component for both visualizing, accessing and
    > > democratizing data at Airbnb.  Also at Hortonworks, Superset is a core
    > > component of the DataFlow product offering.  Thus, the risk of the
    > project
    > > being orphaned is relatively low.  The project could be at risk if Airbnb
    > > changes their approach for democratizing data or if Hortonworks changes
    > > their strategy in the market.  In such an event, the committers plan to
    > > continue working on the project on their own time, thought the progress
    > > will likely be slower.  We plan to mitigate this risk by recruiting
    > > additional committers.
    > >
    > > === Inexperience with Open Source ===
    > > The initial committers include veteran Apache members (committers and
    > PPMC
    > > members) and other developers who have varying degrees of experience with
    > > open source projects. All have been involved with source code that has
    > been
    > > released under an open source license, and several also have experience
    > > developing code with an open source development process.
    > >
    > > === Homogenous Developers ===
    > > The initial committers are employed by Airbnb Inc. and Hortonworks. We
    > are
    > > committed to recruiting additional committers from other companies.
    > >
    > > === Reliance on Salaried Developers ===
    > > It is expected that Superset development will occur on both salaried time
    > > and on volunteer time, after hours. The majority of initial committers
    > are
    > > paid by their employer to contribute to this project. However, they are
    > all
    > > passionate about the project, and we are confident that the project will
    > > continue even if no salaried developers contribute to the project. We are
    > > committed to recruiting additional committers including non-salaried
    > > developers.
    > >
    > > === Relationships with Other Apache Products ===
    > > To the knowledge of the Initial Committers, there are no direct
    > competitors
    > > to Superset within the Apache Software Foundation.  That said, Apache
    > > Zeppelin is an indirect competitor, but it solves a different use case.
    > >
    > > Apache Zeppelin is a web-based notebook that enables interactive data
    > > analytics. It enables the creation of beautiful data-driven, interactive
    > > and collaborative documents with SQL, Scala and more.  Although a user
    > can
    > > create data visualizations using this project, it leverages a notebook
    > > style user interfaces and it is geared towards the Spark community where
    > > Scala and SQL co-exist
    > >
    > > We look forward to collaborating with those communities, as well as other
    > > Apache communities.
    > >
    > > === An Excessive Fascination with the Apache Brand ===
    > > Superset is solving two huge challenges:
    > > The challenge of enabling every knowledge worker to make data informed
    > > decisions, particularly those who are not deeply skilled at writing SQL.
    > > The challenge of visualizing huge amounts of data interactively and in
    > > real-time
    > >
    > > Superset was first developed as a data visualization solution for
    > Druid.io
    > > as a way to visualize billions of rows of data.  Since then, usage of
    > > Superset has expanded to address data visualization use cases across SQL
    > > speaking data sources as well.
    > >
    > > Our rationale for developing Superset as an Apache project is detailed in
    > > the Rationale Section.  We believe that the Apache brand and community
    > > process will help us attract more contributors to this project, and help
    > > grow the footprint of the project through usage at other organizations
    > and
    > > within other applications.  Establishing consensus among users and
    > > developers will result in a more valuable tool for everyone.
    > >
    > > == Documentation ==
    > > References to further reading material:
    > > * [[http://airbnb.io/superset/|Superset Documentation]]
    > > * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat
    > > a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post:  Superset:
    > > Airbnb’s Data Exploration Platform]]
    > > * [[https://medium.com/airbnb-engineering/superset-scaling-dat
    > > a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog
    > Post:
    > > Superset: Scaling Data Access & Visual Insights at Airbnb]]
    > >
    > > == Initial Source ==
    > > The origin of the proposed code base can be found at
    > > https://github.com/airbnb/superset.  The code base is primarily in
    > Python.
    > >
    > > == Source and Intellectual Property Submission Plan ==
    > > Airbnb will submit a Software Grant Agreement (SGA) as Superset joins the
    > > incubator. We do not expect any complications for the submission of the
    > > Superset code base.  Our code is already in Github and there is only a
    > > single code base.
    > >
    > > == External Dependencies ==
    > > List of Python packages, from the Python Package Index (Pypi):
    > >
    > > * boto3
    > > * celery
    > > * cryptography
    > > * flask-appbuilder
    > > * flask-cache
    > > * flask-migrate
    > > * flask-script
    > > * flask-sqlalchemy
    > > * flask-testing
    > > * humanize
    > > * gunicorn
    > > * markdown
    > > * pandas
    > > * parsedatetime
    > > * pydruid
    > > * PyHive
    > > * python-dateutil
    > > * requests
    > > * simplejson
    > > * six
    > > * sqlalchemy
    > > * sqlalchemy-utils
    > > * sqlparse
    > > * thrift
    > > * thrift-sasl
    > > * werkzeug
    > >
    > > List of Javascript packages, from NPM:
    > > * autobind-decorator
    > > * bootstrap
    > > * bootstrap-datepicker
    > > * brace
    > > * brfs
    > > * cal-heatmap
    > > * classnames
    > > * d3
    > > * d3-cloud
    > > * d3-sankey
    > > * d3-scale
    > > * d3-tip
    > > * datamaps
    > > * datatables-bootstrap3-plugin
    > > * datatables.net-bs
    > > * font-awesome
    > > * gridster
    > > * immutability-helper
    > > * immutable
    > > * jquery
    > > * lodash.throttle
    > > * mapbox-gl
    > > * moment
    > > * moments
    > > * mustache
    > > * nvd3
    > > * react
    > > * react-ace
    > > * react-bootstrap
    > > * react-bootstrap-table
    > > * react-dom
    > > * react-draggable
    > > * react-gravatar
    > > * react-grid-layout
    > > * react-map-gl
    > > * react-redux
    > > * react-resizable
    > > * react-select
    > > * react-syntax-highlighter
    > > * reactable
    > > * redux
    > > * redux-localstorage
    > > * redux-thunk
    > > * shortid
    > > * style-loader
    > > * supercluster
    > > * topojson
    > > * victory
    > > * viewport-mercator-project
    > >
    > > == Cryptography ==
    > > The proposal does not include cryptographic code.
    > >
    > > == Required Resources ==
    > >
    > > === Mailing List ===
    > > There is a current mailing list as a Google Group “airbnb_superset” that
    > we
    > > are planning on deprecating as the Apache.org become ready to serve our
    > > community.
    > >
    > > * superset-private
    > > * superset-dev
    > > * superset-user
    > >
    > > === Subversion Directory ===
    > > Git is the preferred source control system.
    > http://svn.apache.org/repos/as
    > > f/incubator/superset
    > >
    > > == Git Repository ==
    > > Git is the preferred source control system, we’re assuming
    > > https://github.com/apache/incubator-superset based on the naming scheme
    > >
    > > == Issue Tracking ==
    > > JIRA Superset (SUPERSET). If possible, we’d like to use Github issues &
    > PRs
    > > to manage our project as much as possible. It’s been said that there are
    > > ways to keep Github’s issues in sync with Jira, allowing us to get best
    > of
    > > both worlds. If that is not possible, we will comply to using Jira.
    > >
    > > == Other Resources ==
    > > We currently use a set of Github integrated services that are free to the
    > > open source community, like Travis-ci, Code Climate, Coveralls,
    > > Landscape.io, Requires.io, david-dm and Gitter. We would like to keep
    > using
    > > these services as they allow us to scale contributions and optimize our
    > > development flows. These services require some elevated rights on the
    > > Github repository in order to set up or tune and we would like for the
    > > committers to have the required rights.
    > >
    > >
    > > == Initial Committers ==
    > >
    > > * Maxime Beauchemin <[hidden email]> - PPMC & Committer
    > > * Alanna Scott <[hidden email]> - PPMC & Committer
    > > * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
    > > * Vera Liu <[hidden email]> - Committer
    > > * Jeff Feng <[hidden email]> - PPMC & Committer
    > > * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
    > > * Nishant Bangarwa <[hidden email]> - PPMC & Committer
    > > * Slim Bouguerra <[hidden email]> - Committer
    > > * Priyank Shah <[hidden email]> - Committer
    > > * Harsha Chintalapani <[hidden email]> - Committer
    > > * Daniel Dai <[hidden email]> - Champion & Committer
    > > * Luke Han <[hidden email]> - Mentor
    > >
    > > == Affiliations ==
    > > The initial committers are employees of Airbnb Inc. and Hortonworks.
    > >
    > > == Sponsors ==
    > >
    > > === Champion ===
    > > Daniel Dai <[hidden email]>
    > >
    > > === Nominated Mentors ===
    > > * Ashutosh Chauhan <[hidden email]>
    > > * Luke Han <[hidden email]>
    > >
    > > === Sponsoring Entity ===
    > > Incubator PMC
    > >
    > >
    > >
    > >
    > >
    > > On Wed, Apr 26, 2017 at 6:31 PM, Edward J. Yoon <[hidden email]>
    > > wrote:
    > >
    > >> +1 binding
    > >>
    > >> On Thu, Apr 27, 2017 at 10:29 AM, Naresh Agarwal
    > >> <[hidden email]> wrote:
    > >>> +1 (non-binding).
    > >>>
    > >>> Thanks
    > >>> Naresh Agarwal
    > >>>
    > >>> On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <[hidden email]>
    > >> wrote:
    > >>>
    > >>>> +1 (binding)
    > >>>>
    > >>>>
    > >>>>
    > >>>> On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <[hidden email]> wrote:
    > >>>>
    > >>>>> +1 (binding)
    > >>>>>
    > >>>>> On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey
    > >>>>> <[hidden email]> wrote:
    > >>>>>> +1 (binding)
    > >>>>>>
    > >>>>>> On 4/25/17, 1:27 PM, "Julian Hyde" <[hidden email]> wrote:
    > >>>>>>
    > >>>>>>    +1 binding
    > >>>>>>
    > >>>>>>> On Apr 25, 2017, at 12:48 PM, moon soo Lee <[hidden email]>
    > >>>>> wrote:
    > >>>>>>>
    > >>>>>>> +1 (non-binding)
    > >>>>>>>
    > >>>>>>> On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <
    > >>>>> [hidden email]>
    > >>>>>>> wrote:
    > >>>>>>>
    > >>>>>>>> +1 (binding)
    > >>>>>>>>
    > >>>>>>>> Thanks,
    > >>>>>>>> Ashutosh
    > >>>>>>>>
    > >>>>>>>> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <[hidden email]
    > >>>
    > >>>>> wrote:
    > >>>>>>>>
    > >>>>>>>>> +1 binding
    > >>>>>>>>>
    > >>>>>>>>> Love to see Superset to be new incubator project.
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>> Best Regards!
    > >>>>>>>>> ---------------------
    > >>>>>>>>>
    > >>>>>>>>> Luke Han
    > >>>>>>>>>
    > >>>>>>>>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <
    > >>>> [hidden email]>
    > >>>>> wrote:
    > >>>>>>>>>
    > >>>>>>>>>> Dear Apache Incubator Community,
    > >>>>>>>>>>
    > >>>>>>>>>> We have updated the Superset proposal
    > >>>>>>>>>> <https://wiki.apache.org/incubator/SupersetProposal>
    > >> (copied
    > >>>>> below) for
    > >>>>>>>>>>
    > >>>>>>>>>> Apache Incubation with an additional mentor (Luke Han -
    > >>>>>>>>>> [hidden email]),
    > >>>>>>>>>> and would like to start a vote thread for acceptance into
    > >> the
    > >>>>> incubator.
    > >>>>>>>>>>
    > >>>>>>>>>> Our team is excited to share Superset with the Apache
    > >>>> community
    > >>>>> and we
    > >>>>>>>>>> hope
    > >>>>>>>>>> for the your continued support!
    > >>>>>>>>>>
    > >>>>>>>>>> Cheers,
    > >>>>>>>>>> Jeff & the Superset Team
    > >>>>>>>>>>
    > >>>>>>>>>>
    > >>>>>>>>>>
    > >>>>>>>>>>
    > >>>>>>>>>> = Superset =
    > >>>>>>>>>>
    > >>>>>>>>>> == Abstract ==
    > >>>>>>>>>> Superset is an enterprise-ready web application for data
    > >>>>> exploration,
    > >>>>>>>> data
    > >>>>>>>>>> visualization and dashboarding.
    > >>>>>>>>>>
    > >>>>>>>>>> == Proposal ==
    > >>>>>>>>>> Superset is business intelligence (BI) software that helps
    > >>>>> modern
    > >>>>>>>>>> organizations visualize and interact with their data.
    > >> Superset
    > >>>>> enables
    > >>>>>>>>>> users explore data from a variety of databases, assemble
    > >>>>> beautiful
    > >>>>>>>>>> dashboards and share their findings.  Superset works neatly
    > >>>>> with all
    > >>>>>>>>>> modern
    > >>>>>>>>>> SQL-speaking databases, and integrates with Druid.io to
    > >>>> provide
    > >>>>>>>> real-time,
    > >>>>>>>>>> interactive, blazing fast data access to large datasets.
    > >>>>>>>>>>
    > >>>>>>>>>> == Background ==
    > >>>>>>>>>> Data is mission critical. To succeed in this era,
    > >>>> organizations
    > >>>>> need to
    > >>>>>>>>>> provide low-friction, intuitive and interactive access to
    > >>>> data.
    > >>>>> It is
    > >>>>>>>>>> paramount for knowledge workers to be capable of answering
    > >>>>> their own
    > >>>>>>>>>> questions by querying, exploring and visualizing data.
    > >>>>>>>>>>
    > >>>>>>>>>> The entire business intelligence industry has pivoted from
    > >> a
    > >>>>> model of
    > >>>>>>>>>> centralized top-down platforms driven by IT organizations
    > >> to
    > >>>>>>>> self-service
    > >>>>>>>>>> analytics and agile workflows by any user.  This shift
    > >>>> unblocks
    > >>>>>>>>>> centralized
    > >>>>>>>>>> service bottlenecks for creating data visualizations while
    > >>>> also
    > >>>>> creating
    > >>>>>>>>>> an
    > >>>>>>>>>> environment that is iterative and fast-moving.  This means
    > >>>> that
    > >>>>> business
    > >>>>>>>>>> intelligence software must also be easy and delightful to
    > >> use.
    > >>>>>>>>>> Self-service analytics doesn’t mean that admin and
    > >> governance
    > >>>>> features
    > >>>>>>>> are
    > >>>>>>>>>> not needed.
    > >>>>>>>>>> Modern BI tools provide fine-grain access controls and
    > >>>> auditing
    > >>>>>>>>>> capabilities to understand how data is being used.
    > >> Superset
    > >>>> is
    > >>>>> a
    > >>>>>>>> solution
    > >>>>>>>>>> that delivers on all of these vectors.
    > >>>>>>>>>>
    > >>>>>>>>>> The technology stack is also constantly morphing - vendors
    > >> are
    > >>>>>>>> struggling
    > >>>>>>>>>> to provide cheap, quick and easy solutions to access data.
    > >>>>> Business
    > >>>>>>>>>> intelligence users are finding existing solutions lacking
    > >> as
    > >>>>> these
    > >>>>>>>>>> software
    > >>>>>>>>>> products either disregard or react slowly to recent
    > >>>>> game-changing
    > >>>>>>>>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache
    > >>>>> Kylin, d3.js,
    > >>>>>>>>>> React.js and iPython’s Jupyter for instance.
    > >>>>>>>>>>
    > >>>>>>>>>> == Rationale ==
    > >>>>>>>>>> Business intelligence is more relevant today than at any
    > >> other
    > >>>>> point in
    > >>>>>>>>>> history.  Organizations are currently very limited in
    > >> options
    > >>>>> for open
    > >>>>>>>>>> source data visualization solutions, especially solutions
    > >> that
    > >>>>> are both
    > >>>>>>>>>> self-service and enterprise-ready.  Every company informing
    > >>>>> their
    > >>>>>>>>>> decisions
    > >>>>>>>>>> with data needs a BI tool.
    > >>>>>>>>>>
    > >>>>>>>>>> We believe that Superset will be a strong compliment to
    > >>>>> existing Apache
    > >>>>>>>>>> Software Foundation technologies by offering scalable user
    > >>>>> interactions
    > >>>>>>>> to
    > >>>>>>>>>> distributed storage and computation solutions.  Users will
    > >>>>> often find
    > >>>>>>>> that
    > >>>>>>>>>> Superset can act as a catalyst for tooling that can
    > >> visualize
    > >>>>> the
    > >>>>>>>>>> byproduct
    > >>>>>>>>>> of data and computation infrastructure.
    > >>>>>>>>>>
    > >>>>>>>>>> Superset has many key design elements that help fill a gap
    > >> in
    > >>>>> current
    > >>>>>>>>>> solutions for organizations:
    > >>>>>>>>>> * Easy, low friction access to data through a simple,
    > >>>> web-based
    > >>>>> data
    > >>>>>>>>>> exploration interface.  Composing charts and dashboards are
    > >>>>> intuitive.
    > >>>>>>>>>> Eliminating the need to write code or SQL empowers anyone
    > >> to
    > >>>>> use it.
    > >>>>>>>>>> * Access to a wide array of rich, interactive data
    > >>>>> visualization types.
    > >>>>>>>>>> * Enterprise-ready: Integration with different
    > >> authentication
    > >>>>>>>> mechanisms
    > >>>>>>>>>> and granular permissions centered around actions and data
    > >>>>> access.
    > >>>>>>>>>> * Realtime & fast: Superset provides realtime analytics at
    > >> the
    > >>>>> speed of
    > >>>>>>>>>> thought on very large datasets when integrated with
    > >> Druid.io.
    > >>>>>>>>>> * Broad data access: Consume data out of any SQL-speaking
    > >>>>> relational
    > >>>>>>>>>> database.
    > >>>>>>>>>> * Extensible: Can be extended to talk to many noSQL
    > >> databases
    > >>>>> like
    > >>>>>>>> Apache
    > >>>>>>>>>> Drill, Elastic Search, and other popular database engines.
    > >>>>>>>>>> * Fast loading dashboards with configurable web-scale
    > >> caching.
    > >>>>>>>>>> * Plug-in framework that enables organizations to build
    > >> custom
    > >>>>>>>> analytical
    > >>>>>>>>>> applications with new UI/UX interfaces.
    > >>>>>>>>>> * SQL Lab, a state-of-the-art SQL IDE that empowers
    > >>>>> SQL-speaking users
    > >>>>>>>>>> with more flexibility.  SQL Lab integrates with the
    > >>>>> visualization engine
    > >>>>>>>>>> seamlessly.
    > >>>>>>>>>>
    > >>>>>>>>>> == Initial Goals ==
    > >>>>>>>>>> The initial goals of the Superset project are several-fold:
    > >>>>>>>>>> * Move the existing codebase to Apache and integrate with
    > >> the
    > >>>>> Apache
    > >>>>>>>>>> development process.
    > >>>>>>>>>> * Redesign the user interface and interaction model for
    > >>>> creating
    > >>>>>>>>>> visualizations/dashboards and connecting to data sources
    > >>>>>>>>>> * Build robust support for security and governance of the
    > >> tool
    > >>>>>>>> including
    > >>>>>>>>>> popular authorization modules (including Apache Ranger and
    > >>>>> Apache
    > >>>>>>>> Sentry)
    > >>>>>>>>>> and a more sophisticated permissions system
    > >>>>>>>>>> * Grow the extensibility of the project both in terms of
    > >>>>> enhanced
    > >>>>>>>>>> connectivity to NoSQL-based data sources and creating a
    > >>>> plug-in
    > >>>>>>>> framework
    > >>>>>>>>>> that enables organizations to build custom analytical
    > >>>>> applications which
    > >>>>>>>>>> require a new UI/UX
    > >>>>>>>>>>
    > >>>>>>>>>> == Current Status ==
    > >>>>>>>>>> By many standards, Superset is already a successful open
    > >>>> source
    > >>>>> project.
    > >>>>>>>>>> As
    > >>>>>>>>>> of March 2017, Superset is officially used in production at
    > >>>>> about a
    > >>>>>>>> dozen
    > >>>>>>>>>> companies, has received contributions from over one hundred
    > >>>>> contributors
    > >>>>>>>>>> on
    > >>>>>>>>>> Github, 1500+ forks, and 12k+ stars.
    > >>>>>>>>>>
    > >>>>>>>>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have
    > >>>> made
    > >>>>>>>>>> significant contributions, and expressed their commitment
    > >> to
    > >>>> the
    > >>>>>>>> project.
    > >>>>>>>>>> The product is feature complete and has been viable for
    > >>>> months.
    > >>>>> It
    > >>>>>>>> already
    > >>>>>>>>>> serves as the main interface for consuming data at many
    > >>>>> companies of
    > >>>>>>>>>> different sizes.
    > >>>>>>>>>>
    > >>>>>>>>>> While the product is usable, there’s room for improvement
    > >>>>> across the
    > >>>>>>>>>> board,
    > >>>>>>>>>> starting with providing a smoother user experience around
    > >>>>> content
    > >>>>>>>>>> creation,
    > >>>>>>>>>> making sure all features work out-of-the-box on more
    > >> platforms
    > >>>>> and
    > >>>>>>>>>> databases, providing better user training guides and
    > >> videos,
    > >>>>> having a
    > >>>>>>>>>> predictable release process, and increasing the overall
    > >>>> quality
    > >>>>> of the
    > >>>>>>>>>> Superset releases.
    > >>>>>>>>>>
    > >>>>>>>>>> === Meritocracy ===
    > >>>>>>>>>> We plan to invest in supporting a meritocracy. We will
    > >> discuss
    > >>>>> the
    > >>>>>>>>>> requirements in an open forum. Several companies have
    > >>>> expressed
    > >>>>> interest
    > >>>>>>>>>> in
    > >>>>>>>>>> this project, and we intend to invite additional
    > >> developers to
    > >>>>>>>>>> participate.
    > >>>>>>>>>> We will encourage and monitor community participation so
    > >> that
    > >>>>> privileges
    > >>>>>>>>>> can be extended to those that contribute.
    > >>>>>>>>>>
    > >>>>>>>>>> === Community ===
    > >>>>>>>>>> The need for an enterprise-ready data visualization and
    > >>>>> exploration
    > >>>>>>>>>> platform in the open source community is tremendous.  While
    > >>>>> Superset is
    > >>>>>>>>>> fairly well known, recognized and used within the Druid.io
    > >>>>> community,
    > >>>>>>>>>> adoption is currently limited outside of that niche. There
    > >> is
    > >>>> a
    > >>>>> huge
    > >>>>>>>>>> opportunity to grow the community to hundreds if not
    > >> thousands
    > >>>>> of
    > >>>>>>>>>> organizations, and we are hoping that embracing “the Apache
    > >>>>> way” will
    > >>>>>>>>>> accelerate the growth of our community.
    > >>>>>>>>>>
    > >>>>>>>>>> We have already been active at seeking and inviting
    > >>>>> contributions, and
    > >>>>>>>> are
    > >>>>>>>>>> planning to scale the project by investing time and growing
    > >>>> the
    > >>>>> support
    > >>>>>>>>>> structure to grow the community.
    > >>>>>>>>>>
    > >>>>>>>>>> === Core Developers ===
    > >>>>>>>>>> The initial committers for Superset include experienced
    > >> full
    > >>>>> stack,
    > >>>>>>>>>> front-end and data engineers:
    > >>>>>>>>>> * Maxime Beauchemin (Airbnb)
    > >>>>>>>>>> * Alanna Scott (Airbnb)
    > >>>>>>>>>> * Bogdan Kyryliuk (Airbnb)
    > >>>>>>>>>> * Vera Liu  (Airbnb)
    > >>>>>>>>>> * Jeff Feng (Airbnb)
    > >>>>>>>>>> * Ashutosh Chauhan (Hortonworks)
    > >>>>>>>>>> * Nishant Bangarwa (Hortonworks)
    > >>>>>>>>>> * Slim Bouguerra (Hortonworks)
    > >>>>>>>>>> * Priyank Shah (Hortonworks)
    > >>>>>>>>>> * Sriharsha Chintalapani (Hortonworks)
    > >>>>>>>>>> * Daniel Dai (Hortonworks)
    > >>>>>>>>>>
    > >>>>>>>>>> We realize that additional employer diversity is needed,
    > >> and
    > >>>> we
    > >>>>> will
    > >>>>>>>> work
    > >>>>>>>>>> aggressively to recruit developers from additional
    > >> companies.
    > >>>>>>>>>>
    > >>>>>>>>>> === Alignment ===
    > >>>>>>>>>> The initial committers strongly believe that a system for
    > >>>>> interactive
    > >>>>>>>>>> visualization of data will gain broader adoption as an open
    > >>>>> source,
    > >>>>>>>>>> community driven project, where the community can
    > >> contribute
    > >>>>> not only to
    > >>>>>>>>>> the core components, but also to a growing collection of
    > >>>>> connectors,
    > >>>>>>>>>> visualizations and improving integration a all potential
    > >> data
    > >>>>> sources.
    > >>>>>>>>>> Superset already integrates closely with Apache Hive, the
    > >> Hive
    > >>>>>>>> metastore,
    > >>>>>>>>>> as well as most SQL-speaking databases found in modern data
    > >>>>> ecosystems.
    > >>>>>>>>>>
    > >>>>>>>>>> == Known Risks ==
    > >>>>>>>>>>
    > >>>>>>>>>> === Orphaned Products ===
    > >>>>>>>>>> Superset is a vital component for both visualizing,
    > >> accessing
    > >>>>> and
    > >>>>>>>>>> democratizing data at Airbnb.  Also at Hortonworks,
    > >> Superset
    > >>>> is
    > >>>>> a core
    > >>>>>>>>>> component of the DataFlow product offering.  Thus, the
    > >> risk of
    > >>>>> the
    > >>>>>>>> project
    > >>>>>>>>>> being orphaned is relatively low.  The project could be at
    > >>>> risk
    > >>>>> if
    > >>>>>>>> Airbnb
    > >>>>>>>>>> changes their approach for democratizing data or if
    > >>>> Hortonworks
    > >>>>> changes
    > >>>>>>>>>> their strategy in the market.  In such an event, the
    > >>>> committers
    > >>>>> plan to
    > >>>>>>>>>> continue working on the project on their own time, thought
    > >> the
    > >>>>> progress
    > >>>>>>>>>> will likely be slower.  We plan to mitigate this risk by
    > >>>>> recruiting
    > >>>>>>>>>> additional committers.
    > >>>>>>>>>>
    > >>>>>>>>>> === Inexperience with Open Source ===
    > >>>>>>>>>> The initial committers include veteran Apache members
    > >>>>> (committers and
    > >>>>>>>> PPMC
    > >>>>>>>>>> members) and other developers who have varying degrees of
    > >>>>> experience
    > >>>>>>>> with
    > >>>>>>>>>> open source projects. All have been involved with source
    > >> code
    > >>>>> that has
    > >>>>>>>>>> been
    > >>>>>>>>>> released under an open source license, and several also
    > >> have
    > >>>>> experience
    > >>>>>>>>>> developing code with an open source development process.
    > >>>>>>>>>>
    > >>>>>>>>>> === Homogenous Developers ===
    > >>>>>>>>>> The initial committers are employed by Airbnb Inc. and
    > >>>>> Hortonworks. We
    > >>>>>>>> are
    > >>>>>>>>>> committed to recruiting additional committers from other
    > >>>>> companies.
    > >>>>>>>>>>
    > >>>>>>>>>> === Reliance on Salaried Developers ===
    > >>>>>>>>>> It is expected that Superset development will occur on both
    > >>>>> salaried
    > >>>>>>>> time
    > >>>>>>>>>> and on volunteer time, after hours. The majority of initial
    > >>>>> committers
    > >>>>>>>> are
    > >>>>>>>>>> paid by their employer to contribute to this project.
    > >> However,
    > >>>>> they are
    > >>>>>>>>>> all
    > >>>>>>>>>> passionate about the project, and we are confident that the
    > >>>>> project will
    > >>>>>>>>>> continue even if no salaried developers contribute to the
    > >>>>> project. We
    > >>>>>>>> are
    > >>>>>>>>>> committed to recruiting additional committers including
    > >>>>> non-salaried
    > >>>>>>>>>> developers.
    > >>>>>>>>>>
    > >>>>>>>>>> === Relationships with Other Apache Products ===
    > >>>>>>>>>> To the knowledge of the Initial Committers, there are no
    > >>>> direct
    > >>>>>>>>>> competitors
    > >>>>>>>>>> to Superset within the Apache Software Foundation.  That
    > >> said,
    > >>>>> Apache
    > >>>>>>>>>> Zeppelin is an indirect competitor, but it solves a
    > >> different
    > >>>>> use case.
    > >>>>>>>>>>
    > >>>>>>>>>> Apache Zeppelin is a web-based notebook that enables
    > >>>>> interactive data
    > >>>>>>>>>> analytics. It enables the creation of beautiful
    > >> data-driven,
    > >>>>> interactive
    > >>>>>>>>>> and collaborative documents with SQL, Scala and more.
    > >>>> Although
    > >>>>> a user
    > >>>>>>>> can
    > >>>>>>>>>> create data visualizations using this project, it
    > >> leverages a
    > >>>>> notebook
    > >>>>>>>>>> style user interfaces and it is geared towards the Spark
    > >>>>> community where
    > >>>>>>>>>> Scala and SQL co-exist
    > >>>>>>>>>>
    > >>>>>>>>>> We look forward to collaborating with those communities, as
    > >>>>> well as
    > >>>>>>>> other
    > >>>>>>>>>> Apache communities.
    > >>>>>>>>>>
    > >>>>>>>>>> === An Excessive Fascination with the Apache Brand ===
    > >>>>>>>>>> Superset is solving two huge challenges:
    > >>>>>>>>>> The challenge of enabling every knowledge worker to make
    > >> data
    > >>>>> informed
    > >>>>>>>>>> decisions, particularly those who are not deeply skilled at
    > >>>>> writing SQL.
    > >>>>>>>>>> The challenge of visualizing huge amounts of data
    > >>>> interactively
    > >>>>> and in
    > >>>>>>>>>> real-time
    > >>>>>>>>>>
    > >>>>>>>>>> Superset was first developed as a data visualization
    > >> solution
    > >>>>> for
    > >>>>>>>> Druid.io
    > >>>>>>>>>> as a way to visualize billions of rows of data.  Since
    > >> then,
    > >>>>> usage of
    > >>>>>>>>>> Superset has expanded to address data visualization use
    > >> cases
    > >>>>> across SQL
    > >>>>>>>>>> speaking data sources as well.
    > >>>>>>>>>>
    > >>>>>>>>>> Our rationale for developing Superset as an Apache project
    > >> is
    > >>>>> detailed
    > >>>>>>>> in
    > >>>>>>>>>> the Rationale Section.  We believe that the Apache brand
    > >> and
    > >>>>> community
    > >>>>>>>>>> process will help us attract more contributors to this
    > >>>> project,
    > >>>>> and help
    > >>>>>>>>>> grow the footprint of the project through usage at other
    > >>>>> organizations
    > >>>>>>>> and
    > >>>>>>>>>> within other applications.  Establishing consensus among
    > >> users
    > >>>>> and
    > >>>>>>>>>> developers will result in a more valuable tool for
    > >> everyone.
    > >>>>>>>>>>
    > >>>>>>>>>> == Documentation ==
    > >>>>>>>>>> References to further reading material:
    > >>>>>>>>>> * [[http://airbnb.io/superset/|Superset Documentation]]
    > >>>>>>>>>> * [[
    > >>>>>>>>>> https://medium.com/airbnb-engi
    > >> neering/caravel-airbnb-s-data-
    > >>>>>>>>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
    > >>>>>>>>>> Post:  Superset: Airbnb’s Data Exploration Platform]]
    > >>>>>>>>>> * [[
    > >>>>>>>>>> https://medium.com/airbnb-engi
    > >> neering/superset-scaling-data-
    > >>>>>>>>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.
    > >>>>> a505zvb1t|Blog
    > >>>>>>>>>> Post:  Superset: Scaling Data Access & Visual Insights at
    > >>>>> Airbnb]]
    > >>>>>>>>>>
    > >>>>>>>>>> == Initial Source ==
    > >>>>>>>>>> The origin of the proposed code base can be found at
    > >>>>>>>>>> https://github.com/airbnb/superset.  The code base is
    > >>>>> primarily in
    > >>>>>>>>>> Python.
    > >>>>>>>>>>
    > >>>>>>>>>> == Source and Intellectual Property Submission Plan ==
    > >>>>>>>>>> We do not expect any complications for the submission of
    > >> the
    > >>>>> Superset
    > >>>>>>>> code
    > >>>>>>>>>> base.  Our code is already in Github and there is only a
    > >>>> single
    > >>>>> code
    > >>>>>>>> base.
    > >>>>>>>>>>
    > >>>>>>>>>> == External Dependencies ==
    > >>>>>>>>>> List of Python packages, from the Python Package Index
    > >> (Pypi):
    > >>>>>>>>>>
    > >>>>>>>>>> * boto3
    > >>>>>>>>>> * celery
    > >>>>>>>>>> * cryptography
    > >>>>>>>>>> * flask-appbuilder
    > >>>>>>>>>> * flask-cache
    > >>>>>>>>>> * flask-migrate
    > >>>>>>>>>> * flask-script
    > >>>>>>>>>> * flask-sqlalchemy
    > >>>>>>>>>> * flask-testing
    > >>>>>>>>>> * humanize
    > >>>>>>>>>> * gunicorn
    > >>>>>>>>>> * markdown
    > >>>>>>>>>> * pandas
    > >>>>>>>>>> * parsedatetime
    > >>>>>>>>>> * pydruid
    > >>>>>>>>>> * PyHive
    > >>>>>>>>>> * python-dateutil
    > >>>>>>>>>> * requests
    > >>>>>>>>>> * simplejson
    > >>>>>>>>>> * six
    > >>>>>>>>>> * sqlalchemy
    > >>>>>>>>>> * sqlalchemy-utils
    > >>>>>>>>>> * sqlparse
    > >>>>>>>>>> * thrift
    > >>>>>>>>>> * thrift-sasl
    > >>>>>>>>>> * werkzeug
    > >>>>>>>>>>
    > >>>>>>>>>> List of Javascript packages, from NPM:
    > >>>>>>>>>> * autobind-decorator
    > >>>>>>>>>> * bootstrap
    > >>>>>>>>>> * bootstrap-datepicker
    > >>>>>>>>>> * brace
    > >>>>>>>>>> * brfs
    > >>>>>>>>>> * cal-heatmap
    > >>>>>>>>>> * classnames
    > >>>>>>>>>> * d3
    > >>>>>>>>>> * d3-cloud
    > >>>>>>>>>> * d3-sankey
    > >>>>>>>>>> * d3-scale
    > >>>>>>>>>> * d3-tip
    > >>>>>>>>>> * datamaps
    > >>>>>>>>>> * datatables-bootstrap3-plugin
    > >>>>>>>>>> * datatables.net-bs
    > >>>>>>>>>> * font-awesome
    > >>>>>>>>>> * gridster
    > >>>>>>>>>> * immutability-helper
    > >>>>>>>>>> * immutable
    > >>>>>>>>>> * jquery
    > >>>>>>>>>> * lodash.throttle
    > >>>>>>>>>> * mapbox-gl
    > >>>>>>>>>> * moment
    > >>>>>>>>>> * moments
    > >>>>>>>>>> * mustache
    > >>>>>>>>>> * nvd3
    > >>>>>>>>>> * react
    > >>>>>>>>>> * react-ace
    > >>>>>>>>>> * react-bootstrap
    > >>>>>>>>>> * react-bootstrap-table
    > >>>>>>>>>> * react-dom
    > >>>>>>>>>> * react-draggable
    > >>>>>>>>>> * react-gravatar
    > >>>>>>>>>> * react-grid-layout
    > >>>>>>>>>> * react-map-gl
    > >>>>>>>>>> * react-redux
    > >>>>>>>>>> * react-resizable
    > >>>>>>>>>> * react-select
    > >>>>>>>>>> * react-syntax-highlighter
    > >>>>>>>>>> * reactable
    > >>>>>>>>>> * redux
    > >>>>>>>>>> * redux-localstorage
    > >>>>>>>>>> * redux-thunk
    > >>>>>>>>>> * shortid
    > >>>>>>>>>> * style-loader
    > >>>>>>>>>> * supercluster
    > >>>>>>>>>> * topojson
    > >>>>>>>>>> * victory
    > >>>>>>>>>> * viewport-mercator-project
    > >>>>>>>>>>
    > >>>>>>>>>> == Cryptography ==
    > >>>>>>>>>> The proposal does not include cryptographic code.
    > >>>>>>>>>>
    > >>>>>>>>>> == Required Resources ==
    > >>>>>>>>>>
    > >>>>>>>>>> === Mailing List ===
    > >>>>>>>>>> There is a current mailing list as a Google Group
    > >>>>> “airbnb_superset” that
    > >>>>>>>>>> we
    > >>>>>>>>>> are planning on deprecating as the Apache.org become ready
    > >> to
    > >>>>> serve our
    > >>>>>>>>>> community.
    > >>>>>>>>>>
    > >>>>>>>>>> * superset-private
    > >>>>>>>>>> * superset-dev
    > >>>>>>>>>> * superset-user
    > >>>>>>>>>>
    > >>>>>>>>>> === Subversion Directory ===
    > >>>>>>>>>> Git is the preferred source control system.
    > >>>>>>>>>> http://svn.apache.org/repos/asf/incubator/superset
    > >>>>>>>>>>
    > >>>>>>>>>> == Git Repository ==
    > >>>>>>>>>> Git is the preferred source control system, we’re assuming
    > >>>>>>>>>> https://github.com/apache/incubator-superset based on the
    > >>>>> naming scheme
    > >>>>>>>>>>
    > >>>>>>>>>> == Issue Tracking ==
    > >>>>>>>>>> JIRA Superset (SUPERSET). If possible, we’d like to use
    > >> Github
    > >>>>> issues &
    > >>>>>>>>>> PRs
    > >>>>>>>>>> to manage our project as much as possible. It’s been said
    > >> that
    > >>>>> there are
    > >>>>>>>>>> ways to keep Github’s issues in sync with Jira, allowing
    > >> us to
    > >>>>> get best
    > >>>>>>>> of
    > >>>>>>>>>> both worlds. If that is not possible, we will comply to
    > >> using
    > >>>>> Jira.
    > >>>>>>>>>>
    > >>>>>>>>>> == Other Resources ==
    > >>>>>>>>>> We currently use a set of Github integrated services that
    > >> are
    > >>>>> free to
    > >>>>>>>> the
    > >>>>>>>>>> open source community, like Travis-ci, Code Climate,
    > >>>> Coveralls,
    > >>>>>>>>>> Landscape.io, Requires.io, david-dm and Gitter. We would
    > >> like
    > >>>>> to keep
    > >>>>>>>>>> using
    > >>>>>>>>>> these services as they allow us to scale contributions and
    > >>>>> optimize our
    > >>>>>>>>>> development flows. These services require some elevated
    > >> rights
    > >>>>> on the
    > >>>>>>>>>> Github repository in order to set up or tune and we would
    > >> like
    > >>>>> for the
    > >>>>>>>>>> committers to have the required rights.
    > >>>>>>>>>>
    > >>>>>>>>>>
    > >>>>>>>>>> == Initial Committers ==
    > >>>>>>>>>>
    > >>>>>>>>>> * Maxime Beauchemin <[hidden email]> - PPMC
    > >> &
    > >>>>> Committer
    > >>>>>>>>>> * Alanna Scott <[hidden email]> - PPMC &
    > >> Committer
    > >>>>>>>>>> * Bogdan Kyryliuk <[hidden email]> - PPMC &
    > >> Committer
    > >>>>>>>>>> * Vera Liu <[hidden email]> - Committer
    > >>>>>>>>>> * Jeff Feng <[hidden email]> - PPMC & Committer
    > >>>>>>>>>> * Ashutosh Chauhan <[hidden email]> - Mentor &
    > >>>> Committer
    > >>>>>>>>>> * Nishant Bangarwa <[hidden email]> - PPMC &
    > >>>>> Committer
    > >>>>>>>>>> * Slim Bouguerra <[hidden email]> - Committer
    > >>>>>>>>>> * Priyank Shah <[hidden email]> - Committer
    > >>>>>>>>>> * Harsha Chintalapani <[hidden email]> -
    > >>>>> Committer
    > >>>>>>>>>> * Daniel Dai <[hidden email]> - Champion & Committer
    > >>>>>>>>>> * Luke Han <[hidden email]> - Mentor
    > >>>>>>>>>>
    > >>>>>>>>>> == Affiliations ==
    > >>>>>>>>>> The initial committers are employees of Airbnb Inc. and
    > >>>>> Hortonworks.
    > >>>>>>>>>>
    > >>>>>>>>>> == Sponsors ==
    > >>>>>>>>>>
    > >>>>>>>>>> === Champion ===
    > >>>>>>>>>> Daniel Dai <[hidden email]>
    > >>>>>>>>>>
    > >>>>>>>>>> === Nominated Mentors ===
    > >>>>>>>>>> * Ashutosh Chauhan <[hidden email]>
    > >>>>>>>>>> * Luke Han <[hidden email]>
    > >>>>>>>>>>
    > >>>>>>>>>> === Sponsoring Entity ===
    > >>>>>>>>>> Incubator PMC
    > >>>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>>
    > >>>>>>>>
    > >>>>>>
    > >>>>>>
    > >>>>>>    ------------------------------------------------------------
    > >>>>> ---------
    > >>>>>>    To unsubscribe, e-mail: general-unsubscribe@incubator.
    > >> apache.org
    > >>>>>>    For additional commands, e-mail: [hidden email].
    > >> org
    > >>>>>>
    > >>>>>>
    > >>>>>>
    > >>>>>>
    > >>>>>
    > >>>>> ------------------------------------------------------------
    > ---------
    > >>>>> To unsubscribe, e-mail: [hidden email]
    > >>>>> For additional commands, e-mail: [hidden email]
    > >>>>>
    > >>>>>
    > >>>>
    > >>
    > >>
    > >>
    > >> --
    > >> Best Regards, Edward J. Yoon
    > >>
    > >> ---------------------------------------------------------------------
    > >> To unsubscribe, e-mail: [hidden email]
    > >> For additional commands, e-mail: [hidden email]
    > >>
    > >>
    >
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: [hidden email]
    > For additional commands, e-mail: [hidden email]
    >
    >
   


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Jacques Nadeau
+1 binding

On Apr 27, 2017 1:35 PM, "Jitendra Pandey" <[hidden email]> wrote:

> Re-affirming. +1 (binding)
>
> On 4/27/17, 1:24 PM, "Ashutosh Chauhan" <[hidden email]>
> wrote:
>
>     Re-affirming my vote as well:
>
>     +1 (binding)
>
>     On Thu, Apr 27, 2017 at 10:45 AM, Julian Hyde <[hidden email]>
> wrote:
>
>     > Re-affriming my vote:
>     >
>     > +1 (binding)
>     >
>     > > On Apr 26, 2017, at 11:12 PM, Jeff Feng <[hidden email]>
> wrote:
>     > >
>     > > Hello everyone,
>     > >
>     > > Thank you for checking out our proposal on Superset and for your
>     > > consideration for the Apache Incubator.  So far, I believe we have
> 8
>     > > binding votes and 2 non-binding votes.
>     > >
>     > > As Taylor mentioned earlier, we made a minor update to the wording
> in the
>     > > "Source and Intellectual Property Submission Plan" section based
> on a
>     > > suggestion by John Ament.  The update was to help confirm the
> previously
>     > > unstated assumption that we will submit an SGA.  I have copied the
>     > updated
>     > > proposal from the wiki to the email below and highlighted (in
> yellow) the
>     > > new sentence below in the document.
>     > >
>     > > Folks on the cc line who have already voted, please let us know if
> the
>     > > change impacts your vote.
>     > >
>     > > Thank you all,
>     > > Jeff
>     > >
>     > >
>     > >
>     > > = Superset =
>     > >
>     > > == Abstract ==
>     > > Superset is an enterprise-ready web application for data
> exploration,
>     > data
>     > > visualization and dashboarding.
>     > >
>     > > == Proposal ==
>     > > Superset is business intelligence (BI) software that helps modern
>     > > organizations visualize and interact with their data. Superset
> enables
>     > > users explore data from a variety of databases, assemble beautiful
>     > > dashboards and share their findings.  Superset works neatly with
> all
>     > modern
>     > > SQL-speaking databases, and integrates with Druid.io to provide
>     > real-time,
>     > > interactive, blazing fast data access to large datasets.
>     > >
>     > > == Background ==
>     > > Data is mission critical. To succeed in this era, organizations
> need to
>     > > provide low-friction, intuitive and interactive access to data. It
> is
>     > > paramount for knowledge workers to be capable of answering their
> own
>     > > questions by querying, exploring and visualizing data.
>     > >
>     > > The entire business intelligence industry has pivoted from a model
> of
>     > > centralized top-down platforms driven by IT organizations to
> self-service
>     > > analytics and agile workflows by any user.  This shift unblocks
>     > centralized
>     > > service bottlenecks for creating data visualizations while also
> creating
>     > an
>     > > environment that is iterative and fast-moving.  This means that
> business
>     > > intelligence software must also be easy and delightful to use.
>     > > Self-service analytics doesn’t mean that admin and governance
> features
>     > are
>     > > not needed.
>     > > Modern BI tools provide fine-grain access controls and auditing
>     > > capabilities to understand how data is being used.  Superset is a
>     > solution
>     > > that delivers on all of these vectors.
>     > >
>     > > The technology stack is also constantly morphing - vendors are
> struggling
>     > > to provide cheap, quick and easy solutions to access data.
> Business
>     > > intelligence users are finding existing solutions lacking as these
>     > software
>     > > products either disregard or react slowly to recent game-changing
>     > > technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin,
> d3.js,
>     > > React.js and iPython’s Jupyter for instance.
>     > >
>     > > == Rationale ==
>     > > Business intelligence is more relevant today than at any other
> point in
>     > > history.  Organizations are currently very limited in options for
> open
>     > > source data visualization solutions, especially solutions that are
> both
>     > > self-service and enterprise-ready.  Every company informing their
>     > decisions
>     > > with data needs a BI tool.
>     > >
>     > > We believe that Superset will be a strong compliment to existing
> Apache
>     > > Software Foundation technologies by offering scalable user
> interactions
>     > to
>     > > distributed storage and computation solutions.  Users will often
> find
>     > that
>     > > Superset can act as a catalyst for tooling that can visualize the
>     > byproduct
>     > > of data and computation infrastructure.
>     > >
>     > > Superset has many key design elements that help fill a gap in
> current
>     > > solutions for organizations:
>     > > * Easy, low friction access to data through a simple, web-based
> data
>     > > exploration interface.  Composing charts and dashboards are
> intuitive.
>     > > Eliminating the need to write code or SQL empowers anyone to use
> it.
>     > > * Access to a wide array of rich, interactive data visualization
> types.
>     > > * Enterprise-ready: Integration with different authentication
> mechanisms
>     > > and granular permissions centered around actions and data access.
>     > > * Realtime & fast: Superset provides realtime analytics at the
> speed of
>     > > thought on very large datasets when integrated with Druid.io.
>     > > * Broad data access: Consume data out of any SQL-speaking
> relational
>     > > database.
>     > > * Extensible: Can be extended to talk to many noSQL databases like
> Apache
>     > > Drill, Elastic Search, and other popular database engines.
>     > > * Fast loading dashboards with configurable web-scale caching.
>     > > * Plug-in framework that enables organizations to build custom
> analytical
>     > > applications with new UI/UX interfaces.
>     > > * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking
> users
>     > > with more flexibility.  SQL Lab integrates with the visualization
> engine
>     > > seamlessly.
>     > >
>     > > == Initial Goals ==
>     > > The initial goals of the Superset project are several-fold:
>     > > * Move the existing codebase to Apache and integrate with the
> Apache
>     > > development process.
>     > > * Redesign the user interface and interaction model for creating
>     > > visualizations/dashboards and connecting to data sources
>     > > * Build robust support for security and governance of the tool
> including
>     > > popular authorization modules (including Apache Ranger and Apache
> Sentry)
>     > > and a more sophisticated permissions system
>     > > * Grow the extensibility of the project both in terms of enhanced
>     > > connectivity to NoSQL-based data sources and creating a plug-in
> framework
>     > > that enables organizations to build custom analytical applications
> which
>     > > require a new UI/UX
>     > >
>     > > == Current Status ==
>     > > By many standards, Superset is already a successful open source
> project.
>     > As
>     > > of March 2017, Superset is officially used in production at about
> a dozen
>     > > companies, has received contributions from over one hundred
> contributors
>     > on
>     > > Github, 1500+ forks, and 12k+ stars.
>     > >
>     > > Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
>     > > significant contributions, and expressed their commitment to the
> project.
>     > > The product is feature complete and has been viable for months. It
>     > already
>     > > serves as the main interface for consuming data at many companies
> of
>     > > different sizes.
>     > >
>     > > While the product is usable, there’s room for improvement across
> the
>     > board,
>     > > starting with providing a smoother user experience around content
>     > creation,
>     > > making sure all features work out-of-the-box on more platforms and
>     > > databases, providing better user training guides and videos,
> having a
>     > > predictable release process, and increasing the overall quality of
> the
>     > > Superset releases.
>     > >
>     > > === Meritocracy ===
>     > > We plan to invest in supporting a meritocracy. We will discuss the
>     > > requirements in an open forum. Several companies have expressed
> interest
>     > in
>     > > this project, and we intend to invite additional developers to
>     > participate.
>     > > We will encourage and monitor community participation so that
> privileges
>     > > can be extended to those that contribute.
>     > >
>     > > === Community ===
>     > > The need for an enterprise-ready data visualization and exploration
>     > > platform in the open source community is tremendous.  While
> Superset is
>     > > fairly well known, recognized and used within the Druid.io
> community,
>     > > adoption is currently limited outside of that niche. There is a
> huge
>     > > opportunity to grow the community to hundreds if not thousands of
>     > > organizations, and we are hoping that embracing “the Apache way”
> will
>     > > accelerate the growth of our community.
>     > >
>     > > We have already been active at seeking and inviting contributions,
> and
>     > are
>     > > planning to scale the project by investing time and growing the
> support
>     > > structure to grow the community.
>     > >
>     > > === Core Developers ===
>     > > The initial committers for Superset include experienced full stack,
>     > > front-end and data engineers:
>     > > * Maxime Beauchemin (Airbnb)
>     > > * Alanna Scott (Airbnb)
>     > > * Bogdan Kyryliuk (Airbnb)
>     > > * Vera Liu  (Airbnb)
>     > > * Jeff Feng (Airbnb)
>     > > * Ashutosh Chauhan (Hortonworks)
>     > > * Nishant Bangarwa (Hortonworks)
>     > > * Slim Bouguerra (Hortonworks)
>     > > * Priyank Shah (Hortonworks)
>     > > * Sriharsha Chintalapani (Hortonworks)
>     > > * Daniel Dai (Hortonworks)
>     > >
>     > > We realize that additional employer diversity is needed, and we
> will work
>     > > aggressively to recruit developers from additional companies.
>     > >
>     > > === Alignment ===
>     > > The initial committers strongly believe that a system for
> interactive
>     > > visualization of data will gain broader adoption as an open source,
>     > > community driven project, where the community can contribute not
> only to
>     > > the core components, but also to a growing collection of
> connectors,
>     > > visualizations and improving integration a all potential data
> sources.
>     > > Superset already integrates closely with Apache Hive, the Hive
> metastore,
>     > > as well as most SQL-speaking databases found in modern data
> ecosystems.
>     > >
>     > > == Known Risks ==
>     > >
>     > > === Orphaned Products ===
>     > > Superset is a vital component for both visualizing, accessing and
>     > > democratizing data at Airbnb.  Also at Hortonworks, Superset is a
> core
>     > > component of the DataFlow product offering.  Thus, the risk of the
>     > project
>     > > being orphaned is relatively low.  The project could be at risk if
> Airbnb
>     > > changes their approach for democratizing data or if Hortonworks
> changes
>     > > their strategy in the market.  In such an event, the committers
> plan to
>     > > continue working on the project on their own time, thought the
> progress
>     > > will likely be slower.  We plan to mitigate this risk by recruiting
>     > > additional committers.
>     > >
>     > > === Inexperience with Open Source ===
>     > > The initial committers include veteran Apache members (committers
> and
>     > PPMC
>     > > members) and other developers who have varying degrees of
> experience with
>     > > open source projects. All have been involved with source code that
> has
>     > been
>     > > released under an open source license, and several also have
> experience
>     > > developing code with an open source development process.
>     > >
>     > > === Homogenous Developers ===
>     > > The initial committers are employed by Airbnb Inc. and
> Hortonworks. We
>     > are
>     > > committed to recruiting additional committers from other companies.
>     > >
>     > > === Reliance on Salaried Developers ===
>     > > It is expected that Superset development will occur on both
> salaried time
>     > > and on volunteer time, after hours. The majority of initial
> committers
>     > are
>     > > paid by their employer to contribute to this project. However,
> they are
>     > all
>     > > passionate about the project, and we are confident that the
> project will
>     > > continue even if no salaried developers contribute to the project.
> We are
>     > > committed to recruiting additional committers including
> non-salaried
>     > > developers.
>     > >
>     > > === Relationships with Other Apache Products ===
>     > > To the knowledge of the Initial Committers, there are no direct
>     > competitors
>     > > to Superset within the Apache Software Foundation.  That said,
> Apache
>     > > Zeppelin is an indirect competitor, but it solves a different use
> case.
>     > >
>     > > Apache Zeppelin is a web-based notebook that enables interactive
> data
>     > > analytics. It enables the creation of beautiful data-driven,
> interactive
>     > > and collaborative documents with SQL, Scala and more.  Although a
> user
>     > can
>     > > create data visualizations using this project, it leverages a
> notebook
>     > > style user interfaces and it is geared towards the Spark community
> where
>     > > Scala and SQL co-exist
>     > >
>     > > We look forward to collaborating with those communities, as well
> as other
>     > > Apache communities.
>     > >
>     > > === An Excessive Fascination with the Apache Brand ===
>     > > Superset is solving two huge challenges:
>     > > The challenge of enabling every knowledge worker to make data
> informed
>     > > decisions, particularly those who are not deeply skilled at
> writing SQL.
>     > > The challenge of visualizing huge amounts of data interactively
> and in
>     > > real-time
>     > >
>     > > Superset was first developed as a data visualization solution for
>     > Druid.io
>     > > as a way to visualize billions of rows of data.  Since then, usage
> of
>     > > Superset has expanded to address data visualization use cases
> across SQL
>     > > speaking data sources as well.
>     > >
>     > > Our rationale for developing Superset as an Apache project is
> detailed in
>     > > the Rationale Section.  We believe that the Apache brand and
> community
>     > > process will help us attract more contributors to this project,
> and help
>     > > grow the footprint of the project through usage at other
> organizations
>     > and
>     > > within other applications.  Establishing consensus among users and
>     > > developers will result in a more valuable tool for everyone.
>     > >
>     > > == Documentation ==
>     > > References to further reading material:
>     > > * [[http://airbnb.io/superset/|Superset Documentation]]
>     > > * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat
>     > > a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post:
> Superset:
>     > > Airbnb’s Data Exploration Platform]]
>     > > * [[https://medium.com/airbnb-engineering/superset-scaling-dat
>     > > a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.
> a505zvb1t|Blog
>     > Post:
>     > > Superset: Scaling Data Access & Visual Insights at Airbnb]]
>     > >
>     > > == Initial Source ==
>     > > The origin of the proposed code base can be found at
>     > > https://github.com/airbnb/superset.  The code base is primarily in
>     > Python.
>     > >
>     > > == Source and Intellectual Property Submission Plan ==
>     > > Airbnb will submit a Software Grant Agreement (SGA) as Superset
> joins the
>     > > incubator. We do not expect any complications for the submission
> of the
>     > > Superset code base.  Our code is already in Github and there is
> only a
>     > > single code base.
>     > >
>     > > == External Dependencies ==
>     > > List of Python packages, from the Python Package Index (Pypi):
>     > >
>     > > * boto3
>     > > * celery
>     > > * cryptography
>     > > * flask-appbuilder
>     > > * flask-cache
>     > > * flask-migrate
>     > > * flask-script
>     > > * flask-sqlalchemy
>     > > * flask-testing
>     > > * humanize
>     > > * gunicorn
>     > > * markdown
>     > > * pandas
>     > > * parsedatetime
>     > > * pydruid
>     > > * PyHive
>     > > * python-dateutil
>     > > * requests
>     > > * simplejson
>     > > * six
>     > > * sqlalchemy
>     > > * sqlalchemy-utils
>     > > * sqlparse
>     > > * thrift
>     > > * thrift-sasl
>     > > * werkzeug
>     > >
>     > > List of Javascript packages, from NPM:
>     > > * autobind-decorator
>     > > * bootstrap
>     > > * bootstrap-datepicker
>     > > * brace
>     > > * brfs
>     > > * cal-heatmap
>     > > * classnames
>     > > * d3
>     > > * d3-cloud
>     > > * d3-sankey
>     > > * d3-scale
>     > > * d3-tip
>     > > * datamaps
>     > > * datatables-bootstrap3-plugin
>     > > * datatables.net-bs
>     > > * font-awesome
>     > > * gridster
>     > > * immutability-helper
>     > > * immutable
>     > > * jquery
>     > > * lodash.throttle
>     > > * mapbox-gl
>     > > * moment
>     > > * moments
>     > > * mustache
>     > > * nvd3
>     > > * react
>     > > * react-ace
>     > > * react-bootstrap
>     > > * react-bootstrap-table
>     > > * react-dom
>     > > * react-draggable
>     > > * react-gravatar
>     > > * react-grid-layout
>     > > * react-map-gl
>     > > * react-redux
>     > > * react-resizable
>     > > * react-select
>     > > * react-syntax-highlighter
>     > > * reactable
>     > > * redux
>     > > * redux-localstorage
>     > > * redux-thunk
>     > > * shortid
>     > > * style-loader
>     > > * supercluster
>     > > * topojson
>     > > * victory
>     > > * viewport-mercator-project
>     > >
>     > > == Cryptography ==
>     > > The proposal does not include cryptographic code.
>     > >
>     > > == Required Resources ==
>     > >
>     > > === Mailing List ===
>     > > There is a current mailing list as a Google Group
> “airbnb_superset” that
>     > we
>     > > are planning on deprecating as the Apache.org become ready to
> serve our
>     > > community.
>     > >
>     > > * superset-private
>     > > * superset-dev
>     > > * superset-user
>     > >
>     > > === Subversion Directory ===
>     > > Git is the preferred source control system.
>     > http://svn.apache.org/repos/as
>     > > f/incubator/superset
>     > >
>     > > == Git Repository ==
>     > > Git is the preferred source control system, we’re assuming
>     > > https://github.com/apache/incubator-superset based on the naming
> scheme
>     > >
>     > > == Issue Tracking ==
>     > > JIRA Superset (SUPERSET). If possible, we’d like to use Github
> issues &
>     > PRs
>     > > to manage our project as much as possible. It’s been said that
> there are
>     > > ways to keep Github’s issues in sync with Jira, allowing us to get
> best
>     > of
>     > > both worlds. If that is not possible, we will comply to using Jira.
>     > >
>     > > == Other Resources ==
>     > > We currently use a set of Github integrated services that are free
> to the
>     > > open source community, like Travis-ci, Code Climate, Coveralls,
>     > > Landscape.io, Requires.io, david-dm and Gitter. We would like to
> keep
>     > using
>     > > these services as they allow us to scale contributions and
> optimize our
>     > > development flows. These services require some elevated rights on
> the
>     > > Github repository in order to set up or tune and we would like for
> the
>     > > committers to have the required rights.
>     > >
>     > >
>     > > == Initial Committers ==
>     > >
>     > > * Maxime Beauchemin <[hidden email]> - PPMC &
> Committer
>     > > * Alanna Scott <[hidden email]> - PPMC & Committer
>     > > * Bogdan Kyryliuk <[hidden email]> - PPMC & Committer
>     > > * Vera Liu <[hidden email]> - Committer
>     > > * Jeff Feng <[hidden email]> - PPMC & Committer
>     > > * Ashutosh Chauhan <[hidden email]> - Mentor & Committer
>     > > * Nishant Bangarwa <[hidden email]> - PPMC & Committer
>     > > * Slim Bouguerra <[hidden email]> - Committer
>     > > * Priyank Shah <[hidden email]> - Committer
>     > > * Harsha Chintalapani <[hidden email]> - Committer
>     > > * Daniel Dai <[hidden email]> - Champion & Committer
>     > > * Luke Han <[hidden email]> - Mentor
>     > >
>     > > == Affiliations ==
>     > > The initial committers are employees of Airbnb Inc. and
> Hortonworks.
>     > >
>     > > == Sponsors ==
>     > >
>     > > === Champion ===
>     > > Daniel Dai <[hidden email]>
>     > >
>     > > === Nominated Mentors ===
>     > > * Ashutosh Chauhan <[hidden email]>
>     > > * Luke Han <[hidden email]>
>     > >
>     > > === Sponsoring Entity ===
>     > > Incubator PMC
>     > >
>     > >
>     > >
>     > >
>     > >
>     > > On Wed, Apr 26, 2017 at 6:31 PM, Edward J. Yoon <
> [hidden email]>
>     > > wrote:
>     > >
>     > >> +1 binding
>     > >>
>     > >> On Thu, Apr 27, 2017 at 10:29 AM, Naresh Agarwal
>     > >> <[hidden email]> wrote:
>     > >>> +1 (non-binding).
>     > >>>
>     > >>> Thanks
>     > >>> Naresh Agarwal
>     > >>>
>     > >>> On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <
> [hidden email]>
>     > >> wrote:
>     > >>>
>     > >>>> +1 (binding)
>     > >>>>
>     > >>>>
>     > >>>>
>     > >>>> On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <[hidden email]>
> wrote:
>     > >>>>
>     > >>>>> +1 (binding)
>     > >>>>>
>     > >>>>> On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey
>     > >>>>> <[hidden email]> wrote:
>     > >>>>>> +1 (binding)
>     > >>>>>>
>     > >>>>>> On 4/25/17, 1:27 PM, "Julian Hyde" <[hidden email]> wrote:
>     > >>>>>>
>     > >>>>>>    +1 binding
>     > >>>>>>
>     > >>>>>>> On Apr 25, 2017, at 12:48 PM, moon soo Lee <[hidden email]>
>     > >>>>> wrote:
>     > >>>>>>>
>     > >>>>>>> +1 (non-binding)
>     > >>>>>>>
>     > >>>>>>> On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan <
>     > >>>>> [hidden email]>
>     > >>>>>>> wrote:
>     > >>>>>>>
>     > >>>>>>>> +1 (binding)
>     > >>>>>>>>
>     > >>>>>>>> Thanks,
>     > >>>>>>>> Ashutosh
>     > >>>>>>>>
>     > >>>>>>>> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <
> [hidden email]
>     > >>>
>     > >>>>> wrote:
>     > >>>>>>>>
>     > >>>>>>>>> +1 binding
>     > >>>>>>>>>
>     > >>>>>>>>> Love to see Superset to be new incubator project.
>     > >>>>>>>>>
>     > >>>>>>>>>
>     > >>>>>>>>> Best Regards!
>     > >>>>>>>>> ---------------------
>     > >>>>>>>>>
>     > >>>>>>>>> Luke Han
>     > >>>>>>>>>
>     > >>>>>>>>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <
>     > >>>> [hidden email]>
>     > >>>>> wrote:
>     > >>>>>>>>>
>     > >>>>>>>>>> Dear Apache Incubator Community,
>     > >>>>>>>>>>
>     > >>>>>>>>>> We have updated the Superset proposal
>     > >>>>>>>>>> <https://wiki.apache.org/incubator/SupersetProposal>
>     > >> (copied
>     > >>>>> below) for
>     > >>>>>>>>>>
>     > >>>>>>>>>> Apache Incubation with an additional mentor (Luke Han -
>     > >>>>>>>>>> [hidden email]),
>     > >>>>>>>>>> and would like to start a vote thread for acceptance into
>     > >> the
>     > >>>>> incubator.
>     > >>>>>>>>>>
>     > >>>>>>>>>> Our team is excited to share Superset with the Apache
>     > >>>> community
>     > >>>>> and we
>     > >>>>>>>>>> hope
>     > >>>>>>>>>> for the your continued support!
>     > >>>>>>>>>>
>     > >>>>>>>>>> Cheers,
>     > >>>>>>>>>> Jeff & the Superset Team
>     > >>>>>>>>>>
>     > >>>>>>>>>>
>     > >>>>>>>>>>
>     > >>>>>>>>>>
>     > >>>>>>>>>> = Superset =
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Abstract ==
>     > >>>>>>>>>> Superset is an enterprise-ready web application for data
>     > >>>>> exploration,
>     > >>>>>>>> data
>     > >>>>>>>>>> visualization and dashboarding.
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Proposal ==
>     > >>>>>>>>>> Superset is business intelligence (BI) software that helps
>     > >>>>> modern
>     > >>>>>>>>>> organizations visualize and interact with their data.
>     > >> Superset
>     > >>>>> enables
>     > >>>>>>>>>> users explore data from a variety of databases, assemble
>     > >>>>> beautiful
>     > >>>>>>>>>> dashboards and share their findings.  Superset works
> neatly
>     > >>>>> with all
>     > >>>>>>>>>> modern
>     > >>>>>>>>>> SQL-speaking databases, and integrates with Druid.io to
>     > >>>> provide
>     > >>>>>>>> real-time,
>     > >>>>>>>>>> interactive, blazing fast data access to large datasets.
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Background ==
>     > >>>>>>>>>> Data is mission critical. To succeed in this era,
>     > >>>> organizations
>     > >>>>> need to
>     > >>>>>>>>>> provide low-friction, intuitive and interactive access to
>     > >>>> data.
>     > >>>>> It is
>     > >>>>>>>>>> paramount for knowledge workers to be capable of answering
>     > >>>>> their own
>     > >>>>>>>>>> questions by querying, exploring and visualizing data.
>     > >>>>>>>>>>
>     > >>>>>>>>>> The entire business intelligence industry has pivoted from
>     > >> a
>     > >>>>> model of
>     > >>>>>>>>>> centralized top-down platforms driven by IT organizations
>     > >> to
>     > >>>>>>>> self-service
>     > >>>>>>>>>> analytics and agile workflows by any user.  This shift
>     > >>>> unblocks
>     > >>>>>>>>>> centralized
>     > >>>>>>>>>> service bottlenecks for creating data visualizations while
>     > >>>> also
>     > >>>>> creating
>     > >>>>>>>>>> an
>     > >>>>>>>>>> environment that is iterative and fast-moving.  This means
>     > >>>> that
>     > >>>>> business
>     > >>>>>>>>>> intelligence software must also be easy and delightful to
>     > >> use.
>     > >>>>>>>>>> Self-service analytics doesn’t mean that admin and
>     > >> governance
>     > >>>>> features
>     > >>>>>>>> are
>     > >>>>>>>>>> not needed.
>     > >>>>>>>>>> Modern BI tools provide fine-grain access controls and
>     > >>>> auditing
>     > >>>>>>>>>> capabilities to understand how data is being used.
>     > >> Superset
>     > >>>> is
>     > >>>>> a
>     > >>>>>>>> solution
>     > >>>>>>>>>> that delivers on all of these vectors.
>     > >>>>>>>>>>
>     > >>>>>>>>>> The technology stack is also constantly morphing - vendors
>     > >> are
>     > >>>>>>>> struggling
>     > >>>>>>>>>> to provide cheap, quick and easy solutions to access data.
>     > >>>>> Business
>     > >>>>>>>>>> intelligence users are finding existing solutions lacking
>     > >> as
>     > >>>>> these
>     > >>>>>>>>>> software
>     > >>>>>>>>>> products either disregard or react slowly to recent
>     > >>>>> game-changing
>     > >>>>>>>>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache
>     > >>>>> Kylin, d3.js,
>     > >>>>>>>>>> React.js and iPython’s Jupyter for instance.
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Rationale ==
>     > >>>>>>>>>> Business intelligence is more relevant today than at any
>     > >> other
>     > >>>>> point in
>     > >>>>>>>>>> history.  Organizations are currently very limited in
>     > >> options
>     > >>>>> for open
>     > >>>>>>>>>> source data visualization solutions, especially solutions
>     > >> that
>     > >>>>> are both
>     > >>>>>>>>>> self-service and enterprise-ready.  Every company
> informing
>     > >>>>> their
>     > >>>>>>>>>> decisions
>     > >>>>>>>>>> with data needs a BI tool.
>     > >>>>>>>>>>
>     > >>>>>>>>>> We believe that Superset will be a strong compliment to
>     > >>>>> existing Apache
>     > >>>>>>>>>> Software Foundation technologies by offering scalable user
>     > >>>>> interactions
>     > >>>>>>>> to
>     > >>>>>>>>>> distributed storage and computation solutions.  Users will
>     > >>>>> often find
>     > >>>>>>>> that
>     > >>>>>>>>>> Superset can act as a catalyst for tooling that can
>     > >> visualize
>     > >>>>> the
>     > >>>>>>>>>> byproduct
>     > >>>>>>>>>> of data and computation infrastructure.
>     > >>>>>>>>>>
>     > >>>>>>>>>> Superset has many key design elements that help fill a gap
>     > >> in
>     > >>>>> current
>     > >>>>>>>>>> solutions for organizations:
>     > >>>>>>>>>> * Easy, low friction access to data through a simple,
>     > >>>> web-based
>     > >>>>> data
>     > >>>>>>>>>> exploration interface.  Composing charts and dashboards
> are
>     > >>>>> intuitive.
>     > >>>>>>>>>> Eliminating the need to write code or SQL empowers anyone
>     > >> to
>     > >>>>> use it.
>     > >>>>>>>>>> * Access to a wide array of rich, interactive data
>     > >>>>> visualization types.
>     > >>>>>>>>>> * Enterprise-ready: Integration with different
>     > >> authentication
>     > >>>>>>>> mechanisms
>     > >>>>>>>>>> and granular permissions centered around actions and data
>     > >>>>> access.
>     > >>>>>>>>>> * Realtime & fast: Superset provides realtime analytics at
>     > >> the
>     > >>>>> speed of
>     > >>>>>>>>>> thought on very large datasets when integrated with
>     > >> Druid.io.
>     > >>>>>>>>>> * Broad data access: Consume data out of any SQL-speaking
>     > >>>>> relational
>     > >>>>>>>>>> database.
>     > >>>>>>>>>> * Extensible: Can be extended to talk to many noSQL
>     > >> databases
>     > >>>>> like
>     > >>>>>>>> Apache
>     > >>>>>>>>>> Drill, Elastic Search, and other popular database engines.
>     > >>>>>>>>>> * Fast loading dashboards with configurable web-scale
>     > >> caching.
>     > >>>>>>>>>> * Plug-in framework that enables organizations to build
>     > >> custom
>     > >>>>>>>> analytical
>     > >>>>>>>>>> applications with new UI/UX interfaces.
>     > >>>>>>>>>> * SQL Lab, a state-of-the-art SQL IDE that empowers
>     > >>>>> SQL-speaking users
>     > >>>>>>>>>> with more flexibility.  SQL Lab integrates with the
>     > >>>>> visualization engine
>     > >>>>>>>>>> seamlessly.
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Initial Goals ==
>     > >>>>>>>>>> The initial goals of the Superset project are
> several-fold:
>     > >>>>>>>>>> * Move the existing codebase to Apache and integrate with
>     > >> the
>     > >>>>> Apache
>     > >>>>>>>>>> development process.
>     > >>>>>>>>>> * Redesign the user interface and interaction model for
>     > >>>> creating
>     > >>>>>>>>>> visualizations/dashboards and connecting to data sources
>     > >>>>>>>>>> * Build robust support for security and governance of the
>     > >> tool
>     > >>>>>>>> including
>     > >>>>>>>>>> popular authorization modules (including Apache Ranger and
>     > >>>>> Apache
>     > >>>>>>>> Sentry)
>     > >>>>>>>>>> and a more sophisticated permissions system
>     > >>>>>>>>>> * Grow the extensibility of the project both in terms of
>     > >>>>> enhanced
>     > >>>>>>>>>> connectivity to NoSQL-based data sources and creating a
>     > >>>> plug-in
>     > >>>>>>>> framework
>     > >>>>>>>>>> that enables organizations to build custom analytical
>     > >>>>> applications which
>     > >>>>>>>>>> require a new UI/UX
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Current Status ==
>     > >>>>>>>>>> By many standards, Superset is already a successful open
>     > >>>> source
>     > >>>>> project.
>     > >>>>>>>>>> As
>     > >>>>>>>>>> of March 2017, Superset is officially used in production
> at
>     > >>>>> about a
>     > >>>>>>>> dozen
>     > >>>>>>>>>> companies, has received contributions from over one
> hundred
>     > >>>>> contributors
>     > >>>>>>>>>> on
>     > >>>>>>>>>> Github, 1500+ forks, and 12k+ stars.
>     > >>>>>>>>>>
>     > >>>>>>>>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks
> have
>     > >>>> made
>     > >>>>>>>>>> significant contributions, and expressed their commitment
>     > >> to
>     > >>>> the
>     > >>>>>>>> project.
>     > >>>>>>>>>> The product is feature complete and has been viable for
>     > >>>> months.
>     > >>>>> It
>     > >>>>>>>> already
>     > >>>>>>>>>> serves as the main interface for consuming data at many
>     > >>>>> companies of
>     > >>>>>>>>>> different sizes.
>     > >>>>>>>>>>
>     > >>>>>>>>>> While the product is usable, there’s room for improvement
>     > >>>>> across the
>     > >>>>>>>>>> board,
>     > >>>>>>>>>> starting with providing a smoother user experience around
>     > >>>>> content
>     > >>>>>>>>>> creation,
>     > >>>>>>>>>> making sure all features work out-of-the-box on more
>     > >> platforms
>     > >>>>> and
>     > >>>>>>>>>> databases, providing better user training guides and
>     > >> videos,
>     > >>>>> having a
>     > >>>>>>>>>> predictable release process, and increasing the overall
>     > >>>> quality
>     > >>>>> of the
>     > >>>>>>>>>> Superset releases.
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Meritocracy ===
>     > >>>>>>>>>> We plan to invest in supporting a meritocracy. We will
>     > >> discuss
>     > >>>>> the
>     > >>>>>>>>>> requirements in an open forum. Several companies have
>     > >>>> expressed
>     > >>>>> interest
>     > >>>>>>>>>> in
>     > >>>>>>>>>> this project, and we intend to invite additional
>     > >> developers to
>     > >>>>>>>>>> participate.
>     > >>>>>>>>>> We will encourage and monitor community participation so
>     > >> that
>     > >>>>> privileges
>     > >>>>>>>>>> can be extended to those that contribute.
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Community ===
>     > >>>>>>>>>> The need for an enterprise-ready data visualization and
>     > >>>>> exploration
>     > >>>>>>>>>> platform in the open source community is tremendous.
> While
>     > >>>>> Superset is
>     > >>>>>>>>>> fairly well known, recognized and used within the Druid.io
>     > >>>>> community,
>     > >>>>>>>>>> adoption is currently limited outside of that niche. There
>     > >> is
>     > >>>> a
>     > >>>>> huge
>     > >>>>>>>>>> opportunity to grow the community to hundreds if not
>     > >> thousands
>     > >>>>> of
>     > >>>>>>>>>> organizations, and we are hoping that embracing “the
> Apache
>     > >>>>> way” will
>     > >>>>>>>>>> accelerate the growth of our community.
>     > >>>>>>>>>>
>     > >>>>>>>>>> We have already been active at seeking and inviting
>     > >>>>> contributions, and
>     > >>>>>>>> are
>     > >>>>>>>>>> planning to scale the project by investing time and
> growing
>     > >>>> the
>     > >>>>> support
>     > >>>>>>>>>> structure to grow the community.
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Core Developers ===
>     > >>>>>>>>>> The initial committers for Superset include experienced
>     > >> full
>     > >>>>> stack,
>     > >>>>>>>>>> front-end and data engineers:
>     > >>>>>>>>>> * Maxime Beauchemin (Airbnb)
>     > >>>>>>>>>> * Alanna Scott (Airbnb)
>     > >>>>>>>>>> * Bogdan Kyryliuk (Airbnb)
>     > >>>>>>>>>> * Vera Liu  (Airbnb)
>     > >>>>>>>>>> * Jeff Feng (Airbnb)
>     > >>>>>>>>>> * Ashutosh Chauhan (Hortonworks)
>     > >>>>>>>>>> * Nishant Bangarwa (Hortonworks)
>     > >>>>>>>>>> * Slim Bouguerra (Hortonworks)
>     > >>>>>>>>>> * Priyank Shah (Hortonworks)
>     > >>>>>>>>>> * Sriharsha Chintalapani (Hortonworks)
>     > >>>>>>>>>> * Daniel Dai (Hortonworks)
>     > >>>>>>>>>>
>     > >>>>>>>>>> We realize that additional employer diversity is needed,
>     > >> and
>     > >>>> we
>     > >>>>> will
>     > >>>>>>>> work
>     > >>>>>>>>>> aggressively to recruit developers from additional
>     > >> companies.
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Alignment ===
>     > >>>>>>>>>> The initial committers strongly believe that a system for
>     > >>>>> interactive
>     > >>>>>>>>>> visualization of data will gain broader adoption as an
> open
>     > >>>>> source,
>     > >>>>>>>>>> community driven project, where the community can
>     > >> contribute
>     > >>>>> not only to
>     > >>>>>>>>>> the core components, but also to a growing collection of
>     > >>>>> connectors,
>     > >>>>>>>>>> visualizations and improving integration a all potential
>     > >> data
>     > >>>>> sources.
>     > >>>>>>>>>> Superset already integrates closely with Apache Hive, the
>     > >> Hive
>     > >>>>>>>> metastore,
>     > >>>>>>>>>> as well as most SQL-speaking databases found in modern
> data
>     > >>>>> ecosystems.
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Known Risks ==
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Orphaned Products ===
>     > >>>>>>>>>> Superset is a vital component for both visualizing,
>     > >> accessing
>     > >>>>> and
>     > >>>>>>>>>> democratizing data at Airbnb.  Also at Hortonworks,
>     > >> Superset
>     > >>>> is
>     > >>>>> a core
>     > >>>>>>>>>> component of the DataFlow product offering.  Thus, the
>     > >> risk of
>     > >>>>> the
>     > >>>>>>>> project
>     > >>>>>>>>>> being orphaned is relatively low.  The project could be at
>     > >>>> risk
>     > >>>>> if
>     > >>>>>>>> Airbnb
>     > >>>>>>>>>> changes their approach for democratizing data or if
>     > >>>> Hortonworks
>     > >>>>> changes
>     > >>>>>>>>>> their strategy in the market.  In such an event, the
>     > >>>> committers
>     > >>>>> plan to
>     > >>>>>>>>>> continue working on the project on their own time, thought
>     > >> the
>     > >>>>> progress
>     > >>>>>>>>>> will likely be slower.  We plan to mitigate this risk by
>     > >>>>> recruiting
>     > >>>>>>>>>> additional committers.
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Inexperience with Open Source ===
>     > >>>>>>>>>> The initial committers include veteran Apache members
>     > >>>>> (committers and
>     > >>>>>>>> PPMC
>     > >>>>>>>>>> members) and other developers who have varying degrees of
>     > >>>>> experience
>     > >>>>>>>> with
>     > >>>>>>>>>> open source projects. All have been involved with source
>     > >> code
>     > >>>>> that has
>     > >>>>>>>>>> been
>     > >>>>>>>>>> released under an open source license, and several also
>     > >> have
>     > >>>>> experience
>     > >>>>>>>>>> developing code with an open source development process.
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Homogenous Developers ===
>     > >>>>>>>>>> The initial committers are employed by Airbnb Inc. and
>     > >>>>> Hortonworks. We
>     > >>>>>>>> are
>     > >>>>>>>>>> committed to recruiting additional committers from other
>     > >>>>> companies.
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Reliance on Salaried Developers ===
>     > >>>>>>>>>> It is expected that Superset development will occur on
> both
>     > >>>>> salaried
>     > >>>>>>>> time
>     > >>>>>>>>>> and on volunteer time, after hours. The majority of
> initial
>     > >>>>> committers
>     > >>>>>>>> are
>     > >>>>>>>>>> paid by their employer to contribute to this project.
>     > >> However,
>     > >>>>> they are
>     > >>>>>>>>>> all
>     > >>>>>>>>>> passionate about the project, and we are confident that
> the
>     > >>>>> project will
>     > >>>>>>>>>> continue even if no salaried developers contribute to the
>     > >>>>> project. We
>     > >>>>>>>> are
>     > >>>>>>>>>> committed to recruiting additional committers including
>     > >>>>> non-salaried
>     > >>>>>>>>>> developers.
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Relationships with Other Apache Products ===
>     > >>>>>>>>>> To the knowledge of the Initial Committers, there are no
>     > >>>> direct
>     > >>>>>>>>>> competitors
>     > >>>>>>>>>> to Superset within the Apache Software Foundation.  That
>     > >> said,
>     > >>>>> Apache
>     > >>>>>>>>>> Zeppelin is an indirect competitor, but it solves a
>     > >> different
>     > >>>>> use case.
>     > >>>>>>>>>>
>     > >>>>>>>>>> Apache Zeppelin is a web-based notebook that enables
>     > >>>>> interactive data
>     > >>>>>>>>>> analytics. It enables the creation of beautiful
>     > >> data-driven,
>     > >>>>> interactive
>     > >>>>>>>>>> and collaborative documents with SQL, Scala and more.
>     > >>>> Although
>     > >>>>> a user
>     > >>>>>>>> can
>     > >>>>>>>>>> create data visualizations using this project, it
>     > >> leverages a
>     > >>>>> notebook
>     > >>>>>>>>>> style user interfaces and it is geared towards the Spark
>     > >>>>> community where
>     > >>>>>>>>>> Scala and SQL co-exist
>     > >>>>>>>>>>
>     > >>>>>>>>>> We look forward to collaborating with those communities,
> as
>     > >>>>> well as
>     > >>>>>>>> other
>     > >>>>>>>>>> Apache communities.
>     > >>>>>>>>>>
>     > >>>>>>>>>> === An Excessive Fascination with the Apache Brand ===
>     > >>>>>>>>>> Superset is solving two huge challenges:
>     > >>>>>>>>>> The challenge of enabling every knowledge worker to make
>     > >> data
>     > >>>>> informed
>     > >>>>>>>>>> decisions, particularly those who are not deeply skilled
> at
>     > >>>>> writing SQL.
>     > >>>>>>>>>> The challenge of visualizing huge amounts of data
>     > >>>> interactively
>     > >>>>> and in
>     > >>>>>>>>>> real-time
>     > >>>>>>>>>>
>     > >>>>>>>>>> Superset was first developed as a data visualization
>     > >> solution
>     > >>>>> for
>     > >>>>>>>> Druid.io
>     > >>>>>>>>>> as a way to visualize billions of rows of data.  Since
>     > >> then,
>     > >>>>> usage of
>     > >>>>>>>>>> Superset has expanded to address data visualization use
>     > >> cases
>     > >>>>> across SQL
>     > >>>>>>>>>> speaking data sources as well.
>     > >>>>>>>>>>
>     > >>>>>>>>>> Our rationale for developing Superset as an Apache project
>     > >> is
>     > >>>>> detailed
>     > >>>>>>>> in
>     > >>>>>>>>>> the Rationale Section.  We believe that the Apache brand
>     > >> and
>     > >>>>> community
>     > >>>>>>>>>> process will help us attract more contributors to this
>     > >>>> project,
>     > >>>>> and help
>     > >>>>>>>>>> grow the footprint of the project through usage at other
>     > >>>>> organizations
>     > >>>>>>>> and
>     > >>>>>>>>>> within other applications.  Establishing consensus among
>     > >> users
>     > >>>>> and
>     > >>>>>>>>>> developers will result in a more valuable tool for
>     > >> everyone.
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Documentation ==
>     > >>>>>>>>>> References to further reading material:
>     > >>>>>>>>>> * [[http://airbnb.io/superset/|Superset Documentation]]
>     > >>>>>>>>>> * [[
>     > >>>>>>>>>> https://medium.com/airbnb-engi
>     > >> neering/caravel-airbnb-s-data-
>     > >>>>>>>>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog
>     > >>>>>>>>>> Post:  Superset: Airbnb’s Data Exploration Platform]]
>     > >>>>>>>>>> * [[
>     > >>>>>>>>>> https://medium.com/airbnb-engi
>     > >> neering/superset-scaling-data-
>     > >>>>>>>>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.
>     > >>>>> a505zvb1t|Blog
>     > >>>>>>>>>> Post:  Superset: Scaling Data Access & Visual Insights at
>     > >>>>> Airbnb]]
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Initial Source ==
>     > >>>>>>>>>> The origin of the proposed code base can be found at
>     > >>>>>>>>>> https://github.com/airbnb/superset.  The code base is
>     > >>>>> primarily in
>     > >>>>>>>>>> Python.
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Source and Intellectual Property Submission Plan ==
>     > >>>>>>>>>> We do not expect any complications for the submission of
>     > >> the
>     > >>>>> Superset
>     > >>>>>>>> code
>     > >>>>>>>>>> base.  Our code is already in Github and there is only a
>     > >>>> single
>     > >>>>> code
>     > >>>>>>>> base.
>     > >>>>>>>>>>
>     > >>>>>>>>>> == External Dependencies ==
>     > >>>>>>>>>> List of Python packages, from the Python Package Index
>     > >> (Pypi):
>     > >>>>>>>>>>
>     > >>>>>>>>>> * boto3
>     > >>>>>>>>>> * celery
>     > >>>>>>>>>> * cryptography
>     > >>>>>>>>>> * flask-appbuilder
>     > >>>>>>>>>> * flask-cache
>     > >>>>>>>>>> * flask-migrate
>     > >>>>>>>>>> * flask-script
>     > >>>>>>>>>> * flask-sqlalchemy
>     > >>>>>>>>>> * flask-testing
>     > >>>>>>>>>> * humanize
>     > >>>>>>>>>> * gunicorn
>     > >>>>>>>>>> * markdown
>     > >>>>>>>>>> * pandas
>     > >>>>>>>>>> * parsedatetime
>     > >>>>>>>>>> * pydruid
>     > >>>>>>>>>> * PyHive
>     > >>>>>>>>>> * python-dateutil
>     > >>>>>>>>>> * requests
>     > >>>>>>>>>> * simplejson
>     > >>>>>>>>>> * six
>     > >>>>>>>>>> * sqlalchemy
>     > >>>>>>>>>> * sqlalchemy-utils
>     > >>>>>>>>>> * sqlparse
>     > >>>>>>>>>> * thrift
>     > >>>>>>>>>> * thrift-sasl
>     > >>>>>>>>>> * werkzeug
>     > >>>>>>>>>>
>     > >>>>>>>>>> List of Javascript packages, from NPM:
>     > >>>>>>>>>> * autobind-decorator
>     > >>>>>>>>>> * bootstrap
>     > >>>>>>>>>> * bootstrap-datepicker
>     > >>>>>>>>>> * brace
>     > >>>>>>>>>> * brfs
>     > >>>>>>>>>> * cal-heatmap
>     > >>>>>>>>>> * classnames
>     > >>>>>>>>>> * d3
>     > >>>>>>>>>> * d3-cloud
>     > >>>>>>>>>> * d3-sankey
>     > >>>>>>>>>> * d3-scale
>     > >>>>>>>>>> * d3-tip
>     > >>>>>>>>>> * datamaps
>     > >>>>>>>>>> * datatables-bootstrap3-plugin
>     > >>>>>>>>>> * datatables.net-bs
>     > >>>>>>>>>> * font-awesome
>     > >>>>>>>>>> * gridster
>     > >>>>>>>>>> * immutability-helper
>     > >>>>>>>>>> * immutable
>     > >>>>>>>>>> * jquery
>     > >>>>>>>>>> * lodash.throttle
>     > >>>>>>>>>> * mapbox-gl
>     > >>>>>>>>>> * moment
>     > >>>>>>>>>> * moments
>     > >>>>>>>>>> * mustache
>     > >>>>>>>>>> * nvd3
>     > >>>>>>>>>> * react
>     > >>>>>>>>>> * react-ace
>     > >>>>>>>>>> * react-bootstrap
>     > >>>>>>>>>> * react-bootstrap-table
>     > >>>>>>>>>> * react-dom
>     > >>>>>>>>>> * react-draggable
>     > >>>>>>>>>> * react-gravatar
>     > >>>>>>>>>> * react-grid-layout
>     > >>>>>>>>>> * react-map-gl
>     > >>>>>>>>>> * react-redux
>     > >>>>>>>>>> * react-resizable
>     > >>>>>>>>>> * react-select
>     > >>>>>>>>>> * react-syntax-highlighter
>     > >>>>>>>>>> * reactable
>     > >>>>>>>>>> * redux
>     > >>>>>>>>>> * redux-localstorage
>     > >>>>>>>>>> * redux-thunk
>     > >>>>>>>>>> * shortid
>     > >>>>>>>>>> * style-loader
>     > >>>>>>>>>> * supercluster
>     > >>>>>>>>>> * topojson
>     > >>>>>>>>>> * victory
>     > >>>>>>>>>> * viewport-mercator-project
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Cryptography ==
>     > >>>>>>>>>> The proposal does not include cryptographic code.
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Required Resources ==
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Mailing List ===
>     > >>>>>>>>>> There is a current mailing list as a Google Group
>     > >>>>> “airbnb_superset” that
>     > >>>>>>>>>> we
>     > >>>>>>>>>> are planning on deprecating as the Apache.org become ready
>     > >> to
>     > >>>>> serve our
>     > >>>>>>>>>> community.
>     > >>>>>>>>>>
>     > >>>>>>>>>> * superset-private
>     > >>>>>>>>>> * superset-dev
>     > >>>>>>>>>> * superset-user
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Subversion Directory ===
>     > >>>>>>>>>> Git is the preferred source control system.
>     > >>>>>>>>>> http://svn.apache.org/repos/asf/incubator/superset
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Git Repository ==
>     > >>>>>>>>>> Git is the preferred source control system, we’re assuming
>     > >>>>>>>>>> https://github.com/apache/incubator-superset based on the
>     > >>>>> naming scheme
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Issue Tracking ==
>     > >>>>>>>>>> JIRA Superset (SUPERSET). If possible, we’d like to use
>     > >> Github
>     > >>>>> issues &
>     > >>>>>>>>>> PRs
>     > >>>>>>>>>> to manage our project as much as possible. It’s been said
>     > >> that
>     > >>>>> there are
>     > >>>>>>>>>> ways to keep Github’s issues in sync with Jira, allowing
>     > >> us to
>     > >>>>> get best
>     > >>>>>>>> of
>     > >>>>>>>>>> both worlds. If that is not possible, we will comply to
>     > >> using
>     > >>>>> Jira.
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Other Resources ==
>     > >>>>>>>>>> We currently use a set of Github integrated services that
>     > >> are
>     > >>>>> free to
>     > >>>>>>>> the
>     > >>>>>>>>>> open source community, like Travis-ci, Code Climate,
>     > >>>> Coveralls,
>     > >>>>>>>>>> Landscape.io, Requires.io, david-dm and Gitter. We would
>     > >> like
>     > >>>>> to keep
>     > >>>>>>>>>> using
>     > >>>>>>>>>> these services as they allow us to scale contributions and
>     > >>>>> optimize our
>     > >>>>>>>>>> development flows. These services require some elevated
>     > >> rights
>     > >>>>> on the
>     > >>>>>>>>>> Github repository in order to set up or tune and we would
>     > >> like
>     > >>>>> for the
>     > >>>>>>>>>> committers to have the required rights.
>     > >>>>>>>>>>
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Initial Committers ==
>     > >>>>>>>>>>
>     > >>>>>>>>>> * Maxime Beauchemin <[hidden email]> - PPMC
>     > >> &
>     > >>>>> Committer
>     > >>>>>>>>>> * Alanna Scott <[hidden email]> - PPMC &
>     > >> Committer
>     > >>>>>>>>>> * Bogdan Kyryliuk <[hidden email]> - PPMC &
>     > >> Committer
>     > >>>>>>>>>> * Vera Liu <[hidden email]> - Committer
>     > >>>>>>>>>> * Jeff Feng <[hidden email]> - PPMC & Committer
>     > >>>>>>>>>> * Ashutosh Chauhan <[hidden email]> - Mentor &
>     > >>>> Committer
>     > >>>>>>>>>> * Nishant Bangarwa <[hidden email]> - PPMC &
>     > >>>>> Committer
>     > >>>>>>>>>> * Slim Bouguerra <[hidden email]> - Committer
>     > >>>>>>>>>> * Priyank Shah <[hidden email]> - Committer
>     > >>>>>>>>>> * Harsha Chintalapani <[hidden email]> -
>     > >>>>> Committer
>     > >>>>>>>>>> * Daniel Dai <[hidden email]> - Champion & Committer
>     > >>>>>>>>>> * Luke Han <[hidden email]> - Mentor
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Affiliations ==
>     > >>>>>>>>>> The initial committers are employees of Airbnb Inc. and
>     > >>>>> Hortonworks.
>     > >>>>>>>>>>
>     > >>>>>>>>>> == Sponsors ==
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Champion ===
>     > >>>>>>>>>> Daniel Dai <[hidden email]>
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Nominated Mentors ===
>     > >>>>>>>>>> * Ashutosh Chauhan <[hidden email]>
>     > >>>>>>>>>> * Luke Han <[hidden email]>
>     > >>>>>>>>>>
>     > >>>>>>>>>> === Sponsoring Entity ===
>     > >>>>>>>>>> Incubator PMC
>     > >>>>>>>>>>
>     > >>>>>>>>>
>     > >>>>>>>>>
>     > >>>>>>>>
>     > >>>>>>
>     > >>>>>>
>     > >>>>>>    ------------------------------
> ------------------------------
>     > >>>>> ---------
>     > >>>>>>    To unsubscribe, e-mail: general-unsubscribe@incubator.
>     > >> apache.org
>     > >>>>>>    For additional commands, e-mail:
> [hidden email].
>     > >> org
>     > >>>>>>
>     > >>>>>>
>     > >>>>>>
>     > >>>>>>
>     > >>>>>
>     > >>>>> ------------------------------------------------------------
>     > ---------
>     > >>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.
> apache.org
>     > >>>>> For additional commands, e-mail:
> [hidden email]
>     > >>>>>
>     > >>>>>
>     > >>>>
>     > >>
>     > >>
>     > >>
>     > >> --
>     > >> Best Regards, Edward J. Yoon
>     > >>
>     > >> ------------------------------------------------------------
> ---------
>     > >> To unsubscribe, e-mail: [hidden email]
>     > >> For additional commands, e-mail: [hidden email].
> org
>     > >>
>     > >>
>     >
>     >
>     >
>     > ------------------------------------------------------------
> ---------
>     > To unsubscribe, e-mail: [hidden email]
>     > For additional commands, e-mail: [hidden email]
>     >
>     >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [VOTE] Superset Proposal for Apache Incubator

Julien Le Dem-3
+1 (binding)

On Thu, Apr 27, 2017 at 3:08 PM, Jacques Nadeau <[hidden email]> wrote:

> +1 binding
>
> On Apr 27, 2017 1:35 PM, "Jitendra Pandey" <[hidden email]>
> wrote:
>
> > Re-affirming. +1 (binding)
> >
> > On 4/27/17, 1:24 PM, "Ashutosh Chauhan" <[hidden email]>
> > wrote:
> >
> >     Re-affirming my vote as well:
> >
> >     +1 (binding)
> >
> >     On Thu, Apr 27, 2017 at 10:45 AM, Julian Hyde <[hidden email]>
> > wrote:
> >
> >     > Re-affriming my vote:
> >     >
> >     > +1 (binding)
> >     >
> >     > > On Apr 26, 2017, at 11:12 PM, Jeff Feng <[hidden email]>
> > wrote:
> >     > >
> >     > > Hello everyone,
> >     > >
> >     > > Thank you for checking out our proposal on Superset and for your
> >     > > consideration for the Apache Incubator.  So far, I believe we
> have
> > 8
> >     > > binding votes and 2 non-binding votes.
> >     > >
> >     > > As Taylor mentioned earlier, we made a minor update to the
> wording
> > in the
> >     > > "Source and Intellectual Property Submission Plan" section based
> > on a
> >     > > suggestion by John Ament.  The update was to help confirm the
> > previously
> >     > > unstated assumption that we will submit an SGA.  I have copied
> the
> >     > updated
> >     > > proposal from the wiki to the email below and highlighted (in
> > yellow) the
> >     > > new sentence below in the document.
> >     > >
> >     > > Folks on the cc line who have already voted, please let us know
> if
> > the
> >     > > change impacts your vote.
> >     > >
> >     > > Thank you all,
> >     > > Jeff
> >     > >
> >     > >
> >     > >
> >     > > = Superset =
> >     > >
> >     > > == Abstract ==
> >     > > Superset is an enterprise-ready web application for data
> > exploration,
> >     > data
> >     > > visualization and dashboarding.
> >     > >
> >     > > == Proposal ==
> >     > > Superset is business intelligence (BI) software that helps modern
> >     > > organizations visualize and interact with their data. Superset
> > enables
> >     > > users explore data from a variety of databases, assemble
> beautiful
> >     > > dashboards and share their findings.  Superset works neatly with
> > all
> >     > modern
> >     > > SQL-speaking databases, and integrates with Druid.io to provide
> >     > real-time,
> >     > > interactive, blazing fast data access to large datasets.
> >     > >
> >     > > == Background ==
> >     > > Data is mission critical. To succeed in this era, organizations
> > need to
> >     > > provide low-friction, intuitive and interactive access to data.
> It
> > is
> >     > > paramount for knowledge workers to be capable of answering their
> > own
> >     > > questions by querying, exploring and visualizing data.
> >     > >
> >     > > The entire business intelligence industry has pivoted from a
> model
> > of
> >     > > centralized top-down platforms driven by IT organizations to
> > self-service
> >     > > analytics and agile workflows by any user.  This shift unblocks
> >     > centralized
> >     > > service bottlenecks for creating data visualizations while also
> > creating
> >     > an
> >     > > environment that is iterative and fast-moving.  This means that
> > business
> >     > > intelligence software must also be easy and delightful to use.
> >     > > Self-service analytics doesn’t mean that admin and governance
> > features
> >     > are
> >     > > not needed.
> >     > > Modern BI tools provide fine-grain access controls and auditing
> >     > > capabilities to understand how data is being used.  Superset is a
> >     > solution
> >     > > that delivers on all of these vectors.
> >     > >
> >     > > The technology stack is also constantly morphing - vendors are
> > struggling
> >     > > to provide cheap, quick and easy solutions to access data.
> > Business
> >     > > intelligence users are finding existing solutions lacking as
> these
> >     > software
> >     > > products either disregard or react slowly to recent game-changing
> >     > > technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin,
> > d3.js,
> >     > > React.js and iPython’s Jupyter for instance.
> >     > >
> >     > > == Rationale ==
> >     > > Business intelligence is more relevant today than at any other
> > point in
> >     > > history.  Organizations are currently very limited in options for
> > open
> >     > > source data visualization solutions, especially solutions that
> are
> > both
> >     > > self-service and enterprise-ready.  Every company informing their
> >     > decisions
> >     > > with data needs a BI tool.
> >     > >
> >     > > We believe that Superset will be a strong compliment to existing
> > Apache
> >     > > Software Foundation technologies by offering scalable user
> > interactions
> >     > to
> >     > > distributed storage and computation solutions.  Users will often
> > find
> >     > that
> >     > > Superset can act as a catalyst for tooling that can visualize the
> >     > byproduct
> >     > > of data and computation infrastructure.
> >     > >
> >     > > Superset has many key design elements that help fill a gap in
> > current
> >     > > solutions for organizations:
> >     > > * Easy, low friction access to data through a simple, web-based
> > data
> >     > > exploration interface.  Composing charts and dashboards are
> > intuitive.
> >     > > Eliminating the need to write code or SQL empowers anyone to use
> > it.
> >     > > * Access to a wide array of rich, interactive data visualization
> > types.
> >     > > * Enterprise-ready: Integration with different authentication
> > mechanisms
> >     > > and granular permissions centered around actions and data access.
> >     > > * Realtime & fast: Superset provides realtime analytics at the
> > speed of
> >     > > thought on very large datasets when integrated with Druid.io.
> >     > > * Broad data access: Consume data out of any SQL-speaking
> > relational
> >     > > database.
> >     > > * Extensible: Can be extended to talk to many noSQL databases
> like
> > Apache
> >     > > Drill, Elastic Search, and other popular database engines.
> >     > > * Fast loading dashboards with configurable web-scale caching.
> >     > > * Plug-in framework that enables organizations to build custom
> > analytical
> >     > > applications with new UI/UX interfaces.
> >     > > * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking
> > users
> >     > > with more flexibility.  SQL Lab integrates with the visualization
> > engine
> >     > > seamlessly.
> >     > >
> >     > > == Initial Goals ==
> >     > > The initial goals of the Superset project are several-fold:
> >     > > * Move the existing codebase to Apache and integrate with the
> > Apache
> >     > > development process.
> >     > > * Redesign the user interface and interaction model for creating
> >     > > visualizations/dashboards and connecting to data sources
> >     > > * Build robust support for security and governance of the tool
> > including
> >     > > popular authorization modules (including Apache Ranger and Apache
> > Sentry)
> >     > > and a more sophisticated permissions system
> >     > > * Grow the extensibility of the project both in terms of enhanced
> >     > > connectivity to NoSQL-based data sources and creating a plug-in
> > framework
> >     > > that enables organizations to build custom analytical
> applications
> > which
> >     > > require a new UI/UX
> >     > >
> >     > > == Current Status ==
> >     > > By many standards, Superset is already a successful open source
> > project.
> >     > As
> >     > > of March 2017, Superset is officially used in production at about
> > a dozen
> >     > > companies, has received contributions from over one hundred
> > contributors
> >     > on
> >     > > Github, 1500+ forks, and 12k+ stars.
> >     > >
> >     > > Sizeable companies like Airbnb, Yahoo! and Hortonworks have made
> >     > > significant contributions, and expressed their commitment to the
> > project.
> >     > > The product is feature complete and has been viable for months.
> It
> >     > already
> >     > > serves as the main interface for consuming data at many companies
> > of
> >     > > different sizes.
> >     > >
> >     > > While the product is usable, there’s room for improvement across
> > the
> >     > board,
> >     > > starting with providing a smoother user experience around content
> >     > creation,
> >     > > making sure all features work out-of-the-box on more platforms
> and
> >     > > databases, providing better user training guides and videos,
> > having a
> >     > > predictable release process, and increasing the overall quality
> of
> > the
> >     > > Superset releases.
> >     > >
> >     > > === Meritocracy ===
> >     > > We plan to invest in supporting a meritocracy. We will discuss
> the
> >     > > requirements in an open forum. Several companies have expressed
> > interest
> >     > in
> >     > > this project, and we intend to invite additional developers to
> >     > participate.
> >     > > We will encourage and monitor community participation so that
> > privileges
> >     > > can be extended to those that contribute.
> >     > >
> >     > > === Community ===
> >     > > The need for an enterprise-ready data visualization and
> exploration
> >     > > platform in the open source community is tremendous.  While
> > Superset is
> >     > > fairly well known, recognized and used within the Druid.io
> > community,
> >     > > adoption is currently limited outside of that niche. There is a
> > huge
> >     > > opportunity to grow the community to hundreds if not thousands of
> >     > > organizations, and we are hoping that embracing “the Apache way”
> > will
> >     > > accelerate the growth of our community.
> >     > >
> >     > > We have already been active at seeking and inviting
> contributions,
> > and
> >     > are
> >     > > planning to scale the project by investing time and growing the
> > support
> >     > > structure to grow the community.
> >     > >
> >     > > === Core Developers ===
> >     > > The initial committers for Superset include experienced full
> stack,
> >     > > front-end and data engineers:
> >     > > * Maxime Beauchemin (Airbnb)
> >     > > * Alanna Scott (Airbnb)
> >     > > * Bogdan Kyryliuk (Airbnb)
> >     > > * Vera Liu  (Airbnb)
> >     > > * Jeff Feng (Airbnb)
> >     > > * Ashutosh Chauhan (Hortonworks)
> >     > > * Nishant Bangarwa (Hortonworks)
> >     > > * Slim Bouguerra (Hortonworks)
> >     > > * Priyank Shah (Hortonworks)
> >     > > * Sriharsha Chintalapani (Hortonworks)
> >     > > * Daniel Dai (Hortonworks)
> >     > >
> >     > > We realize that additional employer diversity is needed, and we
> > will work
> >     > > aggressively to recruit developers from additional companies.
> >     > >
> >     > > === Alignment ===
> >     > > The initial committers strongly believe that a system for
> > interactive
> >     > > visualization of data will gain broader adoption as an open
> source,
> >     > > community driven project, where the community can contribute not
> > only to
> >     > > the core components, but also to a growing collection of
> > connectors,
> >     > > visualizations and improving integration a all potential data
> > sources.
> >     > > Superset already integrates closely with Apache Hive, the Hive
> > metastore,
> >     > > as well as most SQL-speaking databases found in modern data
> > ecosystems.
> >     > >
> >     > > == Known Risks ==
> >     > >
> >     > > === Orphaned Products ===
> >     > > Superset is a vital component for both visualizing, accessing and
> >     > > democratizing data at Airbnb.  Also at Hortonworks, Superset is a
> > core
> >     > > component of the DataFlow product offering.  Thus, the risk of
> the
> >     > project
> >     > > being orphaned is relatively low.  The project could be at risk
> if
> > Airbnb
> >     > > changes their approach for democratizing data or if Hortonworks
> > changes
> >     > > their strategy in the market.  In such an event, the committers
> > plan to
> >     > > continue working on the project on their own time, thought the
> > progress
> >     > > will likely be slower.  We plan to mitigate this risk by
> recruiting
> >     > > additional committers.
> >     > >
> >     > > === Inexperience with Open Source ===
> >     > > The initial committers include veteran Apache members (committers
> > and
> >     > PPMC
> >     > > members) and other developers who have varying degrees of
> > experience with
> >     > > open source projects. All have been involved with source code
> that
> > has
> >     > been
> >     > > released under an open source license, and several also have
> > experience
> >     > > developing code with an open source development process.
> >     > >
> >     > > === Homogenous Developers ===
> >     > > The initial committers are employed by Airbnb Inc. and
> > Hortonworks. We
> >     > are
> >     > > committed to recruiting additional committers from other
> companies.
> >     > >
> >     > > === Reliance on Salaried Developers ===
> >     > > It is expected that Supe