Open Data Science for All (OpenDS4All)

Projects that follow the best practices below can voluntarily self-certify and show that they've achieved an Open Source Security Foundation (OpenSSF) best practices badge.

If this is your project, please show your badge status on your project page! The badge status looks like this: Badge level for project 9294 is passing Here is how to embed it:

These are the Passing level criteria. You can also view the Silver or Gold level criteria.

        

 Basics 13/13

  • Identification

    The Open Data Science for All (OpenDS4All) resources contain the building blocks for a data science curriculum and learning journey organized by category.

    What programming language(s) are used to implement the project?
  • Basic project website content


    The project website MUST succinctly describe what the software does (what problem does it solve?). [description_good]

    https://github.com/odpi/OpenDS4All - OpenDS4All is a project created to accelerate the creation of data science curricula at academic institutions. While a great deal of online material is available for data science, including online courses, we recognize that the best way for many students to learn (and for many institutions to deliver) content is through a combination of lectures, recitation or flipped classroom activities, and hands-on assignments. OpenDS4All attempts to fill this important niche. Our goal is to provide recommendations, slide sets, sample Jupyter notebooks, and other materials for creating, customizing, and delivering data science and data engineering education.



    The project website MUST provide information on how to: obtain, provide feedback (as bug reports or enhancements), and contribute to the software. [interact]

    https://github.com/odpi/OpenDS4All/blob/master/COMMUNITY-GUIDE.md - This project welcomes contributors from any organization or individual, provided they are willing to follow the simple processes outlined below, as well as adhere to the Code of Conduct.

    https://github.com/odpi/OpenDS4All/blob/master/CONTRIBUTING.md - List of contributing guidelines. All are welcome to contribute.



    Habari juu ya jinsi ya kuchangia LAZIMA ieleze mchakato wa uchangiaji (kwa mfano, je! Maombi ya kuvuta yanatumika?) (URL required) [contribution]

    Non-trivial contribution file in repository: https://github.com/odpi/OpenDS4All/blob/master/CONTRIBUTING.md. - List of contributing guidelines. All are welcome to contribute. The easiest way to create a pull request is by selecting your working branch, and clicking on 'New pull request'. Add your Developer Certificate of Origin and submit your request to the OpenDS4All committers for review and inclusion in the repository. You can submit an issue to https://github.com/odpi/OpenDS4All//issues. If you have any sensitive concerns or wish to report a security issue, please email odpi-opends4all-private@lists.odpi.org instead and do not submit a public issue.



    Habari juu ya jinsi ya kuchangia INAPASWA kujumuisha mahitaji ya michango inayokubalika (k.m., rejeleo la kiwango chochote kinachohitajika cha usimbaji). (URL required) [contribution_requirements]

    https://github.com/odpi/OpenDS4All/blob/master/CONTRIBUTING.md - The minimum requirement for a module to be considered for inclusion in this repository is that it contains:

    a set of PowerPoint slides ( with presenter notes ) 30 or more slides are recommended there must be enough substance in the slide deck to cover at least a 50-minute lecture a Jupyter notebook ( illustrating how material covered in the slides are applied to one or more data sets ) use public data sets that are available for download or accessible through a hyperlink do not assume dependent packages are pre-installed in the user's Jupyter environment import all modules needed to run the code cells successfully keep the markdown cells as simple as possible NB! The Jupyter notebook my be omitted in special cases, such as in Foundational modules where no accompanying data sets exist. But, this should be the exception rather than the rule. a short summary of the module with a set of learning outcomes ( in a text or a markdown file ) 300 or less words are recommended ( for the summary ) use active verbs when formulating outcomes make sure the the outcomes are measurable examples of learning outcomes are understand sampling, probability theory, and probability distributions implement descriptive and inferential statistics using Python demonstrate ability to visualize data and extract insight

    This repository now also accepts data use cases.

    Data use cases should include:

    One or more data sets A description of: The purpose / goal of analyzing this data and what business problem(s) can be solved with similar data (objective)? The data set The origin of the data (source) The features of the data set (attribute information) A Jupyter Notebook illustrating how the data is analysed


  • FLOSS license

    What license(s) is the project released under?



    The software produced by the project MUST be released as FLOSS. [floss_license]

    The Apache-2.0 license is approved by the Open Source Initiative (OSI).



    It is SUGGESTED that any required license(s) for the software produced by the project be approved by the Open Source Initiative (OSI). [floss_license_osi]

    The Apache-2.0 license is approved by the Open Source Initiative (OSI).



    The project MUST post the license(s) of its results in a standard location in their source repository. (URL required) [license_location]

    Non-trivial license location file in repository: https://github.com/odpi/OpenDS4All/blob/master/LICENSE.


  • Documentation


    The project MUST provide basic documentation for the software produced by the project. [documentation_basics]

    https://github.com/odpi/OpenDS4All/blob/master/Instructor_Notes.md - Documentation referred to how instructors should leverage the content produced by the project. https://github.com/odpi/OpenDS4All/blob/master/opends4all-resources/README.md - Teaching resources regarding leveraging content created in a classroom.



    The project MUST provide reference documentation that describes the external interface (both input and output) of the software produced by the project. [documentation_interface]

    The OpenDS4All content is not software, it is educational content. Viewing the content is easy for users with a quick download option.


  • Other


    The project sites (website, repository, and download URLs) MUST support HTTPS using TLS. [sites_https]

    Given only https: URLs.



    The project MUST have one or more mechanisms for discussion (including proposed changes and issues) that are searchable, allow messages and topics to be addressed by URL, enable new people to participate in some of the discussions, and do not require client-side installation of proprietary software. [discussion]

    GitHub supports discussions on issues and pull requests.



    The project SHOULD provide documentation in English and be able to accept bug reports and comments about code in English. [english]

    https://github.com/odpi/OpenDS4All/blob/master/COMMUNITY-GUIDE.md - Pull request information can be found here in English; pull requests and comments can be provided in English.



    The project MUST be maintained. [maintained]

    The Northeast Big Data Innovation Hub in the Data Science Institute at Columbia University in New York is currently managing the OpenDS4All content and evolution. Professional and student volunteers are supporting content development and community growth across various domains and disciplines.



(Advanced) What other users have additional rights to edit this badge entry? Currently: []



  • Public version-controlled source repository


    The project MUST have a version-controlled source repository that is publicly readable and has a URL. [repo_public]

    Repository on GitHub, which provides public git repositories with URLs.



    The project's source repository MUST track what changes were made, who made the changes, and when the changes were made. [repo_track]

    Repository on GitHub, which uses git. git can track the changes, who made them, and when they were made. Resource pages show content additions and release history toward the bottom of the pages. Repository on GitHub, which uses git. git can track the changes, who made them, and when they were made.



    To enable collaborative review, the project's source repository MUST include interim versions for review between releases; it MUST NOT include only final releases. [repo_interim]

    We plan to share interim versions for review to OpenDS4All Team managed by the Northeast Big Data Innovation Hub between releases.



    It is SUGGESTED that common distributed version control software be used (e.g., git) for the project's source repository. [repo_distributed]

    Repository on GitHub, which uses git. git is distributed.


  • Unique version numbering


    The project results MUST have a unique version identifier for each release intended to be used by users. [version_unique]

    We will be consistently posting new content to this repository. Version history will be found toward the bottom of each resource page: https://github.com/odpi/OpenDS4All/tree/master/opends4all-resources/opends4all-exploratory-data-analysis



    It is SUGGESTED that the Semantic Versioning (SemVer) or Calendar Versioning (CalVer) version numbering format be used for releases. It is SUGGESTED that those who use CalVer include a micro level value. [version_semver]


    It is SUGGESTED that projects identify each release within their version control system. For example, it is SUGGESTED that those using git identify each release using git tags. [version_tags]

    The team will work on using git tags to identify each upcoming release.


  • Release notes


    The project MUST provide, in each release, release notes that are a human-readable summary of major changes in that release to help users determine if they should upgrade and what the upgrade impact will be. The release notes MUST NOT be the raw output of a version control log (e.g., the "git log" command results are not release notes). Projects whose results are not intended for reuse in multiple locations (such as the software for a single website or service) AND employ continuous delivery MAY select "N/A". (URL required) [release_notes]

    Release notes and content additions can be found toward the bottom of the resource pages. Please find an example page here: https://github.com/odpi/OpenDS4All/tree/master/opends4all-resources/opends4all-data-wrangling-and-integration



    The release notes MUST identify every publicly known run-time vulnerability fixed in this release that already had a CVE assignment or similar when the release was created. This criterion may be marked as not applicable (N/A) if users typically cannot practically update the software themselves (e.g., as is often true for kernel updates). This criterion applies only to the project results, not to its dependencies. If there are no release notes or there have been no publicly known vulnerabilities, choose N/A. [release_notes_vulns]

    Software is not being developed throughout this repository. Learner and educator content will be continuously uploaded to this repository. There have been no publicly known vulnerabilities attributed to this repository or its use.


  • Bug-reporting process


    The project MUST provide a process for users to submit bug reports (e.g., using an issue tracker or a mailing list). (URL required) [report_process]

    https://github.com/odpi/OpenDS4All/blob/master/COMMUNITY-GUIDE.md - Users may submit bug reports. Instructions to do so can be found on the linked page.



    The project SHOULD use an issue tracker for tracking individual issues. [report_tracker]

    The team will be using an issue tracker to track individual issues.



    The project MUST acknowledge a majority of bug reports submitted in the last 2-12 months (inclusive); the response need not include a fix. [report_responses]

    All bug reports in the last 2-12 months have been acknowledged. We will continue to acknowledge them as they arise.



    The project SHOULD respond to a majority (>50%) of enhancement requests in the last 2-12 months (inclusive). [enhancement_responses]

    All enhancement requests have been responded to. We will continue to respond as they arise.



    The project MUST have a publicly available archive for reports and responses for later searching. (URL required) [report_archive]

    https://github.com/odpi/OpenDS4All/pulls - All pull requests and reports can be found here.


  • Vulnerability report process


    The project MUST publish the process for reporting vulnerabilities on the project site. (URL required) [vulnerability_report_process]

    https://github.com/odpi/OpenDS4All/blob/master/COMMUNITY-GUIDE.md - Issues and vulnerabilities may be reported using the instructions found on this webpage.



    If private vulnerability reports are supported, the project MUST include how to send the information in a way that is kept private. (URL required) [vulnerability_report_private]

    https://github.com/odpi/OpenDS4All/blob/master/COMMUNITY-GUIDE.md - All private reports will be kept private and handled/managed privately.



    The project's initial response time for any vulnerability report received in the last 6 months MUST be less than or equal to 14 days. [vulnerability_report_response]

  • Working build system


    Ikiwa programu iliyotengenezwa na mradi inahitaji ujenzi wa matumizi, mradi LAZIMA utoe mfumo wa kujenga ambao unaweza kujenga programu kiotomatiki kutoka kwa chanzo-msimbo. [build]

    We do not require building for use. The project does not produce software.



    INAPENDEKEZWA kuwa zana za kawaida zitumike kujenga programu. [build_common_tools]

    The project does not produce software.



    Mradi UNAPASWA kujengwa kwa kutumia zana za FLOSS pekee yake. [build_floss_tools]

    We do not require building for use. The project does not produce software.


  • Automated test suite


    The project MUST use at least one automated test suite that is publicly released as FLOSS (this test suite may be maintained as a separate FLOSS project). The project MUST clearly show or document how to run the test suite(s) (e.g., via a continuous integration (CI) script or via documentation in files such as BUILD.md, README.md, or CONTRIBUTING.md). [test]

    We do not require an automated test suite. The project does not produce software.



    A test suite SHOULD be invocable in a standard way for that language. [test_invocation]

    We do not require an automated test suite. We do not produce software.



    It is SUGGESTED that the test suite cover most (or ideally all) the code branches, input fields, and functionality. [test_most]

    We do not require a test suite. We do not produce software.



    It is SUGGESTED that the project implement continuous integration (where new or changed code is frequently integrated into a central code repository and automated tests are run on the result). [test_continuous_integration]

    We do plan to add additional Jupyter notebooks and Google Colab notebooks that include code to the repository. They will be tested prior to their addition. These new projects will be frequently and continuously integrated into the central repository and its resource pages.


  • New functionality testing


    The project MUST have a general policy (formal or not) that as major new functionality is added to the software produced by the project, tests of that functionality should be added to an automated test suite. [test_policy]

    We do not produce software. Functionality of the repository itself will be regularly tested.



    The project MUST have evidence that the test_policy for adding tests has been adhered to in the most recent major changes to the software produced by the project. [tests_are_added]

    We do not produce software. Functionality of the repository itself will be regularly tested.



    It is SUGGESTED that this policy on adding tests (see test_policy) be documented in the instructions for change proposals. [tests_documented_added]

    We do not produce software. Functionality of the repository itself will be regularly tested.


  • Warning flags


    The project MUST enable one or more compiler warning flags, a "safe" language mode, or use a separate "linter" tool to look for code quality errors or common simple mistakes, if there is at least one FLOSS tool that can implement this criterion in the selected language. [warnings]

    FLOSS tools are currently not leveraged throughout the repository.



    The project MUST address warnings. [warnings_fixed]

    The project team will address warnings as they arise. However, this project does not produce software.



    It is SUGGESTED that projects be maximally strict with warnings in the software produced by the project, where practical. [warnings_strict]

    This project does not produce software.


  • Secure development knowledge


    The project MUST have at least one primary developer who knows how to design secure software. (See ‘details’ for the exact requirements.) [know_secure_design]

    The Northeast Big Data Innovation Hub HQ Team will support the security of the repository for all of the parameters above. We will address warnings and vulnerabilities as they arise.



    At least one of the project's primary developers MUST know of common kinds of errors that lead to vulnerabilities in this kind of software, as well as at least one method to counter or mitigate each of them. [know_common_errors]

    The Northeast Big Data Innovation Hub HQ Team will support the security of the repository for all of the parameters above. The Northeast Big Data Innovation Hub HQ Team will address warnings and vulnerabilities as they arise.


  • Use basic good cryptographic practices

    Note that some software does not need to use cryptographic mechanisms. If your project produces software that (1) includes, activates, or enables encryption functionality, and (2) might be released from the United States (US) to outside the US or to a non-US-citizen, you may be legally required to take a few extra steps. Typically this just involves sending an email. For more information, see the encryption section of Understanding Open Source Technology & US Export Controls.

    Programu iliyotengenezwa na mradi LAZIMA itumie, kwa chaguo-msingi, tu itifaki za kriptografia na mifumbo ambazo zimechapishwa hadharani na kukaguliwa na wataalam (ikiwa itifaki za kriptografia na mafumbo imetumika). [crypto_published]

    This project is not producing software.



    Ikiwa programu iliyotengenezwa na mradi ni programu au maktaba, na kusudi lake la msingi sio kutekeleza usimbuaji, basi INAPASWA tu kuita programu iliyoundwa kihususa kutekeleza kazi za kielelezo; HAIPASWI kutekeleza-upya shughuli hiyo. [crypto_call]

    This project is not producing software.



    Utendaji wote katika programu iliyotengenezwa na mradi ambayo inategemea usimbuaji LAZIMA iweze kutekelezwa kwa kutumia FLOSS. [crypto_floss]

    This project is not producing software.



    Mifumo ya usalama ndani ya programu inayozalishwa na mradi LAZIMA itumie kwa msingi keylengths ambazo angalau zinakidhi mahitaji ya chini ya NIST kufikia mwaka wa 2030 (kama ilivyoelezwa mnamo 2012). LAZIMA iwe rahisi kusanidi programu ili keylengths ndogo zimezimwa kabisa. [crypto_keylength]

    This project is not producing software.



    The default security mechanisms within the software produced by the project MUST NOT depend on broken cryptographic algorithms (e.g., MD4, MD5, single DES, RC4, Dual_EC_DRBG), or use cipher modes that are inappropriate to the context, unless they are necessary to implement an interoperable protocol (where the protocol implemented is the most recent version of that standard broadly supported by the network ecosystem, that ecosystem requires the use of such an algorithm or mode, and that ecosystem does not offer any more secure alternative). The documentation MUST describe any relevant security risks and any known mitigations if these broken algorithms or modes are necessary for an interoperable protocol. [crypto_working]

    This project is not producing software.



    The default security mechanisms within the software produced by the project SHOULD NOT depend on cryptographic algorithms or modes with known serious weaknesses (e.g., the SHA-1 cryptographic hash algorithm or the CBC mode in SSH). [crypto_weaknesses]

    This project is not producing software.



    Mifumo ya usalama ndani ya programu iliyotengenezwa na mradi INAPASWA kutekeleza kwa ukamilifu usiri wa umbele ya itifaki za makubaliano ya funguo ili funguo la kipindi kilicho tokana na kikao cha vifungo muda-mrefu haziwezi kuridhi mabaya ikiwa mojawapo ya vifunguo vya muda-mrefu imeridhi mabaya katika usoni. [crypto_pfs]

    This project is not producing software.



    Ikiwa programu iliyotengenezwa na mradi imesababisha uhifadhi wa nywila kwa minajili ya uthibitishaji ya watumiaji wa kutoka nje, nywila LAZIMA zihifadhiwe kwa mficho uliorudiarudia na chumvi kwa kila-mtumiaji kwa kutumia kanuni ya upanuaji (rudiarudia) wa funguo (k.m., Argon2id, Bcrypt, Scrypt, or PBKDF2). Ona pia Kurasadogo ya Uhifadhi wa Nywila la OWASP). [crypto_password_storage]

    This project is not producing software.



    Mifumo ya usalama ndani ya programu iliyotengenezwa na mradi LAZIMA itoe funguo zote za kriptologia na nonces kwa kutumia kitengeneza cha nambari za bahati kuptia kriptologia salama, na ISIWEZE kufanya hivo kutumia vitengenezi zisizo salama kikriptologia. [crypto_random]

    This project is not producing software.


  • Secured delivery against man-in-the-middle (MITM) attacks


    The project MUST use a delivery mechanism that counters MITM attacks. Using https or ssh+scp is acceptable. [delivery_mitm]

    Our project uses https://



    A cryptographic hash (e.g., a sha1sum) MUST NOT be retrieved over http and used without checking for a cryptographic signature. [delivery_unsigned]

    We will not be using hashes within this project.


  • Publicly known vulnerabilities fixed


    There MUST be no unpatched vulnerabilities of medium or higher severity that have been publicly known for more than 60 days. [vulnerabilities_fixed_60_days]

    There are no known unpatched vulnerabilities. All that arise will be patched within 60 days.



    Projects SHOULD fix all critical vulnerabilities rapidly after they are reported. [vulnerabilities_critical_fixed]

    We will fix all critical vulnerabilities rapidly as they are reported.


  • Other security issues


    The public repositories MUST NOT leak a valid private credential (e.g., a working password or private key) that is intended to limit public access. [no_leaked_credentials]

    The public repository will not leak any private information.


  • Static code analysis


    At least one static code analysis tool (beyond compiler warnings and "safe" language modes) MUST be applied to any proposed major production release of the software before its release, if there is at least one FLOSS tool that implements this criterion in the selected language. [static_analysis]

    This project does not produce software.



    It is SUGGESTED that at least one of the static analysis tools used for the static_analysis criterion include rules or approaches to look for common vulnerabilities in the analyzed language or environment. [static_analysis_common_vulnerabilities]


    All medium and higher severity exploitable vulnerabilities discovered with static code analysis MUST be fixed in a timely way after they are confirmed. [static_analysis_fixed]


    It is SUGGESTED that static source code analysis occur on every commit or at least daily. [static_analysis_often]

  • Dynamic code analysis


    It is SUGGESTED that at least one dynamic analysis tool be applied to any proposed major production release of the software before its release. [dynamic_analysis]

    This project does not produce software.



    It is SUGGESTED that if the software produced by the project includes software written using a memory-unsafe language (e.g., C or C++), then at least one dynamic tool (e.g., a fuzzer or web application scanner) be routinely used in combination with a mechanism to detect memory safety problems such as buffer overwrites. If the project does not produce software written in a memory-unsafe language, choose "not applicable" (N/A). [dynamic_analysis_unsafe]

    This project does not produce software.



    It is SUGGESTED that the project use a configuration for at least some dynamic analysis (such as testing or fuzzing) which enables many assertions. In many cases these assertions should not be enabled in production builds. [dynamic_analysis_enable_assertions]

    This project does not produce software.



    All medium and higher severity exploitable vulnerabilities discovered with dynamic code analysis MUST be fixed in a timely way after they are confirmed. [dynamic_analysis_fixed]

    This project does not produce software.



This data is available under the Creative Commons Attribution version 3.0 or later license (CC-BY-3.0+). All are free to share and adapt the data, but must give appropriate credit. Please credit Emily-Rothenberg and the OpenSSF Best Practices badge contributors.

Project badge entry owned by: Emily-Rothenberg.
Entry created on 2024-08-05 18:42:06 UTC, last updated on 2024-08-06 15:39:08 UTC. Last achieved passing badge on 2024-08-06 15:39:08 UTC.

Back