thecrowler

遵循以下最佳实践的项目将能够自愿的自我认证,并显示他们已经实现了核心基础设施计划(OpenSSF)徽章。

如果这是您的项目,请在您的项目页面上显示您的徽章状态!徽章状态如下所示: 项目8344的徽章级别为passing 这里是如何嵌入它:

这些是通过级别条款。您还可以查看白银黄金级别条款。

        

 基本 13/13

  • 识别

    The CROWler is a specialized web crawler developed to efficiently navigate and index web pages. This tool leverages the robust capabilities of Selenium and Google Chrome (to covertly crawl a site), offering a reliable and precise crawling experience. It is designed with user customization in mind, allowing users to specify the scope and targets of their crawling tasks.

    用什么编程语言实现项目?
  • 基本项目网站内容


    项目网站必须简明扼要地描述软件的作用(它解决了什么问题?)。 [description_good]

    On the repository README.md we have:

    What is it? The CROWler is a specialized web crawler developed to efficiently navigate and index web pages. This tool leverages the robust capabilities of Selenium and Google Chrome (to covertly crawl a site), offering a reliable and precise crawling experience. It is designed with user customization in mind, allowing users to specify the scope and targets of their crawling tasks.

    To enhance its functionality, CROWler includes a suite of command-line utilities. These utilities facilitate seamless management of the crawler's database, enabling users to effortlessly add or remove websites from the Sources list. Additionally, the system is equipped with an API, providing a streamlined interface for database queries. This feature ensures easy integration and access to indexed data for various applications.

    Please check: https://github.com/pzaino/thecrowler#the-crowler



    项目网站必须提供有关如何获取和提供反馈(错误报告或增强功能)以及如何贡献的信息。 [interact]

    On the contributing.md we have this section:

    Report bugs using Github's issues We use GitHub issues to track public bugs. Report a bug by opening a new issue; it's that easy!

    Please check: https://github.com/pzaino/thecrowler/blob/main/CONTRIBUTING.md



    关于如何贡献的信息必须解释贡献流程(例如,是否使用拉请求?) (需要网址) [contribution]

    Non-trivial contribution file in repository: https://github.com/pzaino/thecrowler/blob/main/CONTRIBUTING.md.



    关于如何贡献的信息应包括对可接受的贡献的要求(例如,引用任何所需的编码标准)。 (需要网址) [contribution_requirements]

    In the contributing.md file it's clearly explained that we use:

    Use a Consistent Coding Style 4 spaces for indentation rather than tabs You can try running gofmt for style unification

    Please check: https://github.com/pzaino/thecrowler/blob/main/CONTRIBUTING.md


  • FLOSS许可证

    项目使用什么许可证发布?



    项目生产的软件必须作为FLOSS发布。 [floss_license]

    The Apache-2.0 license is approved by the Open Source Initiative (OSI).



    建议由项目生成的软件的任何必需的许可证是由开放源码促进会(OSI)批准的许可证(英文)[floss_license_osi]

    The Apache-2.0 license is approved by the Open Source Initiative (OSI).



    项目必须将其许可证在其源代码存储库中的标准位置发布。 (需要网址) [license_location]

    Non-trivial license location file in repository: https://github.com/pzaino/thecrowler/blob/main/LICENSE.


  • 文档


    项目必须为项目生成的软件提供基本文档。 [documentation_basics]

    Some documentation basics file contents found.



    项目必须提供描述项目生成的软件的外部接口(输入和输出)的参考文档。 [documentation_interface]

    The project provides comprehensive documentation on how to install, build from source and use.

    The README.md provides insight on how to build and instal and configure. While the Doc section shows how to use it.

    Link to README.md: https://github.com/pzaino/thecrowler#the-crowler Link to full documentation: https://github.com/pzaino/thecrowler/tree/main/doc Specifically on how to use: https://github.com/pzaino/thecrowler/blob/main/doc/usage.md


  • 其他


    项目网站(网站,存储库和下载URL)必须使用TLS支持HTTPS。 [sites_https]

    Given only https: URLs.



    该项目必须有一个或多个讨论机制(包括建议的更改和问题),可搜索,允许通过URL访问消息和主题,使新人能够参与一些讨论,并且不需要客户端安装专有软件。 [discussion]

    GitHub supports discussions on issues and pull requests.



    项目应该提供英文文档,并能够接受英文的代码的错误报告和评论。 [english]

    Documentation as linked above is ALL in English Language and reviewed to ensure clarity to the best of our abilities. Bug report is accepted through the usual github Issues.



    必须维护该项目。 [maintained]


(高级)哪些用户还有额外权限编辑此徽章条目?目前:[]



  • 公开的版本控制的源代码存储库


    该项目必须有一个版本控制的源代码存储库。它必须是公开可读的并可通过URL访问。 [repo_public]

    Repository on GitHub, which provides public git repositories with URLs.



    项目的源代码存储库必须跟踪所做的更改,谁进行了更改,何时进行了更改。 [repo_track]

    Repository on GitHub, which uses git. git can track the changes, who made them, and when they were made.



    为了实现协作检视,项目的源代码存储库必须包括临时版本,以便检视版本之间的变化;它不得仅包括最终版本。 [repo_interim]

    The project includes interim releases (and also pre-releases for testing) as well as having an official Develop branch that is used to push code that needs also human testing, not just automated. On top of all this the repository has plenty of automated quality tests, the code comes with unit tests which are mandatory also for new features and has also automated coding style checks.



    建议使用通用分布式版本控制软件(例如,git)作为项目的源代码存储库。 [repo_distributed]

    Repository on GitHub, which uses git. git is distributed.


  • 唯一版本编号


    项目生成的用于每个用户使用的版本必须具有唯一版本标识符。 [version_unique]

    The project does have unique versions numbering and we version also RCs when and if available.



    建议使用语义版本控制(SemVer)格式进行发布。 [version_semver]


    建议项目识别其版本控制系统中的每个版本。例如,建议使用git的项目,使用git标签识别每个版本。 [version_tags]

    Developers and contributors push their PRs against Develop branch, not Main (Main is protected and requires a PR from Develop with approval). Devs and contributors PRs against Develop are tested automatically and reviewed by humans before code gets merged into Develop (same happens for code that gets merged from Develop to Main). Devs and Contributors are also required to install pre-commit which is configured to run a lot of tests (and run also all unit tests) at every got commit. When code has passed ALL automated tests and human reviews and merged in Main, then an RC tag is emitted for testing on 3rd party systems. an RC has a period of a month, to give time to user to test. If no issues are reported, then the Main gets released again with the final release number

    Checks in pre-commit here: https://github.com/pzaino/thecrowler/blob/main/.pre-commit-config.yaml Tags here: https://github.com/pzaino/thecrowler/tags

    Project is new, so not official releases has been emitted yet.


  • 发行说明


    该项目必须在每个版本中提供发布说明,这是该版本中主要变化的可读的摘要,以帮助用户确定是否应升级,升级影响将如何。发行说明不能是版本控制日志的原始输出(例如,“git log”命令结果不是发行说明)。其产出不适用于多个地点的项目(如单个网站或服务的软件),并采用持续交付,可以选择“N/A”。 (需要网址) [release_notes]

    Each release (and pre-release) have a fully detailed release note: https://github.com/pzaino/thecrowler/releases/tag/v0.9.3

    Full changelog example: https://github.com/pzaino/thecrowler/compare/v0.9.2...v0.9.3



    发行说明必须列出每个新版本中修复的每个公开的漏洞。如果没有发行说明或者没有公开的漏洞,选择“不适用”。 [release_notes_vulns]

    The release note will indeed include all publicly known run-time vulnerabilities when we'll start releasing production versions.


  • 错误报告流程


    项目必须为用户提交错误报告(例如,使用问题跟踪器或邮件列表)提供相关流程。 (需要网址) [report_process]

    项目必须使用问题跟踪器来跟踪每个问题。 [report_tracker]

    We use GitHub Issues number to track each individual issue



    该项目必须响应过去2-12个月内(含)提交的大多数错误报告;响应不需要包括修复。 [report_responses]

    We aim to have ALL found issues public (we use GitHub for users to report them) and we acknowledge them all as soon as we are able to reproduce, so it could be even minutes.



    该项目应该对过去2-12个月内(包括)的大部分(> 50%)的增强请求作出回应。 [enhancement_responses]

    We do our best to respond to enhancement requests as fast as possible, but the project if fully free, so involved people also need to ensure they do their jobs first to then have time for the project.



    该项目必须有一个公开的报告和回复的档案供后续搜索。 (需要网址) [report_archive]

    We use GitHub issues and Discussions, so everything is public. https://github.com/pzaino/thecrowler/issues?q=is%3Aissue+is%3Aclosed


  • 漏洞报告流程


    项目必须在项目网站上发布报告漏洞的流程。 (需要网址) [vulnerability_report_process]

    如果支持私有漏洞报告,项目必须包括如何以保密的方式发送信息。 (需要网址) [vulnerability_report_private]

    Private vulnerability reports are supported and they can be submitted via email to the project author.



    该项目在过去6个月收到的任何漏洞报告的初始响应时间必须小于或等于14天。 [vulnerability_report_response]

    We check the project GitHub and the emails every day. So, every vulnerability found and reported will be checked certainly within 14 working days.


  • 可工作的构建系统


    如果项目生成的软件需要构建使用,项目必须提供可以从源代码自动重新构建软件的可工作的构建系统。 [build]

    The project provides a working build system to rebuild the entire set of micro-services at once:

    https://github.com/pzaino/thecrowler/blob/main/docker-rebuild.sh



    建议使用通用工具来构建软件。 [build_common_tools]

    The project uses Docker and Docker compose to create containerized builds of all the required components and it also offers build scripts that add platform detection so to help Docker to pull or build containers appropriately.

    https://github.com/pzaino/thecrowler/blob/main/docker-compose.yml

    https://github.com/pzaino/thecrowler/blob/main/docker-build.sh



    该项目应该仅使用FLOSS工具来构建。 [build_floss_tools]

    The project IS buildable using ONLY FLOSS tools, to build it one needs go lang only. Database is based on PostgreSQL and SQLite, components are packaged in Docker images at build time and using freely available tools. It works fine with Docker provided with Linux Distribution (aka doesn't requires to install docker from docker.io)


  • 自动测试套件


    该项目必须使用至少一个作为FLOSS公开发布的自动测试套件(该测试套件可以作为单独的FLOSS项目维护)。 [test]

    测试套件应该以该语言的标准方式进行调用。 [test_invocation]

    A user can run go test to run unit tests locally on their system and at any time.



    建议测试套件覆盖大部分(或理想情况下所有)代码分支,输入字段和功能。 [test_most]

    The entire suite of tests covers everything. And we are also adding system tests to measure performance under stress.



    建议项目实施持续集成,将新的或更改的代码经常集成到中央代码库中,并对结果进行自动化测试。 [test_continuous_integration]

    The project allows to do continuous integration. Contains can be built off-line and replace existing containers too.


  • 新功能测试


    该项目必须有通用的策略(正式或非正式),当主要的新功能被添加到项目生成的软件中,该功能的测试应该同时添加到自动测试套件。 [test_policy]

    From the CONTRIBUTING.md file:

    "If you've added code that should be tested, add tests."

    link here: https://github.com/pzaino/thecrowler/blob/main/CONTRIBUTING.md



    该项目必须有证据表明,在项目生成的软件的最近重大变化中,已经遵守了添加测试的条款: test_policy [tests_are_added]

    建议您在更改提案的说明文档中添加测试策略要求(请参阅test_policy)。 [tests_documented_added]

    It is:

    "If you've added code that should be tested, add tests. For more information on testing, see Test Policy for TheCROWler."

    From the CONTRIBUTING.MD file, link here: https://github.com/pzaino/thecrowler/blob/main/CONTRIBUTING.md


  • 警告标志


    该项目必须启用一个或多个编译器警告标志,“安全”语言模式,或者使用单独的“linter”工具查找代码质量错误或常见的简单错误,如果至少有一个FLOSS工具可以在所选择的语言实现此条款。 [warnings]

    Project is written in go (and gofmt) and pre-commit forbid a developer from even being able to push their code in a develop branch. tests are executed even when code compiles fine and we have all go lang warning.



    该项目必须处理警告。 [warnings_fixed]

    As mentioned, we do address warnings.



    建议在实际情况下,项目以最严格方式对待项目生成的软件中的告警。 [warnings_strict]

    We use Go lang and gofmt as well as: - go-fmt - no-go-testing - golangci-lint - go-unit-tests

    https://github.com/pzaino/thecrowler/blob/main/.pre-commit-config.yaml


  • 安全开发知识


    该项目必须至少有一个主要开发人员知道如何设计安全软件。 [know_secure_design]

    We do have a primary developer with multiple courses and more than 30 years of experience in coding and secure coding.



    该项目的主要开发人员中,至少有一个必须知道导致这类型软件漏洞的常见错误类型,以及至少有一种方法来对付或缓解这些漏洞。 [know_common_errors]

    Our primary developer works in the Cyber Security field and has experience in developing and designing IDS, IPS, Firewalls and Anitvirus software and deals daily with vulnerabilities and CVEs, CWEs and CPEs


  • 使用基础的良好加密实践

    请注意,某些软件不需要使用加密机制。

    项目生成的软件默认情况下,只能使用由专家公开发布和审查的加密协议和算法(如果使用加密协议和算法)。 [crypto_published]


    如果项目生成的软件是应用程序或库,其主要目的不是实现加密,那么它应该只调用专门设计实现加密功能的软件,而不应该重新实现自己的。 [crypto_call]

    Software uses standard HTTPS for the few cryptographic elements and for that uses the standard go lang libraries.



    项目所产生的软件中,所有依赖于密码学的功能必须使用FLOSS实现。 [crypto_floss]


    项目生成的软件中的安全机制使用的默认密钥长度必须至少达到2030年(如2012年所述)的NIST最低要求。必须提供配置,以使较小的密钥长度被完全禁用。 [crypto_keylength]


    项目产生的软件中的默认安全机制不得取决于已被破解的密码算法(例如,MD4,MD5,单DES,RC4,Dual_EC_DRBG)或使用不适合上下文的密码模式(例如,ECB模式几乎不适当,因为它揭示了密文中相同的块,如 ECB企鹅所示。CTR模式通常是不合适的,因为如果重复输入状态,则它不执行认证并导致重复)。 [crypto_working]


    由项目产生的软件中的默认安全机制不应该依赖于具有已知严重弱点的加密算法或模式(例如,SHA-1密码散列算法或SSH中的CBC模式)。 [crypto_weaknesses]


    项目产生的软件中的安全机制应该​​对密钥协商协议实施完美的前向保密(PFS),如果长期密钥集合中的一个长期密钥在将来泄露,也不能破坏从一组长期密钥导出的会话密钥。 [crypto_pfs]


    如果项目产生的软件存储用于外部用户认证的密码,则必须使用密钥拉伸(迭代)算法(例如,PBKDF2,Bcrypt或Scrypt)将密码存储为每用户盐值不同的迭代散列 。 [crypto_password_storage]


    由项目生成的软件中的安全机制必须使用密码学安全的随机数生成器生成所有加密密钥和随机数,并且不得使用密码学不安全的生成器。 [crypto_random]

  • 安全交付防御中间人(MITM)的攻击


    该项目必须使用一种针对MITM攻击的传递机制。使用https或ssh + scp是可以接受的。 [delivery_mitm]


    不得通过http协议获取加密散列(例如,sha1sum)并直接使用,而不检查密码学签名。 [delivery_unsigned]

  • 修正公开的漏洞


    被公开了超过60天的中等或更高严重程度的漏洞,必须被修复。 [vulnerabilities_fixed_60_days]

    We also use a vulnerability bot that automatically creates pull-requests with updated dependencies when one is found having a vulnerability.



    项目在得到报告后应该迅速修复所有致命漏洞。 [vulnerabilities_critical_fixed]

  • 其他安全问题


    公共存储库不得泄漏旨在限制公众访问的有效私人凭证(例如,工作密码或私钥)。 [no_leaked_credentials]

    we check for private credentials from the developer local system using pre-commit and re-check on GitHub as well


  • 静态代码分析


    如果至少有一个FLOSS工具以所选择的语言实现此条款,则至少需要将一个静态代码分析工具应用于软件发布之前任何提议的主要生成版本。 [static_analysis]

    SonarQube (locally), CodeQL, Codacy on GitHub.com



    建议至少有一个用于static_analysis标准的静态分析工具包括在分析语言或环境中查找常见漏洞的规则或方法。 [static_analysis_common_vulnerabilities]


    使用静态代码分析发现的所有中,高严重性可利用漏洞必须在确认后及时修复。 [static_analysis_fixed]


    建议每次提交或至少每天执行静态源代码分析。 [static_analysis_often]

    We also use SonarCube LINT in VSCode studio so security issues are detected as we write code (real time), we check again with sonar-scanner before commit and with CodeQL and Codacy at every commit on gitHub


  • 动态代码分析


    建议在发布之前,至少将一个动态分析工具应用于软件任何发布的主要生产版本。 [dynamic_analysis]


    建议如果项目生成的软件包含使用内存不安全语言编写的软件(例如C或C++),则至少有一个动态工具(例如,fuzzer或web应用扫描程序)与检测缓冲区覆盖等内存安全问题的机制例行应用。如果该项目生成的软件没有以内存不安全语言编写,请选择“不适用”(N / A)。 [dynamic_analysis_unsafe]

    project is fully written in Go Lang, so no memory-unsafe code.



    建议由项目生成的软件包括许多运行时断言,在动态分析期间检查。 [dynamic_analysis_enable_assertions]


    通过动态代码分析发现的所有严重性为中,高的可利用漏洞必须在确认后及时修复。 [dynamic_analysis_fixed]

    Not only we do, we also generate our containers hardened and code is executed with low privilege accounts inside the containers. Where applicable containers have also read only filesystem enabled.



此数据在知识共享署名3.0或更高版本许可证(CC-BY-3.0 +) 下可用。所有内容都可以自由分享和演绎,但必须给予适当的署名。请署名为Paolo Fabio Zaino和OpenSSF最佳实践徽章贡献者。

项目徽章条目拥有者: Paolo Fabio Zaino.
最后更新于 2024-01-25 23:46:04 UTC, 最后更新于 2024-06-04 01:29:01 UTC。 最后在 2024-06-04 01:29:01 UTC 获得通过徽章。

后退