Popularity

1.7

Growing

Activity

0.0

Declining

Stars 27

Watchers 3

Forks 8

Last Commit 4 months ago

Programming language: C#

License: MIT License

Tags: Parser CLI Bots Crawling

Robots Exclusion Tools alternatives and similar packages

Based on the "CLI" category.
Alternatively, view Robots Exclusion Tools alternatives based on common mentions on social networks and blogs.

Gui.cs

9.5 9.2 Robots Exclusion Tools VS Gui.cs

Cross Platform Terminal UI toolkit for .NET
spectre.console

9.4 8.5 Robots Exclusion Tools VS spectre.console

A .NET library that makes it easier to create beautiful console applications.

WorkOS - The modern identity platform for B2B SaaS

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

Promo workos.com

Command Line Parser

8.7 0.0 Robots Exclusion Tools VS Command Line Parser

The best C# command line parser that brings standardized *nix getopt style, for .NET. Includes F# support
CliWrap

8.5 8.5 Robots Exclusion Tools VS CliWrap

Library for running command-line processes
CommandLineUtils

7.4 4.6 Robots Exclusion Tools VS CommandLineUtils

Command line parsing and utilities for .NET
Colorful.Console

6.5 0.0 L4 Robots Exclusion Tools VS Colorful.Console

Style your .NET console output!
CliFx

6.5 8.2 Robots Exclusion Tools VS CliFx

Class-first framework for building command-line interfaces
Sieve

6.4 0.0 Robots Exclusion Tools VS Sieve

⚗️ Clean & extensible Sorting, Filtering, and Pagination for ASP.NET Core
ReadLine

5.6 0.0 L4 Robots Exclusion Tools VS ReadLine

A Pure C# GNU-Readline like library for .NET/.NET Core
Fluent Command Line Parser

5.1 0.0 L4 Robots Exclusion Tools VS Fluent Command Line Parser

A simple, strongly typed .NET C# command line parser library using a fluent easy to use interface
Console Framework

5.1 0.0 L2 Robots Exclusion Tools VS Console Framework

Cross-platform toolkit for easy development of TUI applications.
Power Args

5.0 4.5 L3 Robots Exclusion Tools VS Power Args

The ultimate .NET Standard command line argument parser
UnionArgParser

4.8 7.8 Robots Exclusion Tools VS UnionArgParser

A declarative CLI argument parser for F#
CommandDotNet

4.7 5.9 Robots Exclusion Tools VS CommandDotNet

A modern framework for building modern CLI apps
CsConsoleFormat

4.2 0.0 Robots Exclusion Tools VS CsConsoleFormat

.NET C# library for advanced formatting of console output [Apache]
Docopt

4.1 0.0 Robots Exclusion Tools VS Docopt

Port of docopt to .net
Typin

3.4 0.0 Robots Exclusion Tools VS Typin

Declarative framework for interactive CLI applications
EntryPoint

2.9 0.0 L5 Robots Exclusion Tools VS EntryPoint

Composable CLI Argument Parser for all modern .Net platforms.
clipr

2.6 0.0 L3 Robots Exclusion Tools VS clipr

Command Line Interface ParseR for .Net
SharpNetSH

2.4 0.0 Robots Exclusion Tools VS SharpNetSH

A simple netsh library for C#
DotMake Command-Line

1.9 8.8 Robots Exclusion Tools VS DotMake Command-Line

Declarative syntax for System.CommandLine via attributes for easy, fast, strongly-typed (no reflection) usage. Includes a source generator which automagically converts your classes to CLI commands and properties to CLI options or CLI arguments.
NFlags

1.9 2.6 Robots Exclusion Tools VS NFlags

Simple yet powerfull library to made parsing CLI arguments easy. Library also allow to print usage help "out of box".
Sitemap Tools

1.8 4.9 Robots Exclusion Tools VS Sitemap Tools

A sitemap (sitemap.xml) querying and parsing library for .NET
Appccelerate - Command Line Parser

1.8 0.0 L5 Robots Exclusion Tools VS Appccelerate - Command Line Parser

A simple command line parser with fluent definition API.
RunInfoBuilder

1.7 0.0 Robots Exclusion Tools VS RunInfoBuilder

A unique command line parser for .NET that utilizes object trees for commands.
Jarilo

1.4 0.0 Robots Exclusion Tools VS Jarilo

Framework for building .NET command line applications.
JustCli

1.4 0.0 Robots Exclusion Tools VS JustCli

Just a quick way to create your own command line tool
DarkXaHTeP.CommandLine

1.0 0.0 Robots Exclusion Tools VS DarkXaHTeP.CommandLine

Allows creating CommandLine applications using Microsoft.Extensions.CommandLineUtils together with DI, Configuration and Logging in a convenient way similar to AspNetCore Hosting
Tamar.ANSITerm

1.0 0.0 Robots Exclusion Tools VS Tamar.ANSITerm

“ANSITerm” provides ANSI escape codes and true color formatting for .NET Core's Console on Linux terminals.
coptions

0.8 8.4 Robots Exclusion Tools VS coptions

DISCONTINUED. The Best Command Line Options Parser for .Net

* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.

Do you think we are missing an alternative of Robots Exclusion Tools or a related project?

Add another 'CLI' Package

Popular Comparisons

README

[Icon](images/icon.png)

Robots Exclusion Tools

A "robots.txt" parsing and querying library for .NET

Closely following the NoRobots RFC and other details on robotstxt.org.

Build

Features

Load Robots by string, by URI (Async) or by streams (Async)
Supports multiple user-agents and "*"
Supports Allow and Disallow
Supports Crawl-delay entries
Supports Sitemap entries
Supports wildcard paths (*) as well as must-end-with declarations ($)
Built-in "robots.txt" tokenization system (allowing extension to support other custom fields)
Built-in "robots.txt" validator (allowing to validate a tokenized file)
Dedicated parser for the data from <meta name="robots" /> tag and the X-Robots-Tag header

Licensing and Support

Robots Exclusion Tools is licensed under the MIT license. It is free to use in personal and commercial projects.

There are support plans available that cover all active Turner Software OSS projects. Support plans provide private email support, expert usage advice for our projects, priority bug fixes and more. These support plans help fund our OSS commitments to provide better software for everyone.

NoRobots RFC Compatibility

This library attempts to stick closely to the rules defined in the RFC document, including:

Global/any user-agent when none is explicitly defined (Section 3.2.1 of RFC)
Field names (eg. "User-agent") are character restricted (Section 3.3)
Allow/disallow rules are performed by order-of-occurence (Section 3.2.2)
Loading by URI applies default rules based on access to "robots.txt" (Section 3.1)
Interoperability for varying line endings (Section 5.2)

Tokenization & Validation

At the core of the library is a tokenization system to parse the file format. It follows the formal syntax rules defined in Section 3.3 of the NoRobots RFC to the characters that are valid. When used in conjunction with the token validator, it can enforce the correct token structure too.

The major benefit for designing the library around this system is that is allows for greater extendability. If you wanted to support custom fields that the core RobotsFile class didn't use, you can parse the data with the tokenizer.

Parsing in-request robots rules (metatags and header)

Similar to the rules from a "robots.txt" file, there can be in-request rules deciding whether a page allows indexing or following links. The process of extracting this data from a request isn't currently part of this library, avoiding a dependency to parse HTML.

If you extract the raw rules from the metatags and X-Robots-Tag header, you can pass those into the parser. The parser takes an array of rules and returns a RobotsPageDefinition file which allows querying of the rules by user agent.

Like the RobotsFileParser, this parser is built around the tokenization and validation system and is similarly extendable.

There is no RFC available to define the formats of metatag or X-Robots-Tag data. The parser follows the base formatting rules described in the NoRobots RFC regarding fields combined with rules from Google's documentation on the robots metatag. There are ambiguities in the rules described there (like whether there is rule inheritence from global scope) which may be different to what other implementations may use.

Example Usage

Parsing a "robots.txt" file from URI

using TurnerSoftware.RobotsExclusionTools;

var robotsFileParser = new RobotsFileParser();
RobotsFile robotsFile = await robotsFileParser.FromUriAsync(new Uri("http://www.example.org/robots.txt"));

var allowedAccess = robotsFile.IsAllowedAccess(
    new Uri("http://www.example.org/some/url/i-want-to/check"),
    "MyUserAgent"
);

Parsing robots data from metatags or the `X-Robots-Tag`

using TurnerSoftware.RobotsExclusionTools;

//These rules are gathered by you from the Robots metatag and `X-Robots-Tag` header
var pageRules = new[] {
    "noindex, notranslate",
    "googlebot: none",
    "otherbot: nofollow",
    "superbot: all"
};

var robotsPageParser = new RobotsPageParser();
RobotsPageDefinition robotsPageDefinition = robotsPageParser.FromRules(pageRules);

robotsPageDefinition.CanIndex("SomeNotListedBot/1.0"); //False
robotsPageDefinition.CanFollowLinks("SomeNotListedBot/1.0"); //True
robotsPageDefinition.Can("translate", "SomeNotListedBot/1.0"); //False

robotsPageDefinition.CanIndex("GoogleBot/1.0"); //False
robotsPageDefinition.CanFollowLinks("GoogleBot/1.0"); //False
robotsPageDefinition.Can("translate", "GoogleBot/1.0"); //False

robotsPageDefinition.CanIndex("OtherBot/1.0"); //False
robotsPageDefinition.CanFollowLinks("OtherBot/1.0"); //False
robotsPageDefinition.Can("translate", "OtherBot/1.0"); //False

robotsPageDefinition.CanIndex("superbot/1.0"); //True
robotsPageDefinition.CanFollowLinks("superbot/1.0"); //True
robotsPageDefinition.Can("translate", "superbot/1.0"); //True

*Note that all licence references and agreements mentioned in the Robots Exclusion Tools README section above are relevant to that project's source code only.

Robots Exclusion Tools

A "robots.txt" parsing and querying library for .NET

Robots Exclusion Tools alternatives and similar packages

Popular Comparisons

README

Robots Exclusion Tools

Features

Licensing and Support

NoRobots RFC Compatibility

Tokenization & Validation

Parsing in-request robots rules (metatags and header)

Example Usage

Parsing a "robots.txt" file from URI

Parsing robots data from metatags or the X-Robots-Tag

Parsing robots data from metatags or the `X-Robots-Tag`