Catalyst alternatives and similar packages
Based on the "Machine Learning and Data Science" category.
Alternatively, view Catalyst alternatives based on common mentions on social networks and blogs.
-
ML.NET
Cross-platform open-source machine learning framework which makes machine learning accessible to .NET developers. -
Accord.NET
Machine learning framework combined with audio and image processing libraries (computer vision, computer audition, signal processing and statistics). -
TensorFlow.NET
.NET Standard bindings for Google's TensorFlow for developing, training and deploying Machine Learning models in C# and F#. -
AForge.NET
Framework for developers and researchers in the fields of Computer Vision and Artificial Intelligence (image processing, neural networks, genetic algorithms, machine learning, robotics). -
F# Data
F# type providers for accessing XML, JSON, CSV and HTML files (based on sample documents) and for accessing WorldBank data -
Deedle
Data frame and (time) series library for exploratory data manipulation with C# and F# support -
Accord.NET Extensions
Advanced image processing and computer vision algorithms made as fluent extensions. -
numl
Designed to include the most popular supervised and unsupervised learning algorithms while minimizing the friction involved with creating the predictive models. -
Spreads
Series and Panels for Real-time and Exploratory Analysis of Data Streams. Spreads library is optimized for performance and memory usage. It is several times faster than other open source projects. -
Infer.NET
A framework for running Bayesian inference in graphical models. It can also be used for probabilistic programming. [Proprietary] [Free] [Research] -
SciSharp STACK
A rich machine learning ecosystem for .NET created by porting the most popular Python libraries to C#.
Get performance insights in less than 4 minutes
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest. Visit our partner's website for more details.
Do you think we are missing an alternative of Catalyst or a related project?
README
catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.
⚡ Features
- Fast, modern pure-C# NLP library, supporting .NET standard 2.0
- Cross-platform, runs anywhere .NET core is supported - Windows, Linux, macOS and even ARM
- Non-destructive tokenization, >99.9% RegEx-free, >1M tokens/s on a modern CPU
- Named Entity Recognition (gazeteer, rule-based & perceptron-based)
- Pre-trained models based on Universal Dependencies project
- Custom models for learning Abbreviations & Senses
- Out-of-the-box support for training FastText and StarSpace embeddings (pre-trained models coming soon)
- Part-of-speech tagging
- Language detection using FastText or cld3
- Efficient binary serialization based on MessagePack
✨ Getting Started
Using catalyst is as simple as installing its NuGet Package, and setting the storage to use our online repository. This way, models will be lazy loaded either from disk or downloaded from our online repository. Check out also some of the sample projects for more examples on how to use catalyst.
Storage.Current = new OnlineRepositoryStorage(new DiskStorage("catalyst-models"));
var nlp = await Pipeline.ForAsync(Language.English);
var doc = new Document("The quick brown fox jumps over the lazy dog", Language.English);
nlp.ProcessSingle(doc);
Console.WriteLine(doc.ToJson());
You can also take advantage of C# lazy evaluation and native multi-threading support to process a large number of documents in parallel:
var docs = GetDocuments();
var parsed = nlp.Process(docs);
DoSomething(parsed);
IEnumerable<IDocument> GetDocuments()
{
//Generates a few documents, to demonstrate multi-threading & lazy evaluation
for(int i = 0; i < 1000; i++)
{
yield return new Document("The quick brown fox jumps over the lazy dog", Language.English);
}
}
void DoSomething(IEnumerable<IDocument> docs)
{
foreach(var doc in docs)
{
Console.WriteLine(doc.ToJson());
}
}
Training a new FastText word2vec embedding model is as simple as this:
var nlp = await Pipeline.ForAsync(Language.English);
var ft = new FastText(Language.English, 0, "wiki-word2vec");
ft.Data.Type = FastText.ModelType.CBow;
ft.Data.Loss = FastText.LossType.NegativeSampling;
ft.Train(nlp.Process(GetDocs()));
ft.StoreAsync();
For fast embedding search, we have also released a C# version of the "Hierarchical Navigable Small World" (HNSW) algorithm on NuGet, based on our fork of Microsoft's HNSW.Net. We have also released a C# version of the "Uniform Manifold Approximation and Projection" (UMAP) algorithm for dimensionality reduction on GitHub and on NuGet.
📖 Documentation (coming soon)
Documentation | |
---|---|
Getting Started | How to use catalyst and its features. |
API Reference | The detailed reference for catalyst's API. |
Contribute | How to contribute to catalyst codebase. |
Samples | Sample projects demonstrating catalyst capabilities |
Join our gitter channel |