Skip to content

Fuzzy Prefix Search written in Rust

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

tpisto/fuzzy_prefix_search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

github crates.io docs.rs tests

Fuzzy Prefix Search

Flexible Trie implementation in Rust for fuzzy prefix string searching and auto-completion.

Documentation:

Features

  • Prefix-based fuzzy searching (Levenshtein distance)
  • Fuzzy search with customizable edit distance
  • Multiple data associations per word
  • Jaro-Winkler similarity scoring for search results
  • Thread safe
  • No dependencies

Installation

Add this to your Cargo.toml:

[dependencies]
fuzzy_prefix_search = "0.2"

Usage

Here's a quick example of how to use the Fuzzy Prefix Search:

use fuzzy_prefix_search::Trie;

fn main() {
    let mut trie = Trie::new();

    // Insert words with associated data
    trie.insert("apple", 1);
    trie.insert("application", 2);
    trie.insert("appetite", 3);

    // Perform a fuzzy search
    let results = trie.search_within_distance("appl", 1);

    for result in results {
        println!("Word: {}, Data: {:?}", result.word, result.data);
    }

    // Search with similarity scoring
    let scored_results = trie.search_within_distance_scored("aple", 2);

    for result in scored_results {
        println!("Word: {}, Data: {:?}, Score: {}", result.word, result.data, result.score);
    }
}

Advanced Usage

Custom Data Types

The Trie supports any data type that implements Clone, PartialEq, Eq, and Hash:

#[derive(Clone, PartialEq, Eq, Hash)]
struct CustomData {
    id: u32,
    value: String,
}

let mut trie = Trie::new();
trie.insert("example", CustomData { id: 1, value: "Test".to_string() });

Removing Data

You can remove all occurrences of a specific data value:

trie.remove_all(&2);

Performance

  • O(k) time complexity for insertion, where k is the length of the word
  • Space-efficient storage using a tree structure with shared prefixes
  • TODO: Benchmarks, optimizations, algorithm selection...

Caveat Emptor: we use unsafe in deletes for 2x read performance compared to Rc/RefCell approach.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Resources

License

This project is licensed under the MIT or Apache 2.0 License - see the LICENSE files for details.

About

Fuzzy Prefix Search written in Rust

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages