New ask Hacker News story: Ask HN: Semantic Vector Searching in WASM?
Ask HN: Semantic Vector Searching in WASM?
2 by jzombie | 0 comments on Hacker News.
I'm working on a simple markdown-driven CMS, in Python, that outputs to static HTML files that can be served directly from GitHub (as an example). I'm interested in implementing local-only search of this content, w/o using a backend, and preferably w/o cobbling together some NLP algorithms in JavaScript. I've also been training ML models for other use cases and experimenting around w/ vector databases. I started looking into creating embeddings using Python, from my markdown content, potentially using word2vec or finalfusion to help me get those embeddings into a Rust environment that I can compile w/ Rust into WASM, and use cosine similarity for my search. Now, I neither have any Rust nor WASM experience, but that doesn't really deter me much. I'm just curious if the overhead would be significant, or if I should skip the semantic embeddings and do a different type of embedding such as TD-IDF.
2 by jzombie | 0 comments on Hacker News.
I'm working on a simple markdown-driven CMS, in Python, that outputs to static HTML files that can be served directly from GitHub (as an example). I'm interested in implementing local-only search of this content, w/o using a backend, and preferably w/o cobbling together some NLP algorithms in JavaScript. I've also been training ML models for other use cases and experimenting around w/ vector databases. I started looking into creating embeddings using Python, from my markdown content, potentially using word2vec or finalfusion to help me get those embeddings into a Rust environment that I can compile w/ Rust into WASM, and use cosine similarity for my search. Now, I neither have any Rust nor WASM experience, but that doesn't really deter me much. I'm just curious if the overhead would be significant, or if I should skip the semantic embeddings and do a different type of embedding such as TD-IDF.
Comments
Post a Comment