Convert to front-end friendly, remove 'fs' #62

knod · 2016-12-11T12:58:02Z

This module could pretty easily be converted to be front end friendly by removing the need for 'fs'. Since the stopwords files are the only things being accessed with 'fs', this could be solved pretty easily.

I created my own solution, but I'm not sure it was particularly elegant. My current thought is to create an object with a key of the language code and a value of the stopwords array. Do you have other thoughts about more elegant solutions?

ageitgey · 2016-12-14T22:47:47Z

Supporting front-end use cases just hasn't ever come up before. I'd be worried that the stopwords data would be too large to be used in most front-end applications if it was bundled in.

knod · 2016-12-15T02:53:49Z

I'll admit I haven't tested it on a throttled system. It did work for me on a MacBook Pro, OSX 10.11, Chrome 55.0.2883.75 with no problems. Used it in the development of a Chrome extension.

The lists aren't actually that big. What I did at the time was turn those stopwords files into js files/node modules and require them in an object. I don't think a plain old object would be any more of a problem than that. We just need to test it on a slower system and see if it becomes an issue.

aknobloch · 2018-02-15T20:56:53Z

I also was looking at node-unfluff for front end bro purposes. The solution offered by @knod works, an alternative could also be to utilize local HTML5 File IO when applicable, and fallback to fs when the browser does not support HTML5. This would leave the current structure untouched, and front-end users could work around that limitation in WebPack by setting up their configuration to supply an empty module for fs, bearing in mind that the extension would not work on browsers that don't support HTML5. But the list of users not on HTML5-compatible browsers is shrinking daily.

I'd be happy to code this and submit a PR, but I also wonder if this would be better suited to its own fork and module. A separate module via npm install unfluff-standalone or something similar.

ageitgey · 2018-02-15T21:14:08Z

If it's a feature you want, feel free to either submit a PR or make a fork. I'm happy with whichever route works best for you. I just don't have the need to use this in a browser myself, so it's not something I ever attempted to support.

aknobloch · 2018-02-15T21:22:45Z

Makes sense - but I notice there's a few PR's that address this issue while maintaining the robustness of the original design. However, these have been there for a while with little activity, so I'm wondering if there is a reason why they aren't being merged. The ideal solution would be to have this base updated with a solution so that separate maintenance on the stopwords files doesn't have to be done.

Thanks for the speedy reply, by the way!

ageitgey · 2018-02-15T21:39:35Z

so I'm wondering if there is a reason why they aren't being merged.

A fair question! I don't have a good answer beyond I thought the existing PRs made the code slightly worse for me and I didn't need the new feature, so I did a bad job of ever getting around to doing anything with it :)

The ideal solution would be to have this base updated with a solution so that separate maintenance on the stopwords files doesn't have to be done.

Sure, that makes sense.

My issues with the current PRs are just that they required loading all stop words for all languages into memory even if when 99.9% of users would only need them for one language. Loading all languages not only would that make memory usage higher on the server side (100KB per process), but it seemed like a pretty bad solution for delivering the library to users client side. Is it really worth adding 100KB of useless data to to every page load for ever user for no reason?

I imagine there might be a more clever solution where you pre-generate a separate js bundle file per language or something like that. I just never looked into it.

But that's the honest answer - I didn't love the current solutions but I also didn't spend any time coming up with a better answer either. But I'm totally open to discussing it or revisiting my opinion or whatever.

aknobloch · 2018-02-15T21:46:41Z

Gotcha, that makes perfect sense. And thanks for the explanation. I'll give the problem some more thought and submit a PR request if I come up with a solution that I think works well.

ageitgey · 2018-02-15T22:35:43Z

Cool, thanks!

knod · 2018-02-23T21:16:25Z

Oh, cool. I didn't know that that was the problem with the PRs. That's good to know.

cjanietz · 2018-04-01T21:34:22Z

I have made a solution which uses require for getting the stopword files. When combined with webpack and raw-loader you can make the download of the actual stopword files completely async and dependent on the relevant language of your content. If you would like to check it out, check my fork.

bhowmikp · 2018-04-04T05:58:09Z

Hopefully your PR gets accepted because I too would like this feature

ageitgey · 2018-04-04T19:14:39Z

@cjanietz I really like your approach. I created a PR here: #84

Let me know if you are OK with me pulling that in.

justinmchase · 2021-11-27T23:10:30Z

I just tried to use this in an Electron app, which uses webpack to build the UI and non-UI parts of the app. It doesn't seem to support bundling txt files like along with the modules (as far as I can figure out). So if these were just hardcoded into js modules it would solve it there as well.

This was referenced Dec 19, 2016

Frontend support #64

Closed

Frontend support #65

Open

This was referenced Dec 22, 2016

Non-fs method 2: Single file containing one object #68

Open

Non-fs method 1: txt to json #67

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert to front-end friendly, remove 'fs' #62

Convert to front-end friendly, remove 'fs' #62

knod commented Dec 11, 2016

ageitgey commented Dec 14, 2016

knod commented Dec 15, 2016

aknobloch commented Feb 15, 2018 •

edited

Loading

ageitgey commented Feb 15, 2018

aknobloch commented Feb 15, 2018

ageitgey commented Feb 15, 2018

aknobloch commented Feb 15, 2018

ageitgey commented Feb 15, 2018

knod commented Feb 23, 2018

cjanietz commented Apr 1, 2018

bhowmikp commented Apr 4, 2018

ageitgey commented Apr 4, 2018

justinmchase commented Nov 27, 2021

Convert to front-end friendly, remove 'fs' #62

Convert to front-end friendly, remove 'fs' #62

Comments

knod commented Dec 11, 2016

ageitgey commented Dec 14, 2016

knod commented Dec 15, 2016

aknobloch commented Feb 15, 2018 • edited Loading

ageitgey commented Feb 15, 2018

aknobloch commented Feb 15, 2018

ageitgey commented Feb 15, 2018

aknobloch commented Feb 15, 2018

ageitgey commented Feb 15, 2018

knod commented Feb 23, 2018

cjanietz commented Apr 1, 2018

bhowmikp commented Apr 4, 2018

ageitgey commented Apr 4, 2018

justinmchase commented Nov 27, 2021

aknobloch commented Feb 15, 2018 •

edited

Loading