-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert to front-end friendly, remove 'fs' #62
Comments
Supporting front-end use cases just hasn't ever come up before. I'd be worried that the stopwords data would be too large to be used in most front-end applications if it was bundled in. |
I'll admit I haven't tested it on a throttled system. It did work for me on a MacBook Pro, OSX 10.11, Chrome 55.0.2883.75 with no problems. Used it in the development of a Chrome extension. The lists aren't actually that big. What I did at the time was turn those stopwords files into js files/node modules and |
I also was looking at node-unfluff for front end bro purposes. The solution offered by @knod works, an alternative could also be to utilize local HTML5 File IO when applicable, and fallback to I'd be happy to code this and submit a PR, but I also wonder if this would be better suited to its own fork and module. A separate module via |
If it's a feature you want, feel free to either submit a PR or make a fork. I'm happy with whichever route works best for you. I just don't have the need to use this in a browser myself, so it's not something I ever attempted to support. |
Makes sense - but I notice there's a few PR's that address this issue while maintaining the robustness of the original design. However, these have been there for a while with little activity, so I'm wondering if there is a reason why they aren't being merged. The ideal solution would be to have this base updated with a solution so that separate maintenance on the Thanks for the speedy reply, by the way! |
A fair question! I don't have a good answer beyond I thought the existing PRs made the code slightly worse for me and I didn't need the new feature, so I did a bad job of ever getting around to doing anything with it :)
Sure, that makes sense. My issues with the current PRs are just that they required loading all stop words for all languages into memory even if when 99.9% of users would only need them for one language. Loading all languages not only would that make memory usage higher on the server side (100KB per process), but it seemed like a pretty bad solution for delivering the library to users client side. Is it really worth adding 100KB of useless data to to every page load for ever user for no reason? I imagine there might be a more clever solution where you pre-generate a separate js bundle file per language or something like that. I just never looked into it. But that's the honest answer - I didn't love the current solutions but I also didn't spend any time coming up with a better answer either. But I'm totally open to discussing it or revisiting my opinion or whatever. |
Gotcha, that makes perfect sense. And thanks for the explanation. I'll give the problem some more thought and submit a PR request if I come up with a solution that I think works well. |
Cool, thanks! |
Oh, cool. I didn't know that that was the problem with the PRs. That's good to know. |
I have made a solution which uses require for getting the stopword files. When combined with webpack and raw-loader you can make the download of the actual stopword files completely async and dependent on the relevant language of your content. If you would like to check it out, check my fork. |
Hopefully your PR gets accepted because I too would like this feature |
I just tried to use this in an Electron app, which uses webpack to build the UI and non-UI parts of the app. It doesn't seem to support bundling txt files like along with the modules (as far as I can figure out). So if these were just hardcoded into js modules it would solve it there as well. |
This module could pretty easily be converted to be front end friendly by removing the need for 'fs'. Since the stopwords files are the only things being accessed with 'fs', this could be solved pretty easily.
I created my own solution, but I'm not sure it was particularly elegant. My current thought is to create an object with a key of the language code and a value of the stopwords array. Do you have other thoughts about more elegant solutions?
The text was updated successfully, but these errors were encountered: