diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index e4dd871..bad4505 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -52,7 +52,7 @@ Goldenset Match Error: 동일조건변경허락 (동일조건변경허락Noun) - Goldenset Match Error: 기획조정실장 (기획조정실장Noun) -> (기획Noun 조정Noun 실장Noun) Goldenset Match Error: 안올라 (안Noun 올라Noun) -> (안Noun 올라Verb) ``` -5. Run [src/main/scala/com/twitter/penguin/korean/tools/CreateParsingGoldenset.scala](src/main/scala/com/twitter/penguin/korean/tools/CreateParsingGoldenset.scala) to update the golden set. You can run it via maven or your IDE. I would recommend using an IDE. +5. Run [src/main/scala/org/openkoreantext/processor/tools/CreateParsingGoldenset.scala](src/main/scala/org/openkoreantext/processor/tools/CreateParsingGoldenset.scala) to update the golden set. You can run it via maven or your IDE. I would recommend using an IDE. ## Pull requests diff --git a/README.md b/README.md index 6f13ade..59059ee 100644 --- a/README.md +++ b/README.md @@ -109,7 +109,7 @@ You can find these [examples](examples) in examples folder. - + ## Contribution diff --git a/docs/contribution-guide.md b/docs/contribution-guide.md index 808d092..a030447 100644 --- a/docs/contribution-guide.md +++ b/docs/contribution-guide.md @@ -21,13 +21,13 @@ git checkout -b "feature_branch_name" 이 예제에서는 사전을 수정해 보겠습니다. 사전 파일들은 -[src/main/resources/com/twitter/penguin/korean/util/](../../../tree/master/src/main/resources/com/twitter/penguin/korean/util) 에 있습니다. +[src/main/resources/org/openkoreantext/processor/util/](../../../tree/master/src/main/resources/org/openkoreantext/processor/util) 에 있습니다. -```src/main/resources/com/twitter/penguin/korean/util/noun/wikipedia_title_nouns.txt``` 에 동사가 들어가 있네요. 삭제했습니다. (이런 경우가 많이 있습니다. 수작업으로 없애 주어야 하는데요 여러분의 도움을 구합니다. 아울러 복합명사도 최대한 분리 되어야 합니다. 하동청룡리석불좌상 -> 하동 청룡리 석불 좌상) +```src/main/resources/org/openkoreantext/processor/util/noun/wikipedia_title_nouns.txt``` 에 동사가 들어가 있네요. 삭제했습니다. (이런 경우가 많이 있습니다. 수작업으로 없애 주어야 하는데요 여러분의 도움을 구합니다. 아울러 복합명사도 최대한 분리 되어야 합니다. 하동청룡리석불좌상 -> 하동 청룡리 석불 좌상) ![editor](imgs/img2-1.png) -사전을 정리하기 위해서 [src/main/scala/com/twitter/penguin/korean/tools/CleanupDictionaries.scala](../../../tree/master/src/main/scala/com/twitter/penguin/korean/tools/CleanupDictionaries.scala) 를 실행합니다. +사전을 정리하기 위해서 [src/main/scala/org/openkoreantext/processor/tools/CleanupDictionaries.scala](../../../tree/master/src/main/scala/org/openkoreantext/processor/tools/CleanupDictionaries.scala) 를 실행합니다. 파일을 열고 Run > Run... 을 실행합니다. @@ -42,10 +42,10 @@ git checkout -b "feature_branch_name" ``` ~/workspace/twitter-korean-text remove_verbs_from_wiki* ➜ git diff -diff --git a/src/main/resources/com/twitter/penguin/korean/util/noun/wikipedia_title_nouns.txt b/src/main/resources/com/twitter/penguin/korean/util/noun/wikipedia_title_nouns.txt +diff --git a/src/main/resources/org/openkoreantext/processor/util/noun/wikipedia_title_nouns.txt b/src/main/resources/org/openkoreantext/processor/util/noun/wikipedia_title_nouns.txt index 8a6d3c5..c1386d3 100644 ---- a/src/main/resources/com/twitter/penguin/korean/util/noun/wikipedia_title_nouns.txt -+++ b/src/main/resources/com/twitter/penguin/korean/util/noun/wikipedia_title_nouns.txt +--- a/src/main/resources/org/openkoreantext/processor/util/noun/wikipedia_title_nouns.txt ++++ b/src/main/resources/org/openkoreantext/processor/util/noun/wikipedia_title_nouns.txt @@ -1183,8 +1183,6 @@ 가야정 가야정류장 @@ -171,7 +171,7 @@ Goldenset Match Error: 락이가 (락이Noun* 가Josa) -> (락Noun 이Suffix 가 * 변화한 예시가 acceptable하면 Goldenset을 업데이트 합니다. -[src/main/scala/com/twitter/penguin/korean/tools/CreateParsingGoldenset.scala](../../../tree/master/src/main/scala/com/twitter/penguin/korean/tools/CreateParsingGoldenset.scala) 파일을 실행하면 goldenset을 자동으로 업데이트 합니다. (IntelliJ 안에서 실행 해 주세요.) +[src/main/scala/org/openkoreantext/processor/tools/CreateParsingGoldenset.scala](../../../tree/master/src/main/scala/org/openkoreantext/processor/tools/CreateParsingGoldenset.scala) 파일을 실행하면 goldenset을 자동으로 업데이트 합니다. (IntelliJ 안에서 실행 해 주세요.) * 다시 테스트를 실행해 봅니다. ``` @@ -219,7 +219,7 @@ Tests run: 66, Failures: 0, Errors: 0, Skipped: 0 ~/workspace/twitter-korean-text remove_verbs_from_wiki* ➜ git commit -am "dictionary update" [dictionary_update_name 8dffbfc] dictionary update 2 files changed, 8 insertions(+), 41 deletions(-) - rewrite src/test/resources/com/twitter/penguin/korean/util/goldenset.txt.gz (61%) + rewrite src/test/resources/org/openkoreantext/processor/util/goldenset.txt.gz (61%) ~/workspace/twitter-korean-text remove_verbs_from_wiki ➜ git push origin remove_verbs_from_wiki Counting objects: 20, done. Delta compression using up to 8 threads. diff --git a/docs/sbt.md b/docs/sbt.md index 785880b..f84035a 100644 --- a/docs/sbt.md +++ b/docs/sbt.md @@ -6,8 +6,8 @@ SBT - run - **사전 업데이트 등의 작업 후에 할 것** — `$ sbt "runMain org.openkoreantext.processor.tools.UpdateAllTheExamples"` - 기타 - - `$ sbt "runMain org.openkoreantext.processor.qa.BatchGetUnknownNouns ./src/main/resources/com/twitter/penguin/korean/util/example_tweets.txt"` - - `$ sbt "runMain org.openkoreantext.processor.qa.BatchGetUnknownNouns ./src/main/resources/com/twitter/penguin/korean/util/example_tweets.txt"` Looking to contribute something? Here's how you can help. + - `$ sbt "runMain org.openkoreantext.processor.qa.BatchGetUnknownNouns ./src/main/resources/org/openkoreantext/processor/util/example_tweets.txt"` + - `$ sbt "runMain org.openkoreantext.processor.qa.BatchGetUnknownNouns ./src/main/resources/org/openkoreantext/processor/util/example_tweets.txt"` Looking to contribute something? Here's how you can help. Bugs reports ------------ diff --git a/src/main/scala/org/openkoreantext/processor/tools/CreateConjugationExamples.scala b/src/main/scala/org/openkoreantext/processor/tools/CreateConjugationExamples.scala index 1b3c5f4..832bda7 100644 --- a/src/main/scala/org/openkoreantext/processor/tools/CreateConjugationExamples.scala +++ b/src/main/scala/org/openkoreantext/processor/tools/CreateConjugationExamples.scala @@ -36,7 +36,7 @@ object CreateConjugationExamples extends Runnable { def updateConjugateExamples(file: String, isAdj: Boolean, outputFileName: String) { System.err.println("Writing the expansion goldenset in " + outputFileName) - val outputPath = "src/test/resources/com/twitter/penguin/korean/util/" + outputFileName + val outputPath = "src/test/resources/org/openkoreantext/processor/util/" + outputFileName val out = new FileOutputStream(outputPath) val words = readWordsAsSeq(file) diff --git a/src/main/scala/org/openkoreantext/processor/tools/CreateParsingExamples.scala b/src/main/scala/org/openkoreantext/processor/tools/CreateParsingExamples.scala index 37b6954..2b16db6 100644 --- a/src/main/scala/org/openkoreantext/processor/tools/CreateParsingExamples.scala +++ b/src/main/scala/org/openkoreantext/processor/tools/CreateParsingExamples.scala @@ -42,7 +42,7 @@ object CreateParsingExamples extends Runnable { }.toSet - val outputFile: String = "src/test/resources/com/twitter/penguin/korean/util/current_parsing.txt" + val outputFile: String = "src/test/resources/org/openkoreantext/processor/util/current_parsing.txt" System.err.println("Writing the new goldenset to " + outputFile) diff --git a/src/main/scala/org/openkoreantext/processor/tools/CreatePhraseExtractionExamples.scala b/src/main/scala/org/openkoreantext/processor/tools/CreatePhraseExtractionExamples.scala index 2514698..633097e 100644 --- a/src/main/scala/org/openkoreantext/processor/tools/CreatePhraseExtractionExamples.scala +++ b/src/main/scala/org/openkoreantext/processor/tools/CreatePhraseExtractionExamples.scala @@ -46,7 +46,7 @@ object CreatePhraseExtractionExamples extends Runnable { }.toSet - val outputFile: String = "src/test/resources/com/twitter/penguin/korean/util/current_phrases.txt" + val outputFile: String = "src/test/resources/org/openkoreantext/processor/util/current_phrases.txt" System.err.println("Writing the new phrases to " + outputFile) diff --git a/src/main/scala/org/openkoreantext/processor/tools/DeduplicateAndSortDictionaries.scala b/src/main/scala/org/openkoreantext/processor/tools/DeduplicateAndSortDictionaries.scala index d7ee2e1..b3e4b67 100644 --- a/src/main/scala/org/openkoreantext/processor/tools/DeduplicateAndSortDictionaries.scala +++ b/src/main/scala/org/openkoreantext/processor/tools/DeduplicateAndSortDictionaries.scala @@ -58,7 +58,7 @@ object DeduplicateAndSortDictionaries extends Runnable { def run { RESOURCES_TO_CLEANUP.foreach { f: String => - val outputFolder = "src/main/resources/com/twitter/penguin/korean/util/" + val outputFolder = "src/main/resources/org/openkoreantext/processor/util/" System.err.println("Processing %s.".format(f)) val words = readWords(outputFolder + f).toList.sorted