We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong main.py file which should contain all things related to PDF reader but instead contains about pygame
The text was updated successfully, but these errors were encountered:
Here is the code. The only change i did was i turned print('-'*10) into a function called divide().
print('-'*10)
divide()
import re from collections import Counter from PyPDF2 import PdfReader def extract_text_from_pdf(pdf_file: str) -> list[str]: with open(pdf_file, 'rb') as pdf: reader = PdfReader(pdf_file, strict=False) print('Pages:', len(reader.pages)) divide() return [page.extract_text() for page in reader.pages] def divide(): print('-' * 75) def count_words(text_list: list[str]) -> Counter: all_words: list[str] = [] for text in text_list: split_text: list[str] = re.split(r'\s+|[,;?!.-]\s*', text.lower()) all_words += [word for word in split_text if word] # exclude empty string return Counter(all_words) def main(): extracted_text: list[str] = extract_text_from_pdf('sample.pdf') counter: Counter = count_words(text_list=extracted_text) for page in extracted_text: print(page) divide() for word, mentions in counter.most_common(5): print(f'{word:10} : {mentions} uses') if __name__ == '__main__': main()
Sorry, something went wrong.
No branches or pull requests
Wrong main.py file which should contain all things related to PDF reader but instead contains about pygame
The text was updated successfully, but these errors were encountered: