Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC (string dtype): update user guide page "Working with text data" #60348

Open
jorisvandenbossche opened this issue Nov 17, 2024 · 5 comments · May be fixed by #60535
Open

DOC (string dtype): update user guide page "Working with text data" #60348

jorisvandenbossche opened this issue Nov 17, 2024 · 5 comments · May be fixed by #60535
Assignees
Labels
Docs Strings String extension data type and string data
Milestone

Comments

@jorisvandenbossche
Copy link
Member

With the new default string dtype in pandas 3.0, we should update the documentation to properly reflect this. The main page about working with string data is https://pandas.pydata.org/pandas-docs/version/2.2/user_guide/text.html

This page currently mentions object dtype vs nullable StringDtype, some differences, and then shows most examples using dtype="string".

We should update that page to reflect that there is now a default "str" dtype, add a historical note about object dtype being the default before pandas 3.0 (and that you can still encounter this, and then how to convert to str dtype; and refer to the upgrade guide), mention the differences of "str" vs "string" dtype, etc

@jorisvandenbossche jorisvandenbossche added Docs Strings String extension data type and string data labels Nov 17, 2024
@jorisvandenbossche jorisvandenbossche added this to the 3.0 milestone Nov 17, 2024
@ensalada-de-pollo ensalada-de-pollo removed their assignment Nov 18, 2024
@ensalada-de-pollo ensalada-de-pollo removed their assignment Nov 18, 2024
@TEARFEAR
Copy link

take

@Uvi-12
Copy link
Contributor

Uvi-12 commented Dec 6, 2024

@jorisvandenbossche @mroeschke Should all instances of dtype="string" in the documentation be changed to dtype="str" to align with the new default string dtype in pandas 3.0?

@Uvi-12
Copy link
Contributor

Uvi-12 commented Dec 9, 2024

@mroeschke I want to work on this, please guide me.

@jorisvandenbossche
Copy link
Member Author

Should all instances of dtype="string" in the documentation be changed to dtype="str" to align with the new default string dtype in pandas 3.0?

Not blindly all of them, because in some places it might be explaining something explicitly about the nullable "string" version. But where it is in general about string data, I think indeed it can be changed.

@Uvi-12
Copy link
Contributor

Uvi-12 commented Dec 10, 2024

Should all instances of dtype="string" in the documentation be changed to dtype="str" to align with the new default string dtype in pandas 3.0?

Not blindly all of them, because in some places it might be explaining something explicitly about the nullable "string" version. But where it is in general about string data, I think indeed it can be changed.

Alright, Thank you. I will work on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Strings String extension data type and string data
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants